diff mbox series

[v1] fuse: add numa affinity for uring queues

Message ID 20250331205709.1148069-1-joannelkoong@gmail.com (mailing list archive)
State New
Headers show
Series [v1] fuse: add numa affinity for uring queues | expand

Commit Message

Joanne Koong March 31, 2025, 8:57 p.m. UTC
There is a 1:1 mapping between cpus and queues. Allocate the queue on
the numa node associated with the cpu to help reduce memory access
latencies.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 fs/fuse/dev_uring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Joanne Koong March 31, 2025, 11:42 p.m. UTC | #1
On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> There is a 1:1 mapping between cpus and queues. Allocate the queue on
> the numa node associated with the cpu to help reduce memory access
> latencies.
>
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>  fs/fuse/dev_uring.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
> index accdce2977c5..0762d6229ac6 100644
> --- a/fs/fuse/dev_uring.c
> +++ b/fs/fuse/dev_uring.c
> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
>         struct fuse_ring_queue *queue;
>         struct list_head *pq;
>
> -       queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT);
> +       queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid));
>         if (!queue)
>                 return NULL;
>         pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL);

On the same note I guess we should also allocate pq on the
corresponding numa node too.
> --
> 2.47.1
>
Bernd Schubert April 1, 2025, 8:11 a.m. UTC | #2
On 4/1/25 01:42, Joanne Koong wrote:
> On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>>
>> There is a 1:1 mapping between cpus and queues. Allocate the queue on
>> the numa node associated with the cpu to help reduce memory access
>> latencies.
>>
>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
>> ---
>>  fs/fuse/dev_uring.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
>> index accdce2977c5..0762d6229ac6 100644
>> --- a/fs/fuse/dev_uring.c
>> +++ b/fs/fuse/dev_uring.c
>> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
>>         struct fuse_ring_queue *queue;
>>         struct list_head *pq;
>>
>> -       queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT);
>> +       queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid));
>>         if (!queue)
>>                 return NULL;
>>         pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL);
> 
> On the same note I guess we should also allocate pq on the
> corresponding numa node too.

So this is supposed to be called from a thread that already runs on this
numa node and then kmalloc will allocate anyway on the right node,
afaik. Do you have a use case where this is called from another node? If
you do, all allocations in this file should be changed.


Thanks,
Bernd
Joanne Koong April 1, 2025, 5:50 p.m. UTC | #3
On Tue, Apr 1, 2025 at 1:11 AM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
> On 4/1/25 01:42, Joanne Koong wrote:
> > On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> >>
> >> There is a 1:1 mapping between cpus and queues. Allocate the queue on
> >> the numa node associated with the cpu to help reduce memory access
> >> latencies.
> >>
> >> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> >> ---
> >>  fs/fuse/dev_uring.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
> >> index accdce2977c5..0762d6229ac6 100644
> >> --- a/fs/fuse/dev_uring.c
> >> +++ b/fs/fuse/dev_uring.c
> >> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
> >>         struct fuse_ring_queue *queue;
> >>         struct list_head *pq;
> >>
> >> -       queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT);
> >> +       queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid));
> >>         if (!queue)
> >>                 return NULL;
> >>         pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL);
> >
> > On the same note I guess we should also allocate pq on the
> > corresponding numa node too.
>
> So this is supposed to be called from a thread that already runs on this
> numa node and then kmalloc will allocate anyway on the right node,
> afaik. Do you have a use case where this is called from another node? If
> you do, all allocations in this file should be changed.
>

I don't have a use case I'm using but imo it seems hardier to ensure
this at the kernel level for queue, pq, and ent allocations instead of
assuming userspace will always submit the registration from the thread
on the numa node corresponding to the qid it's registering. I don't
feel strongly about this though so if you think the responsibility
should be left to userspace, then that's fine with me.

Thanks,
Joanne

>
> Thanks,
> Bernd
Joanne Koong April 1, 2025, 5:53 p.m. UTC | #4
On Tue, Apr 1, 2025 at 10:50 AM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Tue, Apr 1, 2025 at 1:11 AM Bernd Schubert
> <bernd.schubert@fastmail.fm> wrote:
> >
> > On 4/1/25 01:42, Joanne Koong wrote:
> > > On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > >>
> > >> There is a 1:1 mapping between cpus and queues. Allocate the queue on
> > >> the numa node associated with the cpu to help reduce memory access
> > >> latencies.
> > >>
> > >> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > >> ---
> > >>  fs/fuse/dev_uring.c | 2 +-
> > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
> > >> index accdce2977c5..0762d6229ac6 100644
> > >> --- a/fs/fuse/dev_uring.c
> > >> +++ b/fs/fuse/dev_uring.c
> > >> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
> > >>         struct fuse_ring_queue *queue;
> > >>         struct list_head *pq;
> > >>
> > >> -       queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT);
> > >> +       queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid));
> > >>         if (!queue)
> > >>                 return NULL;
> > >>         pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL);
> > >
> > > On the same note I guess we should also allocate pq on the
> > > corresponding numa node too.
> >
> > So this is supposed to be called from a thread that already runs on this
> > numa node and then kmalloc will allocate anyway on the right node,
> > afaik. Do you have a use case where this is called from another node? If
> > you do, all allocations in this file should be changed.
> >
>
> I don't have a use case I'm using but imo it seems hardier to ensure
> this at the kernel level for queue, pq, and ent allocations instead of
> assuming userspace will always submit the registration from the thread
> on the numa node corresponding to the qid it's registering. I don't
> feel strongly about this though so if you think the responsibility
> should be left to userspace, then that's fine with me.

Thinking about this some more, if the responsibility for numa affinity
should be left to userspace, then that should hold true too for
configurable queues, no?

Thanks,
Joanne

>
> Thanks,
> Joanne
>
> >
> > Thanks,
> > Bernd
Bernd Schubert April 1, 2025, 5:56 p.m. UTC | #5
On 4/1/25 19:50, Joanne Koong wrote:
> On Tue, Apr 1, 2025 at 1:11 AM Bernd Schubert
> <bernd.schubert@fastmail.fm> wrote:
>>
>> On 4/1/25 01:42, Joanne Koong wrote:
>>> On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>>>>
>>>> There is a 1:1 mapping between cpus and queues. Allocate the queue on
>>>> the numa node associated with the cpu to help reduce memory access
>>>> latencies.
>>>>
>>>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
>>>> ---
>>>>  fs/fuse/dev_uring.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
>>>> index accdce2977c5..0762d6229ac6 100644
>>>> --- a/fs/fuse/dev_uring.c
>>>> +++ b/fs/fuse/dev_uring.c
>>>> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
>>>>         struct fuse_ring_queue *queue;
>>>>         struct list_head *pq;
>>>>
>>>> -       queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT);
>>>> +       queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid));
>>>>         if (!queue)
>>>>                 return NULL;
>>>>         pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL);
>>>
>>> On the same note I guess we should also allocate pq on the
>>> corresponding numa node too.
>>
>> So this is supposed to be called from a thread that already runs on this
>> numa node and then kmalloc will allocate anyway on the right node,
>> afaik. Do you have a use case where this is called from another node? If
>> you do, all allocations in this file should be changed.
>>
> 
> I don't have a use case I'm using but imo it seems hardier to ensure
> this at the kernel level for queue, pq, and ent allocations instead of
> assuming userspace will always submit the registration from the thread
> on the numa node corresponding to the qid it's registering. I don't
> feel strongly about this though so if you think the responsibility
> should be left to userspace, then that's fine with me.
> 

What I basically mean is that these threads are going to send the
initial SQE, shouldn't it? Sounds a bit weird to me if the setup and
sending the registration SQE are from different threads than the later
handling. 


Thanks,
Bernd
diff mbox series

Patch

diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c
index accdce2977c5..0762d6229ac6 100644
--- a/fs/fuse/dev_uring.c
+++ b/fs/fuse/dev_uring.c
@@ -256,7 +256,7 @@  static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring,
 	struct fuse_ring_queue *queue;
 	struct list_head *pq;
 
-	queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT);
+	queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid));
 	if (!queue)
 		return NULL;
 	pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL);