Message ID | 20240918083552.77531-1-haakon.bugge@oracle.com (mailing list archive) |
---|---|
Headers | show |
Series | Enable DIM for legacy ULPs and use it in RDS | expand |
On Wed, Sep 18, 2024 at 10:35:50AM +0200, Håkon Bugge wrote: > The Dynamic Interrupt Moderation mechanism can only be used by ULPs > using ib_alloc_cq() and family. We extend DIM to also cover legacy > ULPs using ib_create_cq(). The last commit takes advantage of this end > uses DIM in RDS. I would much prefer if you could move RDS off that horrible API finally instead of investing more effort into it and making it more complicated.
Hi Christoph, > On 19 Sep 2024, at 16:17, Christoph Hellwig <hch@infradead.org> wrote: > > On Wed, Sep 18, 2024 at 10:35:50AM +0200, Håkon Bugge wrote: >> The Dynamic Interrupt Moderation mechanism can only be used by ULPs >> using ib_alloc_cq() and family. We extend DIM to also cover legacy >> ULPs using ib_create_cq(). The last commit takes advantage of this end >> uses DIM in RDS. > > I would much prefer if you could move RDS off that horrible API finally > instead of investing more effort into it and making it more complicated. ib_alloc_cq() and family does not support arming the CQ with the IB_CQ_SOLICITED flag, which RDS uses. Thxs, Håkon
On Fri, Sep 20, 2024 at 09:46:06AM +0000, Haakon Bugge wrote: > > I would much prefer if you could move RDS off that horrible API finally > > instead of investing more effort into it and making it more complicated. > > ib_alloc_cq() and family does not support arming the CQ with the IB_CQ_SOLICITED flag, which RDS uses. Then work on supporting it. RDS and SMC are the only users, so one of the maintainers needs to drive it.
在 2024/9/20 21:51, Christoph Hellwig 写道: > On Fri, Sep 20, 2024 at 09:46:06AM +0000, Haakon Bugge wrote: >>> I would much prefer if you could move RDS off that horrible API finally >>> instead of investing more effort into it and making it more complicated. >> >> ib_alloc_cq() and family does not support arming the CQ with the IB_CQ_SOLICITED flag, which RDS uses. > > Then work on supporting it. RDS and SMC are the only users, so one Some other open source projects are also the users. Zhu Yanjun > of the maintainers needs to drive it. >
On Sat, Sep 21, 2024 at 10:28:48AM +0800, Zhu Yanjun wrote: > 在 2024/9/20 21:51, Christoph Hellwig 写道: > > On Fri, Sep 20, 2024 at 09:46:06AM +0000, Haakon Bugge wrote: > > > > I would much prefer if you could move RDS off that horrible API finally > > > > instead of investing more effort into it and making it more complicated. > > > > > > ib_alloc_cq() and family does not support arming the CQ with the IB_CQ_SOLICITED flag, which RDS uses. > > > > Then work on supporting it. RDS and SMC are the only users, so one > > Some other open source projects are also the users. What are these projects? I see only RDS and SMC. Thanks > > Zhu Yanjun > > > of the maintainers needs to drive it. > > > >
On Sat, Sep 21, 2024 at 10:28:48AM +0800, Zhu Yanjun wrote: > 在 2024/9/20 21:51, Christoph Hellwig 写道: > > On Fri, Sep 20, 2024 at 09:46:06AM +0000, Haakon Bugge wrote: > > > > I would much prefer if you could move RDS off that horrible API finally > > > > instead of investing more effort into it and making it more complicated. > > > > > > ib_alloc_cq() and family does not support arming the CQ with the IB_CQ_SOLICITED flag, which RDS uses. > > > > Then work on supporting it. RDS and SMC are the only users, so one > > Some other open source projects are also the users. I'm not sure what you mean with "open source projects", but the only thing that matters is users in the kernel tree.
在 2024/9/23 20:03, Christoph Hellwig 写道: > On Sat, Sep 21, 2024 at 10:28:48AM +0800, Zhu Yanjun wrote: >> 在 2024/9/20 21:51, Christoph Hellwig 写道: >>> On Fri, Sep 20, 2024 at 09:46:06AM +0000, Haakon Bugge wrote: >>>>> I would much prefer if you could move RDS off that horrible API finally >>>>> instead of investing more effort into it and making it more complicated. >>>> >>>> ib_alloc_cq() and family does not support arming the CQ with the IB_CQ_SOLICITED flag, which RDS uses. >>> >>> Then work on supporting it. RDS and SMC are the only users, so one >> >> Some other open source projects are also the users. > > I'm not sure what you mean with "open source projects", but the only > thing that matters is users in the kernel tree. The users that I mentioned is not in the kernel tree. Best Regards, Zhu Yanjun >
On Tue, Sep 24, 2024 at 09:58:24AM +0800, Zhu Yanjun wrote:
> The users that I mentioned is not in the kernel tree.
And why do you think that would matter the slightest?
On Fri, Sep 20, 2024 at 06:51:18AM -0700, Christoph Hellwig wrote: > On Fri, Sep 20, 2024 at 09:46:06AM +0000, Haakon Bugge wrote: > > > I would much prefer if you could move RDS off that horrible API finally > > > instead of investing more effort into it and making it more complicated. > > > > ib_alloc_cq() and family does not support arming the CQ with the IB_CQ_SOLICITED flag, which RDS uses. > > Then work on supporting it. RDS and SMC are the only users, so one > of the maintainers needs to drive it. I took a quick look at what it would take, and adding IB_CQ_SOLICITED support to the cq API looks pretty trivial, you'll just need to pass it to ib_cq_pool_get by adding a new argument and the pass it down to __ib_alloc_cq. So yes, please get started at moving RDS out of the stone age.
> On 24 Sep 2024, at 09:01, Christoph Hellwig <hch@infradead.org> wrote: > > On Fri, Sep 20, 2024 at 06:51:18AM -0700, Christoph Hellwig wrote: >> On Fri, Sep 20, 2024 at 09:46:06AM +0000, Haakon Bugge wrote: >>>> I would much prefer if you could move RDS off that horrible API finally >>>> instead of investing more effort into it and making it more complicated. >>> >>> ib_alloc_cq() and family does not support arming the CQ with the IB_CQ_SOLICITED flag, which RDS uses. >> >> Then work on supporting it. RDS and SMC are the only users, so one >> of the maintainers needs to drive it. > > I took a quick look at what it would take, and adding IB_CQ_SOLICITED > support to the cq API looks pretty trivial, you'll just need to pass > it to ib_cq_pool_get by adding a new argument and the pass it > down to __ib_alloc_cq. So yes, please get started at moving RDS out > of the stone age. Agree. I'll work on that. I am also inclined to improve it by having designated CPUs where the worker threads polling the CQs execute, as that often improves cache hit rates. But that will come as another commit. Thxs, Håkon
在 2024/9/24 14:54, Christoph Hellwig 写道: > On Tue, Sep 24, 2024 at 09:58:24AM +0800, Zhu Yanjun wrote: >> The users that I mentioned is not in the kernel tree. > > And why do you think that would matter the slightest? I noticed that the same cq functions are used. And I also made tests with this patch series. Without this patch series, dim mechanism will not be invoked. Zhu Yanjun >
> On 24 Sep 2024, at 15:59, Zhu Yanjun <yanjun.zhu@linux.dev> wrote: > > 在 2024/9/24 14:54, Christoph Hellwig 写道: >> On Tue, Sep 24, 2024 at 09:58:24AM +0800, Zhu Yanjun wrote: >>> The users that I mentioned is not in the kernel tree. >> And why do you think that would matter the slightest? > > I noticed that the same cq functions are used. And I also made tests with this patch series. Without this patch series, dim mechanism will not be invoked. Christoph alluded to say: Do not modify the old cq_create_cq() code in order to support DIM, it is better to change the ULP to use ib_alloc_cq(), and get DIM enabled that way. Thxs, Håkon
On Tue, 2024-09-24 at 08:59 +0000, Haakon Bugge wrote: > > > > On 24 Sep 2024, at 09:01, Christoph Hellwig <hch@infradead.org> > > wrote: > > > > On Fri, Sep 20, 2024 at 06:51:18AM -0700, Christoph Hellwig wrote: > > > On Fri, Sep 20, 2024 at 09:46:06AM +0000, Haakon Bugge wrote: > > > > > I would much prefer if you could move RDS off that horrible > > > > > API finally > > > > > instead of investing more effort into it and making it more > > > > > complicated. > > > > > > > > ib_alloc_cq() and family does not support arming the CQ with > > > > the IB_CQ_SOLICITED flag, which RDS uses. > > > > > > Then work on supporting it. RDS and SMC are the only users, so > > > one > > > of the maintainers needs to drive it. > > > > I took a quick look at what it would take, and adding > > IB_CQ_SOLICITED > > support to the cq API looks pretty trivial, you'll just need to > > pass > > it to ib_cq_pool_get by adding a new argument and the pass it > > down to __ib_alloc_cq. So yes, please get started at moving RDS > > out > > of the stone age. > > Agree. I'll work on that. I am also inclined to improve it by having > designated CPUs where the worker threads polling the CQs execute, as > that often improves cache hit rates. But that will come as another > commit. > > > Thxs, Håkon > Ok, sounds like a good plan then. Thanks for working on this Haakon! Allison
在 2024/9/24 23:16, Haakon Bugge 写道: > > >> On 24 Sep 2024, at 15:59, Zhu Yanjun <yanjun.zhu@linux.dev> wrote: >> >> 在 2024/9/24 14:54, Christoph Hellwig 写道: >>> On Tue, Sep 24, 2024 at 09:58:24AM +0800, Zhu Yanjun wrote: >>>> The users that I mentioned is not in the kernel tree. >>> And why do you think that would matter the slightest? >> >> I noticed that the same cq functions are used. And I also made tests with this patch series. Without this patch series, dim mechanism will not be invoked. > > Christoph alluded to say: Do not modify the old cq_create_cq() code in order to support DIM, it is better to change the ULP to use ib_alloc_cq(), and get DIM enabled that way. Hi, Haakon To be honest, I like your original commit that enable DIM for legacy ULPs because this can fix this problem once for all and improve the old ib_create_cq function. The idea from Christoph will cause a lot of changes in ULPs. I am not very sure if these changes cause risks or not. Thus, I prefer to your original commit. But I will follow the advice from Leon and Jason. Zhu Yanjun > > > Thxs, Håkon >
On Wed, Sep 25, 2024 at 10:04:21AM +0800, Zhu Yanjun wrote: > 在 2024/9/24 23:16, Haakon Bugge 写道: > > > > > > > On 24 Sep 2024, at 15:59, Zhu Yanjun <yanjun.zhu@linux.dev> wrote: > > > > > > 在 2024/9/24 14:54, Christoph Hellwig 写道: > > > > On Tue, Sep 24, 2024 at 09:58:24AM +0800, Zhu Yanjun wrote: > > > > > The users that I mentioned is not in the kernel tree. > > > > And why do you think that would matter the slightest? > > > > > > I noticed that the same cq functions are used. And I also made tests with this patch series. Without this patch series, dim mechanism will not be invoked. > > > > Christoph alluded to say: Do not modify the old cq_create_cq() code in order to support DIM, it is better to change the ULP to use ib_alloc_cq(), and get DIM enabled that way. > > Hi, Haakon > > To be honest, I like your original commit that enable DIM for legacy ULPs > because this can fix this problem once for all and improve the old > ib_create_cq function. > > The idea from Christoph will cause a lot of changes in ULPs. I am not very > sure if these changes cause risks or not. > > Thus, I prefer to your original commit. But I will follow the advice from > Leon and Jason. Christoph was very clear and he summarized our position very well. We said similar thing to SMC folks in 2022 [1] and RDS is no different here. So no, "old ib_create_cq" shouldn't be used by ULPs. Thanks [1] https://lore.kernel.org/netdev/YePesYRnrKCh1vFy@unreal/ > > Zhu Yanjun > > > > > > > Thxs, Håkon > > > >
在 2024/9/25 17:26, Leon Romanovsky 写道: > On Wed, Sep 25, 2024 at 10:04:21AM +0800, Zhu Yanjun wrote: >> 在 2024/9/24 23:16, Haakon Bugge 写道: >>> >>> >>>> On 24 Sep 2024, at 15:59, Zhu Yanjun <yanjun.zhu@linux.dev> wrote: >>>> >>>> 在 2024/9/24 14:54, Christoph Hellwig 写道: >>>>> On Tue, Sep 24, 2024 at 09:58:24AM +0800, Zhu Yanjun wrote: >>>>>> The users that I mentioned is not in the kernel tree. >>>>> And why do you think that would matter the slightest? >>>> >>>> I noticed that the same cq functions are used. And I also made tests with this patch series. Without this patch series, dim mechanism will not be invoked. >>> >>> Christoph alluded to say: Do not modify the old cq_create_cq() code in order to support DIM, it is better to change the ULP to use ib_alloc_cq(), and get DIM enabled that way. >> >> Hi, Haakon >> >> To be honest, I like your original commit that enable DIM for legacy ULPs >> because this can fix this problem once for all and improve the old >> ib_create_cq function. >> >> The idea from Christoph will cause a lot of changes in ULPs. I am not very >> sure if these changes cause risks or not. >> >> Thus, I prefer to your original commit. But I will follow the advice from >> Leon and Jason. > > Christoph was very clear and he summarized our position very well. We > said similar thing to SMC folks in 2022 [1] and RDS is no different here. Thanks, Leon. I will read this link carefully. > > So no, "old ib_create_cq" shouldn't be used by ULPs. Got it. I have replaced ib_create_cq with ib cq pool APIs. Perhaps drivers/infiniband/ulp/srpt/ib_srpt.c is a good example to use ib cq pool APIs. Best Regards, Zhu Yanjun > > Thanks > > [1] https://lore.kernel.org/netdev/YePesYRnrKCh1vFy@unreal/ > >> >> Zhu Yanjun >> >>> >>> >>> Thxs, Håkon >>> >> >>