Message ID | 20171006214243.11296-4-bart.vanassche@wdc.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Fri, Oct 06, 2017 at 02:42:39PM -0700, Bart Van Assche wrote: > Although the non-SRQ mode needs more resources that mode has three > advantages over SRQ: > - It works with all RDMA adapters, even those that do not support > SRQ. > - Posting WRs and polling WCs does not trigger lock contention > because only one thread at a time accesses a WR or WC queue in > non-SRQ mode. > - The end-to-end flow control mechanism is used. > > From the IB spec: > > C9-150.2.1: For QPs that are not associated with an SRQ, each HCA > receive queue shall generate end-to-end flow control credits. If > a QP is associated with an SRQ, the HCA receive queue shall not > generate end-to-end flow control credits. > > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> > --- > drivers/infiniband/ulp/srpt/ib_srpt.c | 130 +++++++++++++++++++++++++--------- > drivers/infiniband/ulp/srpt/ib_srpt.h | 4 ++ > 2 files changed, 99 insertions(+), 35 deletions(-) > > diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c > index 6cf95ad870cc..a21b7a96635c 100644 > --- a/drivers/infiniband/ulp/srpt/ib_srpt.c > +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c > @@ -75,11 +75,19 @@ module_param(srp_max_req_size, int, 0444); > MODULE_PARM_DESC(srp_max_req_size, > "Maximum size of SRP request messages in bytes."); > > +static bool use_srq; > +module_param(use_srq, bool, S_IRUGO | S_IWUSR); > +MODULE_PARM_DESC(use_srq, "Whether or not to use SRQ"); > + It is a little bit strange to ask from user to decide if his adapter supports SRQ or not. It should be automatically. > static int srpt_srq_size = DEFAULT_SRPT_SRQ_SIZE; > module_param(srpt_srq_size, int, 0444); > MODULE_PARM_DESC(srpt_srq_size, > "Shared receive queue (SRQ) size."); > > +static int srpt_sq_size = DEF_SRPT_SQ_SIZE; > +module_param(srpt_sq_size, int, 0444); > +MODULE_PARM_DESC(srpt_sq_size, "Per-channel send queue (SQ) size."); > + Thanks.
On Sun, 2017-10-08 at 13:03 +0300, Leon Romanovsky wrote: > On Fri, Oct 06, 2017 at 02:42:39PM -0700, Bart Van Assche wrote: > > Although the non-SRQ mode needs more resources that mode has three > > advantages over SRQ: > > - It works with all RDMA adapters, even those that do not support > > SRQ. > > - Posting WRs and polling WCs does not trigger lock contention > > because only one thread at a time accesses a WR or WC queue in > > non-SRQ mode. > > - The end-to-end flow control mechanism is used. > > > > From the IB spec: > > > > C9-150.2.1: For QPs that are not associated with an SRQ, each > > HCA > > receive queue shall generate end-to-end flow control credits. > > If > > a QP is associated with an SRQ, the HCA receive queue shall not > > generate end-to-end flow control credits. > > > > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> > > --- > > drivers/infiniband/ulp/srpt/ib_srpt.c | 130 > > +++++++++++++++++++++++++--------- > > drivers/infiniband/ulp/srpt/ib_srpt.h | 4 ++ > > 2 files changed, 99 insertions(+), 35 deletions(-) > > > > diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c > > b/drivers/infiniband/ulp/srpt/ib_srpt.c > > index 6cf95ad870cc..a21b7a96635c 100644 > > --- a/drivers/infiniband/ulp/srpt/ib_srpt.c > > +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c > > @@ -75,11 +75,19 @@ module_param(srp_max_req_size, int, 0444); > > MODULE_PARM_DESC(srp_max_req_size, > > "Maximum size of SRP request messages in > > bytes."); > > > > +static bool use_srq; > > +module_param(use_srq, bool, S_IRUGO | S_IWUSR); > > +MODULE_PARM_DESC(use_srq, "Whether or not to use SRQ"); > > + > > It is a little bit strange to ask from user to decide if his adapter > supports SRQ or not. > > It should be automatically. I think Bart's intent is that the driver not use SRQ as the default behavior even if the adapter supports it, so querying the adapter for support and enabling it if it exists would not achieve his desired result. This would then be used to override that behavior. Is that correct Bart?
On Mon, 2017-10-09 at 12:56 -0400, Doug Ledford wrote: > On Sun, 2017-10-08 at 13:03 +0300, Leon Romanovsky wrote: > > It is a little bit strange to ask from user to decide if his adapter > > supports SRQ or not. > > > > It should be automatically. > > I think Bart's intent is that the driver not use SRQ as the default > behavior even if the adapter supports it, so querying the adapter for > support and enabling it if it exists would not achieve his desired > result. This would then be used to override that behavior. Is that > correct Bart? Hello Leon and Doug, The changes realized by this patch are: - Instead of using SRQ as default, use non-SRQ mode as default. - If SRQ has been chosen as default, and if SRQ is not supported, fall back to non-SRQ mode (see also the if (IS_ERR(sdev->srq)) ... code). Please let me know if you have any further questions about this patch. Bart.
On Mon, 2017-10-09 at 17:01 +0000, Bart Van Assche wrote: > On Mon, 2017-10-09 at 12:56 -0400, Doug Ledford wrote: > > On Sun, 2017-10-08 at 13:03 +0300, Leon Romanovsky wrote: > > > It is a little bit strange to ask from user to decide if his > > > adapter > > > supports SRQ or not. > > > > > > It should be automatically. > > > > I think Bart's intent is that the driver not use SRQ as the default > > behavior even if the adapter supports it, so querying the adapter > > for > > support and enabling it if it exists would not achieve his desired > > result. This would then be used to override that behavior. Is > > that > > correct Bart? > > Hello Leon and Doug, > > The changes realized by this patch are: > - Instead of using SRQ as default, use non-SRQ mode as default. > - If SRQ has been chosen as default, and if SRQ is not supported, > fall back > to non-SRQ mode (see also the if (IS_ERR(sdev->srq)) ... code). > > Please let me know if you have any further questions about this > patch. Nope, I'm good.
On Mon, Oct 09, 2017 at 05:01:33PM +0000, Bart Van Assche wrote: > On Mon, 2017-10-09 at 12:56 -0400, Doug Ledford wrote: > > On Sun, 2017-10-08 at 13:03 +0300, Leon Romanovsky wrote: > > > It is a little bit strange to ask from user to decide if his adapter > > > supports SRQ or not. > > > > > > It should be automatically. > > > > I think Bart's intent is that the driver not use SRQ as the default > > behavior even if the adapter supports it, so querying the adapter for > > support and enabling it if it exists would not achieve his desired > > result. This would then be used to override that behavior. Is that > > correct Bart? > > Hello Leon and Doug, > > The changes realized by this patch are: > - Instead of using SRQ as default, use non-SRQ mode as default. > - If SRQ has been chosen as default, and if SRQ is not supported, fall back > to non-SRQ mode (see also the if (IS_ERR(sdev->srq)) ... code). > > Please let me know if you have any further questions about this patch. Yes, in case HCA supports SRQ, when do you set that module parameter? In the commit message, you mentioned disadvantages of using SRQ is a default and among them - locks contention, which can be changed in the future. Won't it mean that users stuck with current default, because change of default will "break" their scripts? Setting visible to user default won't allow us to change SRP behavior in the future. I wouldn't recommend to make such option accessible by users. Thanks > > Bart.
On 10/10/2017 12:14 AM, Leon Romanovsky wrote: > On Mon, Oct 09, 2017 at 05:01:33PM +0000, Bart Van Assche wrote: >> On Mon, 2017-10-09 at 12:56 -0400, Doug Ledford wrote: >>> On Sun, 2017-10-08 at 13:03 +0300, Leon Romanovsky wrote: >>>> It is a little bit strange to ask from user to decide if his adapter >>>> supports SRQ or not. >>>> >>>> It should be automatically. >>> >>> I think Bart's intent is that the driver not use SRQ as the default >>> behavior even if the adapter supports it, so querying the adapter for >>> support and enabling it if it exists would not achieve his desired >>> result. This would then be used to override that behavior. Is that >>> correct Bart? >> >> Hello Leon and Doug, >> >> The changes realized by this patch are: >> - Instead of using SRQ as default, use non-SRQ mode as default. >> - If SRQ has been chosen as default, and if SRQ is not supported, fall back >> to non-SRQ mode (see also the if (IS_ERR(sdev->srq)) ... code). >> >> Please let me know if you have any further questions about this patch. > > Yes, in case HCA supports SRQ, when do you set that module parameter? You set it in your /etc/modprobe.d/ib_srp.conf file or the equivalent in your OS. > In the commit message, you mentioned disadvantages of using SRQ is a > default and among them - locks contention, which can be changed in the > future. Won't it mean that users stuck with current default, because > change of default will "break" their scripts? No, it won't. If you change the default, you don't remove the variable, you just change what its setting is. Then existing modprobe.d files become redundant, but nothing breaks. People that don't want the new setting add a new file to the modprobe.d directory to change the option. > Setting visible to user default won't allow us to change SRP behavior in > the future. No it doesn't. > I wouldn't recommend to make such option accessible by users. > > Thanks > >> >> Bart.
On Tue, Oct 10, 2017 at 10:34:19AM -0400, Doug Ledford wrote: > On 10/10/2017 12:14 AM, Leon Romanovsky wrote: > > On Mon, Oct 09, 2017 at 05:01:33PM +0000, Bart Van Assche wrote: > >> On Mon, 2017-10-09 at 12:56 -0400, Doug Ledford wrote: > >>> On Sun, 2017-10-08 at 13:03 +0300, Leon Romanovsky wrote: > >>>> It is a little bit strange to ask from user to decide if his adapter > >>>> supports SRQ or not. > >>>> > >>>> It should be automatically. > >>> > >>> I think Bart's intent is that the driver not use SRQ as the default > >>> behavior even if the adapter supports it, so querying the adapter for > >>> support and enabling it if it exists would not achieve his desired > >>> result. This would then be used to override that behavior. Is that > >>> correct Bart? > >> > >> Hello Leon and Doug, > >> > >> The changes realized by this patch are: > >> - Instead of using SRQ as default, use non-SRQ mode as default. > >> - If SRQ has been chosen as default, and if SRQ is not supported, fall back > >> to non-SRQ mode (see also the if (IS_ERR(sdev->srq)) ... code). > >> > >> Please let me know if you have any further questions about this patch. > > > > Yes, in case HCA supports SRQ, when do you set that module parameter? > > You set it in your /etc/modprobe.d/ib_srp.conf file or the equivalent in > your OS. Doug, But my question was "when" and not "how". When should I set this parameter to true? > > > In the commit message, you mentioned disadvantages of using SRQ is a > > default and among them - locks contention, which can be changed in the > > future. Won't it mean that users stuck with current default, because > > change of default will "break" their scripts? > > No, it won't. If you change the default, you don't remove the variable, > you just change what its setting is. Then existing modprobe.d files > become redundant, but nothing breaks. People that don't want the new > setting add a new file to the modprobe.d directory to change the option. Not accurate, now I won't set any parameter because I'm relying on the fact that the default is without SRQ. Once the default will be changed, it will break my assumption. > > > Setting visible to user default won't allow us to change SRP behavior in > > the future. > > No it doesn't. > > > I wouldn't recommend to make such option accessible by users. > > > > Thanks > > > >> > >> Bart. > > > -- > Doug Ledford <dledford@redhat.com> > GPG Key ID: B826A3330E572FDD > Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD >
On Tue, 2017-10-10 at 18:00 +0300, Leon Romanovsky wrote: > On Tue, Oct 10, 2017 at 10:34:19AM -0400, Doug Ledford wrote: > > On 10/10/2017 12:14 AM, Leon Romanovsky wrote: > > > > Doug, > But my question was "when" and not "how". When should I set this > parameter to true? Whenever you want. When do you set the maximum TCP window size to 4MB? If you have an issue with things being one way, you try another. Things like this are system specific and often there is no universal answer, especially since the answer here could very easily depend on things like how many targets you are connecting to, how fast those targets are, etc. > > > > > In the commit message, you mentioned disadvantages of using SRQ > > > is a > > > default and among them - locks contention, which can be changed > > > in the > > > future. Won't it mean that users stuck with current default, > > > because > > > change of default will "break" their scripts? > > > > No, it won't. If you change the default, you don't remove the > > variable, > > you just change what its setting is. Then existing modprobe.d > > files > > become redundant, but nothing breaks. People that don't want the > > new > > setting add a new file to the modprobe.d directory to change the > > option. > > Not accurate, now I won't set any parameter because I'm relying on > the > fact that the default is without SRQ. Once the default will be > changed, > it will break my assumption. So? Your assumptions are never a reason to prevent ongoing development. By this argument, we would never have been able to turn on SCSI Multi-queue by default because when it was first added to the kernel it was off by default and people might have made an assumption that we would later break when we turned it on by default.
On Tue, Oct 10, 2017 at 11:13:22AM -0400, Doug Ledford wrote: > On Tue, 2017-10-10 at 18:00 +0300, Leon Romanovsky wrote: > > On Tue, Oct 10, 2017 at 10:34:19AM -0400, Doug Ledford wrote: > > > On 10/10/2017 12:14 AM, Leon Romanovsky wrote: > > > > > > Doug, > > But my question was "when" and not "how". When should I set this > > parameter to true? > > Whenever you want. When do you set the maximum TCP window size to 4MB? > If you have an issue with things being one way, you try another. > Things like this are system specific and often there is no universal > answer, especially since the answer here could very easily depend on > things like how many targets you are connecting to, how fast those > targets are, etc. Bart clearly mentioned disadvantages of XRQ and left me wonder why user needs to enable it anyway. This is what I'm asking and this is what I'm hoping to see in the commit message. I know little about SRP and by asking the questions, I'm trying to fill the gaps. > > > > > > > > In the commit message, you mentioned disadvantages of using SRQ > > > > is a > > > > default and among them - locks contention, which can be changed > > > > in the > > > > future. Won't it mean that users stuck with current default, > > > > because > > > > change of default will "break" their scripts? > > > > > > No, it won't. If you change the default, you don't remove the > > > variable, > > > you just change what its setting is. Then existing modprobe.d > > > files > > > become redundant, but nothing breaks. People that don't want the > > > new > > > setting add a new file to the modprobe.d directory to change the > > > option. > > > > Not accurate, now I won't set any parameter because I'm relying on > > the > > fact that the default is without SRQ. Once the default will be > > changed, > > it will break my assumption. > > So? Your assumptions are never a reason to prevent ongoing > development. By this argument, we would never have been able to turn > on SCSI Multi-queue by default because when it was first added to the > kernel it was off by default and people might have made an assumption > that we would later break when we turned it on by default. I'm not fully understand the example, The transition to MQ was long process and one of the difficulties was requirement to smoothly transit all devices in such way that users won't be affected. This is why it was disabled by default. Thanks > > -- > Doug Ledford <dledford@redhat.com> > GPG KeyID: B826A3330E572FDD > Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD >
On Tue, 2017-10-10 at 18:44 +0300, Leon Romanovsky wrote: > Bart clearly mentioned disadvantages of XRQ and left me wonder why user > needs to enable it anyway. This is what I'm asking and this is what I'm > hoping to see in the commit message. I assume that you meant SRQ instead of XRQ? For HCA's that support SRQ a choice has to be made between the lower memory usage of SRQ or the higher performance of RC. I think only the user can make that choice. Hence the new kernel module parameter. Bart.
On Tue, 2017-10-10 at 18:44 +0300, Leon Romanovsky wrote: > On Tue, Oct 10, 2017 at 11:13:22AM -0400, Doug Ledford wrote: > > On Tue, 2017-10-10 at 18:00 +0300, Leon Romanovsky wrote: > > > On Tue, Oct 10, 2017 at 10:34:19AM -0400, Doug Ledford wrote: > > > > On 10/10/2017 12:14 AM, Leon Romanovsky wrote: > > > > > > > > > > > Doug, > > > But my question was "when" and not "how". When should I set this > > > parameter to true? > > > > Whenever you want. When do you set the maximum TCP window size to > > 4MB? > > If you have an issue with things being one way, you try another. > > Things like this are system specific and often there is no > > universal > > answer, especially since the answer here could very easily depend > > on > > things like how many targets you are connecting to, how fast those > > targets are, etc. > > Bart clearly mentioned disadvantages of XRQ and left me wonder why > user > needs to enable it anyway. This is what I'm asking and this is what > I'm > hoping to see in the commit message. That's more than a commit message needs to be IMO. > I know little about SRP and by asking the questions, I'm trying to > fill > the gaps. It's not really SRP related. As Bart points out in his next message, it's a tradeoff between lower resource consumption and higher contention. It's a general tradeoff that is common to RDMA. > > > > > > > > > > > In the commit message, you mentioned disadvantages of using > > > > > SRQ > > > > > is a > > > > > default and among them - locks contention, which can be > > > > > changed > > > > > in the > > > > > future. Won't it mean that users stuck with current default, > > > > > because > > > > > change of default will "break" their scripts? > > > > > > > > No, it won't. If you change the default, you don't remove the > > > > variable, > > > > you just change what its setting is. Then existing modprobe.d > > > > files > > > > become redundant, but nothing breaks. People that don't want > > > > the > > > > new > > > > setting add a new file to the modprobe.d directory to change > > > > the > > > > option. > > > > > > Not accurate, now I won't set any parameter because I'm relying > > > on > > > the > > > fact that the default is without SRQ. Once the default will be > > > changed, > > > it will break my assumption. > > > > So? Your assumptions are never a reason to prevent ongoing > > development. By this argument, we would never have been able to > > turn > > on SCSI Multi-queue by default because when it was first added to > > the > > kernel it was off by default and people might have made an > > assumption > > that we would later break when we turned it on by default. > > I'm not fully understand the example, The transition to MQ was long > process and one of the difficulties was requirement to smoothly > transit > all devices in such way that users won't be affected. The difficulty was making sure it was robust. > This is why it was > disabled by default. It was off by default because they wanted it to get testing and bake time before enabling it by default. However, the reason they did it is really immaterial to my point: we can change defaults and that is not considered "breaking user's assumptions". The only way we would break user's system in this specific situation is if we removed the module option. That would break working setups that had a modprobe.d file to set the variable. Changing what the variable is by default doesn't do that. > Thanks > > > > > -- > > Doug Ledford <dledford@redhat.com> > > GPG KeyID: B826A3330E572FDD > > Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 > > 2FDD > >
On Tue, Oct 10, 2017 at 04:04:34PM +0000, Bart Van Assche wrote: > On Tue, 2017-10-10 at 18:44 +0300, Leon Romanovsky wrote: > > Bart clearly mentioned disadvantages of XRQ and left me wonder why user > > needs to enable it anyway. This is what I'm asking and this is what I'm > > hoping to see in the commit message. > > I assume that you meant SRQ instead of XRQ? For HCA's that support SRQ a > choice has to be made between the lower memory usage of SRQ or the higher > performance of RC. I think only the user can make that choice. Hence the > new kernel module parameter. considering our general dislike of module parameters, could you achieve this via some communicationfrom srp_daemon instead? Perhaps even on a per target basis? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2017-10-10 at 11:04 -0600, Jason Gunthorpe wrote: > considering our general dislike of module parameters, could you > achieve this via some communication from srp_daemon instead? > > Perhaps even on a per target basis? The SRP target driver already creates one /sys/kernel/config/target/srpt/$GUID directory per HCA port. How about adding an attribute to that configfs directory? Bart.
On Tue, Oct 10, 2017 at 05:10:55PM +0000, Bart Van Assche wrote: > On Tue, 2017-10-10 at 11:04 -0600, Jason Gunthorpe wrote: > > considering our general dislike of module parameters, could you > > achieve this via some communication from srp_daemon instead? > > > > Perhaps even on a per target basis? > > The SRP target driver already creates one /sys/kernel/config/target/srpt/$GUID > directory per HCA port. How about adding an attribute to that configfs directory? Sounds good. Thanks > > Bart.
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c index 6cf95ad870cc..a21b7a96635c 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.c +++ b/drivers/infiniband/ulp/srpt/ib_srpt.c @@ -75,11 +75,19 @@ module_param(srp_max_req_size, int, 0444); MODULE_PARM_DESC(srp_max_req_size, "Maximum size of SRP request messages in bytes."); +static bool use_srq; +module_param(use_srq, bool, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(use_srq, "Whether or not to use SRQ"); + static int srpt_srq_size = DEFAULT_SRPT_SRQ_SIZE; module_param(srpt_srq_size, int, 0444); MODULE_PARM_DESC(srpt_srq_size, "Shared receive queue (SRQ) size."); +static int srpt_sq_size = DEF_SRPT_SQ_SIZE; +module_param(srpt_sq_size, int, 0444); +MODULE_PARM_DESC(srpt_sq_size, "Per-channel send queue (SQ) size."); + static int srpt_get_u64_x(char *buffer, struct kernel_param *kp) { return sprintf(buffer, "0x%016llx", *(u64 *)kp->arg); @@ -295,6 +303,7 @@ static void srpt_get_ioc(struct srpt_port *sport, u32 slot, { struct srpt_device *sdev = sport->sdev; struct ib_dm_ioc_profile *iocp; + int send_queue_depth; iocp = (struct ib_dm_ioc_profile *)mad->data; @@ -310,6 +319,12 @@ static void srpt_get_ioc(struct srpt_port *sport, u32 slot, return; } + if (sdev->use_srq) + send_queue_depth = sdev->srq_size; + else + send_queue_depth = min(SRPT_RQ_SIZE, + sdev->device->attrs.max_qp_wr); + memset(iocp, 0, sizeof(*iocp)); strcpy(iocp->id_string, SRPT_ID_STRING); iocp->guid = cpu_to_be64(srpt_service_guid); @@ -322,7 +337,7 @@ static void srpt_get_ioc(struct srpt_port *sport, u32 slot, iocp->io_subclass = cpu_to_be16(SRP_IO_SUBCLASS); iocp->protocol = cpu_to_be16(SRP_PROTOCOL); iocp->protocol_version = cpu_to_be16(SRP_PROTOCOL_VERSION); - iocp->send_queue_depth = cpu_to_be16(sdev->srq_size); + iocp->send_queue_depth = cpu_to_be16(send_queue_depth); iocp->rdma_read_depth = 4; iocp->send_size = cpu_to_be32(srp_max_req_size); iocp->rdma_size = cpu_to_be32(min(sport->port_attrib.srp_max_rdma_size, @@ -686,6 +701,9 @@ static void srpt_free_ioctx_ring(struct srpt_ioctx **ioctx_ring, { int i; + if (!ioctx_ring) + return; + for (i = 0; i < ring_size; ++i) srpt_free_ioctx(sdev, ioctx_ring[i], dma_size, dir); kfree(ioctx_ring); @@ -757,7 +775,7 @@ static bool srpt_test_and_set_cmd_state(struct srpt_send_ioctx *ioctx, /** * srpt_post_recv() - Post an IB receive request. */ -static int srpt_post_recv(struct srpt_device *sdev, +static int srpt_post_recv(struct srpt_device *sdev, struct srpt_rdma_ch *ch, struct srpt_recv_ioctx *ioctx) { struct ib_sge list; @@ -774,7 +792,10 @@ static int srpt_post_recv(struct srpt_device *sdev, wr.sg_list = &list; wr.num_sge = 1; - return ib_post_srq_recv(sdev->srq, &wr, &bad_wr); + if (sdev->use_srq) + return ib_post_srq_recv(sdev->srq, &wr, &bad_wr); + else + return ib_post_recv(ch->qp, &wr, &bad_wr); } /** @@ -1517,7 +1538,7 @@ static void srpt_handle_new_iu(struct srpt_rdma_ch *ch, break; } - srpt_post_recv(ch->sport->sdev, recv_ioctx); + srpt_post_recv(ch->sport->sdev, ch, recv_ioctx); return; out_wait: @@ -1616,7 +1637,7 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) struct srpt_device *sdev = sport->sdev; const struct ib_device_attr *attrs = &sdev->device->attrs; u32 srp_sq_size = sport->port_attrib.srp_sq_size; - int ret; + int i, ret; WARN_ON(ch->rq_size < 1); @@ -1640,7 +1661,6 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) = (void(*)(struct ib_event *, void*))srpt_qp_event; qp_init->send_cq = ch->cq; qp_init->recv_cq = ch->cq; - qp_init->srq = sdev->srq; qp_init->sq_sig_type = IB_SIGNAL_REQ_WR; qp_init->qp_type = IB_QPT_RC; /* @@ -1654,6 +1674,12 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) qp_init->cap.max_rdma_ctxs = srp_sq_size / 2; qp_init->cap.max_send_sge = min(attrs->max_sge, SRPT_MAX_SG_PER_WQE); qp_init->port_num = ch->sport->port; + if (sdev->use_srq) { + qp_init->srq = sdev->srq; + } else { + qp_init->cap.max_recv_wr = ch->rq_size; + qp_init->cap.max_recv_sge = qp_init->cap.max_send_sge; + } ch->qp = ib_create_qp(sdev->pd, qp_init); if (IS_ERR(ch->qp)) { @@ -1669,6 +1695,10 @@ static int srpt_create_ch_ib(struct srpt_rdma_ch *ch) goto err_destroy_cq; } + if (!sdev->use_srq) + for (i = 0; i < ch->rq_size; i++) + srpt_post_recv(sdev, ch, ch->ioctx_recv_ring[i]); + atomic_set(&ch->sq_wr_avail, qp_init->cap.max_send_wr); pr_debug("%s: max_cqe= %d max_sge= %d sq_size = %d cm_id= %p\n", @@ -1818,6 +1848,10 @@ static void srpt_release_channel_work(struct work_struct *w) ch->sport->sdev, ch->rq_size, ch->rsp_size, DMA_TO_DEVICE); + srpt_free_ioctx_ring((struct srpt_ioctx **)ch->ioctx_recv_ring, + sdev, ch->rq_size, + srp_max_req_size, DMA_FROM_DEVICE); + mutex_lock(&sdev->mutex); list_del_init(&ch->list); if (ch->release_done) @@ -1975,6 +2009,19 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, ch->ioctx_ring[i]->ch = ch; list_add_tail(&ch->ioctx_ring[i]->free_list, &ch->free_list); } + if (!sdev->use_srq) { + ch->ioctx_recv_ring = (struct srpt_recv_ioctx **) + srpt_alloc_ioctx_ring(ch->sport->sdev, ch->rq_size, + sizeof(*ch->ioctx_recv_ring[0]), + srp_max_req_size, + DMA_FROM_DEVICE); + if (!ch->ioctx_recv_ring) { + pr_err("rejected SRP_LOGIN_REQ because creating a new QP RQ ring failed.\n"); + rej->reason = + cpu_to_be32(SRP_LOGIN_REJ_INSUFFICIENT_RESOURCES); + goto free_ring; + } + } ret = srpt_create_ch_ib(ch); if (ret) { @@ -1982,7 +2029,7 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, SRP_LOGIN_REJ_INSUFFICIENT_RESOURCES); pr_err("rejected SRP_LOGIN_REQ because creating" " a new RDMA channel failed.\n"); - goto free_ring; + goto free_recv_ring; } ret = srpt_ch_qp_rtr(ch, ch->qp); @@ -2073,6 +2120,11 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id, destroy_ib: srpt_destroy_ch_ib(ch); +free_recv_ring: + srpt_free_ioctx_ring((struct srpt_ioctx **)ch->ioctx_recv_ring, + ch->sport->sdev, ch->rq_size, + srp_max_req_size, DMA_FROM_DEVICE); + free_ring: srpt_free_ioctx_ring((struct srpt_ioctx **)ch->ioctx_ring, ch->sport->sdev, ch->rq_size, @@ -2502,20 +2554,38 @@ static void srpt_add_one(struct ib_device *device) srq_attr.attr.srq_limit = 0; srq_attr.srq_type = IB_SRQT_BASIC; - sdev->srq = ib_create_srq(sdev->pd, &srq_attr); - if (IS_ERR(sdev->srq)) - goto err_pd; + sdev->srq = use_srq ? ib_create_srq(sdev->pd, &srq_attr) : + ERR_PTR(-ENOTSUPP); + if (IS_ERR(sdev->srq)) { + pr_debug("ib_create_srq() failed: %ld\n", PTR_ERR(sdev->srq)); + + /* SRQ not supported. */ + sdev->use_srq = false; + } else { + pr_debug("create SRQ #wr= %d max_allow=%d dev= %s\n", + sdev->srq_size, sdev->device->attrs.max_srq_wr, + device->name); - pr_debug("%s: create SRQ #wr= %d max_allow=%d dev= %s\n", - __func__, sdev->srq_size, sdev->device->attrs.max_srq_wr, - device->name); + sdev->use_srq = true; + + sdev->ioctx_ring = (struct srpt_recv_ioctx **) + srpt_alloc_ioctx_ring(sdev, sdev->srq_size, + sizeof(*sdev->ioctx_ring[0]), + srp_max_req_size, + DMA_FROM_DEVICE); + if (!sdev->ioctx_ring) + goto err_pd; + + for (i = 0; i < sdev->srq_size; ++i) + srpt_post_recv(sdev, NULL, sdev->ioctx_ring[i]); + } if (!srpt_service_guid) srpt_service_guid = be64_to_cpu(device->node_guid); sdev->cm_id = ib_create_cm_id(device, srpt_cm_handler, sdev); if (IS_ERR(sdev->cm_id)) - goto err_srq; + goto err_ring; /* print out target login information */ pr_debug("Target login info: id_ext=%016llx,ioc_guid=%016llx," @@ -2535,16 +2605,6 @@ static void srpt_add_one(struct ib_device *device) srpt_event_handler); ib_register_event_handler(&sdev->event_handler); - sdev->ioctx_ring = (struct srpt_recv_ioctx **) - srpt_alloc_ioctx_ring(sdev, sdev->srq_size, - sizeof(*sdev->ioctx_ring[0]), - srp_max_req_size, DMA_FROM_DEVICE); - if (!sdev->ioctx_ring) - goto err_event; - - for (i = 0; i < sdev->srq_size; ++i) - srpt_post_recv(sdev, sdev->ioctx_ring[i]); - WARN_ON(sdev->device->phys_port_cnt > ARRAY_SIZE(sdev->port)); for (i = 1; i <= sdev->device->phys_port_cnt; i++) { @@ -2559,7 +2619,7 @@ static void srpt_add_one(struct ib_device *device) if (srpt_refresh_port(sport)) { pr_err("MAD registration failed for %s-%d.\n", sdev->device->name, i); - goto err_ring; + goto err_event; } } @@ -2572,16 +2632,16 @@ static void srpt_add_one(struct ib_device *device) pr_debug("added %s.\n", device->name); return; -err_ring: - srpt_free_ioctx_ring((struct srpt_ioctx **)sdev->ioctx_ring, sdev, - sdev->srq_size, srp_max_req_size, - DMA_FROM_DEVICE); err_event: ib_unregister_event_handler(&sdev->event_handler); err_cm: ib_destroy_cm_id(sdev->cm_id); -err_srq: - ib_destroy_srq(sdev->srq); +err_ring: + if (sdev->use_srq) + ib_destroy_srq(sdev->srq); + srpt_free_ioctx_ring((struct srpt_ioctx **)sdev->ioctx_ring, sdev, + sdev->srq_size, srp_max_req_size, + DMA_FROM_DEVICE); err_pd: ib_dealloc_pd(sdev->pd); free_dev: @@ -2625,12 +2685,12 @@ static void srpt_remove_one(struct ib_device *device, void *client_data) spin_unlock(&srpt_dev_lock); srpt_release_sdev(sdev); - ib_destroy_srq(sdev->srq); - ib_dealloc_pd(sdev->pd); - + if (sdev->use_srq) + ib_destroy_srq(sdev->srq); srpt_free_ioctx_ring((struct srpt_ioctx **)sdev->ioctx_ring, sdev, sdev->srq_size, srp_max_req_size, DMA_FROM_DEVICE); - sdev->ioctx_ring = NULL; + ib_dealloc_pd(sdev->pd); + kfree(sdev); } diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h index 976e924d7400..e0dbd5444d5e 100644 --- a/drivers/infiniband/ulp/srpt/ib_srpt.h +++ b/drivers/infiniband/ulp/srpt/ib_srpt.h @@ -252,6 +252,7 @@ enum rdma_ch_state { * @free_list: Head of list with free send I/O contexts. * @state: channel state. See also enum rdma_ch_state. * @ioctx_ring: Send ring. + * @ioctx_recv_ring: Receive I/O context ring. * @list: Node for insertion in the srpt_device.rch_list list. * @cmd_wait_list: List of SCSI commands that arrived before the RTU event. This * list contains struct srpt_ioctx elements and is protected @@ -281,6 +282,7 @@ struct srpt_rdma_ch { struct list_head free_list; enum rdma_ch_state state; struct srpt_send_ioctx **ioctx_ring; + struct srpt_recv_ioctx **ioctx_recv_ring; struct list_head list; struct list_head cmd_wait_list; struct se_session *sess; @@ -347,6 +349,7 @@ struct srpt_port { * @srq: Per-HCA SRQ (shared receive queue). * @cm_id: Connection identifier. * @srq_size: SRQ size. + * @use_srq: Whether or not to use SRQ. * @ioctx_ring: Per-HCA SRQ. * @rch_list: Per-device channel list -- see also srpt_rdma_ch.list. * @ch_releaseQ: Enables waiting for removal from rch_list. @@ -362,6 +365,7 @@ struct srpt_device { struct ib_srq *srq; struct ib_cm_id *cm_id; int srq_size; + bool use_srq; struct srpt_recv_ioctx **ioctx_ring; struct list_head rch_list; wait_queue_head_t ch_releaseQ;
Although the non-SRQ mode needs more resources that mode has three advantages over SRQ: - It works with all RDMA adapters, even those that do not support SRQ. - Posting WRs and polling WCs does not trigger lock contention because only one thread at a time accesses a WR or WC queue in non-SRQ mode. - The end-to-end flow control mechanism is used. From the IB spec: C9-150.2.1: For QPs that are not associated with an SRQ, each HCA receive queue shall generate end-to-end flow control credits. If a QP is associated with an SRQ, the HCA receive queue shall not generate end-to-end flow control credits. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> --- drivers/infiniband/ulp/srpt/ib_srpt.c | 130 +++++++++++++++++++++++++--------- drivers/infiniband/ulp/srpt/ib_srpt.h | 4 ++ 2 files changed, 99 insertions(+), 35 deletions(-)