Message ID | 1519881025.363156294.1491837154312.JavaMail.zimbra@kalray.eu (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
> +static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) > +{ > + int sig_limit; > + > + /* We signal completion every queue depth/2 and also > + * handle the case of possible device with queue_depth=1, > + * where we would need to signal every message. > + */ > + sig_limit = max(queue->queue_size / 2, 1); > + return (++queue->sig_count % sig_limit) == 0; > +} I would love if we could do this at the ib_qp level. I also know you found a real bug and we want it fixed ASAP, so maybe for now we should merge this one: Reviewed-by: Christoph Hellwig <hch@lst.de> and then do the generic version. Marta, do you want to tackle the generic version or should I look into this once I get a little time. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> +static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) >> +{ >> + int sig_limit; >> + >> + /* We signal completion every queue depth/2 and also >> + * handle the case of possible device with queue_depth=1, >> + * where we would need to signal every message. >> + */ >> + sig_limit = max(queue->queue_size / 2, 1); >> + return (++queue->sig_count % sig_limit) == 0; >> +} > > I would love if we could do this at the ib_qp level. I also know > you found a real bug and we want it fixed ASAP, so maybe for now > we should merge this one: > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > and then do the generic version. Marta, do you want to tackle the > generic version or should I look into this once I get a little time. I can look if I can do the same thing at the ib_qp level. It would be great to have other people feedback on this one before it is merged. We're testing this one now and we can see what I can do in a week or so? Fine for you? Marta -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2017-04-10 at 17:16 +0200, Christoph Hellwig wrote: > > +static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) > > +{ > > + int sig_limit; > > + > > + /* We signal completion every queue depth/2 and also > > + * handle the case of possible device with queue_depth=1, > > + * where we would need to signal every message. > > + */ > > + sig_limit = max(queue->queue_size / 2, 1); > > + return (++queue->sig_count % sig_limit) == 0; > > +} > > I would love if we could do this at the ib_qp level. Hello Christoph, Please keep in mind that some but not all RDMA drivers need a construct like this. Bart.-- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Apr 10, 2017 at 03:27:14PM +0000, Bart Van Assche wrote: > > I would love if we could do this at the ib_qp level. > > Hello Christoph, > > Please keep in mind that some but not all RDMA drivers need a construct like > this. Sure, I don't want to force anyone to use it. But I'd like to have a generic helper that everyone can use instead of reinventing the wheel. E.g. at least iser will need the same fix, and many drivers would benefit from not always signalling completions even if they don't have the ability yet. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2017-04-10 at 17:12 +0200, Marta Rybczynska wrote: > In the case of small NVMe-oF queue size (<32) we may enter > a deadlock caused by the fact that the IB completions aren't sent > waiting for 32 and the send queue will fill up. > > The error is seen as (using mlx5): > [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273): > [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12 > > This patch changes the way the signaling is done so > that it depends on the queue depth now. The magic define has > been removed completely. > > Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu> > Signed-off-by: Samuel Jones <sjones@kalray.eu> > --- > Changes from v1: > * signal by queue size/2, remove hardcoded 32 > * support queue depth of 1 > > drivers/nvme/host/rdma.c | 17 +++++++++++++---- > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index 47a479f..4de1b92 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -1029,6 +1029,18 @@ static void nvme_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc) > nvme_rdma_wr_error(cq, wc, "SEND"); > } > > +static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) > +{ > + int sig_limit; > + > + /* We signal completion every queue depth/2 and also > + * handle the case of possible device with queue_depth=1, > + * where we would need to signal every message. > + */ > + sig_limit = max(queue->queue_size / 2, 1); > + return (++queue->sig_count % sig_limit) == 0; > +} > + > static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, > struct nvme_rdma_qe *qe, struct ib_sge *sge, u32 num_sge, > struct ib_send_wr *first, bool flush) > @@ -1056,9 +1068,6 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, > * Would have been way to obvious to handle this in hardware or > * at least the RDMA stack.. > * > - * This messy and racy code sniplet is copy and pasted from the iSER > - * initiator, and the magic '32' comes from there as well. > - * > * Always signal the flushes. The magic request used for the flush > * sequencer is not allocated in our driver's tagset and it's > * triggered to be freed by blk_cleanup_queue(). So we need to > @@ -1066,7 +1075,7 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, > * embedded in request's payload, is not freed when __ib_process_cq() > * calls wr_cqe->done(). > */ > - if ((++queue->sig_count % 32) == 0 || flush) > + if (nvme_rdma_queue_sig_limit(queue) || flush) > wr.send_flags |= IB_SEND_SIGNALED; > > if (first) Hello Marta, The approach of this patch is suboptimal from a performance point of view. If the number of WRs that have been submitted since the last signaled WR was submitted would be tracked in a member variable that would allow to get rid of the (relatively slow) division operation. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Mon, 2017-04-10 at 17:12 +0200, Marta Rybczynska wrote: >> In the case of small NVMe-oF queue size (<32) we may enter >> a deadlock caused by the fact that the IB completions aren't sent >> waiting for 32 and the send queue will fill up. >> >> The error is seen as (using mlx5): >> [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273): >> [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12 >> >> This patch changes the way the signaling is done so >> that it depends on the queue depth now. The magic define has >> been removed completely. >> >> Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu> >> Signed-off-by: Samuel Jones <sjones@kalray.eu> >> --- >> Changes from v1: >> * signal by queue size/2, remove hardcoded 32 >> * support queue depth of 1 >> >> drivers/nvme/host/rdma.c | 17 +++++++++++++---- >> 1 file changed, 13 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c >> index 47a479f..4de1b92 100644 >> --- a/drivers/nvme/host/rdma.c >> +++ b/drivers/nvme/host/rdma.c >> @@ -1029,6 +1029,18 @@ static void nvme_rdma_send_done(struct ib_cq *cq, struct >> ib_wc *wc) >> nvme_rdma_wr_error(cq, wc, "SEND"); >> } >> >> +static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) >> +{ >> + int sig_limit; >> + >> + /* We signal completion every queue depth/2 and also >> + * handle the case of possible device with queue_depth=1, >> + * where we would need to signal every message. >> + */ >> + sig_limit = max(queue->queue_size / 2, 1); >> + return (++queue->sig_count % sig_limit) == 0; >> +} >> + >> static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, >> struct nvme_rdma_qe *qe, struct ib_sge *sge, u32 num_sge, >> struct ib_send_wr *first, bool flush) >> @@ -1056,9 +1068,6 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue >> *queue, >> * Would have been way to obvious to handle this in hardware or >> * at least the RDMA stack.. >> * >> - * This messy and racy code sniplet is copy and pasted from the iSER >> - * initiator, and the magic '32' comes from there as well. >> - * >> * Always signal the flushes. The magic request used for the flush >> * sequencer is not allocated in our driver's tagset and it's >> * triggered to be freed by blk_cleanup_queue(). So we need to >> @@ -1066,7 +1075,7 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue >> *queue, >> * embedded in request's payload, is not freed when __ib_process_cq() >> * calls wr_cqe->done(). >> */ >> - if ((++queue->sig_count % 32) == 0 || flush) >> + if (nvme_rdma_queue_sig_limit(queue) || flush) >> wr.send_flags |= IB_SEND_SIGNALED; >> >> if (first) > > Hello Marta, > > The approach of this patch is suboptimal from a performance point of view. > If the number of WRs that have been submitted since the last signaled WR > was submitted would be tracked in a member variable that would allow to > get rid of the (relatively slow) division operation. > Hello Bart, I think that we can remove the division (the modulo sig_limit). The sig_count is an u8 so it is really a type of variable you propose. It isn't used anywhere it seems so we can change the way it is used in the snippet to count until the signaling moment. It will give something like: static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) { int sig_limit; /* We signal completion every queue depth/2 and also * handle the case of possible device with queue_depth=1, * where we would need to signal every message. */ sig_limit = max(queue->queue_size / 2, 1); queue->sig_count++; if (queue->sig_count < sig_limit) return 0; queue->sig_count = 0; return 1; } Do you like it better? Marta -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/11/2017 11:52 AM, Marta Rybczynska wrote: >> On Mon, 2017-04-10 at 17:12 +0200, Marta Rybczynska wrote: >>> In the case of small NVMe-oF queue size (<32) we may enter >>> a deadlock caused by the fact that the IB completions aren't sent >>> waiting for 32 and the send queue will fill up. >>> >>> The error is seen as (using mlx5): >>> [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273): >>> [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12 >>> >>> This patch changes the way the signaling is done so >>> that it depends on the queue depth now. The magic define has >>> been removed completely. >>> >>> Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu> >>> Signed-off-by: Samuel Jones <sjones@kalray.eu> >>> --- >>> Changes from v1: >>> * signal by queue size/2, remove hardcoded 32 >>> * support queue depth of 1 >>> >>> drivers/nvme/host/rdma.c | 17 +++++++++++++---- >>> 1 file changed, 13 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c >>> index 47a479f..4de1b92 100644 >>> --- a/drivers/nvme/host/rdma.c >>> +++ b/drivers/nvme/host/rdma.c >>> @@ -1029,6 +1029,18 @@ static void nvme_rdma_send_done(struct ib_cq *cq, struct >>> ib_wc *wc) >>> nvme_rdma_wr_error(cq, wc, "SEND"); >>> } >>> >>> +static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) >>> +{ >>> + int sig_limit; >>> + >>> + /* We signal completion every queue depth/2 and also >>> + * handle the case of possible device with queue_depth=1, >>> + * where we would need to signal every message. >>> + */ >>> + sig_limit = max(queue->queue_size / 2, 1); >>> + return (++queue->sig_count % sig_limit) == 0; >>> +} >>> + >>> static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, >>> struct nvme_rdma_qe *qe, struct ib_sge *sge, u32 num_sge, >>> struct ib_send_wr *first, bool flush) >>> @@ -1056,9 +1068,6 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue >>> *queue, >>> * Would have been way to obvious to handle this in hardware or >>> * at least the RDMA stack.. >>> * >>> - * This messy and racy code sniplet is copy and pasted from the iSER >>> - * initiator, and the magic '32' comes from there as well. >>> - * >>> * Always signal the flushes. The magic request used for the flush >>> * sequencer is not allocated in our driver's tagset and it's >>> * triggered to be freed by blk_cleanup_queue(). So we need to >>> @@ -1066,7 +1075,7 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue >>> *queue, >>> * embedded in request's payload, is not freed when __ib_process_cq() >>> * calls wr_cqe->done(). >>> */ >>> - if ((++queue->sig_count % 32) == 0 || flush) >>> + if (nvme_rdma_queue_sig_limit(queue) || flush) >>> wr.send_flags |= IB_SEND_SIGNALED; >>> >>> if (first) >> >> Hello Marta, >> >> The approach of this patch is suboptimal from a performance point of view. >> If the number of WRs that have been submitted since the last signaled WR >> was submitted would be tracked in a member variable that would allow to >> get rid of the (relatively slow) division operation. >> > > Hello Bart, > I think that we can remove the division (the modulo sig_limit). The sig_count > is an u8 so it is really a type of variable you propose. It isn't used anywhere > it seems so we can change the way it is used in the snippet to count until the > signaling moment. It will give something like: > > static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) > { > int sig_limit; > > /* We signal completion every queue depth/2 and also > * handle the case of possible device with queue_depth=1, > * where we would need to signal every message. > */ > sig_limit = max(queue->queue_size / 2, 1); > queue->sig_count++; > if (queue->sig_count < sig_limit) > return 0; > queue->sig_count = 0; > return 1; > } > > > Do you like it better? Hi Marta, I think that Bart (and I agree) meant to avoid doing the division in the fast path. You can set a new variable to rdma_ctrl called sig_limit and set once it during initialization stage. Then just do: if ((++queue->sig_count % queue->ctrl->sig_limit) == 0 || flush) in nvme_rdma_post_send function. Also, as you mentioned sig_count is u8 so you need to avoid setting the sig_limit to a bigger value than 255. So please take a minimum of 32 (or any other suggestions ?? ) and the result of max(queue->queue_size / 2, 1); thanks, Max. > > Marta > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On 4/11/2017 11:52 AM, Marta Rybczynska wrote: >>> On Mon, 2017-04-10 at 17:12 +0200, Marta Rybczynska wrote: >>>> In the case of small NVMe-oF queue size (<32) we may enter >>>> a deadlock caused by the fact that the IB completions aren't sent >>>> waiting for 32 and the send queue will fill up. >>>> >>>> The error is seen as (using mlx5): >>>> [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273): >>>> [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12 >>>> >>>> This patch changes the way the signaling is done so >>>> that it depends on the queue depth now. The magic define has >>>> been removed completely. >>>> >>>> Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu> >>>> Signed-off-by: Samuel Jones <sjones@kalray.eu> >>>> --- >>>> Changes from v1: >>>> * signal by queue size/2, remove hardcoded 32 >>>> * support queue depth of 1 >>>> >>>> drivers/nvme/host/rdma.c | 17 +++++++++++++---- >>>> 1 file changed, 13 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c >>>> index 47a479f..4de1b92 100644 >>>> --- a/drivers/nvme/host/rdma.c >>>> +++ b/drivers/nvme/host/rdma.c >>>> @@ -1029,6 +1029,18 @@ static void nvme_rdma_send_done(struct ib_cq *cq, struct >>>> ib_wc *wc) >>>> nvme_rdma_wr_error(cq, wc, "SEND"); >>>> } >>>> >>>> +static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) >>>> +{ >>>> + int sig_limit; >>>> + >>>> + /* We signal completion every queue depth/2 and also >>>> + * handle the case of possible device with queue_depth=1, >>>> + * where we would need to signal every message. >>>> + */ >>>> + sig_limit = max(queue->queue_size / 2, 1); >>>> + return (++queue->sig_count % sig_limit) == 0; >>>> +} >>>> + >>>> static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, >>>> struct nvme_rdma_qe *qe, struct ib_sge *sge, u32 num_sge, >>>> struct ib_send_wr *first, bool flush) >>>> @@ -1056,9 +1068,6 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue >>>> *queue, >>>> * Would have been way to obvious to handle this in hardware or >>>> * at least the RDMA stack.. >>>> * >>>> - * This messy and racy code sniplet is copy and pasted from the iSER >>>> - * initiator, and the magic '32' comes from there as well. >>>> - * >>>> * Always signal the flushes. The magic request used for the flush >>>> * sequencer is not allocated in our driver's tagset and it's >>>> * triggered to be freed by blk_cleanup_queue(). So we need to >>>> @@ -1066,7 +1075,7 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue >>>> *queue, >>>> * embedded in request's payload, is not freed when __ib_process_cq() >>>> * calls wr_cqe->done(). >>>> */ >>>> - if ((++queue->sig_count % 32) == 0 || flush) >>>> + if (nvme_rdma_queue_sig_limit(queue) || flush) >>>> wr.send_flags |= IB_SEND_SIGNALED; >>>> >>>> if (first) >>> >>> Hello Marta, >>> >>> The approach of this patch is suboptimal from a performance point of view. >>> If the number of WRs that have been submitted since the last signaled WR >>> was submitted would be tracked in a member variable that would allow to >>> get rid of the (relatively slow) division operation. >>> >> >> Hello Bart, >> I think that we can remove the division (the modulo sig_limit). The sig_count >> is an u8 so it is really a type of variable you propose. It isn't used anywhere >> it seems so we can change the way it is used in the snippet to count until the >> signaling moment. It will give something like: >> >> static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) >> { >> int sig_limit; >> >> /* We signal completion every queue depth/2 and also >> * handle the case of possible device with queue_depth=1, >> * where we would need to signal every message. >> */ >> sig_limit = max(queue->queue_size / 2, 1); >> queue->sig_count++; >> if (queue->sig_count < sig_limit) >> return 0; >> queue->sig_count = 0; >> return 1; >> } >> >> >> Do you like it better? > > Hi Marta, > I think that Bart (and I agree) meant to avoid doing the division in the > fast path. You can set a new variable to rdma_ctrl called sig_limit and > set once it during initialization stage. Then just do: > > if ((++queue->sig_count % queue->ctrl->sig_limit) == 0 || flush) > > in nvme_rdma_post_send function. > > Also, as you mentioned sig_count is u8 so you need to avoid setting the > sig_limit to a bigger value than 255. So please take a minimum of 32 (or > any other suggestions ?? ) and the result of max(queue->queue_size / 2, 1); > Hello Max, Do you mean the division by 2 or the modulo? I would say that the division by two should be compiled as a shift operation on most architectures, won't it? Marta -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2017-04-11 at 10:52 +0200, Marta Rybczynska wrote: > I think that we can remove the division (the modulo sig_limit). The sig_count > is an u8 so it is really a type of variable you propose. It isn't used anywhere > it seems so we can change the way it is used in the snippet to count until the > signaling moment. It will give something like: > > static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) > { > int sig_limit; > > /* We signal completion every queue depth/2 and also > * handle the case of possible device with queue_depth=1, > * where we would need to signal every message. > */ > sig_limit = max(queue->queue_size / 2, 1); > queue->sig_count++; > if (queue->sig_count < sig_limit) > return 0; > queue->sig_count = 0; > return 1; > } Hello Marta, Thank you for having addressed my feedback. Although this is not important, I think it is possible to optimize the above code further as follows: * Instead of initializing / resetting sig_count to zero, initialize it to sig_limit. * Instead of incrementing sig_count in nvme_rdma_queue_sig_limit(), decrement it. * Instead of comparing sig_count against sig_limit, compare it against zero. Bart.-- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I've hand applies this once with a few formatting tweaks now that I got the ACK from Sagi. І'd love to see if someone could figure out how we could move the improved atomic + ilog version into the rdma core somehow. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> I've hand applies this once with a few formatting tweaks now that > I got the ACK from Sagi. І'd love to see if someone could figure > out how we could move the improved atomic + ilog version into the > rdma core somehow. I'll submit the atomic+ilog version in nvme next week. I have no access to the test systems this week. Marta -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> I've hand applies this once with a few formatting tweaks now that >> I got the ACK from Sagi. І'd love to see if someone could figure >> out how we could move the improved atomic + ilog version into the >> rdma core somehow. > > I'll submit the atomic+ilog version in nvme next week. I have no > access to the test systems this week. > I've just posted the new version, with changed title as the v2 is merged. The new thread name is: [PATCH] nvme-rdma: remove race conditions from IB signalling Marta -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 47a479f..4de1b92 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1029,6 +1029,18 @@ static void nvme_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc) nvme_rdma_wr_error(cq, wc, "SEND"); } +static inline nvme_rdma_queue_sig_limit(struct nvme_rdma_queue *queue) +{ + int sig_limit; + + /* We signal completion every queue depth/2 and also + * handle the case of possible device with queue_depth=1, + * where we would need to signal every message. + */ + sig_limit = max(queue->queue_size / 2, 1); + return (++queue->sig_count % sig_limit) == 0; +} + static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, struct nvme_rdma_qe *qe, struct ib_sge *sge, u32 num_sge, struct ib_send_wr *first, bool flush) @@ -1056,9 +1068,6 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, * Would have been way to obvious to handle this in hardware or * at least the RDMA stack.. * - * This messy and racy code sniplet is copy and pasted from the iSER - * initiator, and the magic '32' comes from there as well. - * * Always signal the flushes. The magic request used for the flush * sequencer is not allocated in our driver's tagset and it's * triggered to be freed by blk_cleanup_queue(). So we need to @@ -1066,7 +1075,7 @@ static int nvme_rdma_post_send(struct nvme_rdma_queue *queue, * embedded in request's payload, is not freed when __ib_process_cq() * calls wr_cqe->done(). */ - if ((++queue->sig_count % 32) == 0 || flush) + if (nvme_rdma_queue_sig_limit(queue) || flush) wr.send_flags |= IB_SEND_SIGNALED; if (first)