diff mbox

[libibverbs] XRC - Sample application issues

Message ID 1375974336-26314-1-git-send-email-yishaih@mellanox.com (mailing list archive)
State Rejected
Headers show

Commit Message

Yishai Hadas Aug. 8, 2013, 3:05 p.m. UTC
Fix sync issue when clients go down, it comes to prevent a case when
client misses a response from the daemon then wait forever.

Fix typo in error message.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
---

This patch is on top of V9 of XRC series that was already sent.
It should be squashed into latest patch #7 named 'Add XRC sample application'.

Jay, Sean - please review.


 examples/xsrq_pingpong.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

Comments

Hefty, Sean Aug. 16, 2013, 3:10 p.m. UTC | #1
> @@ -884,6 +884,13 @@ int main(int argc, char *argv[])
>  	if (ctx.use_event)
>  		ibv_ack_cq_events(ctx.recv_cq, num_cq_events);
> 
> +	/* Process should wait before closing its resources to make sure
> +	  * latest daemon's response sent via its target QP destined to an XSRQ
> +	  * created by another client won't be lost.
> +	  * Failure to do so will cause the client to wait for that sent message
> forever.
> +	  * See comment on pp_post_send.
> +	*/
> +	sleep(1);

I dislike adding sleep calls into code.  Isn't there a more robust way to handle this?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yishai Hadas Aug. 18, 2013, 9:05 a.m. UTC | #2
On 8/16/2013 06:11 PM, Sean Hefty wrote:
>> @@ -884,6 +884,13 @@ int main(int argc, char *argv[])
>>   	if (ctx.use_event)
>>   		ibv_ack_cq_events(ctx.recv_cq, num_cq_events);
>>
>> +	/* Process should wait before closing its resources to make sure
>> +	  * latest daemon's response sent via its target QP destined to an XSRQ
>> +	  * created by another client won't be lost.
>> +	  * Failure to do so will cause the client to wait for that sent message
>> forever.
>> +	  * See comment on pp_post_send.
>> +	*/
>> +	sleep(1);
> I dislike adding sleep calls into code.  Isn't there a more robust way to handle this?

     In general I agree this sleep is a workaround that comes to solve a synchronization hole in this sample application.
     For that reason I put 5 lines comment to describe the problem and the reason for that sleep statement.
     Long term you could think of synchronizing all the processes through the sockets mechanism to assure they terminate when all packets are received.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Aug. 19, 2013, 3:27 p.m. UTC | #3
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-owner@vger.kernel.org] On Behalf Of
> Hefty, Sean
> Sent: Friday, August 16, 2013 10:11 AM
> To: Yishai Hadas; linux-rdma@vger.kernel.org; roland@purestorage.com
> Cc: ogerlitz@mellanox.com; tzahio@mellanox.com; Sternberg, Jay E; Eli Cohen
> Subject: RE: [PATCH libibverbs] XRC - Sample application issues
> 
> > @@ -884,6 +884,13 @@ int main(int argc, char *argv[])
> >  	if (ctx.use_event)
> >  		ibv_ack_cq_events(ctx.recv_cq, num_cq_events);
> >
> > +	/* Process should wait before closing its resources to make sure
> > +	  * latest daemon's response sent via its target QP destined to an XSRQ
> > +	  * created by another client won't be lost.
> > +	  * Failure to do so will cause the client to wait for that sent message
> > forever.
> > +	  * See comment on pp_post_send.
> > +	*/
> > +	sleep(1);
> 
> I dislike adding sleep calls into code.  Isn't there a more robust way to handle this?

Perhaps you could synchronize between the processes using the TCP socket  used to exchange the QP
info...

Steve.
 


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Aug. 19, 2013, 7:12 p.m. UTC | #4
On Sun, Aug 18, 2013 at 12:05:48PM +0300, Yishai Hadas wrote:
> On 8/16/2013 06:11 PM, Sean Hefty wrote:
> >>@@ -884,6 +884,13 @@ int main(int argc, char *argv[])
> >>  	if (ctx.use_event)
> >>  		ibv_ack_cq_events(ctx.recv_cq, num_cq_events);
> >>
> >>+	/* Process should wait before closing its resources to make sure
> >>+	  * latest daemon's response sent via its target QP destined to an XSRQ
> >>+	  * created by another client won't be lost.
> >>+	  * Failure to do so will cause the client to wait for that sent message
> >>forever.
> >>+	  * See comment on pp_post_send.
> >>+	*/
> >>+	sleep(1);
> >I dislike adding sleep calls into code.  Isn't there a more robust way to handle this?
> 
> In general I agree this sleep is a workaround that comes to solve a
> synchronization hole in this sample application.  For that reason I
> put 5 lines comment to describe the problem and the reason for that
> sleep statement.  Long term you could think of synchronizing all the
> processes through the sockets mechanism to assure they terminate
> when all packets are received.

This example is very close to the only code that people will have
access to when trying to work with XRC.

It should be complete and correct under all cases. So no sleeps, IMHO.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ido Shamai Aug. 20, 2013, 5:16 a.m. UTC | #5
On 8/19/2013 10:12 PM, Jason Gunthorpe wrote:
> On Sun, Aug 18, 2013 at 12:05:48PM +0300, Yishai Hadas wrote:
>> On 8/16/2013 06:11 PM, Sean Hefty wrote:
>>>> @@ -884,6 +884,13 @@ int main(int argc, char *argv[])
>>>>   	if (ctx.use_event)
>>>>   		ibv_ack_cq_events(ctx.recv_cq, num_cq_events);
>>>>
>>>> +	/* Process should wait before closing its resources to make sure
>>>> +	  * latest daemon's response sent via its target QP destined to an XSRQ
>>>> +	  * created by another client won't be lost.
>>>> +	  * Failure to do so will cause the client to wait for that sent message
>>>> forever.
>>>> +	  * See comment on pp_post_send.
>>>> +	*/
>>>> +	sleep(1);
>>> I dislike adding sleep calls into code.  Isn't there a more robust way to handle this?
>> In general I agree this sleep is a workaround that comes to solve a
>> synchronization hole in this sample application.  For that reason I
>> put 5 lines comment to describe the problem and the reason for that
>> sleep statement.  Long term you could think of synchronizing all the
>> processes through the sockets mechanism to assure they terminate
>> when all packets are received.
> This example is very close to the only code that people will have
> access to when trying to work with XRC.
Latest perftest package also contains a use case of XRC.
No sleep.

Ido
> It should be complete and correct under all cases. So no sleeps, IMHO.
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/examples/xsrq_pingpong.c b/examples/xsrq_pingpong.c
index 984740d..aceef2e 100644
--- a/examples/xsrq_pingpong.c
+++ b/examples/xsrq_pingpong.c
@@ -376,7 +376,7 @@  static int connect_qps(int index)
 			  IBV_QP_STATE | IBV_QP_AV | IBV_QP_PATH_MTU |
 			  IBV_QP_DEST_QPN | IBV_QP_RQ_PSN |
 			  IBV_QP_MAX_DEST_RD_ATOMIC | IBV_QP_MIN_RNR_TIMER)) {
-		fprintf(stderr, "Failed to modify send QP[%d] to RTR\n", index);
+		fprintf(stderr, "Failed to modify recv QP[%d] to RTR\n", index);
 		return 1;
 	}
 
@@ -884,6 +884,13 @@  int main(int argc, char *argv[])
 	if (ctx.use_event)
 		ibv_ack_cq_events(ctx.recv_cq, num_cq_events);
 
+	/* Process should wait before closing its resources to make sure
+	  * latest daemon's response sent via its target QP destined to an XSRQ
+	  * created by another client won't be lost.
+	  * Failure to do so will cause the client to wait for that sent message forever.
+	  * See comment on pp_post_send.
+	*/
+	sleep(1);
 	if (pp_close_ctx())
 		return 1;