Message ID | c144abfe27bd0bf36ff1e3c9c912d7c3@imap.linux.ibm.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
> Expose '/unackcqe_default' variable so completion event > acknowledgment can be done in batch instead of singles > to minimize locking overheads. > > The default value of this variable is 1 so the existing > functionality is preserved. Having a higher value delays > acknowledging until the specified limit is reached for > retrieved completion events. I would rather just change the default behavior, rather than introduce more checks and options into the code. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2014-09-05 23:31, Hefty, Sean wrote: >> Expose '/unackcqe_default' variable so completion event >> acknowledgment can be done in batch instead of singles >> to minimize locking overheads. >> >> The default value of this variable is 1 so the existing >> functionality is preserved. Having a higher value delays >> acknowledging until the specified limit is reached for >> retrieved completion events. > > I would rather just change the default behavior, rather than introduce > more checks and options into the code. Hi Sean, That's indeed a good approach. But, we need to have an upper limit before a batch of events is flushed. Can we decide limit based on completion event queue size? Thank You. - Sreedhar -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> That's indeed a good approach. But, we need to have an upper limit > before a batch of events is flushed. Can we decide limit based on > completion event queue size? I think that's fine. We only need to ack before wrapping the 32-bit value, plus ensure that everything is acked when closing the rsocket. But, does this change really make any difference? At the point where ack is called, there's no contention, so the overhead seems small considering that we in a part of the code where we likely just blocked the thread. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2014-09-06 22:07, Hefty, Sean wrote: >> That's indeed a good approach. But, we need to have an upper limit >> before a batch of events is flushed. Can we decide limit based on >> completion event queue size? > > I think that's fine. We only need to ack before wrapping the 32-bit > value, plus ensure that everything is acked when closing the rsocket. > Sure. I will revert back to you with modified patch. > But, does this change really make any difference? At the point where > ack is called, there's no contention, so the overhead seems small > considering that we in a part of the code where we likely just blocked > the thread. Yes. It indeed makes a small improvement when all cores are fully saturated in a scale-up environments. Thank You. - Sreedhar -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/src/rsocket.c b/src/rsocket.c index ffea0ca..3c39da7 100644 --- a/src/rsocket.c +++ b/src/rsocket.c @@ -118,6 +118,7 @@ static uint32_t polling_time = 10; static uint16_t restart_onintr = 0; static uint16_t next_comp_vector = 0; static uint64_t comp_vector_mask = 0; +static uint32_t def_unackcqe = 1; /* * Immediate data format is determined by the upper bits @@ -384,6 +385,7 @@ struct rsocket { dlist_entry iomap_list; dlist_entry iomap_queue; int iomap_pending; + int unackcqe; }; #define DS_UDP_TAG 0x55555555 @@ -581,6 +583,14 @@ void rs_configure(void) } } } + + if ((f = fopen(RS_CONF_DIR "/unackcqe_default", "r"))) { + (void) fscanf(f, "%u", &def_unackcqe); + fclose(f); + if (def_unackcqe < 1) { + def_unackcqe = 1; + } + } init = 1; out: pthread_mutex_unlock(&mut); @@ -638,6 +648,7 @@ static struct rsocket *rs_alloc(struct rsocket *inherited_rs, int type) rs->target_iomap_size = def_iomap_size; } } + rs->unackcqe = 0; fastlock_init(&rs->slock); fastlock_init(&rs->rlock);