Message ID | 20181204233729.26776-9-axboe@kernel.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/26] fs: add an iopoll method to struct file_operations | expand |
Jens Axboe <axboe@kernel.dk> writes: > It's 192 bytes, fairly substantial. Most items don't need to be cleared, > especially not upfront. Clear the ones we do need to clear, and leave > the other ones for setup when the iocb is prepared and submitted. What performance gains do you see from this? -Jeff > Reviewed-by: Christoph Hellwig <hch@lst.de> > Signed-off-by: Jens Axboe <axboe@kernel.dk> > --- > fs/aio.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/fs/aio.c b/fs/aio.c > index eaceb40e6cf5..522c04864d82 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -1009,14 +1009,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) > { > struct aio_kiocb *req; > > - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); > + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); > if (unlikely(!req)) > return NULL; > > percpu_ref_get(&ctx->reqs); > + req->ki_ctx = ctx; > INIT_LIST_HEAD(&req->ki_list); > refcount_set(&req->ki_refcnt, 0); > - req->ki_ctx = ctx; > + req->ki_eventfd = NULL; > return req; > } > > @@ -1730,6 +1731,10 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb) > if (unlikely(!req->file)) > return -EBADF; > > + req->head = NULL; > + req->woken = false; > + req->cancelled = false; > + > apt.pt._qproc = aio_poll_queue_proc; > apt.pt._key = req->events; > apt.iocb = aiocb;
On 12/6/18 12:27 PM, Jeff Moyer wrote: > Jens Axboe <axboe@kernel.dk> writes: > >> It's 192 bytes, fairly substantial. Most items don't need to be cleared, >> especially not upfront. Clear the ones we do need to clear, and leave >> the other ones for setup when the iocb is prepared and submitted. > > What performance gains do you see from this? Before this, I had 1% in memset doing high IOPS. With it, that's gone. 1% is a lot, when you have just one thread doing everything from submission to completion.
Jens Axboe <axboe@kernel.dk> writes: > On 12/6/18 12:27 PM, Jeff Moyer wrote: >> Jens Axboe <axboe@kernel.dk> writes: >> >>> It's 192 bytes, fairly substantial. Most items don't need to be cleared, >>> especially not upfront. Clear the ones we do need to clear, and leave >>> the other ones for setup when the iocb is prepared and submitted. >> >> What performance gains do you see from this? > > Before this, I had 1% in memset doing high IOPS. With it, that's gone. > 1% is a lot, when you have just one thread doing everything from submission > to completion. I'm used to customers complaining about fractions of a percent, so I get it. :-) I just wanted to know we had some measurable impact, as I've seen bugs crop up from code like this in the past. Thanks! Jeff
On 12/6/18 12:38 PM, Jeff Moyer wrote: > Jens Axboe <axboe@kernel.dk> writes: > >> On 12/6/18 12:27 PM, Jeff Moyer wrote: >>> Jens Axboe <axboe@kernel.dk> writes: >>> >>>> It's 192 bytes, fairly substantial. Most items don't need to be cleared, >>>> especially not upfront. Clear the ones we do need to clear, and leave >>>> the other ones for setup when the iocb is prepared and submitted. >>> >>> What performance gains do you see from this? >> >> Before this, I had 1% in memset doing high IOPS. With it, that's gone. >> 1% is a lot, when you have just one thread doing everything from submission >> to completion. > > I'm used to customers complaining about fractions of a percent, so I get > it. :-) I just wanted to know we had some measurable impact, as I've > seen bugs crop up from code like this in the past. Oh for sure, I wouldn't do it if there wasn't a noticeable gain from this!
diff --git a/fs/aio.c b/fs/aio.c index eaceb40e6cf5..522c04864d82 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1009,14 +1009,15 @@ static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx) { struct aio_kiocb *req; - req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO); + req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL); if (unlikely(!req)) return NULL; percpu_ref_get(&ctx->reqs); + req->ki_ctx = ctx; INIT_LIST_HEAD(&req->ki_list); refcount_set(&req->ki_refcnt, 0); - req->ki_ctx = ctx; + req->ki_eventfd = NULL; return req; } @@ -1730,6 +1731,10 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, struct iocb *iocb) if (unlikely(!req->file)) return -EBADF; + req->head = NULL; + req->woken = false; + req->cancelled = false; + apt.pt._qproc = aio_poll_queue_proc; apt.pt._key = req->events; apt.iocb = aiocb;