diff mbox series

[4/4] aio: allow direct aio poll comletions for keyed wakeups

Message ID 20180806083058.14724-5-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [1/4] timerfd: add support for keyed wakeups | expand

Commit Message

Christoph Hellwig Aug. 6, 2018, 8:30 a.m. UTC
If we get a keyed wakeup for a aio poll waitqueue and wake can acquire the
ctx_lock without spinning we can just complete the iocb straight from the
wakeup callback to avoid a context switch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Avi Kivity <avi@scylladb.com>
---
 fs/aio.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Andrew Morton Aug. 6, 2018, 10:27 p.m. UTC | #1
On Mon,  6 Aug 2018 10:30:58 +0200 Christoph Hellwig <hch@lst.de> wrote:

> If we get a keyed wakeup for a aio poll waitqueue and wake can acquire the
> ctx_lock without spinning we can just complete the iocb straight from the
> wakeup callback to avoid a context switch.

Why do we try to avoid spinning on the lock?

> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -1672,13 +1672,26 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
>  		void *key)
>  {
>  	struct poll_iocb *req = container_of(wait, struct poll_iocb, wait);
> +	struct aio_kiocb *iocb = container_of(req, struct aio_kiocb, poll);
>  	__poll_t mask = key_to_poll(key);
>  
>  	req->woken = true;
>  
>  	/* for instances that support it check for an event match first: */
> -	if (mask && !(mask & req->events))
> -		return 0;
> +	if (mask) {
> +		if (!(mask & req->events))
> +			return 0;
> +
> +		/* try to complete the iocb inline if we can: */

ie, this comment explains 'what" but not "why".

(There's a typo in Subject:, btw)

> +		if (spin_trylock(&iocb->ki_ctx->ctx_lock)) {
> +			list_del(&iocb->ki_list);
> +			spin_unlock(&iocb->ki_ctx->ctx_lock);
> +
> +			list_del_init(&req->wait.entry);
> +			aio_poll_complete(iocb, mask);
> +			return 1;
> +		}
> +	}
>  
>  	list_del_init(&req->wait.entry);
>  	schedule_work(&req->work);
Christoph Hellwig Aug. 7, 2018, 7:25 a.m. UTC | #2
On Mon, Aug 06, 2018 at 03:27:05PM -0700, Andrew Morton wrote:
> On Mon,  6 Aug 2018 10:30:58 +0200 Christoph Hellwig <hch@lst.de> wrote:
> 
> > If we get a keyed wakeup for a aio poll waitqueue and wake can acquire the
> > ctx_lock without spinning we can just complete the iocb straight from the
> > wakeup callback to avoid a context switch.
> 
> Why do we try to avoid spinning on the lock?

Because we are called with the lock on the waitqueue called, which
nests inside it.

> > +		/* try to complete the iocb inline if we can: */
> 
> ie, this comment explains 'what" but not "why".
> 
> (There's a typo in Subject:, btw)

Because it is faster obviously.  I can update the comment.
Andrew Morton Aug. 7, 2018, 4:04 p.m. UTC | #3
On Tue, 7 Aug 2018 09:25:55 +0200 Christoph Hellwig <hch@lst.de> wrote:

> On Mon, Aug 06, 2018 at 03:27:05PM -0700, Andrew Morton wrote:
> > On Mon,  6 Aug 2018 10:30:58 +0200 Christoph Hellwig <hch@lst.de> wrote:
> > 
> > > If we get a keyed wakeup for a aio poll waitqueue and wake can acquire the
> > > ctx_lock without spinning we can just complete the iocb straight from the
> > > wakeup callback to avoid a context switch.
> > 
> > Why do we try to avoid spinning on the lock?
> 
> Because we are called with the lock on the waitqueue called, which
> nests inside it.

Ah.

> > > +		/* try to complete the iocb inline if we can: */
> > 
> > ie, this comment explains 'what" but not "why".
> > 
> > (There's a typo in Subject:, btw)
> 
> Because it is faster obviously.  I can update the comment.

I meant the comment could explain why it's a trylock instead of a
spin_lock().
Christoph Hellwig Aug. 8, 2018, 9:57 a.m. UTC | #4
On Tue, Aug 07, 2018 at 09:04:41AM -0700, Andrew Morton wrote:
> > Because it is faster obviously.  I can update the comment.
> 
> I meant the comment could explain why it's a trylock instead of a
> spin_lock().

We could something like this the patch below.

Al, do you want me to resend or can you just fold it in?

diff --git a/fs/aio.c b/fs/aio.c
index 5943098a87c6..84df2c2bf80b 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1684,7 +1684,8 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 
 		/*
 		 * Try to complete the iocb inline if we can to avoid a costly
-		 * context switch.
+		 * context switch.  As the waitqueue lock nests inside the ctx
+		 * lock we can only do that if we can get it without waiting.
 		 */
 		if (spin_trylock(&iocb->ki_ctx->ctx_lock)) {
 			list_del(&iocb->ki_list);
diff mbox series

Patch

diff --git a/fs/aio.c b/fs/aio.c
index 2fd19521d8a8..29f2b5b57d32 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1672,13 +1672,26 @@  static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 		void *key)
 {
 	struct poll_iocb *req = container_of(wait, struct poll_iocb, wait);
+	struct aio_kiocb *iocb = container_of(req, struct aio_kiocb, poll);
 	__poll_t mask = key_to_poll(key);
 
 	req->woken = true;
 
 	/* for instances that support it check for an event match first: */
-	if (mask && !(mask & req->events))
-		return 0;
+	if (mask) {
+		if (!(mask & req->events))
+			return 0;
+
+		/* try to complete the iocb inline if we can: */
+		if (spin_trylock(&iocb->ki_ctx->ctx_lock)) {
+			list_del(&iocb->ki_list);
+			spin_unlock(&iocb->ki_ctx->ctx_lock);
+
+			list_del_init(&req->wait.entry);
+			aio_poll_complete(iocb, mask);
+			return 1;
+		}
+	}
 
 	list_del_init(&req->wait.entry);
 	schedule_work(&req->work);