diff mbox series

Revert "fs/aio: Make io_cancel() generate completions again"

Message ID 20240304182945.3646109-1-bvanassche@acm.org (mailing list archive)
State New
Headers show
Series Revert "fs/aio: Make io_cancel() generate completions again" | expand

Commit Message

Bart Van Assche March 4, 2024, 6:29 p.m. UTC
Patch "fs/aio: Make io_cancel() generate completions again" is based on the
assumption that calling kiocb->ki_cancel() does not complete R/W requests.
This is incorrect: the two drivers that call kiocb_set_cancel_fn() callers
set a cancellation function that calls usb_ep_dequeue(). According to its
documentation, usb_ep_dequeue() calls the completion routine with status
-ECONNRESET. Hence this revert.

Cc: Benjamin LaHaise <ben@communityfibre.ca>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Avi Kivity <avi@scylladb.com>
Cc: Sandeep Dhavale <dhavale@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: stable@vger.kernel.org
Reported-by: syzbot+b91eb2ed18f599dd3c31@syzkaller.appspotmail.com
Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 fs/aio.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

Comments

Eric Biggers March 4, 2024, 7:31 p.m. UTC | #1
On Mon, Mar 04, 2024 at 10:29:44AM -0800, Bart Van Assche wrote:
> Patch "fs/aio: Make io_cancel() generate completions again" is based on the
> assumption that calling kiocb->ki_cancel() does not complete R/W requests.
> This is incorrect: the two drivers that call kiocb_set_cancel_fn() callers
> set a cancellation function that calls usb_ep_dequeue(). According to its
> documentation, usb_ep_dequeue() calls the completion routine with status
> -ECONNRESET. Hence this revert.
> 
> Cc: Benjamin LaHaise <ben@communityfibre.ca>
> Cc: Eric Biggers <ebiggers@google.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Avi Kivity <avi@scylladb.com>
> Cc: Sandeep Dhavale <dhavale@google.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Kent Overstreet <kent.overstreet@linux.dev>
> Cc: stable@vger.kernel.org
> Reported-by: syzbot+b91eb2ed18f599dd3c31@syzkaller.appspotmail.com
> Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  fs/aio.c | 27 ++++++++++++++++-----------
>  1 file changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index 28223f511931..da18dbcfcb22 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -2165,11 +2165,14 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id,
>  #endif
>  
>  /* sys_io_cancel:
> - *	Attempts to cancel an iocb previously passed to io_submit(). If the
> - *	operation is successfully cancelled 0 is returned. May fail with
> - *	-EFAULT if any of the data structures pointed to are invalid. May
> - *	fail with -EINVAL if aio_context specified by ctx_id is invalid. Will
> - *	fail with -ENOSYS if not implemented.
> + *	Attempts to cancel an iocb previously passed to io_submit.  If
> + *	the operation is successfully cancelled, the resulting event is
> + *	copied into the memory pointed to by result without being placed
> + *	into the completion queue and 0 is returned.  May fail with
> + *	-EFAULT if any of the data structures pointed to are invalid.
> + *	May fail with -EINVAL if aio_context specified by ctx_id is
> + *	invalid.  May fail with -EAGAIN if the iocb specified was not
> + *	cancelled.  Will fail with -ENOSYS if not implemented.
>   */
>  SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
>  		struct io_event __user *, result)
> @@ -2200,12 +2203,14 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
>  	}
>  	spin_unlock_irq(&ctx->ctx_lock);
>  
> -	/*
> -	 * The result argument is no longer used - the io_event is always
> -	 * delivered via the ring buffer.
> -	 */
> -	if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
> -		aio_complete_rw(&kiocb->rw, -EINTR);
> +	if (!ret) {
> +		/*
> +		 * The result argument is no longer used - the io_event is
> +		 * always delivered via the ring buffer. -EINPROGRESS indicates
> +		 * cancellation is progress:
> +		 */
> +		ret = -EINPROGRESS;
> +	}

Acked-by: Eric Biggers <ebiggers@google.com>

It does look like all the ->ki_cancel functions complete the request already, so
this patch was unnecessary and just introduced a bug.

Note that IOCB_CMD_POLL installs a ->ki_cancel function too, and that's how
syzbot hit the use-after-free so easily.

I assume that the patch just wasn't tested?  Or did you find that it actually
fixed something (how)?

By the way, libaio (https://pagure.io/libaio) has a test suite for these system
calls.  How about adding a test case that cancels an IOCB_CMD_POLL request and
verifies that the completion event is received?

- Eric
Christian Brauner March 5, 2024, 8:50 a.m. UTC | #2
On Mon, Mar 04, 2024 at 11:31:53AM -0800, Eric Biggers wrote:
> On Mon, Mar 04, 2024 at 10:29:44AM -0800, Bart Van Assche wrote:
> > Patch "fs/aio: Make io_cancel() generate completions again" is based on the
> > assumption that calling kiocb->ki_cancel() does not complete R/W requests.
> > This is incorrect: the two drivers that call kiocb_set_cancel_fn() callers
> > set a cancellation function that calls usb_ep_dequeue(). According to its
> > documentation, usb_ep_dequeue() calls the completion routine with status
> > -ECONNRESET. Hence this revert.
> > 
> > Cc: Benjamin LaHaise <ben@communityfibre.ca>
> > Cc: Eric Biggers <ebiggers@google.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: Avi Kivity <avi@scylladb.com>
> > Cc: Sandeep Dhavale <dhavale@google.com>
> > Cc: Jens Axboe <axboe@kernel.dk>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Kent Overstreet <kent.overstreet@linux.dev>
> > Cc: stable@vger.kernel.org
> > Reported-by: syzbot+b91eb2ed18f599dd3c31@syzkaller.appspotmail.com
> > Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again")
> > Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> > ---
> >  fs/aio.c | 27 ++++++++++++++++-----------
> >  1 file changed, 16 insertions(+), 11 deletions(-)
> > 
> > diff --git a/fs/aio.c b/fs/aio.c
> > index 28223f511931..da18dbcfcb22 100644
> > --- a/fs/aio.c
> > +++ b/fs/aio.c
> > @@ -2165,11 +2165,14 @@ COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id,
> >  #endif
> >  
> >  /* sys_io_cancel:
> > - *	Attempts to cancel an iocb previously passed to io_submit(). If the
> > - *	operation is successfully cancelled 0 is returned. May fail with
> > - *	-EFAULT if any of the data structures pointed to are invalid. May
> > - *	fail with -EINVAL if aio_context specified by ctx_id is invalid. Will
> > - *	fail with -ENOSYS if not implemented.
> > + *	Attempts to cancel an iocb previously passed to io_submit.  If
> > + *	the operation is successfully cancelled, the resulting event is
> > + *	copied into the memory pointed to by result without being placed
> > + *	into the completion queue and 0 is returned.  May fail with
> > + *	-EFAULT if any of the data structures pointed to are invalid.
> > + *	May fail with -EINVAL if aio_context specified by ctx_id is
> > + *	invalid.  May fail with -EAGAIN if the iocb specified was not
> > + *	cancelled.  Will fail with -ENOSYS if not implemented.
> >   */
> >  SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
> >  		struct io_event __user *, result)
> > @@ -2200,12 +2203,14 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
> >  	}
> >  	spin_unlock_irq(&ctx->ctx_lock);
> >  
> > -	/*
> > -	 * The result argument is no longer used - the io_event is always
> > -	 * delivered via the ring buffer.
> > -	 */
> > -	if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
> > -		aio_complete_rw(&kiocb->rw, -EINTR);
> > +	if (!ret) {
> > +		/*
> > +		 * The result argument is no longer used - the io_event is
> > +		 * always delivered via the ring buffer. -EINPROGRESS indicates
> > +		 * cancellation is progress:
> > +		 */
> > +		ret = -EINPROGRESS;
> > +	}
> 
> Acked-by: Eric Biggers <ebiggers@google.com>
> 
> It does look like all the ->ki_cancel functions complete the request already, so
> this patch was unnecessary and just introduced a bug.
> 
> Note that IOCB_CMD_POLL installs a ->ki_cancel function too, and that's how
> syzbot hit the use-after-free so easily.
> 
> I assume that the patch just wasn't tested?  Or did you find that it actually
> fixed something (how)?

We've been wrestling aio cancellations for a while now and aimed to
actually remove it but apparently it's used in the wild. I still very
much prefer if we could finally nuke this code.

> 
> By the way, libaio (https://pagure.io/libaio) has a test suite for these system
> calls.  How about adding a test case that cancels an IOCB_CMD_POLL request and
> verifies that the completion event is received?

Yes, please.
Christian Brauner March 5, 2024, 9:01 a.m. UTC | #3
On Mon, 04 Mar 2024 10:29:44 -0800, Bart Van Assche wrote:
> Patch "fs/aio: Make io_cancel() generate completions again" is based on the
> assumption that calling kiocb->ki_cancel() does not complete R/W requests.
> This is incorrect: the two drivers that call kiocb_set_cancel_fn() callers
> set a cancellation function that calls usb_ep_dequeue(). According to its
> documentation, usb_ep_dequeue() calls the completion routine with status
> -ECONNRESET. Hence this revert.
> 
> [...]

I'm not enthusiastic about how we handled this. There was apparently
more guesswork involved than anything else and I had asked multiple
times whether that patch is really required. So please, let's be more
careful going forward.

---

Applied to the vfs.fixes branch of the vfs/vfs.git tree.
Patches in the vfs.fixes branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.fixes

[1/1] Revert "fs/aio: Make io_cancel() generate completions again"
      https://git.kernel.org/vfs/vfs/c/d435ca3d38eb
Bart Van Assche March 5, 2024, 8:29 p.m. UTC | #4
On 3/5/24 00:50, Christian Brauner wrote:
> We've been wrestling aio cancellations for a while now and aimed to
> actually remove it but apparently it's used in the wild. I still very
> much prefer if we could finally nuke this code.

io_cancel() is being used by at least the Android user space code for
cancelling pending USB writes. As far as I know we (Linux kernel
developers) are not allowed to break existing user space code. See also:
* 
https://android.googlesource.com/platform/frameworks/av/+/refs/heads/main/media/mtp/MtpFfsHandle.cpp
* 
https://android.googlesource.com/platform/packages/modules/adb/+/refs/heads/main/daemon/usb.cpp

Thanks,

Bart.
diff mbox series

Patch

diff --git a/fs/aio.c b/fs/aio.c
index 28223f511931..da18dbcfcb22 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -2165,11 +2165,14 @@  COMPAT_SYSCALL_DEFINE3(io_submit, compat_aio_context_t, ctx_id,
 #endif
 
 /* sys_io_cancel:
- *	Attempts to cancel an iocb previously passed to io_submit(). If the
- *	operation is successfully cancelled 0 is returned. May fail with
- *	-EFAULT if any of the data structures pointed to are invalid. May
- *	fail with -EINVAL if aio_context specified by ctx_id is invalid. Will
- *	fail with -ENOSYS if not implemented.
+ *	Attempts to cancel an iocb previously passed to io_submit.  If
+ *	the operation is successfully cancelled, the resulting event is
+ *	copied into the memory pointed to by result without being placed
+ *	into the completion queue and 0 is returned.  May fail with
+ *	-EFAULT if any of the data structures pointed to are invalid.
+ *	May fail with -EINVAL if aio_context specified by ctx_id is
+ *	invalid.  May fail with -EAGAIN if the iocb specified was not
+ *	cancelled.  Will fail with -ENOSYS if not implemented.
  */
 SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
 		struct io_event __user *, result)
@@ -2200,12 +2203,14 @@  SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
 	}
 	spin_unlock_irq(&ctx->ctx_lock);
 
-	/*
-	 * The result argument is no longer used - the io_event is always
-	 * delivered via the ring buffer.
-	 */
-	if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
-		aio_complete_rw(&kiocb->rw, -EINTR);
+	if (!ret) {
+		/*
+		 * The result argument is no longer used - the io_event is
+		 * always delivered via the ring buffer. -EINPROGRESS indicates
+		 * cancellation is progress:
+		 */
+		ret = -EINPROGRESS;
+	}
 
 	percpu_ref_put(&ctx->users);