diff mbox

Revert "aio: block exit_aio() until all context requests are completed"

Message ID 5555A33B.20006@de.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Christian Borntraeger May 15, 2015, 7:41 a.m. UTC
I see a significant latency (can be minutes with 2000 disks and HZ=100)
when exiting a QEMU process that has lots of disk devices via aio. The
process sits idle doing nothing as zombie in exit_aio waiting for the
completion.

Turns out that 
commit 6098b45b32 ("aio: block exit_aio() until all context requests are
completed") caused the delay.

Patch description was:

It seems that exit_aio() also needs to wait for all iocbs to complete (like
io_destroy), but we missed the wait step in current implemention, so fix
it in the same way as we did in io_destroy.

Now: io_destroy requires to block until everything is cleaned up from its
interface description in the manpage:
DESCRIPTION
The  io_destroy()  system call will attempt to cancel all outstanding
asynchronous I/O operations against ctx_id, will block on the completion
of all operations that could not be canceled, and will destroy the ctx_id.

Does process exit require the same full blocking? We might be able to
cleanup the process and let the aio data structures be freed lazily.
Opinions or better ideas?

Christian



--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jeff Moyer May 15, 2015, 1:42 p.m. UTC | #1
Christian Borntraeger <borntraeger@de.ibm.com> writes:

> I see a significant latency (can be minutes with 2000 disks and HZ=100)
> when exiting a QEMU process that has lots of disk devices via aio. The
> process sits idle doing nothing as zombie in exit_aio waiting for the
> completion.
>
> Turns out that 
> commit 6098b45b32 ("aio: block exit_aio() until all context requests are
> completed") caused the delay.
>
> Patch description was:
>
> It seems that exit_aio() also needs to wait for all iocbs to complete (like
> io_destroy), but we missed the wait step in current implemention, so fix
> it in the same way as we did in io_destroy.
>
> Now: io_destroy requires to block until everything is cleaned up from its
> interface description in the manpage:
> DESCRIPTION
> The  io_destroy()  system call will attempt to cancel all outstanding
> asynchronous I/O operations against ctx_id, will block on the completion
> of all operations that could not be canceled, and will destroy the ctx_id.
>
> Does process exit require the same full blocking? We might be able to
> cleanup the process and let the aio data structures be freed lazily.
> Opinions or better ideas?

This has already been fixed:

commit dc48e56d761610da4ea1088d1bea0a030b8e3e43
Author: Jens Axboe <axboe@fb.com>
Date:   Wed Apr 15 11:17:23 2015 -0600

    aio: fix serial draining in exit_aio()

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christian Borntraeger May 15, 2015, 3:26 p.m. UTC | #2
Am 15.05.2015 um 15:42 schrieb Jeff Moyer:
> Christian Borntraeger <borntraeger@de.ibm.com> writes:
> 
>> I see a significant latency (can be minutes with 2000 disks and HZ=100)
>> when exiting a QEMU process that has lots of disk devices via aio. The
>> process sits idle doing nothing as zombie in exit_aio waiting for the
>> completion.
>>
>> Turns out that 
>> commit 6098b45b32 ("aio: block exit_aio() until all context requests are
>> completed") caused the delay.
>>
>> Patch description was:
>>
>> It seems that exit_aio() also needs to wait for all iocbs to complete (like
>> io_destroy), but we missed the wait step in current implemention, so fix
>> it in the same way as we did in io_destroy.
>>
>> Now: io_destroy requires to block until everything is cleaned up from its
>> interface description in the manpage:
>> DESCRIPTION
>> The  io_destroy()  system call will attempt to cancel all outstanding
>> asynchronous I/O operations against ctx_id, will block on the completion
>> of all operations that could not be canceled, and will destroy the ctx_id.
>>
>> Does process exit require the same full blocking? We might be able to
>> cleanup the process and let the aio data structures be freed lazily.
>> Opinions or better ideas?
> 
> This has already been fixed:
> 
> commit dc48e56d761610da4ea1088d1bea0a030b8e3e43
> Author: Jens Axboe <axboe@fb.com>
> Date:   Wed Apr 15 11:17:23 2015 -0600
> 
>     aio: fix serial draining in exit_aio()
> 
> Cheers,
> Jeff
> 
Cool thanks. As the original patch had cc stable, shouldnt the fix also be backported?

Christian

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe May 16, 2015, 3:16 p.m. UTC | #3
On 05/15/2015 09:26 AM, Christian Borntraeger wrote:
> Am 15.05.2015 um 15:42 schrieb Jeff Moyer:
>> Christian Borntraeger <borntraeger@de.ibm.com> writes:
>>
>>> I see a significant latency (can be minutes with 2000 disks and HZ=100)
>>> when exiting a QEMU process that has lots of disk devices via aio. The
>>> process sits idle doing nothing as zombie in exit_aio waiting for the
>>> completion.
>>>
>>> Turns out that
>>> commit 6098b45b32 ("aio: block exit_aio() until all context requests are
>>> completed") caused the delay.
>>>
>>> Patch description was:
>>>
>>> It seems that exit_aio() also needs to wait for all iocbs to complete (like
>>> io_destroy), but we missed the wait step in current implemention, so fix
>>> it in the same way as we did in io_destroy.
>>>
>>> Now: io_destroy requires to block until everything is cleaned up from its
>>> interface description in the manpage:
>>> DESCRIPTION
>>> The  io_destroy()  system call will attempt to cancel all outstanding
>>> asynchronous I/O operations against ctx_id, will block on the completion
>>> of all operations that could not be canceled, and will destroy the ctx_id.
>>>
>>> Does process exit require the same full blocking? We might be able to
>>> cleanup the process and let the aio data structures be freed lazily.
>>> Opinions or better ideas?
>>
>> This has already been fixed:
>>
>> commit dc48e56d761610da4ea1088d1bea0a030b8e3e43
>> Author: Jens Axboe <axboe@fb.com>
>> Date:   Wed Apr 15 11:17:23 2015 -0600
>>
>>      aio: fix serial draining in exit_aio()
>>
>> Cheers,
>> Jeff
>>
> Cool thanks. As the original patch had cc stable, shouldnt the fix also be backported?

I'll email stable.
diff mbox

Patch

diff --git a/fs/aio.c b/fs/aio.c
index a793f70..1e6bcdb 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -820,8 +820,6 @@  void exit_aio(struct mm_struct *mm)

 	for (i = 0; i < table->nr; ++i) {
 		struct kioctx *ctx = table->table[i];
-		struct completion requests_done =
-			COMPLETION_INITIALIZER_ONSTACK(requests_done);

 		if (!ctx)
 			continue;
@@ -833,10 +831,7 @@  void exit_aio(struct mm_struct *mm)
 		 * that it needs to unmap the area, just set it to 0.
 		 */
 		ctx->mmap_size = 0;
-		kill_ioctx(mm, ctx, &requests_done);
-
-		/* Wait until all IO for the context are done. */
-		wait_for_completion(&requests_done);
+		kill_ioctx(mm, ctx, NULL);
 	}

 	RCU_INIT_POINTER(mm->ioctx_table, NULL);