From patchwork Fri Jun 27 15:11:06 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Thornber X-Patchwork-Id: 4435841 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Original-To: patchwork-dm-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 5EA30BEEAA for ; Fri, 27 Jun 2014 15:14:58 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7A2162038D for ; Fri, 27 Jun 2014 15:14:57 +0000 (UTC) Received: from mx4-phx2.redhat.com (mx4-phx2.redhat.com [209.132.183.25]) by mail.kernel.org (Postfix) with ESMTP id 6AA4620384 for ; Fri, 27 Jun 2014 15:14:56 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx4-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id s5RFBAPi022314; Fri, 27 Jun 2014 11:11:11 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id s5RFB9rj031105 for ; Fri, 27 Jun 2014 11:11:09 -0400 Received: from localhost (vpn1-4-216.ams2.redhat.com [10.36.4.216]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s5RFB68Y012406; Fri, 27 Jun 2014 11:11:07 -0400 Date: Fri, 27 Jun 2014 16:11:06 +0100 From: Joe Thornber To: device-mapper development Message-ID: <20140627151105.GA30592@debian> Mail-Followup-To: device-mapper development , agk@redhat.com, snitzer@redhat.com, neilb@suse.de, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, Minfei Huang References: <1403841690-4401-1-git-send-email-huangminfei@ucloud.cn> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1403841690-4401-1-git-send-email-huangminfei@ucloud.cn> User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-loop: dm-devel@redhat.com Cc: snitzer@redhat.com, linux-kernel@vger.kernel.org, Minfei Huang , linux-raid@vger.kernel.org, agk@redhat.com Subject: Re: [dm-devel] [PATCH] dm-io: Prevent the danging point of the sync io callback function X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, Jun 27, 2014 at 12:01:30PM +0800, Minfei Huang wrote: > The io address in callback function will become the danging point, > cause by the thread of sync io wakes up by other threads > and return to relieve the io address, Yes, well found. I prefer the following fix however. - Joe Author: Joe Thornber Date: Fri Jun 27 15:49:29 2014 +0100 [dm-io] Fix a race condition in the wake up code for sync_io There's a race condition between the atomic_dec_and_test(&io->count) in dec_count() and the waking of the sync_io() thread. If the thread is spuriously woken immediately after the decrement it may exit, making the on the stack io struct invalid, yet the dec_count could still be using it. There are smaller fixes than the one here (eg, just take the io object off the stack). But I feel this code could use a clean up. - simplify dec_count(). - It always calls a callback fn now. - It always frees the io object back to the pool. - sync_io() - Take the io object off the stack and allocate it from the pool the same as async_io. - Use a completion object rather than an explicit io_schedule() loop. The callback triggers the completion. Reported by: Minfei Huang --- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c index 3842ac7..a0982e81 100644 --- a/drivers/md/dm-io.c +++ b/drivers/md/dm-io.c @@ -10,6 +10,7 @@ #include #include +#include #include #include #include @@ -32,7 +33,6 @@ struct dm_io_client { struct io { unsigned long error_bits; atomic_t count; - struct task_struct *sleeper; struct dm_io_client *client; io_notify_fn callback; void *context; @@ -111,28 +111,27 @@ static void retrieve_io_and_region_from_bio(struct bio *bio, struct io **io, * We need an io object to keep track of the number of bios that * have been dispatched for a particular io. *---------------------------------------------------------------*/ -static void dec_count(struct io *io, unsigned int region, int error) +static void complete_io(struct io *io) { - if (error) - set_bit(region, &io->error_bits); + unsigned long error_bits = io->error_bits; + io_notify_fn fn = io->callback; + void *context = io->context; - if (atomic_dec_and_test(&io->count)) { - if (io->vma_invalidate_size) - invalidate_kernel_vmap_range(io->vma_invalidate_address, - io->vma_invalidate_size); + if (io->vma_invalidate_size) + invalidate_kernel_vmap_range(io->vma_invalidate_address, + io->vma_invalidate_size); - if (io->sleeper) - wake_up_process(io->sleeper); + mempool_free(io, io->client->pool); + fn(error_bits, context); +} - else { - unsigned long r = io->error_bits; - io_notify_fn fn = io->callback; - void *context = io->context; +static void dec_count(struct io *io, unsigned int region, int error) +{ + if (error) + set_bit(region, &io->error_bits); - mempool_free(io, io->client->pool); - fn(r, context); - } - } + if (atomic_dec_and_test(&io->count)) + complete_io(io); } static void endio(struct bio *bio, int error) @@ -375,48 +374,49 @@ static void dispatch_io(int rw, unsigned int num_regions, dec_count(io, 0, 0); } +struct sync_io { + unsigned long error_bits; + struct completion complete; +}; + +static void sync_complete(unsigned long error, void *context) +{ + struct sync_io *sio = context; + sio->error_bits = error; + complete(&sio->complete); +} + static int sync_io(struct dm_io_client *client, unsigned int num_regions, struct dm_io_region *where, int rw, struct dpages *dp, unsigned long *error_bits) { - /* - * gcc <= 4.3 can't do the alignment for stack variables, so we must - * align it on our own. - * volatile prevents the optimizer from removing or reusing - * "io_" field from the stack frame (allowed in ANSI C). - */ - volatile char io_[sizeof(struct io) + __alignof__(struct io) - 1]; - struct io *io = (struct io *)PTR_ALIGN(&io_, __alignof__(struct io)); + struct io *io; + struct sync_io sio; if (num_regions > 1 && (rw & RW_MASK) != WRITE) { WARN_ON(1); return -EIO; } + init_completion(&sio.complete); + + io = mempool_alloc(client->pool, GFP_NOIO); io->error_bits = 0; atomic_set(&io->count, 1); /* see dispatch_io() */ - io->sleeper = current; + io->callback = sync_complete; + io->context = &sio; io->client = client; io->vma_invalidate_address = dp->vma_invalidate_address; io->vma_invalidate_size = dp->vma_invalidate_size; dispatch_io(rw, num_regions, where, dp, io, 1); - - while (1) { - set_current_state(TASK_UNINTERRUPTIBLE); - - if (!atomic_read(&io->count)) - break; - - io_schedule(); - } - set_current_state(TASK_RUNNING); + wait_for_completion_io(&sio.complete); if (error_bits) - *error_bits = io->error_bits; + *error_bits = sio.error_bits; - return io->error_bits ? -EIO : 0; + return sio.error_bits ? -EIO : 0; } static int async_io(struct dm_io_client *client, unsigned int num_regions, @@ -434,7 +434,6 @@ static int async_io(struct dm_io_client *client, unsigned int num_regions, io = mempool_alloc(client->pool, GFP_NOIO); io->error_bits = 0; atomic_set(&io->count, 1); /* see dispatch_io() */ - io->sleeper = NULL; io->client = client; io->callback = fn; io->context = context;