From patchwork Thu Feb 14 23:21:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nikos Tsironis X-Patchwork-Id: 10814005 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 53F761399 for ; Thu, 14 Feb 2019 23:22:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 41F012F3BA for ; Thu, 14 Feb 2019 23:22:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3602E2F3C7; Thu, 14 Feb 2019 23:22:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 471FD2F3C1 for ; Thu, 14 Feb 2019 23:22:00 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 84FBDC0D18B1; Thu, 14 Feb 2019 23:21:58 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C701F5D706; Thu, 14 Feb 2019 23:21:56 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 6F0C53F7D0; Thu, 14 Feb 2019 23:21:53 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x1ENLqhc019922 for ; Thu, 14 Feb 2019 18:21:52 -0500 Received: by smtp.corp.redhat.com (Postfix) id 57D7E19487; Thu, 14 Feb 2019 23:21:52 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx10.extmail.prod.ext.phx2.redhat.com [10.5.110.39]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B4DFC19C65; Thu, 14 Feb 2019 23:21:44 +0000 (UTC) Received: from mx0.arrikto.com (mx0.arrikto.com [212.71.252.59]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7520F118E26; Thu, 14 Feb 2019 23:21:41 +0000 (UTC) Received: from troi.prod.arr (mail.arr [10.99.0.5]) by mx0.arrikto.com (Postfix) with ESMTP id 729DD182004; Fri, 15 Feb 2019 01:21:39 +0200 (EET) Received: from snf-864.vm.snf.arr (snf-864.vm.snf.arr [10.97.70.29]) by troi.prod.arr (Postfix) with ESMTPSA id 15B972BB; Fri, 15 Feb 2019 01:21:39 +0200 (EET) From: Nikos Tsironis To: agk@redhat.com, snitzer@redhat.com, dm-devel@redhat.com Date: Fri, 15 Feb 2019 01:21:38 +0200 Message-Id: <20190214232138.25667-1-ntsironis@arrikto.com> X-Greylist: Sender passed SPF test, ACL 242 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 14 Feb 2019 23:21:43 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 14 Feb 2019 23:21:43 +0000 (UTC) for IP:'212.71.252.59' DOMAIN:'mx0.arrikto.com' HELO:'mx0.arrikto.com' FROM:'ntsironis@arrikto.com' RCPT:'' X-RedHat-Spam-Score: -0.001 (SPF_PASS) 212.71.252.59 mx0.arrikto.com 212.71.252.59 mx0.arrikto.com X-Scanned-By: MIMEDefang 2.78 on 10.5.110.39 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-loop: dm-devel@redhat.com Cc: ejt@redhat.com Subject: [dm-devel] [PATCH v2] dm thin: Fix bug wrt FUA request completion X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Thu, 14 Feb 2019 23:21:59 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP When provisioning a new data block for a virtual block, either because the block was previously unallocated or because we are breaking sharing, if the whole block of data is being overwritten the bio that triggered the provisioning is issued immediately, skipping copying or zeroing of the data block. When this bio completes the new mapping is inserted in to the pool's metadata by process_prepared_mapping(), where the bio completion is signaled to the upper layers. This completion is signaled without first committing the metadata. If the bio in question has the REQ_FUA flag set and the system crashes right after its completion and before the next metadata commit, then the write is lost despite the REQ_FUA flag requiring that I/O completion for this request is only signaled after the data has been committed to non-volatile storage. Fix this by deferring the completion of overwrite bios, with the REQ_FUA flag set, after the metadata has been committed. Signed-off-by: Nikos Tsironis --- drivers/md/dm-thin.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 5 deletions(-) Changes in v2: - Add missing bio_list_init() in pool_create() v1: https://www.redhat.com/archives/dm-devel/2019-February/msg00064.html diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index ca8af21bf644..e83b63608262 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -257,6 +257,7 @@ struct pool { spinlock_t lock; struct bio_list deferred_flush_bios; + struct bio_list deferred_flush_completions; struct list_head prepared_mappings; struct list_head prepared_discards; struct list_head prepared_discards_pt2; @@ -956,6 +957,39 @@ static void process_prepared_mapping_fail(struct dm_thin_new_mapping *m) mempool_free(m, &m->tc->pool->mapping_pool); } +static void complete_overwrite_bio(struct thin_c *tc, struct bio *bio) +{ + struct pool *pool = tc->pool; + unsigned long flags; + + /* + * If the bio has the REQ_FUA flag set we must commit the metadata + * before signaling its completion. + */ + if (!bio_triggers_commit(tc, bio)) { + bio_endio(bio); + return; + } + + /* + * Complete bio with an error if earlier I/O caused changes to the + * metadata that can't be committed, e.g, due to I/O errors on the + * metadata device. + */ + if (dm_thin_aborted_changes(tc->td)) { + bio_io_error(bio); + return; + } + + /* + * Batch together any bios that trigger commits and then issue a + * single commit for them in process_deferred_bios(). + */ + spin_lock_irqsave(&pool->lock, flags); + bio_list_add(&pool->deferred_flush_completions, bio); + spin_unlock_irqrestore(&pool->lock, flags); +} + static void process_prepared_mapping(struct dm_thin_new_mapping *m) { struct thin_c *tc = m->tc; @@ -988,7 +1022,7 @@ static void process_prepared_mapping(struct dm_thin_new_mapping *m) */ if (bio) { inc_remap_and_issue_cell(tc, m->cell, m->data_block); - bio_endio(bio); + complete_overwrite_bio(tc, bio); } else { inc_all_io_entry(tc->pool, m->cell->holder); remap_and_issue(tc, m->cell->holder, m->data_block); @@ -2317,7 +2351,7 @@ static void process_deferred_bios(struct pool *pool) { unsigned long flags; struct bio *bio; - struct bio_list bios; + struct bio_list bios, bio_completions; struct thin_c *tc; tc = get_first_thin(pool); @@ -2328,26 +2362,36 @@ static void process_deferred_bios(struct pool *pool) } /* - * If there are any deferred flush bios, we must commit - * the metadata before issuing them. + * If there are any deferred flush bios, we must commit the metadata + * before issuing them or signaling their completion. */ bio_list_init(&bios); + bio_list_init(&bio_completions); + spin_lock_irqsave(&pool->lock, flags); bio_list_merge(&bios, &pool->deferred_flush_bios); bio_list_init(&pool->deferred_flush_bios); + + bio_list_merge(&bio_completions, &pool->deferred_flush_completions); + bio_list_init(&pool->deferred_flush_completions); spin_unlock_irqrestore(&pool->lock, flags); - if (bio_list_empty(&bios) && + if (bio_list_empty(&bios) && bio_list_empty(&bio_completions) && !(dm_pool_changed_this_transaction(pool->pmd) && need_commit_due_to_time(pool))) return; if (commit(pool)) { + bio_list_merge(&bios, &bio_completions); + while ((bio = bio_list_pop(&bios))) bio_io_error(bio); return; } pool->last_commit_jiffies = jiffies; + while ((bio = bio_list_pop(&bio_completions))) + bio_endio(bio); + while ((bio = bio_list_pop(&bios))) generic_make_request(bio); } @@ -2954,6 +2998,7 @@ static struct pool *pool_create(struct mapped_device *pool_md, INIT_DELAYED_WORK(&pool->no_space_timeout, do_no_space_timeout); spin_lock_init(&pool->lock); bio_list_init(&pool->deferred_flush_bios); + bio_list_init(&pool->deferred_flush_completions); INIT_LIST_HEAD(&pool->prepared_mappings); INIT_LIST_HEAD(&pool->prepared_discards); INIT_LIST_HEAD(&pool->prepared_discards_pt2);