From patchwork Wed Jul 12 20:05:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adam Buchbinder X-Patchwork-Id: 9837347 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C303960393 for ; Wed, 12 Jul 2017 20:05:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B5CBF286AB for ; Wed, 12 Jul 2017 20:05:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A9F09286AE; Wed, 12 Jul 2017 20:05:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3B8C6286AB for ; Wed, 12 Jul 2017 20:05:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751974AbdGLUFT (ORCPT ); Wed, 12 Jul 2017 16:05:19 -0400 Received: from mail-pg0-f42.google.com ([74.125.83.42]:36563 "EHLO mail-pg0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751161AbdGLUFS (ORCPT ); Wed, 12 Jul 2017 16:05:18 -0400 Received: by mail-pg0-f42.google.com with SMTP id u62so18011821pgb.3 for ; Wed, 12 Jul 2017 13:05:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=HBBGj9Yh63yvQwco6ex98o9+mqlSJngnb3OfX2nSUBI=; b=Q1KW3mw4KGTVSnJb3z5iP8CCVEkO31BlR9AKk2BTafgCUeNUy12BE5RDKpisyoe8tX 7/UQSV3GNypn99AYcXdZLsqRX0XWQvwU3hT8C3Ybf0NejSOsJD+37Id8N28kiSxu1NHW 9hZt8lqRfsKfXJ3pCVg6c3SpWqiyVy2mv5t2YwNf/P9jBR4hCVyZCPETZpi+btXNitCj Gfr5YBZNRfoIm5avkC/rsGTpyoMtXlBq803WNFutipWEyZWClOqd27C7nvncJ4J+OYAL 9ieC+ZWqYQpBPKtGYKpwib7dHJPFpKGjZaNIETaDgC0ThCbKXMc7Jj31fkE3Ya77ot+X t4LA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=HBBGj9Yh63yvQwco6ex98o9+mqlSJngnb3OfX2nSUBI=; b=Rd/yh7BW8ZlerM+oNimkjtNiCWUTIniOD8h1LxskvenZ0cbQbCCqumfz0wa8eMmT3g JTbh/lUbt0YmPUgUsrWU/08ICZE4RodiLV22TqvQicnGWeaUEGDSrH7e6lzUkcrX8BRl p3MgBHaraM5j4sMVBpwGq1rxOABPxEtM+ZqFiPO19pcmYrNcWudvS8DE5lrnpv9l4S4H 3P3sQxvONtnY0605hsH2W78Yp9nCHAg910b+vcIOs+UDto8LU4xOd5428heogg1ySTkw 3bUA6kSPYGADkCnn9g7ChwdU6RDcjm8/GarudC/Kv4nPwLaxx81FqEV45eI0/UHQ5fQB 8b3Q== X-Gm-Message-State: AIVw111Yfz+yVEhIH6EpJXiWzQihUfDGeqKqzW4m9oNPpPb9MFlWrjYA Qq2Vovi4KqYICyKQITTtkQ== X-Received: by 10.98.27.73 with SMTP id b70mr56688544pfb.42.1499889917065; Wed, 12 Jul 2017 13:05:17 -0700 (PDT) Received: from abuchbinder-glaptop.corp.google.com (c-73-223-36-121.hsd1.ca.comcast.net. [73.223.36.121]) by smtp.gmail.com with ESMTPSA id 133sm5677366pgc.19.2017.07.12.13.05.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 12 Jul 2017 13:05:16 -0700 (PDT) From: Adam Buchbinder To: linux-btrfs@vger.kernel.org Cc: Adam Buchbinder Subject: [PATCH] btrfs-progs: Fix data races in btrfs-image. Date: Wed, 12 Jul 2017 13:05:10 -0700 Message-Id: <20170712200510.18753-1-abuchbinder@google.com> X-Mailer: git-send-email 2.13.2.932.g7449e964c-goog Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Making the code data-race safe requires that reads *and* writes happen under a mutex lock, if any of the access are writes. See Dmitri Vyukov, "Benign data races: what could possibly go wrong?" for more details. The fix here was to put most of the main loop of restore_worker under a mutex lock. This race was detected using fsck-tests/012-leaf-corruption. ================== WARNING: ThreadSanitizer: data race Write of size 4 by main thread: #0 add_cluster btrfs-progs/image/main.c:1931 #1 restore_metadump btrfs-progs/image/main.c:2566 #2 main btrfs-progs/image/main.c:2859 Previous read of size 4 by thread T6: #0 restore_worker btrfs-progs/image/main.c:1720 Location is stack of main thread. Thread T6 (running) created by main thread at: #0 pthread_create #1 mdrestore_init btrfs-progs/image/main.c:1868 #2 restore_metadump btrfs-progs/image/main.c:2534 #3 main btrfs-progs/image/main.c:2859 SUMMARY: ThreadSanitizer: data race btrfs-progs/image/main.c:1931 in add_cluster Signed-off-by: Adam Buchbinder --- image/main.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/image/main.c b/image/main.c index 1eca414..a5d01d8 100644 --- a/image/main.c +++ b/image/main.c @@ -1715,14 +1715,15 @@ static void *restore_worker(void *data) } async = list_entry(mdres->list.next, struct async_work, list); list_del_init(&async->list); - pthread_mutex_unlock(&mdres->mutex); if (mdres->compress_method == COMPRESS_ZLIB) { size = compress_size; + pthread_mutex_unlock(&mdres->mutex); ret = uncompress(buffer, (unsigned long *)&size, async->buffer, async->bufsize); + pthread_mutex_lock(&mdres->mutex); if (ret != Z_OK) { - error("decompressiion failed with %d", ret); + error("decompression failed with %d", ret); err = -EIO; } outbuf = buffer; @@ -1798,7 +1799,6 @@ error: if (!mdres->multi_devices && async->start == BTRFS_SUPER_INFO_OFFSET) write_backup_supers(outfd, outbuf); - pthread_mutex_lock(&mdres->mutex); if (err && !mdres->error) mdres->error = err; mdres->num_items--; @@ -1899,7 +1899,7 @@ static int fill_mdres_info(struct mdrestore_struct *mdres, ret = uncompress(buffer, (unsigned long *)&size, async->buffer, async->bufsize); if (ret != Z_OK) { - error("decompressiion failed with %d", ret); + error("decompression failed with %d", ret); free(buffer); return -EIO; } @@ -1928,7 +1928,9 @@ static int add_cluster(struct meta_cluster *cluster, u32 i, nritems; int ret; + pthread_mutex_lock(&mdres->mutex); mdres->compress_method = header->compress; + pthread_mutex_unlock(&mdres->mutex); bytenr = le64_to_cpu(header->bytenr) + BLOCK_SIZE; nritems = le32_to_cpu(header->nritems); @@ -2171,7 +2173,7 @@ static int search_for_chunk_blocks(struct mdrestore_struct *mdres, continue; } error( - "unknown state after reading cluster at %llu, probably crrupted data", + "unknown state after reading cluster at %llu, probably corrupted data", cluster_bytenr); ret = -EIO; break; @@ -2220,7 +2222,7 @@ static int search_for_chunk_blocks(struct mdrestore_struct *mdres, (unsigned long *)&size, tmp, bufsize); if (ret != Z_OK) { - error("decompressiion failed with %d", + error("decompression failed with %d", ret); ret = -EIO; break; @@ -2340,7 +2342,7 @@ static int build_chunk_tree(struct mdrestore_struct *mdres, ret = uncompress(tmp, (unsigned long *)&size, buffer, le32_to_cpu(item->size)); if (ret != Z_OK) { - error("decompressiion failed with %d", ret); + error("decompression failed with %d", ret); free(buffer); free(tmp); return -EIO;