From patchwork Wed Mar 30 18:55:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_B=C3=B6hmwalder?= X-Patchwork-Id: 12796267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72061C433EF for ; Wed, 30 Mar 2022 18:56:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345378AbiC3S5o (ORCPT ); Wed, 30 Mar 2022 14:57:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350285AbiC3S5m (ORCPT ); Wed, 30 Mar 2022 14:57:42 -0400 Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95BA16AA63 for ; Wed, 30 Mar 2022 11:55:56 -0700 (PDT) Received: by mail-ej1-x630.google.com with SMTP id bg10so43382775ejb.4 for ; Wed, 30 Mar 2022 11:55:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linbit-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=fS8vexv5y5IlDW6jDa/CJnkMG23qfhzjDUs+WGYeVjg=; b=uE53q3re8iwnjr6tBa9zKY2osZp3Jp7t2TKEQ3SXNl1//qBGvJExj/tsz6E1PgZ3e/ LkmwFOYHMXSgAU9TJywuh/3mnW/vMQ9usbvvqFAMx1fQd8lYOWLLVXhbeuUQve3sXZXf Fq5P2qkDbGUh0zMm4KJHlrdljEz+Ehg0sYvMkHQ52hNhhcRfJ+Hcgkw8bLQIV+4lCT35 VzaTaenvPApoHdpE75vTR57mqE52fMwU0uKIf/5Km74RZ7c7aJw81Xwvkc12VZoCOT0e 7ARVYYfD3GMdixo06qi0wm53i2TqGld6y+YnhLB6+sXfLvzX9Z+/7m5kbKulshiKjXdT 4LZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=fS8vexv5y5IlDW6jDa/CJnkMG23qfhzjDUs+WGYeVjg=; b=hmeqcdWd5Xig2SRqasPqsqJPGylNGzP5bFZFqcgPq0AN6WviQqAQu29fsfnzBy6aCU JHPqftcQKGWJUiR4ZsYS600FB7m7BUQCOOEqY0NCZK22iSf/xL+MxXi9AFwHO2nMM4eP 5ZtffvvSWUIaR5HDJ1H1J3A456Uw0Xabw9WuNUR5SA3VEiAfLuED1ZESnZGW+/v/R4Pk wLBsJRAD4ELSO4lzEkpBrPiWINsMPfkeyI/JF5CwFTQcb/R6hAVGHGHXchLzRWhZDKzR 5kuztnLsy0TMApn3SemOMJ7RKpqhKd6KdgC4G0iYWZJ5kL/mRvL/5ozzQZnS6KSY0jrz vqAg== X-Gm-Message-State: AOAM533rFkiVvAo7hPa029++rtLBN4vxRgsW8w8dHGpmdXCac5XBOMfV WJ7mD8DdzSSlM4fRrDJaI4hvDA== X-Google-Smtp-Source: ABdhPJz+Vt1B8Iq4kvuEZ5n0irgjFoH7JQGrTUQ0hBXINZJsc5RSI/lf6XLpN5R8JUo3VnVHtmsipg== X-Received: by 2002:a17:907:16ac:b0:6e0:1646:9121 with SMTP id hc44-20020a17090716ac00b006e016469121mr1136241ejc.194.1648666555117; Wed, 30 Mar 2022 11:55:55 -0700 (PDT) Received: from gintonic.linbit (62-99-137-214.static.upcbusiness.at. [62.99.137.214]) by smtp.gmail.com with ESMTPSA id nc13-20020a1709071c0d00b006dfa376ee55sm8554639ejc.131.2022.03.30.11.55.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Mar 2022 11:55:54 -0700 (PDT) From: =?utf-8?q?Christoph_B=C3=B6hmwalder?= To: Jens Axboe Cc: Philipp Reisner , drbd-dev@lists.linbit.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Lars Ellenberg , =?utf-8?q?Christoph_B=C3=B6hmwa?= =?utf-8?q?lder?= , stable@vger.kernel.org Subject: [RESEND PATCH] drbd: fix potential silent data corruption Date: Wed, 30 Mar 2022 20:55:51 +0200 Message-Id: <20220330185551.3553196-1-christoph.boehmwalder@linbit.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Lars Ellenberg Scenario: --------- bio chain generated by blk_queue_split(). Some split bio fails and propagates its error status to the "parent" bio. But then the (last part of the) parent bio itself completes without error. We would clobber the already recorded error status with BLK_STS_OK, causing silent data corruption. Reproducer: ----------- How to trigger this in the real world within seconds: DRBD on top of degraded parity raid, small stripe_cache_size, large read_ahead setting. Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED", umount and mount again, "reboot"). Cause significant read ahead. Large read ahead request is split by blk_queue_split(). Parts of the read ahead that are already in the stripe cache, or find an available stripe cache to use, can be serviced. Parts of the read ahead that would need "too much work", would need to wait for a "stripe_head" to become available, are rejected immediately. For larger read ahead requests that are split in many pieces, it is very likely that some "splits" will be serviced, but then the stripe cache is exhausted/busy, and the remaining ones will be rejected. Signed-off-by: Lars Ellenberg Signed-off-by: Christoph Böhmwalder Cc: # 4.13.x --- drivers/block/drbd/drbd_req.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c index c04394518b07..e1e58e91ee58 100644 --- a/drivers/block/drbd/drbd_req.c +++ b/drivers/block/drbd/drbd_req.c @@ -180,7 +180,8 @@ void start_new_tl_epoch(struct drbd_connection *connection) void complete_master_bio(struct drbd_device *device, struct bio_and_error *m) { - m->bio->bi_status = errno_to_blk_status(m->error); + if (unlikely(m->error)) + m->bio->bi_status = errno_to_blk_status(m->error); bio_endio(m->bio); dec_ap_bio(device); }