From patchwork Sun Sep 24 06:14:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13396909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 942DDCE7A88 for ; Sun, 24 Sep 2023 06:14:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229513AbjIXGOm (ORCPT ); Sun, 24 Sep 2023 02:14:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229437AbjIXGOm (ORCPT ); Sun, 24 Sep 2023 02:14:42 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8A73106 for ; Sat, 23 Sep 2023 23:14:35 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 5827A1F45F for ; Sun, 24 Sep 2023 06:14:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1695536074; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mD6MJ9iLgAZP8XODmQXxYo795hI8vlMbjwjQKyRVQdA=; b=t+Z9i5f8Pj4gpk0zwxDAi5NsrwciPD9dMLUULZCm2dkXtfNVRBH3Mtz5QcNH4DRxhxPfOF 9Un8Pd1EaFqdUWy2IFtv47YSMJhFnf+dx4ykU9/6CU2uB6n9jcKIexxrv8CV4lxXWh5Lht Xcdpffv42pjUaaWNth5iE1DZFw2E1/U= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8AA64138FE for ; Sun, 24 Sep 2023 06:14:33 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id CFy2EsnTD2X+CgAAMHmgww (envelope-from ) for ; Sun, 24 Sep 2023 06:14:33 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 1/3] btrfs: introduce allow_backup_super_failure sysfs interface Date: Sun, 24 Sep 2023 15:44:12 +0930 Message-ID: X-Mailer: git-send-email 2.42.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs allows the backup super block to fail its writeback, as long as the primary one is still fine. This tolerance may be a little too loose for some debug purposes, thus this patch would introduce the following sysfs interface: /sys/fs/btrfs//debug/allow_backup_super_failure Which is a read-write entry, its content is 0/1, indicating if we allow backup super blocks to fail its writeback. The default value is 1, meaning we allow backup super blocks to fail its writeback. Writing anything but 0 would set the value to 1. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 7 +++++-- fs/btrfs/fs.h | 3 +++ fs/btrfs/sysfs.c | 37 +++++++++++++++++++++++++++++++++++++ 3 files changed, 45 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index dc577b3c53f6..d8eb968e9e5e 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2722,6 +2722,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) INIT_LIST_HEAD(&fs_info->allocated_roots); INIT_LIST_HEAD(&fs_info->allocated_ebs); spin_lock_init(&fs_info->eb_leak_lock); + fs_info->allow_backup_super_failure = true; #endif extent_map_tree_init(&fs_info->mapping_tree); btrfs_init_block_rsv(&fs_info->global_block_rsv, @@ -3841,8 +3842,10 @@ static int write_dev_supers(struct btrfs_device *device, */ static int wait_dev_supers(struct btrfs_device *device, int max_mirrors) { + struct btrfs_fs_info *fs_info = device->fs_info; int i; int errors = 0; + bool allow_super_failure = READ_ONCE(fs_info->allow_backup_super_failure); bool primary_failed = false; int ret; u64 bytenr; @@ -3890,8 +3893,8 @@ static int wait_dev_supers(struct btrfs_device *device, int max_mirrors) } /* log error, force error return */ - if (primary_failed) { - btrfs_err(device->fs_info, "error writing primary super block to device %llu", + if (primary_failed || (!allow_super_failure && errors)) { + btrfs_err(device->fs_info, "error writing super block to device %llu", device->devid); return -1; } diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 19f9a444bcd8..2dff41cb463d 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -685,6 +685,9 @@ struct btrfs_fs_info { struct btrfs_work qgroup_rescan_work; /* Protected by qgroup_rescan_lock */ bool qgroup_rescan_running; + + /* If we allow backup superblocks writeback to fail. */ + bool allow_backup_super_failure; u8 qgroup_drop_subtree_thres; /* diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 8b75e974f30b..852090622a76 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -614,12 +614,49 @@ static const struct attribute *discard_attrs[] = { #ifdef CONFIG_BTRFS_DEBUG +static ssize_t allow_backup_super_failure_show(struct kobject *debug_kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = to_fs_info(debug_kobj->parent); + + ASSERT(fs_info); + return sysfs_emit(buf, "%d\n", + READ_ONCE(fs_info->allow_backup_super_failure)); +} + +static ssize_t allow_backup_super_failure_store(struct kobject *debug_kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_fs_info *fs_info = to_fs_info(debug_kobj->parent); + u8 new_number; + int ret; + + ASSERT(fs_info); + + ret = kstrtos8(buf, 10, &new_number); + if (ret) + return -EINVAL; + WRITE_ONCE(fs_info->allow_backup_super_failure, !!new_number); + return len; +} +BTRFS_ATTR_RW(debug, allow_backup_super_failure, allow_backup_super_failure_show, + allow_backup_super_failure_store); + /* * Per-filesystem runtime debugging exported via sysfs. * * Path: /sys/fs/btrfs/UUID/debug/ + * + * - allow_backup_super_failure + * RW, binary (0/1), determins if we allow backup superblock writeback to fail. + * + * NOTE: Even with this set to 1, btrfs may still allow some errors to + * happen as btrfs can tolerate up to "rw_devs - 1" failures. */ static const struct attribute *btrfs_debug_mount_attrs[] = { + BTRFS_ATTR_PTR(debug, allow_backup_super_failure), NULL, }; From patchwork Sun Sep 24 06:14:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13396911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8B46CE7A8A for ; Sun, 24 Sep 2023 06:14:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229634AbjIXGOp (ORCPT ); Sun, 24 Sep 2023 02:14:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229482AbjIXGOo (ORCPT ); Sun, 24 Sep 2023 02:14:44 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F14D4107 for ; Sat, 23 Sep 2023 23:14:36 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A0C1A1F460 for ; Sun, 24 Sep 2023 06:14:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1695536075; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VusPkr1JMDUAE/itWlwjBU8XrQVnp3B7voq51eliLCw=; b=P92lIQ9YWuvHLtdQcjYncKzIR96dg+TDZ//v9vrkx0ARpuOY9EOa/4qbFuinHsltE9PlQf ++iIE6DpNf+K7x6BwYnh7AWmyD7iKID1p4cas0HNuc53uVNsjcpv6FWXpNxsGSt/zevDGy DyKKER8OInaEso8+2KSveKlwqKabTe4= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D14E1138FE for ; Sun, 24 Sep 2023 06:14:34 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 2PWLI8rTD2X+CgAAMHmgww (envelope-from ) for ; Sun, 24 Sep 2023 06:14:34 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 2/3] btrfs: introduce super_failure_tolerance sysfs interface Date: Sun, 24 Sep 2023 15:44:13 +0930 Message-ID: <9cc262a52ddb23a7948c8338b660449ec8598914.1695535440.git.wqu@suse.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs has a questionable tolerance on how many devices can fail their super blocks writeback, it allows "num_devices - 1" to fail. This can already be problematic for multi-device btrfses, but unfortunately I don't have anything better for now. Instead this patch would allow debug builds to configure the tolerance by the new sysfs interface: /sys/fs/btrfs//debug/super_failure_tolerance This value is s8, for values >= 0 it's the tolerance number directly. E.g. if the value is 0, we do not allow any device to fail its super block writeback. If the value is 2, and the fs only have 2 devices, it means we allow all devices to fail their super block writeback (aka, very dangerous). If the value is minus, then the tolerance is num_devices plus this value. E.g. if the value is -1 (default), and we have 2 devices, it means the tolerance is 1 (at most one device can fail). If the value is -2, and we have 1 devices, this means we allow all devices to fail (again, very dangerous). Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 27 ++++++++++++++++++++++++--- fs/btrfs/fs.h | 18 ++++++++++++++++++ fs/btrfs/sysfs.c | 30 ++++++++++++++++++++++++++++++ 3 files changed, 72 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index d8eb968e9e5e..062e28ac94b1 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2723,6 +2723,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) INIT_LIST_HEAD(&fs_info->allocated_ebs); spin_lock_init(&fs_info->eb_leak_lock); fs_info->allow_backup_super_failure = true; + fs_info->super_failure_tolerance = -1; #endif extent_map_tree_init(&fs_info->mapping_tree); btrfs_init_block_rsv(&fs_info->global_block_rsv, @@ -4033,6 +4034,26 @@ int btrfs_get_num_tolerated_disk_barrier_failures(u64 flags) return min_tolerated; } +static int calculate_max_super_errors(struct btrfs_fs_info *fs_info) +{ + int num_devs = btrfs_super_num_devices(fs_info->super_copy); + int tolerance_value = READ_ONCE(fs_info->super_failure_tolerance); + + if (tolerance_value >= 0) + return tolerance_value; + + ASSERT(num_devs >= 0); + + /* + * Now tolerance_value is minus, check if + * abs(@tolerance_value) is > @num_devices. If so we allow all devices + * to fail. + */ + if (-tolerance_value >= num_devs) + return INT_MAX; + return num_devs + tolerance_value; +} + int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors) { struct list_head *head; @@ -4060,7 +4081,7 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors) mutex_lock(&fs_info->fs_devices->device_list_mutex); head = &fs_info->fs_devices->devices; - max_errors = btrfs_super_num_devices(fs_info->super_copy) - 1; + max_errors = calculate_max_super_errors(fs_info); if (do_barriers) { ret = barrier_all_devices(fs_info); @@ -4138,8 +4159,8 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors) mutex_unlock(&fs_info->fs_devices->device_list_mutex); if (total_errors > max_errors) { btrfs_handle_fs_error(fs_info, -EIO, - "%d errors while writing supers", - total_errors); + "failed to write supers: errors %d tolerance %d", + total_errors, max_errors); return -EIO; } return 0; diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 2dff41cb463d..7608a1cf612f 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -688,6 +688,24 @@ struct btrfs_fs_info { /* If we allow backup superblocks writeback to fail. */ bool allow_backup_super_failure; + + /* + * Tolerance on how many devices can fail their superblock writeback. + * + * If the value >= 0, then the value itself is the tolerance. + * If the value < 0, then it would be (rw_devices - value) as the tolerance. + * + * Default value is -1. + * + * E.g. 0 means we do not accept any device to fail its super blocks writeback. + * + * If there are 3 devices and the value is -1, then it means we allow up to 2 + * devices to fail its super blocks writeback. + * + * If there are 3 devices and the value is -3 or -4, we would allow all devices + * to fail their super blocks writeback, which can be very DANGEROUS! + */ + s8 super_failure_tolerance; u8 qgroup_drop_subtree_thres; /* diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index 852090622a76..bd9f574c2471 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -644,6 +644,35 @@ static ssize_t allow_backup_super_failure_store(struct kobject *debug_kobj, BTRFS_ATTR_RW(debug, allow_backup_super_failure, allow_backup_super_failure_show, allow_backup_super_failure_store); +static ssize_t super_failure_tolerance_show(struct kobject *debug_kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = to_fs_info(debug_kobj->parent); + + ASSERT(fs_info); + return sysfs_emit(buf, "%d\n", + READ_ONCE(fs_info->super_failure_tolerance)); +} + +static ssize_t super_failure_tolerance_store(struct kobject *debug_kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_fs_info *fs_info = to_fs_info(debug_kobj->parent); + u8 new_number; + int ret; + + ASSERT(fs_info); + + ret = kstrtos8(buf, 10, &new_number); + if (ret) + return -EINVAL; + WRITE_ONCE(fs_info->super_failure_tolerance, new_number); + return len; +} +BTRFS_ATTR_RW(debug, super_failure_tolerance, super_failure_tolerance_show, + super_failure_tolerance_store); /* * Per-filesystem runtime debugging exported via sysfs. * @@ -657,6 +686,7 @@ BTRFS_ATTR_RW(debug, allow_backup_super_failure, allow_backup_super_failure_show */ static const struct attribute *btrfs_debug_mount_attrs[] = { BTRFS_ATTR_PTR(debug, allow_backup_super_failure), + BTRFS_ATTR_PTR(debug, super_failure_tolerance), NULL, }; From patchwork Sun Sep 24 06:14:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13396912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5700CCE7A88 for ; Sun, 24 Sep 2023 06:14:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229641AbjIXGOp (ORCPT ); Sun, 24 Sep 2023 02:14:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229549AbjIXGOo (ORCPT ); Sun, 24 Sep 2023 02:14:44 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76F1C101 for ; Sat, 23 Sep 2023 23:14:38 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id EA4A0215EF for ; Sun, 24 Sep 2023 06:14:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1695536076; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2xw8MQq4qcgBfJ1EqWkXguVXaZ+X+XZv/x39UIs2s9E=; b=SMXLvYdzE4L+x9zyOdCWRJ7z86go97o1HgOl+3+Dxbe5o2QP6kLto+Y1P1o8Cc0122xaHi uUtFCwoVdaCvNsmZ8XiX9Ws1sk4Ucd7RJBpTBAMkye7l9WypxOJCL/jTqCinc606BgdM4K 7vjDM3wgbPEBUmL2NYeXhhqZ09vIrFE= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 28BDD138FE for ; Sun, 24 Sep 2023 06:14:35 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id UO4XNsvTD2X+CgAAMHmgww (envelope-from ) for ; Sun, 24 Sep 2023 06:14:35 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 3/3] btrfs: introduce allow_data_failure sysfs interface Date: Sun, 24 Sep 2023 15:44:14 +0930 Message-ID: <9cb5abd136ffd38b357f8acd3f939ea33ee36e42.1695535440.git.wqu@suse.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently if btrfs fails to write data blocks, it will not really cause any great damage, but mostly -EIO for involved writeback functions like fsync() or direct io for that inode. Normally it's not a big deal, but it can be an indicator of a bigger problem (e.g. unreliable hardware). Thus this patch would allow debug builds to toggle if any data writeback failure is allowed" /sys/fs/btrfs//debug/allow_data_failure The entry is read-write, 0 means the fs would not tolerate any data writeback failure, and would falls read-only after such failure. The default value is 1. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 1 + fs/btrfs/extent_io.c | 8 +++++++- fs/btrfs/fs.h | 2 ++ fs/btrfs/inode.c | 9 ++++++++- fs/btrfs/sysfs.c | 32 ++++++++++++++++++++++++++++++++ 5 files changed, 50 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 062e28ac94b1..160f8f6b906d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2723,6 +2723,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info) INIT_LIST_HEAD(&fs_info->allocated_ebs); spin_lock_init(&fs_info->eb_leak_lock); fs_info->allow_backup_super_failure = true; + fs_info->allow_data_failure = true; fs_info->super_failure_tolerance = -1; #endif extent_map_tree_init(&fs_info->mapping_tree); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 5e5852a4ffb5..95725c5027de 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -483,8 +483,14 @@ static void end_bio_extent_writepage(struct btrfs_bio *bbio) bvec->bv_offset, bvec->bv_len); btrfs_finish_ordered_extent(bbio->ordered, page, start, len, !error); - if (error) + if (error) { mapping_set_error(page->mapping, error); + if (!READ_ONCE(fs_info->allow_data_failure)) + btrfs_handle_fs_error(fs_info, -EIO, + "data write back failed, root %lld ino %llu fileoff %llu", + BTRFS_I(inode)->root->root_key.objectid, + btrfs_ino(BTRFS_I(inode)), start); + } btrfs_page_clear_writeback(fs_info, page, start, len); } diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 7608a1cf612f..fa26ae33a29d 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -689,6 +689,8 @@ struct btrfs_fs_info { /* If we allow backup superblocks writeback to fail. */ bool allow_backup_super_failure; + /* If we allow data writeback to fail. */ + bool allow_data_failure; /* * Tolerance on how many devices can fail their superblock writeback. * diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 514d2e8a4f52..4388eeced1bf 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7703,13 +7703,20 @@ static void btrfs_dio_end_io(struct btrfs_bio *bbio) struct btrfs_dio_private *dip = container_of(bbio, struct btrfs_dio_private, bbio); struct btrfs_inode *inode = bbio->inode; + struct btrfs_fs_info *fs_info = inode->root->fs_info; struct bio *bio = &bbio->bio; if (bio->bi_status) { - btrfs_warn(inode->root->fs_info, + btrfs_warn(fs_info, "direct IO failed ino %llu op 0x%0x offset %#llx len %u err no %d", btrfs_ino(inode), bio->bi_opf, dip->file_offset, dip->bytes, bio->bi_status); + if (!READ_ONCE(fs_info->allow_data_failure)) + btrfs_handle_fs_error(fs_info, -EIO, + "direct IO data write back failed, root %lld ino %llu fileoff %llu len %u", + inode->root->root_key.objectid, + btrfs_ino(inode), dip->file_offset, + dip->bytes); } if (btrfs_op(bio) == BTRFS_MAP_WRITE) { diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index bd9f574c2471..a32a7b2d1b7a 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -673,6 +673,37 @@ static ssize_t super_failure_tolerance_store(struct kobject *debug_kobj, } BTRFS_ATTR_RW(debug, super_failure_tolerance, super_failure_tolerance_show, super_failure_tolerance_store); + +static ssize_t allow_data_failure_show(struct kobject *debug_kobj, + struct kobj_attribute *a, + char *buf) +{ + struct btrfs_fs_info *fs_info = to_fs_info(debug_kobj->parent); + + ASSERT(fs_info); + return sysfs_emit(buf, "%d\n", + READ_ONCE(fs_info->allow_data_failure)); +} + +static ssize_t allow_data_failure_store(struct kobject *debug_kobj, + struct kobj_attribute *a, + const char *buf, size_t len) +{ + struct btrfs_fs_info *fs_info = to_fs_info(debug_kobj->parent); + u8 new_number; + int ret; + + ASSERT(fs_info); + + ret = kstrtos8(buf, 10, &new_number); + if (ret) + return -EINVAL; + WRITE_ONCE(fs_info->allow_data_failure, !!new_number); + return len; +} +BTRFS_ATTR_RW(debug, allow_data_failure, allow_data_failure_show, + allow_data_failure_store); + /* * Per-filesystem runtime debugging exported via sysfs. * @@ -686,6 +717,7 @@ BTRFS_ATTR_RW(debug, super_failure_tolerance, super_failure_tolerance_show, */ static const struct attribute *btrfs_debug_mount_attrs[] = { BTRFS_ATTR_PTR(debug, allow_backup_super_failure), + BTRFS_ATTR_PTR(debug, allow_data_failure), BTRFS_ATTR_PTR(debug, super_failure_tolerance), NULL, };