From patchwork Sat Apr 2 01:30:51 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anand Jain X-Patchwork-Id: 8730061 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id BB764C0553 for ; Sat, 2 Apr 2016 01:31:45 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E2BEC2039D for ; Sat, 2 Apr 2016 01:31:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 038152038A for ; Sat, 2 Apr 2016 01:31:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932417AbcDBBbe (ORCPT ); Fri, 1 Apr 2016 21:31:34 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:23952 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932415AbcDBBbc (ORCPT ); Fri, 1 Apr 2016 21:31:32 -0400 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u321VTxk028801 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 2 Apr 2016 01:31:30 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.14.4/8.13.8) with ESMTP id u321VSUR027329 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 2 Apr 2016 01:31:29 GMT Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id u321VSv7021521; Sat, 2 Apr 2016 01:31:28 GMT Received: from arch2.localdomain (/42.60.24.64) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 01 Apr 2016 18:31:28 -0700 From: Anand Jain To: linux-btrfs@vger.kernel.org Cc: yauhen.kharuzhy@zavadatar.com, dsterba@suse.cz Subject: [PATCH 13/13] btrfs: check for failed device and hot replace Date: Sat, 2 Apr 2016 09:30:51 +0800 Message-Id: <1459560651-14809-14-git-send-email-anand.jain@oracle.com> X-Mailer: git-send-email 2.7.0 In-Reply-To: <1459560651-14809-1-git-send-email-anand.jain@oracle.com> References: <1459560651-14809-1-git-send-email-anand.jain@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch checks for failed device and kicks out auto replace, if when user decided to disable auto replace it can be done by future sysfs or future ioctl interface to set fs_info->no_auto_replace parameter to 1. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/disk-io.c | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 47e9cd9dd29a..67bb36bb82ee 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1862,6 +1862,8 @@ struct btrfs_fs_info { struct list_head pinned_chunks; int creating_free_space_tree; + + int no_auto_replace; }; struct btrfs_subvolume_writers { diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b523e56b34e9..f205e7e94948 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1869,6 +1869,38 @@ sleep: return 0; } +static int btrfs_recuperate(struct btrfs_root *root) +{ + int ret; + int found = 0; + struct btrfs_device *device; + struct btrfs_fs_devices *fs_devices; + + fs_devices = root->fs_info->fs_devices; + + mutex_lock(&fs_devices->device_list_mutex); + rcu_read_lock(); + list_for_each_entry_rcu(device, + &fs_devices->devices, dev_list) { + if (device->failed) { + found = 1; + break; + } + } + rcu_read_unlock(); + mutex_unlock(&fs_devices->device_list_mutex); + + /* + * We are using the replace code which should be interrupt-able + * during unmount, and as of now there is no user land stop + * request that we support and this will run until its complete + */ + if (found && !root->fs_info->no_auto_replace) + ret = btrfs_auto_replace_start(root, device); + + return ret; +} + /* * returns: * < 0 : Check didn't run, std error @@ -1944,6 +1976,8 @@ static int health_kthread(void *arg) /* Check devices health */ btrfs_update_devices_health(root); + btrfs_recuperate(root); + mutex_unlock(&root->fs_info->health_mutex); sleep: