From patchwork Mon Aug 26 21:29:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115657 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4C8FB14E5 for ; Mon, 26 Aug 2019 21:30:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 289662186A for ; Mon, 26 Aug 2019 21:30:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="MamS7JeL" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728431AbfHZVaC (ORCPT ); Mon, 26 Aug 2019 17:30:02 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:33536 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728649AbfHZVaC (ORCPT ); Mon, 26 Aug 2019 17:30:02 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLDmsf000886; Mon, 26 Aug 2019 21:30:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=ZyAsPqSRv2iqxiL5+3N3+IWCjON64tVFNYNj2JNfG1E=; b=MamS7JeLF7PF2T6dL4igU+Qlce3TEcMD+S4CDU9oXIGwoXU1yLFqFkKsHPuPSOzpUi4Y HS3Mlxt9BjaecJC7WlOMN7GeyJXK8vmGPrS6pBgEEL8qSo9cNkvYLHWxQWe2uRStW2KF LvxT979ZPljfOh5xoPqx5kvp2MnIrVD9eC3KlXY+T14EmDcQzlQIyA0/U0DLoPhJeFz1 AKwYQ5SIm4Iq3FVj5vZVQswW2WbAUDjsRQCvWowjVSHutk4jtgGipngnRfm8jHdrVAJ0 7iuY/IJX+4YwyixC6MtnxvR9ZMplktvL2hM/bUkQXoZlSIF8W8ui/avQe3iTWRWbVPaV fw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 2umpxx05df-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:00 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLIRPT024992; Mon, 26 Aug 2019 21:29:59 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3030.oracle.com with ESMTP id 2umj1tk6ru-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:29:59 +0000 Received: from abhmp0014.oracle.com (abhmp0014.oracle.com [141.146.116.20]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x7QLTw5A006492; Mon, 26 Aug 2019 21:29:58 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:29:58 -0700 Subject: [PATCH 01/11] xfs_scrub: fix handling of read-verify pool runtime errors From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:29:57 -0700 Message-ID: <156685499722.2841898.17281881491093468208.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Fix some bogosity with how we handle runtime errors in the read verify pool functions. First of all, memory allocation failures shouldn't be recorded as disk IO errors, they should just complain and abort the phase. Second, we need to collect any other runtime errors in the IO thread and abort the phase instead of silently ignoring them. Signed-off-by: Darrick J. Wong --- scrub/read_verify.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 425342b4..573bc4e0 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -53,6 +53,7 @@ struct read_verify_pool { struct disk *disk; /* which disk? */ read_verify_ioerr_fn_t ioerr_fn; /* io error callback */ size_t miniosz; /* minimum io size, bytes */ + int errors_seen; }; /* @@ -91,6 +92,7 @@ read_verify_pool_init( rvp->ctx = ctx; rvp->disk = disk; rvp->ioerr_fn = ioerr_fn; + rvp->errors_seen = false; error = ptvar_alloc(submitter_threads, sizeof(struct read_verify), &rvp->rvstate); if (error) @@ -149,6 +151,7 @@ read_verify( unsigned long long verified = 0; ssize_t sz; ssize_t len; + int ret; rvp = (struct read_verify_pool *)wq->wq_ctx; while (rv->io_length > 0) { @@ -173,7 +176,12 @@ read_verify( } free(rv); - ptcounter_add(rvp->verified_bytes, verified); + ret = ptcounter_add(rvp->verified_bytes, verified); + if (ret) { + str_liberror(rvp->ctx, ret, + _("updating bytes verified counter")); + rvp->errors_seen = true; + } } /* Queue a read verify request. */ @@ -188,18 +196,25 @@ read_verify_queue( dbg_printf("verify fd %d start %"PRIu64" len %"PRIu64"\n", rvp->disk->d_fd, rv->io_start, rv->io_length); + /* Worker thread saw a runtime error, don't queue more. */ + if (rvp->errors_seen) + return false; + + /* Otherwise clone the request and queue the copy. */ tmp = malloc(sizeof(struct read_verify)); if (!tmp) { - rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, - rv->io_length, errno, rv->io_end_arg); - return true; + str_errno(rvp->ctx, _("allocating read-verify request")); + rvp->errors_seen = true; + return false; } + memcpy(tmp, rv, sizeof(*tmp)); ret = workqueue_add(&rvp->wq, read_verify, 0, tmp); if (ret) { str_liberror(rvp->ctx, ret, _("queueing read-verify work")); free(tmp); + rvp->errors_seen = true; return false; } rv->io_length = 0; From patchwork Mon Aug 26 21:30:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115659 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4477914E5 for ; Mon, 26 Aug 2019 21:30:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2260B21881 for ; Mon, 26 Aug 2019 21:30:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="KzcCEqgU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728651AbfHZVaI (ORCPT ); Mon, 26 Aug 2019 17:30:08 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:33650 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728649AbfHZVaI (ORCPT ); Mon, 26 Aug 2019 17:30:08 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLDmif000900; Mon, 26 Aug 2019 21:30:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=c91cMDQaepibcKZys68185Vbwxp/s7TOXxyCzKCGv/E=; b=KzcCEqgUhpLNNFekyrYhpUYLlMykSi70IhAGEukLVfMsxbODR89Zaa1mNVhSlsHMM/ob +iV4wa0nCRgqVcOAc+z0c427srJQTTvOWXCsBSwq9POc6IYlYgkl+eUOUF0wGy9EUe46 H48oHoEO5rx8lzG2vVdDmoXWjrCxzW6SJRYquBomXL6n2LMIH+Ah3c3SdobDnZ3JEOqV 6N5v3OfILXUR86FVkORmoEzKXFDq8vys80i6vMSZguwsebgri42bjzxY3kj1JG99F4uj WeCVsk+CnKIHP/G8CZZ3GXuf8jToPQ9x1kwhryx+mH4sADH0Rvyt8YqzhSxiDwdpLCY7 kQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 2umpxx05dw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:06 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLIRaw025026; Mon, 26 Aug 2019 21:30:05 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3030.oracle.com with ESMTP id 2umj1tk71f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:05 +0000 Received: from abhmp0009.oracle.com (abhmp0009.oracle.com [141.146.116.15]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x7QLU4BL004385; Mon, 26 Aug 2019 21:30:04 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:30:04 -0700 Subject: [PATCH 02/11] xfs_scrub: abort all read verification work immediately on error From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:03 -0700 Message-ID: <156685500339.2841898.8444017253685790369.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add a new abort function to the read verify pool code so that the caller can immediately abort all pending verification work if things start going wrong. There's no point in waiting for queued work to run if we've already decided to bail. Signed-off-by: Darrick J. Wong --- scrub/phase6.c | 6 +++--- scrub/read_verify.c | 10 ++++++++++ scrub/read_verify.h | 1 + 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/scrub/phase6.c b/scrub/phase6.c index 35dda1f9..00b13d34 100644 --- a/scrub/phase6.c +++ b/scrub/phase6.c @@ -514,16 +514,16 @@ _("Could not create data device media verifier.")); out_rtpool: if (vs.rvp_realtime) { - read_verify_pool_flush(vs.rvp_realtime); + read_verify_pool_abort(vs.rvp_realtime); read_verify_pool_destroy(vs.rvp_realtime); } out_logpool: if (vs.rvp_log) { - read_verify_pool_flush(vs.rvp_log); + read_verify_pool_abort(vs.rvp_log); read_verify_pool_destroy(vs.rvp_log); } out_datapool: - read_verify_pool_flush(vs.rvp_data); + read_verify_pool_abort(vs.rvp_data); read_verify_pool_destroy(vs.rvp_data); out_rbad: bitmap_free(&vs.r_bad); diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 573bc4e0..82d4a16a 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -117,6 +117,16 @@ read_verify_pool_init( return NULL; } +/* Abort all verification work. */ +void +read_verify_pool_abort( + struct read_verify_pool *rvp) +{ + if (!rvp->errors_seen) + rvp->errors_seen = ECANCELED; + workqueue_terminate(&rvp->wq); +} + /* Finish up any read verification work. */ void read_verify_pool_flush( diff --git a/scrub/read_verify.h b/scrub/read_verify.h index 5fabe5e0..f0ed8902 100644 --- a/scrub/read_verify.h +++ b/scrub/read_verify.h @@ -19,6 +19,7 @@ struct read_verify_pool *read_verify_pool_init(struct scrub_ctx *ctx, struct disk *disk, size_t miniosz, read_verify_ioerr_fn_t ioerr_fn, unsigned int submitter_threads); +void read_verify_pool_abort(struct read_verify_pool *rvp); void read_verify_pool_flush(struct read_verify_pool *rvp); void read_verify_pool_destroy(struct read_verify_pool *rvp); From patchwork Mon Aug 26 21:30:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115661 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F2BA1395 for ; Mon, 26 Aug 2019 21:30:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5894121872 for ; Mon, 26 Aug 2019 21:30:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="MdQ3dqCy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728653AbfHZVaP (ORCPT ); Mon, 26 Aug 2019 17:30:15 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:33776 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728649AbfHZVaO (ORCPT ); Mon, 26 Aug 2019 17:30:14 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLDofS000962; Mon, 26 Aug 2019 21:30:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=qB4K+TCwkVmv9YAk7FvTYNEO5qk1n74PkYER7E0vKd4=; b=MdQ3dqCyYz57+HumMZWyhyaPXVWIiOS24xoTok/3tw0AGKqIOtUBI4joqf9/K5JXcybM eGtyc/zZS5PFhEHPCQcHRSKCvFYrQUx5XOozuDUvHAVYSvYJuwuB+zOLdkar8BiXo2EW VgqcEBJFZ5g3zMWwfMV+SIAuY/2J3REQ9p0wAoWq8fMsfiLMv5N/w+Iv+TDZfZZpfO6I DhVYmiBaQ/oOcy60PNQn8qBp3UDKh5sViYZfgMLD6VpHiHlbWMKIe6iw/MpJGkPBHCOL GE89GioyQniH0x3n+sSUtbROtovt0eSINO+6dYIFSeaJGiHh8/lUduJGlcZegcjl6imj hQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2umpxx05e8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:12 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLIMZu170042; Mon, 26 Aug 2019 21:30:11 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3020.oracle.com with ESMTP id 2umj2787vj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:11 +0000 Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x7QLUAQJ006643; Mon, 26 Aug 2019 21:30:10 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:30:10 -0700 Subject: [PATCH 03/11] xfs_scrub: fix read-verify pool error communication problems From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:09 -0700 Message-ID: <156685500962.2841898.14630278226060711525.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Fix all the places in the read-verify pool functions either we fail to check for runtime errors or fail to communicate them properly to callers. Then fix all the callers to report the error messages instead of hiding them. Signed-off-by: Darrick J. Wong --- scrub/phase6.c | 89 ++++++++++++++++++++++++++++++++++++--------------- scrub/read_verify.c | 87 ++++++++++++++++++++++---------------------------- scrub/read_verify.h | 16 +++++---- 3 files changed, 109 insertions(+), 83 deletions(-) diff --git a/scrub/phase6.c b/scrub/phase6.c index 00b13d34..010cceef 100644 --- a/scrub/phase6.c +++ b/scrub/phase6.c @@ -387,6 +387,7 @@ xfs_check_rmap( { struct media_verify_state *vs = arg; struct read_verify_pool *rvp; + int ret; rvp = xfs_dev_to_pool(ctx, vs, map->fmr_device); @@ -415,28 +416,48 @@ xfs_check_rmap( /* XXX: Filter out directory data blocks. */ /* Schedule the read verify command for (eventual) running. */ - read_verify_schedule_io(rvp, map->fmr_physical, map->fmr_length, vs); + ret = read_verify_schedule_io(rvp, map->fmr_physical, map->fmr_length, + vs); + if (ret) { + str_liberror(ctx, ret, descr); + return false; + } out: /* Is this the last extent? Fire off the read. */ - if (map->fmr_flags & FMR_OF_LAST) - read_verify_force_io(rvp); + if (map->fmr_flags & FMR_OF_LAST) { + ret = read_verify_force_io(rvp); + if (ret) { + str_liberror(ctx, ret, descr); + return false; + } + } return true; } /* Wait for read/verify actions to finish, then return # bytes checked. */ -static uint64_t +static int clean_pool( - struct read_verify_pool *rvp) + struct read_verify_pool *rvp, + unsigned long long *bytes_checked) { - uint64_t ret; + uint64_t pool_checked; + int ret; if (!rvp) return 0; - read_verify_pool_flush(rvp); - ret = read_verify_bytes(rvp); + ret = read_verify_pool_flush(rvp); + if (ret) + goto out_destroy; + + ret = read_verify_bytes(rvp, &pool_checked); + if (ret) + goto out_destroy; + + *bytes_checked += pool_checked; +out_destroy: read_verify_pool_destroy(rvp); return ret; } @@ -469,43 +490,57 @@ xfs_scan_blocks( goto out_dbad; } - vs.rvp_data = read_verify_pool_init(ctx, ctx->datadev, + ret = read_verify_pool_alloc(ctx, ctx->datadev, ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr, - scrub_nproc(ctx)); - if (!vs.rvp_data) { - str_info(ctx, ctx->mntpoint, -_("Could not create data device media verifier.")); + scrub_nproc(ctx), &vs.rvp_data); + if (ret) { + str_liberror(ctx, ret, _("creating datadev media verifier")); goto out_rbad; } if (ctx->logdev) { - vs.rvp_log = read_verify_pool_init(ctx, ctx->logdev, + ret = read_verify_pool_alloc(ctx, ctx->logdev, ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr, - scrub_nproc(ctx)); - if (!vs.rvp_log) { - str_info(ctx, ctx->mntpoint, - _("Could not create log device media verifier.")); + scrub_nproc(ctx), &vs.rvp_log); + if (ret) { + str_liberror(ctx, ret, + _("creating logdev media verifier")); goto out_datapool; } } if (ctx->rtdev) { - vs.rvp_realtime = read_verify_pool_init(ctx, ctx->rtdev, + ret = read_verify_pool_alloc(ctx, ctx->rtdev, ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr, - scrub_nproc(ctx)); - if (!vs.rvp_realtime) { - str_info(ctx, ctx->mntpoint, - _("Could not create realtime device media verifier.")); + scrub_nproc(ctx), &vs.rvp_realtime); + if (ret) { + str_liberror(ctx, ret, + _("creating rtdev media verifier")); goto out_logpool; } } moveon = xfs_scan_all_spacemaps(ctx, xfs_check_rmap, &vs); if (!moveon) goto out_rtpool; - ctx->bytes_checked += clean_pool(vs.rvp_data); - ctx->bytes_checked += clean_pool(vs.rvp_log); - ctx->bytes_checked += clean_pool(vs.rvp_realtime); + + ret = clean_pool(vs.rvp_data, &ctx->bytes_checked); + if (ret) { + str_liberror(ctx, ret, _("flushing datadev verify pool")); + moveon = false; + } + + ret = clean_pool(vs.rvp_log, &ctx->bytes_checked); + if (ret) { + str_liberror(ctx, ret, _("flushing logdev verify pool")); + moveon = false; + } + + ret = clean_pool(vs.rvp_realtime, &ctx->bytes_checked); + if (ret) { + str_liberror(ctx, ret, _("flushing rtdev verify pool")); + moveon = false; + } /* Scan the whole dir tree to see what matches the bad extents. */ - if (!bitmap_empty(vs.d_bad) || !bitmap_empty(vs.r_bad)) + if (moveon && (!bitmap_empty(vs.d_bad) || !bitmap_empty(vs.r_bad))) moveon = xfs_report_verify_errors(ctx, &vs); bitmap_free(&vs.r_bad); diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 82d4a16a..a964d2cf 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -65,37 +65,37 @@ struct read_verify_pool { * @submitter_threads is the number of threads that may be sending verify * requests at any given time. */ -struct read_verify_pool * -read_verify_pool_init( +int +read_verify_pool_alloc( struct scrub_ctx *ctx, struct disk *disk, size_t miniosz, read_verify_ioerr_fn_t ioerr_fn, - unsigned int submitter_threads) + unsigned int submitter_threads, + struct read_verify_pool **prvp) { struct read_verify_pool *rvp; - bool ret; - int error; + int ret; rvp = calloc(1, sizeof(struct read_verify_pool)); if (!rvp) - return NULL; + return errno; - error = posix_memalign((void **)&rvp->readbuf, page_size, + ret = posix_memalign((void **)&rvp->readbuf, page_size, RVP_IO_MAX_SIZE); - if (error || !rvp->readbuf) + if (ret) goto out_free; - error = ptcounter_alloc(nproc, &rvp->verified_bytes); - if (error) + ret = ptcounter_alloc(nproc, &rvp->verified_bytes); + if (ret) goto out_buf; rvp->miniosz = miniosz; rvp->ctx = ctx; rvp->disk = disk; rvp->ioerr_fn = ioerr_fn; - rvp->errors_seen = false; - error = ptvar_alloc(submitter_threads, sizeof(struct read_verify), + rvp->errors_seen = 0; + ret = ptvar_alloc(submitter_threads, sizeof(struct read_verify), &rvp->rvstate); - if (error) + if (ret) goto out_counter; /* Run in the main thread if we only want one thread. */ if (nproc == 1) @@ -104,7 +104,8 @@ read_verify_pool_init( disk_heads(disk)); if (ret) goto out_rvstate; - return rvp; + *prvp = rvp; + return 0; out_rvstate: ptvar_free(rvp->rvstate); @@ -114,7 +115,7 @@ read_verify_pool_init( free(rvp->readbuf); out_free: free(rvp); - return NULL; + return ret; } /* Abort all verification work. */ @@ -128,11 +129,11 @@ read_verify_pool_abort( } /* Finish up any read verification work. */ -void +int read_verify_pool_flush( struct read_verify_pool *rvp) { - workqueue_terminate(&rvp->wq); + return workqueue_terminate(&rvp->wq); } /* Finish up any read verification work and tear it down. */ @@ -187,15 +188,12 @@ read_verify( free(rv); ret = ptcounter_add(rvp->verified_bytes, verified); - if (ret) { - str_liberror(rvp->ctx, ret, - _("updating bytes verified counter")); - rvp->errors_seen = true; - } + if (ret) + rvp->errors_seen = ret; } /* Queue a read verify request. */ -static bool +static int read_verify_queue( struct read_verify_pool *rvp, struct read_verify *rv) @@ -208,34 +206,33 @@ read_verify_queue( /* Worker thread saw a runtime error, don't queue more. */ if (rvp->errors_seen) - return false; + return rvp->errors_seen; /* Otherwise clone the request and queue the copy. */ tmp = malloc(sizeof(struct read_verify)); if (!tmp) { - str_errno(rvp->ctx, _("allocating read-verify request")); - rvp->errors_seen = true; - return false; + rvp->errors_seen = errno; + return errno; } memcpy(tmp, rv, sizeof(*tmp)); ret = workqueue_add(&rvp->wq, read_verify, 0, tmp); if (ret) { - str_liberror(rvp->ctx, ret, _("queueing read-verify work")); free(tmp); - rvp->errors_seen = true; - return false; + rvp->errors_seen = ret; + return ret; } + rv->io_length = 0; - return true; + return 0; } /* * Issue an IO request. We'll batch subsequent requests if they're * within 64k of each other */ -bool +int read_verify_schedule_io( struct read_verify_pool *rvp, uint64_t start, @@ -250,7 +247,7 @@ read_verify_schedule_io( assert(rvp->readbuf); rv = ptvar_get(rvp->rvstate, &ret); if (ret) - return false; + return ret; req_end = start + length; rv_end = rv->io_start + rv->io_length; @@ -277,38 +274,32 @@ read_verify_schedule_io( rv->io_end_arg = end_arg; } - return true; + return 0; } /* Force any stashed IOs into the verifier. */ -bool +int read_verify_force_io( struct read_verify_pool *rvp) { struct read_verify *rv; - bool moveon; int ret; assert(rvp->readbuf); rv = ptvar_get(rvp->rvstate, &ret); if (ret) - return false; + return ret; if (rv->io_length == 0) - return true; + return 0; - moveon = read_verify_queue(rvp, rv); - if (moveon) - rv->io_length = 0; - return moveon; + return read_verify_queue(rvp, rv); } /* How many bytes has this process verified? */ -uint64_t +int read_verify_bytes( - struct read_verify_pool *rvp) + struct read_verify_pool *rvp, + uint64_t *bytes_checked) { - uint64_t ret; - - ptcounter_value(rvp->verified_bytes, &ret); - return ret; + return ptcounter_value(rvp->verified_bytes, bytes_checked); } diff --git a/scrub/read_verify.h b/scrub/read_verify.h index f0ed8902..650c46d4 100644 --- a/scrub/read_verify.h +++ b/scrub/read_verify.h @@ -15,17 +15,17 @@ typedef void (*read_verify_ioerr_fn_t)(struct scrub_ctx *ctx, struct disk *disk, uint64_t start, uint64_t length, int error, void *arg); -struct read_verify_pool *read_verify_pool_init(struct scrub_ctx *ctx, - struct disk *disk, size_t miniosz, - read_verify_ioerr_fn_t ioerr_fn, - unsigned int submitter_threads); +int read_verify_pool_alloc(struct scrub_ctx *ctx, struct disk *disk, + size_t miniosz, read_verify_ioerr_fn_t ioerr_fn, + unsigned int submitter_threads, + struct read_verify_pool **prvp); void read_verify_pool_abort(struct read_verify_pool *rvp); -void read_verify_pool_flush(struct read_verify_pool *rvp); +int read_verify_pool_flush(struct read_verify_pool *rvp); void read_verify_pool_destroy(struct read_verify_pool *rvp); -bool read_verify_schedule_io(struct read_verify_pool *rvp, uint64_t start, +int read_verify_schedule_io(struct read_verify_pool *rvp, uint64_t start, uint64_t length, void *end_arg); -bool read_verify_force_io(struct read_verify_pool *rvp); -uint64_t read_verify_bytes(struct read_verify_pool *rvp); +int read_verify_force_io(struct read_verify_pool *rvp); +int read_verify_bytes(struct read_verify_pool *rvp, uint64_t *bytes); #endif /* XFS_SCRUB_READ_VERIFY_H_ */ From patchwork Mon Aug 26 21:30:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115663 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 962C214E5 for ; Mon, 26 Aug 2019 21:30:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 73A7421872 for ; Mon, 26 Aug 2019 21:30:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Gpcy8PyQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728659AbfHZVaU (ORCPT ); Mon, 26 Aug 2019 17:30:20 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:33898 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728649AbfHZVaT (ORCPT ); Mon, 26 Aug 2019 17:30:19 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLE0wK001361; Mon, 26 Aug 2019 21:30:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=pGHXMFv0ROQpLHMeRT4YUg4u9k2B2zcBm2QEqZ6Fz6M=; b=Gpcy8PyQibWObGPTfhBPapaqTxfzxpT55LeQjJc84uRDFZ0FHRqe7lQuNuBhUc+oDYZh bhN+DkRG7nACPNPF8LLOWIy4x9gdStmTvE73J618eJSTRIuI7YjDjJDQYRGHTmBwG4hc 6lXIQZJAmUH44DlKJZYdnm+k08voQ3Gnu4I0hoKVeR3xCAbDdy2GZNDmnigtGrGB0JYy bTVXP0qY83ejUjczaTHdEBE4KU9WrS69eAPWY98et3M1eg7ETuPDm+tavjG2RevNbrdm RdS1LTBPT8dbHij3c4rAUaKtc5U7Kd2QawBmF9prRQLdbG+MxQ8ojxiy9ICJhTzzonvU 8Q== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 2umpxx05em-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:17 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLIdAp169685; Mon, 26 Aug 2019 21:30:17 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3030.oracle.com with ESMTP id 2umhu7x0hj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:17 +0000 Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x7QLUHs2006767; Mon, 26 Aug 2019 21:30:17 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 21:30:16 +0000 Subject: [PATCH 04/11] xfs_scrub: fix queue-and-stash of non-contiguous verify requests From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:15 -0700 Message-ID: <156685501582.2841898.903114008316471296.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong read_verify_schedule_io is supposed to have the ability to decide that a retained aggregate extent verification request is not sufficiently contiguous with the request that is being scheduled, and therefore it needs to queue the retained request and use the new request to start building a new aggregate request. Unfortunately, it stupidly returns after queueing the IO, so we lose the incoming request. Fix the code so we only do that if there's a run time error. Signed-off-by: Darrick J. Wong --- scrub/read_verify.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/scrub/read_verify.c b/scrub/read_verify.c index a964d2cf..3bc56bdc 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -265,8 +265,13 @@ read_verify_schedule_io( rv->io_length = max(req_end, rv_end) - rv->io_start; } else { /* Otherwise, issue the stashed IO (if there is one) */ - if (rv->io_length > 0) - return read_verify_queue(rvp, rv); + if (rv->io_length > 0) { + int res; + + res = read_verify_queue(rvp, rv); + if (res) + return res; + } /* Stash the new IO. */ rv->io_start = start; From patchwork Mon Aug 26 21:30:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115665 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA7D914E5 for ; Mon, 26 Aug 2019 21:30:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A92E821881 for ; Mon, 26 Aug 2019 21:30:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="YbmAFoF1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728433AbfHZVa2 (ORCPT ); Mon, 26 Aug 2019 17:30:28 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:34042 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728402AbfHZVa2 (ORCPT ); Mon, 26 Aug 2019 17:30:28 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLDlZl000848; Mon, 26 Aug 2019 21:30:26 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=cGTv/Iq0TVJkO8gRRDBYnEryVKmR+1L1vwLi0Fq8xXA=; b=YbmAFoF1n+TV7K1z7Q97//awEEtvqS2+cFOPnXUsPbwpLvSm4DFrM8ywT02dVaAlHEO/ hm78OEL/sbt2dM61zOKD1l2R1YAmkZZA/XksyeHZpucrtc50KkBW1VtggyhiWpur2Uc3 snmmdIdw8tyE8ttgcJmXHjKpqT+TgDxQi3M/vFS3d68kyNG8wgNjn659LgXyj2BPyGvd b5LJ7gA/zpR+rWs9mjunVGRSRPKKMhkeEUiahmFBiltwJmBC6J5GR3HPoD+5l5FLeNj8 U+mbT6U4lybo8hA8y4UdCD8+Hj6ihcE38xu709yjADI5otneXJcJZJyR2OdNzrzx5bXH lQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2120.oracle.com with ESMTP id 2umpxx05f1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:26 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLItF9185000; Mon, 26 Aug 2019 21:30:25 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 2umj2xvxsv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:25 +0000 Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x7QLUOso004638; Mon, 26 Aug 2019 21:30:25 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:30:23 -0700 Subject: [PATCH 05/11] xfs_scrub: only call read_verify_force_io once per pool From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:22 -0700 Message-ID: <156685502209.2841898.17592574499659592500.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong There's no reason we need to call read_verify_force_io every AG; we can just let the request aggregation code do its thing and push when we're totally done browsing the fsmap information. Signed-off-by: Darrick J. Wong --- scrub/phase6.c | 16 +++++----------- scrub/read_verify.c | 26 +++++++++++++++++--------- 2 files changed, 22 insertions(+), 20 deletions(-) diff --git a/scrub/phase6.c b/scrub/phase6.c index 010cceef..86d848a0 100644 --- a/scrub/phase6.c +++ b/scrub/phase6.c @@ -411,7 +411,7 @@ xfs_check_rmap( */ if (map->fmr_flags & (FMR_OF_PREALLOC | FMR_OF_ATTR_FORK | FMR_OF_EXTENT_MAP | FMR_OF_SPECIAL_OWNER)) - goto out; + return true; /* XXX: Filter out directory data blocks. */ @@ -423,16 +423,6 @@ xfs_check_rmap( return false; } -out: - /* Is this the last extent? Fire off the read. */ - if (map->fmr_flags & FMR_OF_LAST) { - ret = read_verify_force_io(rvp); - if (ret) { - str_liberror(ctx, ret, descr); - return false; - } - } - return true; } @@ -448,6 +438,10 @@ clean_pool( if (!rvp) return 0; + ret = read_verify_force_io(rvp); + if (ret) + return ret; + ret = read_verify_pool_flush(rvp); if (ret) goto out_destroy; diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 3bc56bdc..7cfe834c 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -282,22 +282,30 @@ read_verify_schedule_io( return 0; } +/* Force any per-thread stashed IOs into the verifier. */ +static int +force_one_io( + struct ptvar *ptv, + void *data, + void *foreach_arg) +{ + struct read_verify_pool *rvp = foreach_arg; + struct read_verify *rv = data; + + if (rv->io_length == 0) + return 0; + + return read_verify_queue(rvp, rv); +} + /* Force any stashed IOs into the verifier. */ int read_verify_force_io( struct read_verify_pool *rvp) { - struct read_verify *rv; - int ret; - assert(rvp->readbuf); - rv = ptvar_get(rvp->rvstate, &ret); - if (ret) - return ret; - if (rv->io_length == 0) - return 0; - return read_verify_queue(rvp, rv); + return ptvar_foreach(rvp->rvstate, force_one_io, rvp); } /* How many bytes has this process verified? */ From patchwork Mon Aug 26 21:30:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115667 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EF0971395 for ; Mon, 26 Aug 2019 21:30:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C3D7521881 for ; Mon, 26 Aug 2019 21:30:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="WK3DxMh5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728537AbfHZVag (ORCPT ); Mon, 26 Aug 2019 17:30:36 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:34186 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728402AbfHZVag (ORCPT ); Mon, 26 Aug 2019 17:30:36 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLDl9M000864; Mon, 26 Aug 2019 21:30:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=LYq1cn71QfBIPgVRv6wQtDmfV6hDom94Lr5EAinaMs4=; b=WK3DxMh5vN1UVN1fYGVEpUrQU9suZhQj5eUcY4KnyGygNt/YMFfXlgE8bl49k7wFJQfF Ydc2aX2sqxQxOkCJdcsOcD82uZXyhO5njlLoiG6fpOiDIhkEpDKvui32k8sqDCBBKrDp 3hc9IGjouxpYa4qvK9Ac+SGhkSysBNSq3kIPnf6+GEOHezHkDWroDmHZX8wLa8JX1pPs fBGjJ1RORjpgRrjVGX6SLELhAeQjnc699Et0ed5udukMgJK6HmLbdRvfly7R9u4Zi6xp 31xjXx7D26qT2MhvnGISQni04/xK+FU5f23wNYAbUdRwGEuK813qLEaTsw4k5/NtpXtK MQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 2umpxx05fr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:34 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLIR0Q025060; Mon, 26 Aug 2019 21:30:33 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 2umj1tk7un-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:33 +0000 Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x7QLUW11029045; Mon, 26 Aug 2019 21:30:32 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:30:32 -0700 Subject: [PATCH 06/11] xfs_scrub: refactor inode prefix rendering code From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:29 -0700 Message-ID: <156685502969.2841898.11326261977295341282.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Refactor all the places in the code where we try to render an inode number as a prefix for some sort of status message. This will help make message prefixes more consistent, which should help users to locate broken metadata. Signed-off-by: Darrick J. Wong --- scrub/common.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ scrub/common.h | 6 ++++++ scrub/inodes.c | 4 ++-- scrub/phase3.c | 8 ++------ scrub/phase5.c | 8 ++------ scrub/phase6.c | 6 +++--- scrub/scrub.c | 20 +++++++++++--------- 7 files changed, 73 insertions(+), 26 deletions(-) diff --git a/scrub/common.c b/scrub/common.c index 1cd2b7ba..fdbbf294 100644 --- a/scrub/common.c +++ b/scrub/common.c @@ -354,3 +354,50 @@ within_range( return true; } + +/* + * Render an inode number as both the raw inode number and as an AG number + * and AG inode pair. This is intended for use with status message reporting. + * If @format is not NULL, it should provide any desired leading whitespace. + * + * For example, "inode 287232352 (13/352) : " + */ +int +xfs_scrub_render_ino_suffix( + const struct scrub_ctx *ctx, + char *buf, + size_t buflen, + uint64_t ino, + uint32_t gen, + const char *format, + ...) +{ + va_list args; + uint32_t agno; + uint32_t agino; + int ret; + + agno = xfrog_ino_to_agno(&ctx->mnt, ino); + agino = xfrog_ino_to_agino(&ctx->mnt, ino); + ret = snprintf(buf, buflen, _("inode %"PRIu64" (%"PRIu32"/%"PRIu32")"), + ino, agno, agino); + if (ret < 0 || ret >= buflen || format == NULL) + return ret; + + va_start(args, format); + ret += vsnprintf(buf + ret, buflen - ret, format, args); + va_end(args); + return ret; +} + +/* Render an inode number for message reporting with no suffix. */ +int +xfs_scrub_render_ino( + const struct scrub_ctx *ctx, + char *buf, + size_t buflen, + uint64_t ino, + uint32_t gen) +{ + return xfs_scrub_render_ino_suffix(ctx, buf, buflen, ino, gen, NULL); +} diff --git a/scrub/common.h b/scrub/common.h index 33555891..b34cb4a6 100644 --- a/scrub/common.h +++ b/scrub/common.h @@ -86,4 +86,10 @@ bool within_range(struct scrub_ctx *ctx, unsigned long long value, unsigned long long desired, unsigned long long abs_threshold, unsigned int n, unsigned int d, const char *descr); +int xfs_scrub_render_ino_suffix(const struct scrub_ctx *ctx, char *buf, + size_t buflen, uint64_t ino, uint32_t gen, + const char *format, ...); +int xfs_scrub_render_ino(const struct scrub_ctx *ctx, char *buf, + size_t buflen, uint64_t ino, uint32_t gen); + #endif /* XFS_SCRUB_COMMON_H_ */ diff --git a/scrub/inodes.c b/scrub/inodes.c index ef12a692..8f4c5e83 100644 --- a/scrub/inodes.c +++ b/scrub/inodes.c @@ -156,8 +156,8 @@ xfs_iterate_inodes_ag( ireq->hdr.ino = inogrp->xi_startino; goto igrp_retry; } - snprintf(idescr, DESCR_BUFSZ, "inode %"PRIu64, - (uint64_t)bs->bs_ino); + xfs_scrub_render_ino(ctx, idescr, DESCR_BUFSZ, + bs->bs_ino, bs->bs_gen); str_info(ctx, idescr, _("Changed too many times during scan; giving up.")); break; diff --git a/scrub/phase3.c b/scrub/phase3.c index 399f0e92..72e67d47 100644 --- a/scrub/phase3.c +++ b/scrub/phase3.c @@ -48,14 +48,10 @@ xfs_scrub_inode_vfs_error( struct xfs_bulkstat *bstat) { char descr[DESCR_BUFSZ]; - xfs_agnumber_t agno; - xfs_agino_t agino; int old_errno = errno; - agno = xfrog_ino_to_agno(&ctx->mnt, bstat->bs_ino); - agino = xfrog_ino_to_agino(&ctx->mnt, bstat->bs_ino); - snprintf(descr, DESCR_BUFSZ, _("inode %"PRIu64" (%u/%u)"), - (uint64_t)bstat->bs_ino, agno, agino); + xfs_scrub_render_ino(ctx, descr, DESCR_BUFSZ, bstat->bs_ino, + bstat->bs_gen); errno = old_errno; str_errno(ctx, descr); } diff --git a/scrub/phase5.c b/scrub/phase5.c index 335b0d19..224081d5 100644 --- a/scrub/phase5.c +++ b/scrub/phase5.c @@ -234,15 +234,11 @@ xfs_scrub_connections( bool *pmoveon = arg; char descr[DESCR_BUFSZ]; bool moveon = true; - xfs_agnumber_t agno; - xfs_agino_t agino; int fd = -1; int error; - agno = xfrog_ino_to_agno(&ctx->mnt, bstat->bs_ino); - agino = xfrog_ino_to_agino(&ctx->mnt, bstat->bs_ino); - snprintf(descr, DESCR_BUFSZ, _("inode %"PRIu64" (%u/%u)"), - (uint64_t)bstat->bs_ino, agno, agino); + xfs_scrub_render_ino(ctx, descr, DESCR_BUFSZ, bstat->bs_ino, + bstat->bs_gen); background_sleep(); /* Warn about naming problems in xattrs. */ diff --git a/scrub/phase6.c b/scrub/phase6.c index 86d848a0..fec25c31 100644 --- a/scrub/phase6.c +++ b/scrub/phase6.c @@ -180,15 +180,15 @@ xfs_report_verify_inode( int fd; int error; - snprintf(descr, DESCR_BUFSZ, _("inode %"PRIu64" (unlinked)"), - (uint64_t)bstat->bs_ino); - /* Ignore linked files and things we can't open. */ if (bstat->bs_nlink != 0) return 0; if (!S_ISREG(bstat->bs_mode) && !S_ISDIR(bstat->bs_mode)) return 0; + xfs_scrub_render_ino_suffix(ctx, descr, DESCR_BUFSZ, + bstat->bs_ino, bstat->bs_gen, _(" (unlinked)")); + /* Try to open the inode. */ fd = xfs_open_handle(handle); if (fd < 0) { diff --git a/scrub/scrub.c b/scrub/scrub.c index a428b524..82beb7ad 100644 --- a/scrub/scrub.c +++ b/scrub/scrub.c @@ -92,24 +92,26 @@ static const struct scrub_descr scrubbers[XFS_SCRUB_TYPE_NR] = { /* Format a scrub description. */ static void format_scrub_descr( + struct scrub_ctx *ctx, char *buf, size_t buflen, - struct xfs_scrub_metadata *meta, - const struct scrub_descr *sc) + struct xfs_scrub_metadata *meta) { - switch (sc->type) { + const struct scrub_descr *sd = &scrubbers[meta->sm_type]; + + switch (sd->type) { case ST_AGHEADER: case ST_PERAG: snprintf(buf, buflen, _("AG %u %s"), meta->sm_agno, - _(sc->name)); + _(sd->name)); break; case ST_INODE: - snprintf(buf, buflen, _("Inode %"PRIu64" %s"), - (uint64_t)meta->sm_ino, _(sc->name)); + xfs_scrub_render_ino_suffix(ctx, buf, buflen, + meta->sm_ino, meta->sm_gen, " %s", _(sd->name)); break; case ST_FS: case ST_SUMMARY: - snprintf(buf, buflen, _("%s"), _(sc->name)); + snprintf(buf, buflen, _("%s"), _(sd->name)); break; case ST_NONE: assert(0); @@ -191,7 +193,7 @@ xfs_check_metadata( assert(!debug_tweak_on("XFS_SCRUB_NO_KERNEL")); assert(meta->sm_type < XFS_SCRUB_TYPE_NR); - format_scrub_descr(buf, DESCR_BUFSZ, meta, &scrubbers[meta->sm_type]); + format_scrub_descr(ctx, buf, DESCR_BUFSZ, meta); dbg_printf("check %s flags %xh\n", buf, meta->sm_flags); retry: @@ -749,7 +751,7 @@ xfs_repair_metadata( return CHECK_RETRY; memcpy(&oldm, &meta, sizeof(oldm)); - format_scrub_descr(buf, DESCR_BUFSZ, &meta, &scrubbers[meta.sm_type]); + format_scrub_descr(ctx, buf, DESCR_BUFSZ, &meta); if (needs_repair(&meta)) str_info(ctx, buf, _("Attempting repair.")); From patchwork Mon Aug 26 21:30:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115669 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D011014E5 for ; Mon, 26 Aug 2019 21:30:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AB54821872 for ; Mon, 26 Aug 2019 21:30:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="q2i337P+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728649AbfHZVan (ORCPT ); Mon, 26 Aug 2019 17:30:43 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:52298 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728534AbfHZVan (ORCPT ); Mon, 26 Aug 2019 17:30:43 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLQuMk003368; Mon, 26 Aug 2019 21:30:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=OYdXw+TsXOiJKJwE+vinik4Ds/Orja/ERK4rSUl3XLo=; b=q2i337P+eI86jHg3zM+TJbEPB+xUYAIxK8XxWEv2KWnY8y7ImdcrUuhUJgxMq3V52iQu OE7cNOuKPbELXlr8Nn6RQR9fgqJuacz8pHT+ELtvpePS1dKFNQCJASQb1olUPKB/KgLU sBBQpChBFGMnIpk/IOi5G8QJOfFFO8ZG31JMCs9T1nO4qq2qktWOJ3Vat10iczmPjeS1 acN7adThi3N/vt9WrwRPd2kcgluE8rdYpiKwQ9zBKyoNQZL4AhSatls+p64/PQr6jxmW DZQRYU/AvR9RJetuv5PVahmImTdGZC59ZTynC6uMFrUgt01zln8YTmsgVoJmbgKN7OE8 3Q== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 2umqbe80h4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:40 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLItqT184976; Mon, 26 Aug 2019 21:30:40 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3020.oracle.com with ESMTP id 2umj2xvxyq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:39 +0000 Received: from abhmp0020.oracle.com (abhmp0020.oracle.com [141.146.116.26]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x7QLUcs2029105; Mon, 26 Aug 2019 21:30:39 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:30:38 -0700 Subject: [PATCH 07/11] xfs_scrub: record disk LBA size From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:37 -0700 Message-ID: <156685503753.2841898.8227493950009379521.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Remember the size (in bytes) of a logical block on the disk. We'll use this in subsequent patches to improve the ability of media scans to report on which files are corrupt. Signed-off-by: Darrick J. Wong --- scrub/disk.c | 7 +++---- scrub/disk.h | 3 ++- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/scrub/disk.c b/scrub/disk.c index dd109533..b728d1c7 100644 --- a/scrub/disk.c +++ b/scrub/disk.c @@ -193,7 +193,6 @@ disk_open( #endif struct disk *disk; bool suspicious_disk = false; - int lba_sz; int error; disk = calloc(1, sizeof(struct disk)); @@ -205,10 +204,10 @@ disk_open( goto out_free; /* Try to get LBA size. */ - error = ioctl(disk->d_fd, BLKSSZGET, &lba_sz); + error = ioctl(disk->d_fd, BLKSSZGET, &disk->d_lbasize); if (error) - lba_sz = 512; - disk->d_lbalog = log2_roundup(lba_sz); + disk->d_lbasize = 512; + disk->d_lbalog = log2_roundup(disk->d_lbasize); /* Obtain disk's stat info. */ error = fstat(disk->d_fd, &disk->d_sb); diff --git a/scrub/disk.h b/scrub/disk.h index 74a26d98..36bfb826 100644 --- a/scrub/disk.h +++ b/scrub/disk.h @@ -10,7 +10,8 @@ struct disk { struct stat d_sb; int d_fd; - int d_lbalog; + unsigned int d_lbalog; + unsigned int d_lbasize; /* bytes */ unsigned int d_flags; unsigned int d_blksize; /* bytes */ uint64_t d_size; /* bytes */ From patchwork Mon Aug 26 21:30:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115671 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4B40214E5 for ; Mon, 26 Aug 2019 21:30:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2925C21881 for ; Mon, 26 Aug 2019 21:30:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="QakP/NIO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728660AbfHZVas (ORCPT ); Mon, 26 Aug 2019 17:30:48 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:34366 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728534AbfHZVas (ORCPT ); Mon, 26 Aug 2019 17:30:48 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLDlEB000858; Mon, 26 Aug 2019 21:30:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=f3CpY2z6BisrlfWud3QPLJlU2AmSUAesQaVMk2ohINk=; b=QakP/NIOmJs+b7NkZpM4v4ZIWo/YRJPFRL7QCIwtHHSt94RID7dkzlohlnig5Z70RC9R 4uS+8uc6CSTWXK3rrOS+zih0ZjGtekvlvc2ZQsOWqG9jiduFj+NnaHORxk6mkmiTnmWe jw+D60LwfjJusLYAzLNK40Ti8JfCgDWbuUYGfNh+OsnwEduXr+DJxvaaqTvOSv1TMRB2 QqNSY9XFIOg1imI8QiUKucS0OrrQpddutbyF3SzeCyyxvU3G1HdbYtAEHddoLAaaBVEh acbqCLRV26EqW+Kaq62L5sx/xOQgPkZRwtk9XRi3FS1NL0iRmsjvN1EwSZrzkD0FoZeB YA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2120.oracle.com with ESMTP id 2umpxx05gg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:46 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLIt7t185028; Mon, 26 Aug 2019 21:30:46 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 2umj2xvy1d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:46 +0000 Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x7QLUjKi029544; Mon, 26 Aug 2019 21:30:45 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:30:44 -0700 Subject: [PATCH 08/11] xfs_scrub: enforce read verify pool minimum io size From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:43 -0700 Message-ID: <156685504387.2841898.7841707499738990491.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Make sure we always issue media verification requests aligned to the minimum IO size that the caller cares about. Concretely, this means that we only care about doing IO in filesystem block-sized chunks. Signed-off-by: Darrick J. Wong --- scrub/read_verify.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 7cfe834c..7cac0a0f 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -77,6 +77,15 @@ read_verify_pool_alloc( struct read_verify_pool *rvp; int ret; + /* + * The minimum IO size must be a multiple of the disk sector size + * and a factor of the max io size. + */ + if (miniosz % disk->d_lbasize) + return EINVAL; + if (RVP_IO_MAX_SIZE % miniosz) + return EINVAL; + rvp = calloc(1, sizeof(struct read_verify_pool)); if (!rvp) return errno; @@ -245,6 +254,11 @@ read_verify_schedule_io( int ret; assert(rvp->readbuf); + + /* Round up and down to the start of a miniosz chunk. */ + start &= ~(rvp->miniosz - 1); + length = roundup(length, rvp->miniosz); + rv = ptvar_get(rvp->rvstate, &ret); if (ret) return ret; From patchwork Mon Aug 26 21:30:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115673 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 070531395 for ; Mon, 26 Aug 2019 21:30:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D8A7F2186A for ; Mon, 26 Aug 2019 21:30:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Qly/ohL5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728661AbfHZVaz (ORCPT ); Mon, 26 Aug 2019 17:30:55 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:50448 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728534AbfHZVaz (ORCPT ); Mon, 26 Aug 2019 17:30:55 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLFSOD162375; Mon, 26 Aug 2019 21:30:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=lyuSMxcaA4HvNDpSU6V/V2Hwd+1dSiaG5oKRiPlGTeY=; b=Qly/ohL57pbZNQ1szx2aT1biCS/4yRbUadrJh0YOTV9ZfMVWKRF9AQgNvcQAz5MzBzUv e/yxjDB6lButqZlx/MlJWa3poLH3rjp7pWdi4c8qx+LOcUJ6MYvDIjWhvnca5i9OIWC0 wvROdzN/8xCLrS11Z+yorPA8U9x26GbP97bA83VDm8Q69VDk8tQ8Icl9K9VhP2his9YB NkIDeSPQLexmJjpFB5vm5YT3hSxx4Q3HQTMFS7zFlM9WzMzb4buiNd4t3btG84N4eh20 4MTpmXhKDx6eG8kVYXiL6llitJFtFR5zNg34QSCz3sz8G600urtTmVuGYimysi1Upuu9 zw== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 2umq5t8290-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:53 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLItfV185080; Mon, 26 Aug 2019 21:30:52 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 2umj2xvy43-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:52 +0000 Received: from abhmp0009.oracle.com (abhmp0009.oracle.com [141.146.116.15]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x7QLUpNq029695; Mon, 26 Aug 2019 21:30:51 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:30:50 -0700 Subject: [PATCH 09/11] xfs_scrub: return bytes verified from a SCSI VERIFY command From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:50 -0700 Message-ID: <156685504999.2841898.3845253676802836561.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Since disk_scsi_verify and pread are interchangeably called from disk_read_verify(), we must return the number of bytes verified (or -1) just like what pread returns. This doesn't matter now due to bugs in scrub, but we're about to fix those bugs. Signed-off-by: Darrick J. Wong --- scrub/disk.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scrub/disk.c b/scrub/disk.c index b728d1c7..e34a8217 100644 --- a/scrub/disk.c +++ b/scrub/disk.c @@ -144,7 +144,7 @@ disk_scsi_verify( iohdr.timeout = 30000; /* 30s */ error = ioctl(disk->d_fd, SG_IO, &iohdr); - if (error) + if (error < 0) return error; dbg_printf("VERIFY(16) fd %d lba %"PRIu64" len %"PRIu64" info %x " @@ -163,7 +163,7 @@ disk_scsi_verify( return -1; } - return error; + return blockcount << BBSHIFT; } #else # define disk_scsi_verify(...) (ENOTTY) From patchwork Mon Aug 26 21:30:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115675 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 75F981395 for ; Mon, 26 Aug 2019 21:31:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 48F1021872 for ; Mon, 26 Aug 2019 21:31:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="AZBy7XCp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728695AbfHZVbB (ORCPT ); Mon, 26 Aug 2019 17:31:01 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:34610 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728673AbfHZVbA (ORCPT ); Mon, 26 Aug 2019 17:31:00 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLDmWT000871; Mon, 26 Aug 2019 21:30:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=AansQV6jMlZaOuIZ/Wt2Yw3xrn9oKOMothHhmOtXZWg=; b=AZBy7XCp9fHQkXZozOZhD3rbFHldIN74p8uzn45mItXtfKm6OulABpe2P9XiaoaFEiab cv6z/QcELBqmMfAuAtmo6CUZemgio1bYvsX3Thwf5hwnTSRCNbNCXd7u6J8AzMzyXnjv i5DLM4wd7G/rOroNc+JTHAl6arBRjtN2gIi+ZUSiq7T+qH+zLWa7Bv3zXPkXTL2cbuos floMHzfX643WAckujPpLc/GzdRr+zN9AJejZKwfa/KhOktO/NHtYgTG4V5wUU0wFC8Tf jq7q+aaXtJ4lQJZchPGLrmms2yNdKJLY/WGzkAMpjf1gMug+BOPZWAqIRFeC3hX8kRWe lw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2umpxx05hg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:58 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLIIDR169934; Mon, 26 Aug 2019 21:30:58 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3020.oracle.com with ESMTP id 2umj278944-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:30:58 +0000 Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x7QLUva6029784; Mon, 26 Aug 2019 21:30:57 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:30:57 -0700 Subject: [PATCH 10/11] xfs_scrub: fix read verify disk error handling strategy From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:30:56 -0700 Message-ID: <156685505612.2841898.2351403401391746984.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The error handling strategy for media errors is totally bogus. First of all, short reads are entirely unhandled -- when we encounter a short read, we know the disk was able to feed us the beginning of what we asked for, so we need to single-step through the remainder to try to capture the exact error that we hit. Second, an actual IO error causes the entire region to be marked bad even though it could be just a few MB of a multi-gigabyte extent that's bad. Therefore, single-step each block in the IO request until we stop getting IO errors to find out if all the blocks are bad or if it's just that extent. Third, fix the fact that the loop updates its own counter variables with the length fed to read(), which doesn't necessarily have anything to do with the amount of data that the read actually produced. Signed-off-by: Darrick J. Wong --- scrub/read_verify.c | 86 ++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 74 insertions(+), 12 deletions(-) diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 7cac0a0f..bab8411a 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -169,30 +169,92 @@ read_verify( struct read_verify *rv = arg; struct read_verify_pool *rvp; unsigned long long verified = 0; + ssize_t io_max_size; ssize_t sz; ssize_t len; + int io_error; int ret; rvp = (struct read_verify_pool *)wq->wq_ctx; + if (rvp->errors_seen) + return; + + io_max_size = RVP_IO_MAX_SIZE; + while (rv->io_length > 0) { - len = min(rv->io_length, RVP_IO_MAX_SIZE); + io_error = 0; + len = min(rv->io_length, io_max_size); dbg_printf("diskverify %d %"PRIu64" %zu\n", rvp->disk->d_fd, rv->io_start, len); sz = disk_read_verify(rvp->disk, rvp->readbuf, rv->io_start, len); - if (sz < 0) { - dbg_printf("IOERR %d %"PRIu64" %zu\n", - rvp->disk->d_fd, rv->io_start, len); - /* IO error, so try the next logical block. */ - len = rvp->miniosz; - rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, len, - errno, rv->io_end_arg); + if (sz == len && io_max_size < rvp->miniosz) { + /* + * If the verify request was 100% successful and less + * than a single block in length, we were trying to + * read to the end of a block after a short read. That + * suggests there's something funny with this device, + * so single-step our way through the rest of the @rv + * range. + */ + io_max_size = rvp->miniosz; + } else if (sz < 0) { + io_error = errno; + + /* Runtime error, bail out... */ + if (io_error != EIO && io_error != EILSEQ) { + rvp->errors_seen = io_error; + return; + } + + /* + * A direct read encountered an error while performing + * a multi-block read. Reduce the transfer size to a + * single block so that we can identify the exact range + * of bad blocks and good blocks. We single-step all + * the way to the end of the @rv range, (re)starting + * with the block that just failed. + */ + if (io_max_size > rvp->miniosz) { + io_max_size = rvp->miniosz; + continue; + } + + /* + * A direct read hit an error while we were stepping + * through single blocks. Mark everything bad from + * io_start to the next miniosz block. + */ + sz = rvp->miniosz - (rv->io_start % rvp->miniosz); + dbg_printf("IOERR %d @ %"PRIu64" %zu err %d\n", + rvp->disk->d_fd, rv->io_start, sz, + io_error); + rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, sz, + io_error, rv->io_end_arg); + } else if (sz < len) { + /* + * A short direct read suggests that we might have hit + * an IO error midway through the read but still had to + * return the number of bytes that were actually read. + * + * We need to force an EIO, so try reading the rest of + * the block (if it was a partial block read) or the + * next full block. + */ + io_max_size = rvp->miniosz - (sz % rvp->miniosz); + dbg_printf("SHORT %d READ @ %"PRIu64" %zu try for %zd\n", + rvp->disk->d_fd, rv->io_start, sz, + io_max_size); + } else { + /* We should never get back more bytes than we asked. */ + assert(sz == len); } - progress_add(len); - verified += len; - rv->io_start += len; - rv->io_length -= len; + progress_add(sz); + if (io_error == 0) + verified += sz; + rv->io_start += sz; + rv->io_length -= sz; } free(rv); From patchwork Mon Aug 26 21:31:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11115677 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B6B91184E for ; Mon, 26 Aug 2019 21:31:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 957E021881 for ; Mon, 26 Aug 2019 21:31:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="liXP1aw9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728697AbfHZVbI (ORCPT ); Mon, 26 Aug 2019 17:31:08 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:52752 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728673AbfHZVbI (ORCPT ); Mon, 26 Aug 2019 17:31:08 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLQjXC003298; Mon, 26 Aug 2019 21:31:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=fNSAPzY8W1yyqujPnOtiqKr14rhvXtgnEmnhOiH7COQ=; b=liXP1aw9gd2d6rkKb7slPHYTn5wQRU7SisOmsg1JRhgYe6uE2DnWDQUNLzKsUXkrsw5B PWfL6DplSX4CXXsBYaStghkwVFFUPxmqAjTkZedtrSVRAV5sMlZGfia5fUe7MyCujYwO ELtNuDhJVxs0ruW3e7ZJDwJRQMDkBZlMJ2wDiKbS6M1Gk7YZ9ZdA5PEuv5gEWafIoknx fBZ9HqDR0ngOdVF2A2DbrZJx1HMV7sqWOGuthrwO4Gw5meZTXIwp9KdCyJ4TZZRWZJm8 tFQT3CCJKRCPfQhwsmqz4YFHhstNWyXKbOn6sKXRm8told0KIAZD3rrtRwXgO+eT5LG9 ew== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 2umqbe80jw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:31:05 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x7QLIuw1185187; Mon, 26 Aug 2019 21:31:04 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 2umj2xvyce-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 21:31:04 +0000 Received: from abhmp0002.oracle.com (abhmp0002.oracle.com [141.146.116.8]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x7QLV3Wg029837; Mon, 26 Aug 2019 21:31:03 GMT Received: from localhost (/10.159.144.227) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 26 Aug 2019 14:31:03 -0700 Subject: [PATCH 11/11] xfs_scrub: simulate errors in the read-verify phase From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Mon, 26 Aug 2019 14:31:02 -0700 Message-ID: <156685506243.2841898.2647440855248405584.stgit@magnolia> In-Reply-To: <156685499099.2841898.18430382226915450537.stgit@magnolia> References: <156685499099.2841898.18430382226915450537.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9361 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1908260198 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add a debugging hook so that we can simulate disk errors during the media scan to test that the code works. Signed-off-by: Darrick J. Wong --- scrub/disk.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++++ scrub/xfs_scrub.c | 2 ++ 2 files changed, 69 insertions(+) diff --git a/scrub/disk.c b/scrub/disk.c index e34a8217..2178c528 100644 --- a/scrub/disk.c +++ b/scrub/disk.c @@ -276,6 +276,59 @@ disk_close( #define LBASIZE(d) (1ULL << (d)->d_lbalog) #define BTOLBA(d, bytes) (((uint64_t)(bytes) + LBASIZE(d) - 1) >> (d)->d_lbalog) +/* Simulate disk errors. */ +static int +disk_simulate_read_error( + struct disk *disk, + uint64_t start, + uint64_t *length) +{ + static int64_t interval; + uint64_t start_interval; + + /* Simulated disk errors are disabled. */ + if (interval < 0) + return 0; + + /* Figure out the disk read error interval. */ + if (interval == 0) { + char *p; + + /* Pretend there's bad media every so often, in bytes. */ + p = getenv("XFS_SCRUB_DISK_ERROR_INTERVAL"); + if (p == NULL) { + interval = -1; + return 0; + } + interval = strtoull(p, NULL, 10); + interval &= ~((1U << disk->d_lbalog) - 1); + } + + /* + * We simulate disk errors by pretending that there are media errors at + * predetermined intervals across the disk. If a read verify request + * crosses one of those intervals we shorten it so that the next read + * will start on an interval threshold. If the read verify request + * starts on an interval threshold, we send back EIO as if it had + * failed. + */ + if ((start % interval) == 0) { + dbg_printf("fd %d: simulating disk error at %"PRIu64".\n", + disk->d_fd, start); + return EIO; + } + + start_interval = start / interval; + if (start_interval != (start + *length) / interval) { + *length = ((start_interval + 1) * interval) - start; + dbg_printf( +"fd %d: simulating short read at %"PRIu64" to length %"PRIu64".\n", + disk->d_fd, start, *length); + } + + return 0; +} + /* Read-verify an extent of a disk device. */ ssize_t disk_read_verify( @@ -284,6 +337,20 @@ disk_read_verify( uint64_t start, uint64_t length) { + if (debug) { + int ret; + + ret = disk_simulate_read_error(disk, start, &length); + if (ret) { + errno = ret; + return -1; + } + + /* Don't actually issue the IO */ + if (getenv("XFS_SCRUB_DISK_VERIFY_SKIP")) + return length; + } + /* Convert to logical block size. */ if (disk->d_flags & DISK_FLAG_SCSI_VERIFY) return disk_scsi_verify(disk, BTOLBAT(disk, start), diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c index 71fc274f..d068634b 100644 --- a/scrub/xfs_scrub.c +++ b/scrub/xfs_scrub.c @@ -111,6 +111,8 @@ * XFS_SCRUB_NO_SCSI_VERIFY -- disable SCSI VERIFY (if present) * XFS_SCRUB_PHASE -- run only this scrub phase * XFS_SCRUB_THREADS -- start exactly this number of threads + * XFS_SCRUB_DISK_ERROR_INTERVAL-- simulate a disk error every this many bytes + * XFS_SCRUB_DISK_VERIFY_SKIP -- pretend disk verify read calls succeeded * * Available even in non-debug mode: * SERVICE_MODE -- compress all error codes to 1 for LSB