From patchwork Fri Sep 6 03:37:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134361 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D8AC4924 for ; Fri, 6 Sep 2019 03:37:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C4DE52082C for ; Fri, 6 Sep 2019 03:37:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="jWeV0qso" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392143AbfIFDhv (ORCPT ); Thu, 5 Sep 2019 23:37:51 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:43770 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732799AbfIFDhv (ORCPT ); Thu, 5 Sep 2019 23:37:51 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863YZUZ074752; Fri, 6 Sep 2019 03:37:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=/CQvP8cCXPALyR4gyixySjSg2/xGE8gDzfxcSCmlA28=; b=jWeV0qsodXUHIit6k1g7OJwzf8ZkG7CrupoRJbxVUzYKN7f4EoYEEdVHfclHmy58hNDp B+vfXfRLtpvFmqsHljwAai3eXRT03qs3jY9ru4M0xV1tVZ/Ytv9gtqKIztNlCe8JYPtm Y0w3aQu8AdOI2JytlIs9emGpgowrSqzsYRS946lOp0DDBr1SFlxH5XamW/klLVKm/3yb jtoOy8efWJZyx7yU4EbuU++7cnf2otm3tccHmofW/0FdFDe+pP44zgPugWDlYxZ3i5/7 e/0aEEZ0BQkdEljoYgiudPoAL3GaD9Z7jluluO8ZE3oioqFz53D6DTC0pmMYl6Mm87X8 9g== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 2uuf51g393-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:37:49 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863XT2i188642; Fri, 6 Sep 2019 03:37:48 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3030.oracle.com with ESMTP id 2utpmc764m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:37:48 +0000 Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x863blaF005296; Fri, 6 Sep 2019 03:37:47 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:37:47 -0700 Subject: [PATCH 01/11] xfs_scrub: fix handling of read-verify pool runtime errors From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:37:46 -0700 Message-ID: <156774106682.2645135.16924307846920048736.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Fix some bogosity with how we handle runtime errors in the read verify pool functions. First of all, memory allocation failures shouldn't be recorded as disk IO errors, they should just complain and abort the phase. Second, we need to collect any other runtime errors in the IO thread and abort the phase instead of silently ignoring them. Signed-off-by: Darrick J. Wong --- scrub/read_verify.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/scrub/read_verify.c b/scrub/read_verify.c index b890c92f..00627307 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -53,6 +53,7 @@ struct read_verify_pool { struct disk *disk; /* which disk? */ read_verify_ioerr_fn_t ioerr_fn; /* io error callback */ size_t miniosz; /* minimum io size, bytes */ + int errors_seen; }; /* @@ -91,6 +92,7 @@ read_verify_pool_init( rvp->ctx = ctx; rvp->disk = disk; rvp->ioerr_fn = ioerr_fn; + rvp->errors_seen = false; error = ptvar_alloc(submitter_threads, sizeof(struct read_verify), &rvp->rvstate); if (error) @@ -149,6 +151,7 @@ read_verify( unsigned long long verified = 0; ssize_t sz; ssize_t len; + int ret; rvp = (struct read_verify_pool *)wq->wq_ctx; while (rv->io_length > 0) { @@ -173,7 +176,12 @@ read_verify( } free(rv); - ptcounter_add(rvp->verified_bytes, verified); + ret = ptcounter_add(rvp->verified_bytes, verified); + if (ret) { + str_liberror(rvp->ctx, ret, + _("updating bytes verified counter")); + rvp->errors_seen = true; + } } /* Queue a read verify request. */ @@ -188,18 +196,25 @@ read_verify_queue( dbg_printf("verify fd %d start %"PRIu64" len %"PRIu64"\n", rvp->disk->d_fd, rv->io_start, rv->io_length); + /* Worker thread saw a runtime error, don't queue more. */ + if (rvp->errors_seen) + return false; + + /* Otherwise clone the request and queue the copy. */ tmp = malloc(sizeof(struct read_verify)); if (!tmp) { - rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, - rv->io_length, errno, rv->io_end_arg); - return true; + str_errno(rvp->ctx, _("allocating read-verify request")); + rvp->errors_seen = true; + return false; } + memcpy(tmp, rv, sizeof(*tmp)); ret = workqueue_add(&rvp->wq, read_verify, 0, tmp); if (ret) { str_liberror(rvp->ctx, ret, _("queueing read-verify work")); free(tmp); + rvp->errors_seen = true; return false; } rv->io_length = 0; From patchwork Fri Sep 6 03:37:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134363 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 34E69924 for ; Fri, 6 Sep 2019 03:37:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 202C92082C for ; Fri, 6 Sep 2019 03:37:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Dzo1nXfX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387802AbfIFDh5 (ORCPT ); Thu, 5 Sep 2019 23:37:57 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:36756 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732799AbfIFDh5 (ORCPT ); Thu, 5 Sep 2019 23:37:57 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863Xpo7104779; Fri, 6 Sep 2019 03:37:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=xGlsoG01q2F6ikV9a1IVpBELVHNtOOYIm3lMDOmRT5o=; b=Dzo1nXfXaztdaHOu9OBhkiJO8jdFJMtlj4mzHUwtdeqwQaHPATWP/F42Tq61aL+1d+b4 udwBUY2kXxsd0rBTni7U4h/EgaSSkefKcE5ibqfxMOk8ShUzmk1WKYkOKQNOZ9sMJX5D krL6Xvj6FqWh5Bh/U9yC0lcDhlV0nzmqewtyhrlzgAkGKSUjIGDtH4sZnVgskJlO8uM4 fZiziwA80+oVaZFCzy8v0trFCY24zyEb74EsE9ml8BF0ObYxZuigmrcGQKUmZbyIeAmV 5AjzmEO2EeRL70E9kDAHi507OodqDC9BPZSQm2YpdK1e1fv/P3A+BGdxgI2nNGkEXihI VQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 2uuf5f8346-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:37:54 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863XbWu069127; Fri, 6 Sep 2019 03:37:54 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3020.oracle.com with ESMTP id 2utvr4jxb2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:37:54 +0000 Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x863brK9019525; Fri, 6 Sep 2019 03:37:53 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:37:53 -0700 Subject: [PATCH 02/11] xfs_scrub: abort all read verification work immediately on error From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:37:53 -0700 Message-ID: <156774107297.2645135.3766760104052020043.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add a new abort function to the read verify pool code so that the caller can immediately abort all pending verification work if things start going wrong. There's no point in waiting for queued work to run if we've already decided to bail. Signed-off-by: Darrick J. Wong --- scrub/phase6.c | 6 +++--- scrub/read_verify.c | 10 ++++++++++ scrub/read_verify.h | 1 + 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/scrub/phase6.c b/scrub/phase6.c index d9285fee..4c81ee7b 100644 --- a/scrub/phase6.c +++ b/scrub/phase6.c @@ -514,16 +514,16 @@ _("Could not create data device media verifier.")); out_rtpool: if (vs.rvp_realtime) { - read_verify_pool_flush(vs.rvp_realtime); + read_verify_pool_abort(vs.rvp_realtime); read_verify_pool_destroy(vs.rvp_realtime); } out_logpool: if (vs.rvp_log) { - read_verify_pool_flush(vs.rvp_log); + read_verify_pool_abort(vs.rvp_log); read_verify_pool_destroy(vs.rvp_log); } out_datapool: - read_verify_pool_flush(vs.rvp_data); + read_verify_pool_abort(vs.rvp_data); read_verify_pool_destroy(vs.rvp_data); out_rbad: bitmap_free(&vs.r_bad); diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 00627307..301e9b48 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -117,6 +117,16 @@ read_verify_pool_init( return NULL; } +/* Abort all verification work. */ +void +read_verify_pool_abort( + struct read_verify_pool *rvp) +{ + if (!rvp->errors_seen) + rvp->errors_seen = ECANCELED; + workqueue_terminate(&rvp->wq); +} + /* Finish up any read verification work. */ void read_verify_pool_flush( diff --git a/scrub/read_verify.h b/scrub/read_verify.h index 5fabe5e0..f0ed8902 100644 --- a/scrub/read_verify.h +++ b/scrub/read_verify.h @@ -19,6 +19,7 @@ struct read_verify_pool *read_verify_pool_init(struct scrub_ctx *ctx, struct disk *disk, size_t miniosz, read_verify_ioerr_fn_t ioerr_fn, unsigned int submitter_threads); +void read_verify_pool_abort(struct read_verify_pool *rvp); void read_verify_pool_flush(struct read_verify_pool *rvp); void read_verify_pool_destroy(struct read_verify_pool *rvp); From patchwork Fri Sep 6 03:37:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134365 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EACDC76 for ; Fri, 6 Sep 2019 03:38:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CF4362082C for ; Fri, 6 Sep 2019 03:38:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="QaLr3jb+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732799AbfIFDiE (ORCPT ); Thu, 5 Sep 2019 23:38:04 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:36846 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392144AbfIFDiE (ORCPT ); Thu, 5 Sep 2019 23:38:04 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863Xqkq104830; Fri, 6 Sep 2019 03:38:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=PseadaLSKThl+8+EgSmYMkxbXL2zMSjWi2BxBTfnWa8=; b=QaLr3jb+Cc/wYFLO/b/Z6bLMqoNwcKJ8AKJ5nZEOdb6nZ1slhk3ed5yBa/8Krtaq8U6d A02j4ZwmPb20MKweqJ5pXLn/Zl4pNX/u0wOFYcbCcUj5pK/2BEHwE/djght8LHICVW3i YGmYdP+Xkbn2RnBDepTtrk2I9r0lCjZBA4qNql2QnXUK1BCRArtZIj5jPXiL8no2/aFk yKZweEH5lpQcFuLua1uq17ong0hGeFhA7g7RZKdkCMuo02Wp9WaJumg1WBmTWRAxl80q ZhCZQ0DDrF/uirZBKW8mkBi1PIjIcmKrDNIdQeWXz0h/OlpGTVhg0ySfkYxBn1B4XTQs ag== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 2uuf5f834d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:01 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863Xbql069130; Fri, 6 Sep 2019 03:38:00 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3020.oracle.com with ESMTP id 2utvr4jxf5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:00 +0000 Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x863c0w1015796; Fri, 6 Sep 2019 03:38:00 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:37:59 -0700 Subject: [PATCH 03/11] xfs_scrub: fix read-verify pool error communication problems From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:37:59 -0700 Message-ID: <156774107913.2645135.4815104017850485668.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Fix all the places in the read-verify pool functions either we fail to check for runtime errors or fail to communicate them properly to callers. Then fix all the callers to report the error messages instead of hiding them. Signed-off-by: Darrick J. Wong --- scrub/phase6.c | 89 ++++++++++++++++++++++++++++++++++++--------------- scrub/read_verify.c | 87 ++++++++++++++++++++++---------------------------- scrub/read_verify.h | 16 +++++---- 3 files changed, 109 insertions(+), 83 deletions(-) diff --git a/scrub/phase6.c b/scrub/phase6.c index 4c81ee7b..f6274a49 100644 --- a/scrub/phase6.c +++ b/scrub/phase6.c @@ -387,6 +387,7 @@ xfs_check_rmap( { struct media_verify_state *vs = arg; struct read_verify_pool *rvp; + int ret; rvp = xfs_dev_to_pool(ctx, vs, map->fmr_device); @@ -415,28 +416,48 @@ xfs_check_rmap( /* XXX: Filter out directory data blocks. */ /* Schedule the read verify command for (eventual) running. */ - read_verify_schedule_io(rvp, map->fmr_physical, map->fmr_length, vs); + ret = read_verify_schedule_io(rvp, map->fmr_physical, map->fmr_length, + vs); + if (ret) { + str_liberror(ctx, ret, descr); + return false; + } out: /* Is this the last extent? Fire off the read. */ - if (map->fmr_flags & FMR_OF_LAST) - read_verify_force_io(rvp); + if (map->fmr_flags & FMR_OF_LAST) { + ret = read_verify_force_io(rvp); + if (ret) { + str_liberror(ctx, ret, descr); + return false; + } + } return true; } /* Wait for read/verify actions to finish, then return # bytes checked. */ -static uint64_t +static int clean_pool( - struct read_verify_pool *rvp) + struct read_verify_pool *rvp, + unsigned long long *bytes_checked) { - uint64_t ret; + uint64_t pool_checked; + int ret; if (!rvp) return 0; - read_verify_pool_flush(rvp); - ret = read_verify_bytes(rvp); + ret = read_verify_pool_flush(rvp); + if (ret) + goto out_destroy; + + ret = read_verify_bytes(rvp, &pool_checked); + if (ret) + goto out_destroy; + + *bytes_checked += pool_checked; +out_destroy: read_verify_pool_destroy(rvp); return ret; } @@ -469,43 +490,57 @@ xfs_scan_blocks( goto out_dbad; } - vs.rvp_data = read_verify_pool_init(ctx, ctx->datadev, + ret = read_verify_pool_alloc(ctx, ctx->datadev, ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr, - scrub_nproc(ctx)); - if (!vs.rvp_data) { - str_info(ctx, ctx->mntpoint, -_("Could not create data device media verifier.")); + scrub_nproc(ctx), &vs.rvp_data); + if (ret) { + str_liberror(ctx, ret, _("creating datadev media verifier")); goto out_rbad; } if (ctx->logdev) { - vs.rvp_log = read_verify_pool_init(ctx, ctx->logdev, + ret = read_verify_pool_alloc(ctx, ctx->logdev, ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr, - scrub_nproc(ctx)); - if (!vs.rvp_log) { - str_info(ctx, ctx->mntpoint, - _("Could not create log device media verifier.")); + scrub_nproc(ctx), &vs.rvp_log); + if (ret) { + str_liberror(ctx, ret, + _("creating logdev media verifier")); goto out_datapool; } } if (ctx->rtdev) { - vs.rvp_realtime = read_verify_pool_init(ctx, ctx->rtdev, + ret = read_verify_pool_alloc(ctx, ctx->rtdev, ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr, - scrub_nproc(ctx)); - if (!vs.rvp_realtime) { - str_info(ctx, ctx->mntpoint, - _("Could not create realtime device media verifier.")); + scrub_nproc(ctx), &vs.rvp_realtime); + if (ret) { + str_liberror(ctx, ret, + _("creating rtdev media verifier")); goto out_logpool; } } moveon = xfs_scan_all_spacemaps(ctx, xfs_check_rmap, &vs); if (!moveon) goto out_rtpool; - ctx->bytes_checked += clean_pool(vs.rvp_data); - ctx->bytes_checked += clean_pool(vs.rvp_log); - ctx->bytes_checked += clean_pool(vs.rvp_realtime); + + ret = clean_pool(vs.rvp_data, &ctx->bytes_checked); + if (ret) { + str_liberror(ctx, ret, _("flushing datadev verify pool")); + moveon = false; + } + + ret = clean_pool(vs.rvp_log, &ctx->bytes_checked); + if (ret) { + str_liberror(ctx, ret, _("flushing logdev verify pool")); + moveon = false; + } + + ret = clean_pool(vs.rvp_realtime, &ctx->bytes_checked); + if (ret) { + str_liberror(ctx, ret, _("flushing rtdev verify pool")); + moveon = false; + } /* Scan the whole dir tree to see what matches the bad extents. */ - if (!bitmap_empty(vs.d_bad) || !bitmap_empty(vs.r_bad)) + if (moveon && (!bitmap_empty(vs.d_bad) || !bitmap_empty(vs.r_bad))) moveon = xfs_report_verify_errors(ctx, &vs); bitmap_free(&vs.r_bad); diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 301e9b48..8f80dcaf 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -65,37 +65,37 @@ struct read_verify_pool { * @submitter_threads is the number of threads that may be sending verify * requests at any given time. */ -struct read_verify_pool * -read_verify_pool_init( +int +read_verify_pool_alloc( struct scrub_ctx *ctx, struct disk *disk, size_t miniosz, read_verify_ioerr_fn_t ioerr_fn, - unsigned int submitter_threads) + unsigned int submitter_threads, + struct read_verify_pool **prvp) { struct read_verify_pool *rvp; - bool ret; - int error; + int ret; rvp = calloc(1, sizeof(struct read_verify_pool)); if (!rvp) - return NULL; + return errno; - error = posix_memalign((void **)&rvp->readbuf, page_size, + ret = posix_memalign((void **)&rvp->readbuf, page_size, RVP_IO_MAX_SIZE); - if (error || !rvp->readbuf) + if (ret) goto out_free; - error = ptcounter_alloc(nproc, &rvp->verified_bytes); - if (error) + ret = ptcounter_alloc(nproc, &rvp->verified_bytes); + if (ret) goto out_buf; rvp->miniosz = miniosz; rvp->ctx = ctx; rvp->disk = disk; rvp->ioerr_fn = ioerr_fn; - rvp->errors_seen = false; - error = ptvar_alloc(submitter_threads, sizeof(struct read_verify), + rvp->errors_seen = 0; + ret = ptvar_alloc(submitter_threads, sizeof(struct read_verify), &rvp->rvstate); - if (error) + if (ret) goto out_counter; /* Run in the main thread if we only want one thread. */ if (nproc == 1) @@ -104,7 +104,8 @@ read_verify_pool_init( disk_heads(disk)); if (ret) goto out_rvstate; - return rvp; + *prvp = rvp; + return 0; out_rvstate: ptvar_free(rvp->rvstate); @@ -114,7 +115,7 @@ read_verify_pool_init( free(rvp->readbuf); out_free: free(rvp); - return NULL; + return ret; } /* Abort all verification work. */ @@ -128,11 +129,11 @@ read_verify_pool_abort( } /* Finish up any read verification work. */ -void +int read_verify_pool_flush( struct read_verify_pool *rvp) { - workqueue_terminate(&rvp->wq); + return workqueue_terminate(&rvp->wq); } /* Finish up any read verification work and tear it down. */ @@ -187,15 +188,12 @@ read_verify( free(rv); ret = ptcounter_add(rvp->verified_bytes, verified); - if (ret) { - str_liberror(rvp->ctx, ret, - _("updating bytes verified counter")); - rvp->errors_seen = true; - } + if (ret) + rvp->errors_seen = ret; } /* Queue a read verify request. */ -static bool +static int read_verify_queue( struct read_verify_pool *rvp, struct read_verify *rv) @@ -208,34 +206,33 @@ read_verify_queue( /* Worker thread saw a runtime error, don't queue more. */ if (rvp->errors_seen) - return false; + return rvp->errors_seen; /* Otherwise clone the request and queue the copy. */ tmp = malloc(sizeof(struct read_verify)); if (!tmp) { - str_errno(rvp->ctx, _("allocating read-verify request")); - rvp->errors_seen = true; - return false; + rvp->errors_seen = errno; + return errno; } memcpy(tmp, rv, sizeof(*tmp)); ret = workqueue_add(&rvp->wq, read_verify, 0, tmp); if (ret) { - str_liberror(rvp->ctx, ret, _("queueing read-verify work")); free(tmp); - rvp->errors_seen = true; - return false; + rvp->errors_seen = ret; + return ret; } + rv->io_length = 0; - return true; + return 0; } /* * Issue an IO request. We'll batch subsequent requests if they're * within 64k of each other */ -bool +int read_verify_schedule_io( struct read_verify_pool *rvp, uint64_t start, @@ -250,7 +247,7 @@ read_verify_schedule_io( assert(rvp->readbuf); rv = ptvar_get(rvp->rvstate, &ret); if (ret) - return false; + return ret; req_end = start + length; rv_end = rv->io_start + rv->io_length; @@ -277,38 +274,32 @@ read_verify_schedule_io( rv->io_end_arg = end_arg; } - return true; + return 0; } /* Force any stashed IOs into the verifier. */ -bool +int read_verify_force_io( struct read_verify_pool *rvp) { struct read_verify *rv; - bool moveon; int ret; assert(rvp->readbuf); rv = ptvar_get(rvp->rvstate, &ret); if (ret) - return false; + return ret; if (rv->io_length == 0) - return true; + return 0; - moveon = read_verify_queue(rvp, rv); - if (moveon) - rv->io_length = 0; - return moveon; + return read_verify_queue(rvp, rv); } /* How many bytes has this process verified? */ -uint64_t +int read_verify_bytes( - struct read_verify_pool *rvp) + struct read_verify_pool *rvp, + uint64_t *bytes_checked) { - uint64_t ret; - - ptcounter_value(rvp->verified_bytes, &ret); - return ret; + return ptcounter_value(rvp->verified_bytes, bytes_checked); } diff --git a/scrub/read_verify.h b/scrub/read_verify.h index f0ed8902..650c46d4 100644 --- a/scrub/read_verify.h +++ b/scrub/read_verify.h @@ -15,17 +15,17 @@ typedef void (*read_verify_ioerr_fn_t)(struct scrub_ctx *ctx, struct disk *disk, uint64_t start, uint64_t length, int error, void *arg); -struct read_verify_pool *read_verify_pool_init(struct scrub_ctx *ctx, - struct disk *disk, size_t miniosz, - read_verify_ioerr_fn_t ioerr_fn, - unsigned int submitter_threads); +int read_verify_pool_alloc(struct scrub_ctx *ctx, struct disk *disk, + size_t miniosz, read_verify_ioerr_fn_t ioerr_fn, + unsigned int submitter_threads, + struct read_verify_pool **prvp); void read_verify_pool_abort(struct read_verify_pool *rvp); -void read_verify_pool_flush(struct read_verify_pool *rvp); +int read_verify_pool_flush(struct read_verify_pool *rvp); void read_verify_pool_destroy(struct read_verify_pool *rvp); -bool read_verify_schedule_io(struct read_verify_pool *rvp, uint64_t start, +int read_verify_schedule_io(struct read_verify_pool *rvp, uint64_t start, uint64_t length, void *end_arg); -bool read_verify_force_io(struct read_verify_pool *rvp); -uint64_t read_verify_bytes(struct read_verify_pool *rvp); +int read_verify_force_io(struct read_verify_pool *rvp); +int read_verify_bytes(struct read_verify_pool *rvp, uint64_t *bytes); #endif /* XFS_SCRUB_READ_VERIFY_H_ */ From patchwork Fri Sep 6 03:38:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134393 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6683376 for ; Fri, 6 Sep 2019 03:39:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5239220820 for ; Fri, 6 Sep 2019 03:39:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="cY2L5HBe" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404426AbfIFDjX (ORCPT ); Thu, 5 Sep 2019 23:39:23 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:38372 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732014AbfIFDjX (ORCPT ); Thu, 5 Sep 2019 23:39:23 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863dAUb108568; Fri, 6 Sep 2019 03:39:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=LdmAp4vvHI/BKFIy2XI79DB9cFAsXq+OkI7afpctqVg=; b=cY2L5HBeXnFRPK3O0v7PKIXjDNPpqeVMSTQ6yoZw8zGshaAa7moL7PxK0AOyGR6Vxejw qlRxR9p705zzKk1F07rBd+qRr03mny19f//iDqnbmnEgFt8F4RRzVEFuAUMFQnrzn9Qh 5OrInnqezkFHUAKEoPV/fBJ8rQ2erX6raGgAVd2VPNLU9mARSojhdtIW1UVL7T/Z4/s/ lGlN90iY0ekdE2RNbMc+uW05Fv3q1Z1GxMRbz2hi3Sarv8Dkf2+bq+kf7ZX8uwUHUuBg NPZoTmzaN+toinf1HlALL3dl8YnhIUoa79sq1LsWpPWhqDqQsh5L2XcdpCu82WJa39W3 jQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 2uuf5f839u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:39:21 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863dHKK112785; Fri, 6 Sep 2019 03:39:20 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3020.oracle.com with ESMTP id 2uud7p2s5x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:39:18 +0000 Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x863c7Yp005509; Fri, 6 Sep 2019 03:38:07 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:38:07 -0700 Subject: [PATCH 04/11] xfs_scrub: fix queue-and-stash of non-contiguous verify requests From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:38:05 -0700 Message-ID: <156774108546.2645135.14576287125742125024.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong read_verify_schedule_io is supposed to have the ability to decide that a retained aggregate extent verification request is not sufficiently contiguous with the request that is being scheduled, and therefore it needs to queue the retained request and use the new request to start building a new aggregate request. Unfortunately, it stupidly returns after queueing the IO, so we lose the incoming request. Fix the code so we only do that if there's a run time error. Signed-off-by: Darrick J. Wong --- scrub/read_verify.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 8f80dcaf..980b92b8 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -265,8 +265,13 @@ read_verify_schedule_io( rv->io_length = max(req_end, rv_end) - rv->io_start; } else { /* Otherwise, issue the stashed IO (if there is one) */ - if (rv->io_length > 0) - return read_verify_queue(rvp, rv); + if (rv->io_length > 0) { + int res; + + res = read_verify_queue(rvp, rv); + if (res) + return res; + } /* Stash the new IO. */ rv->io_start = start; From patchwork Fri Sep 6 03:38:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134377 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED91676 for ; Fri, 6 Sep 2019 03:38:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D8B03206B8 for ; Fri, 6 Sep 2019 03:38:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="hTCN77Ww" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392151AbfIFDid (ORCPT ); Thu, 5 Sep 2019 23:38:33 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:44572 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392146AbfIFDid (ORCPT ); Thu, 5 Sep 2019 23:38:33 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863YcEx074767; Fri, 6 Sep 2019 03:38:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=R70hgIyKXJJpl4QuH/sILArckwUjy35P4RsDGzar/2c=; b=hTCN77Ww2QbyKEzX7xwklCR320E4PgNSIiOKtvbl2isH7LnpMnA7lP0hVQ2pau1pHfk6 5Gz9r7mnwLxmxAs8oOMukYxWnXRoB/GKyDnkG5AIrcEEDJ6KgOdTp7rgLc/nIhYn+WhP qm978v1xne2T33VG2NFw7SsB1TRQjR7VfKfeFuynIea3B8EpvOtBIqG0Lh4BD652Wd2Z Sh0P7m4g2yxgoBpwen8BeF8P9HFVHsSgzIajSZzLT/Oe4QhLtWELQlA+lvW+7snWeUUG PtxdquUDnjLXSTdaJKvX7/UAgJt6kJU+3dnXcbZu2BqgJJn5tX5vEdDRiI66fTmKsICp cA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2120.oracle.com with ESMTP id 2uuf51g3bh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:31 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863cTXk096134; Fri, 6 Sep 2019 03:38:30 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3030.oracle.com with ESMTP id 2uu1b99sqy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:30 +0000 Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x863cFYT015906; Fri, 6 Sep 2019 03:38:15 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:38:14 -0700 Subject: [PATCH 05/11] xfs_scrub: only call read_verify_force_io once per pool From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:38:13 -0700 Message-ID: <156774109332.2645135.1085596269768224961.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong There's no reason we need to call read_verify_force_io every AG; we can just let the request aggregation code do its thing and push when we're totally done browsing the fsmap information. Signed-off-by: Darrick J. Wong --- scrub/phase6.c | 16 +++++----------- scrub/read_verify.c | 26 +++++++++++++++++--------- 2 files changed, 22 insertions(+), 20 deletions(-) diff --git a/scrub/phase6.c b/scrub/phase6.c index f6274a49..8063d6ce 100644 --- a/scrub/phase6.c +++ b/scrub/phase6.c @@ -411,7 +411,7 @@ xfs_check_rmap( */ if (map->fmr_flags & (FMR_OF_PREALLOC | FMR_OF_ATTR_FORK | FMR_OF_EXTENT_MAP | FMR_OF_SPECIAL_OWNER)) - goto out; + return true; /* XXX: Filter out directory data blocks. */ @@ -423,16 +423,6 @@ xfs_check_rmap( return false; } -out: - /* Is this the last extent? Fire off the read. */ - if (map->fmr_flags & FMR_OF_LAST) { - ret = read_verify_force_io(rvp); - if (ret) { - str_liberror(ctx, ret, descr); - return false; - } - } - return true; } @@ -448,6 +438,10 @@ clean_pool( if (!rvp) return 0; + ret = read_verify_force_io(rvp); + if (ret) + return ret; + ret = read_verify_pool_flush(rvp); if (ret) goto out_destroy; diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 980b92b8..73d30817 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -282,22 +282,30 @@ read_verify_schedule_io( return 0; } +/* Force any per-thread stashed IOs into the verifier. */ +static int +force_one_io( + struct ptvar *ptv, + void *data, + void *foreach_arg) +{ + struct read_verify_pool *rvp = foreach_arg; + struct read_verify *rv = data; + + if (rv->io_length == 0) + return 0; + + return read_verify_queue(rvp, rv); +} + /* Force any stashed IOs into the verifier. */ int read_verify_force_io( struct read_verify_pool *rvp) { - struct read_verify *rv; - int ret; - assert(rvp->readbuf); - rv = ptvar_get(rvp->rvstate, &ret); - if (ret) - return ret; - if (rv->io_length == 0) - return 0; - return read_verify_queue(rvp, rv); + return ptvar_foreach(rvp->rvstate, force_one_io, rvp); } /* How many bytes has this process verified? */ From patchwork Fri Sep 6 03:38:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134395 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 10E4614B4 for ; Fri, 6 Sep 2019 03:39:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E8E9520820 for ; Fri, 6 Sep 2019 03:39:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="WyUF5IwY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732014AbfIFDjY (ORCPT ); Thu, 5 Sep 2019 23:39:24 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:38376 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392103AbfIFDjX (ORCPT ); Thu, 5 Sep 2019 23:39:23 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863dL95108640; Fri, 6 Sep 2019 03:39:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=J4d4S/IhQKjOAsoxXyH7ATbHQX+hEvKJeRrQ7iDUOZk=; b=WyUF5IwYP6htcyCG/xbcPESlwDXul3FDriNW5OsGwCYSQKkVYysKQmElGtzujwEwGkVH ajWoqGP6fLfqH7LzCxt+Ma2jcpaxtHsb+XeaiMrtTo/dAVmJ4040ZyaoWsazZwrsFzQg Py6phnYIXwu7f2zPmLmMotpAfrNaJCwcR2xK6fAdeyMuM8pkxAY4G3tqG3GpUr3O9aCe AN13cxHc+zIxbZx4gRPf+OgNjcyogI01fFHWYQLTle2pZllpyi15REbhNeBHPxo2R3bO eBp/q4BwiD0s2QoNf9F8W2r2mP2oOitX0O60AtXsc19E0BiJT5aL44NKmmG52Mr9QqEj KA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 2uuf5f839v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:39:21 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863dIWQ112877; Fri, 6 Sep 2019 03:39:20 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 2uud7p2s93-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:39:19 +0000 Received: from abhmp0004.oracle.com (abhmp0004.oracle.com [141.146.116.10]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x863cL0Q019688; Fri, 6 Sep 2019 03:38:21 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:38:20 -0700 Subject: [PATCH 06/11] xfs_scrub: refactor inode prefix rendering code From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:38:20 -0700 Message-ID: <156774110045.2645135.12889461548085975502.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Refactor all the places in the code where we try to render an inode number as a prefix for some sort of status message. This will help make message prefixes more consistent, which should help users to locate broken metadata. Signed-off-by: Darrick J. Wong --- scrub/common.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ scrub/common.h | 6 ++++++ scrub/inodes.c | 4 ++-- scrub/phase3.c | 8 ++------ scrub/phase5.c | 8 ++------ scrub/phase6.c | 6 +++--- scrub/scrub.c | 17 +++++++++-------- 7 files changed, 71 insertions(+), 25 deletions(-) diff --git a/scrub/common.c b/scrub/common.c index 7db47044..a814f568 100644 --- a/scrub/common.c +++ b/scrub/common.c @@ -354,3 +354,50 @@ within_range( return true; } + +/* + * Render an inode number as both the raw inode number and as an AG number + * and AG inode pair. This is intended for use with status message reporting. + * If @format is not NULL, it should provide any desired leading whitespace. + * + * For example, "inode 287232352 (13/352) : " + */ +int +scrub_render_ino_suffix( + const struct scrub_ctx *ctx, + char *buf, + size_t buflen, + uint64_t ino, + uint32_t gen, + const char *format, + ...) +{ + va_list args; + uint32_t agno; + uint32_t agino; + int ret; + + agno = cvt_ino_to_agno(&ctx->mnt, ino); + agino = cvt_ino_to_agino(&ctx->mnt, ino); + ret = snprintf(buf, buflen, _("inode %"PRIu64" (%"PRIu32"/%"PRIu32")"), + ino, agno, agino); + if (ret < 0 || ret >= buflen || format == NULL) + return ret; + + va_start(args, format); + ret += vsnprintf(buf + ret, buflen - ret, format, args); + va_end(args); + return ret; +} + +/* Render an inode number for message reporting with no suffix. */ +int +scrub_render_ino( + const struct scrub_ctx *ctx, + char *buf, + size_t buflen, + uint64_t ino, + uint32_t gen) +{ + return scrub_render_ino_suffix(ctx, buf, buflen, ino, gen, NULL); +} diff --git a/scrub/common.h b/scrub/common.h index 33555891..1b9ad48f 100644 --- a/scrub/common.h +++ b/scrub/common.h @@ -86,4 +86,10 @@ bool within_range(struct scrub_ctx *ctx, unsigned long long value, unsigned long long desired, unsigned long long abs_threshold, unsigned int n, unsigned int d, const char *descr); +int scrub_render_ino_suffix(const struct scrub_ctx *ctx, char *buf, + size_t buflen, uint64_t ino, uint32_t gen, + const char *format, ...); +int scrub_render_ino(const struct scrub_ctx *ctx, char *buf, + size_t buflen, uint64_t ino, uint32_t gen); + #endif /* XFS_SCRUB_COMMON_H_ */ diff --git a/scrub/inodes.c b/scrub/inodes.c index 37a35a3f..f436beb8 100644 --- a/scrub/inodes.c +++ b/scrub/inodes.c @@ -159,8 +159,8 @@ xfs_iterate_inodes_ag( ireq->hdr.ino = inogrp->xi_startino; goto igrp_retry; } - snprintf(idescr, DESCR_BUFSZ, "inode %"PRIu64, - (uint64_t)bs->bs_ino); + scrub_render_ino(ctx, idescr, DESCR_BUFSZ, + bs->bs_ino, bs->bs_gen); str_info(ctx, idescr, _("Changed too many times during scan; giving up.")); break; diff --git a/scrub/phase3.c b/scrub/phase3.c index 1e908c2c..48bcc21c 100644 --- a/scrub/phase3.c +++ b/scrub/phase3.c @@ -48,14 +48,10 @@ xfs_scrub_inode_vfs_error( struct xfs_bulkstat *bstat) { char descr[DESCR_BUFSZ]; - xfs_agnumber_t agno; - xfs_agino_t agino; int old_errno = errno; - agno = cvt_ino_to_agno(&ctx->mnt, bstat->bs_ino); - agino = cvt_ino_to_agino(&ctx->mnt, bstat->bs_ino); - snprintf(descr, DESCR_BUFSZ, _("inode %"PRIu64" (%u/%u)"), - (uint64_t)bstat->bs_ino, agno, agino); + scrub_render_ino(ctx, descr, DESCR_BUFSZ, bstat->bs_ino, + bstat->bs_gen); errno = old_errno; str_errno(ctx, descr); } diff --git a/scrub/phase5.c b/scrub/phase5.c index 99cd51b2..997c88d9 100644 --- a/scrub/phase5.c +++ b/scrub/phase5.c @@ -234,15 +234,11 @@ xfs_scrub_connections( bool *pmoveon = arg; char descr[DESCR_BUFSZ]; bool moveon = true; - xfs_agnumber_t agno; - xfs_agino_t agino; int fd = -1; int error; - agno = cvt_ino_to_agno(&ctx->mnt, bstat->bs_ino); - agino = cvt_ino_to_agino(&ctx->mnt, bstat->bs_ino); - snprintf(descr, DESCR_BUFSZ, _("inode %"PRIu64" (%u/%u)"), - (uint64_t)bstat->bs_ino, agno, agino); + scrub_render_ino(ctx, descr, DESCR_BUFSZ, bstat->bs_ino, + bstat->bs_gen); background_sleep(); /* Warn about naming problems in xattrs. */ diff --git a/scrub/phase6.c b/scrub/phase6.c index 8063d6ce..4554af9a 100644 --- a/scrub/phase6.c +++ b/scrub/phase6.c @@ -180,15 +180,15 @@ xfs_report_verify_inode( int fd; int error; - snprintf(descr, DESCR_BUFSZ, _("inode %"PRIu64" (unlinked)"), - (uint64_t)bstat->bs_ino); - /* Ignore linked files and things we can't open. */ if (bstat->bs_nlink != 0) return 0; if (!S_ISREG(bstat->bs_mode) && !S_ISDIR(bstat->bs_mode)) return 0; + scrub_render_ino_suffix(ctx, descr, DESCR_BUFSZ, + bstat->bs_ino, bstat->bs_gen, _(" (unlinked)")); + /* Try to open the inode. */ fd = xfs_open_handle(handle); if (fd < 0) { diff --git a/scrub/scrub.c b/scrub/scrub.c index 0e30bb2f..9bd6a682 100644 --- a/scrub/scrub.c +++ b/scrub/scrub.c @@ -26,11 +26,13 @@ /* Format a scrub description. */ static void format_scrub_descr( + struct scrub_ctx *ctx, char *buf, size_t buflen, - struct xfs_scrub_metadata *meta, - const struct xfrog_scrub_descr *sc) + struct xfs_scrub_metadata *meta) { + const struct xfrog_scrub_descr *sc = &xfrog_scrubbers[meta->sm_type]; + switch (sc->type) { case XFROG_SCRUB_TYPE_AGHEADER: case XFROG_SCRUB_TYPE_PERAG: @@ -38,8 +40,9 @@ format_scrub_descr( _(sc->descr)); break; case XFROG_SCRUB_TYPE_INODE: - snprintf(buf, buflen, _("Inode %"PRIu64" %s"), - (uint64_t)meta->sm_ino, _(sc->descr)); + scrub_render_ino_suffix(ctx, buf, buflen, + meta->sm_ino, meta->sm_gen, " %s", + _(sc->descr)); break; case XFROG_SCRUB_TYPE_FS: snprintf(buf, buflen, _("%s"), _(sc->descr)); @@ -123,8 +126,7 @@ xfs_check_metadata( assert(!debug_tweak_on("XFS_SCRUB_NO_KERNEL")); assert(meta->sm_type < XFS_SCRUB_TYPE_NR); - format_scrub_descr(buf, DESCR_BUFSZ, meta, - &xfrog_scrubbers[meta->sm_type]); + format_scrub_descr(ctx, buf, DESCR_BUFSZ, meta); dbg_printf("check %s flags %xh\n", buf, meta->sm_flags); retry: @@ -677,8 +679,7 @@ xfs_repair_metadata( return CHECK_RETRY; memcpy(&oldm, &meta, sizeof(oldm)); - format_scrub_descr(buf, DESCR_BUFSZ, &meta, - &xfrog_scrubbers[meta.sm_type]); + format_scrub_descr(ctx, buf, DESCR_BUFSZ, &meta); if (needs_repair(&meta)) str_info(ctx, buf, _("Attempting repair.")); From patchwork Fri Sep 6 03:38:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134375 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8718776 for ; Fri, 6 Sep 2019 03:38:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7402E2082C for ; Fri, 6 Sep 2019 03:38:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="P2ecgChm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392149AbfIFDib (ORCPT ); Thu, 5 Sep 2019 23:38:31 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:37342 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392146AbfIFDia (ORCPT ); Thu, 5 Sep 2019 23:38:30 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863Xr0P104870; Fri, 6 Sep 2019 03:38:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=FYXXrPLwaTe2otpJz1HVJs62nr+I8tCAp7lee9WC9BY=; b=P2ecgChmMpxjAIz8ZWmzpI/JbOo+pr7XDPItwniSz4JfV3s54wJkYq5J/ZameIxDjn0r 5bSddnpmYf+UB+pwkpOUZqpeQLPVUR95hCzo2a/1qNqdVhaxb7HVLAIQuR26hk7BRJuG rDhbx4H+I5trmcI8PcFBj5tuaJL3HRPx8bp7mqgE+Q94nj1xtDgmjaHRTnMVegZOV/1c B+cT3fNlqpiRONlA382tKaN+Fb81xvWhyFVjeCYq/lVAcPPhSyOpyDEIvskg/F4Jgy+b R25Raj6mT0KqzwDKzoyptYjT0bnAqSGqEsyV70E+dB6945BdAaTlsVCjrEVwYDCUaxIe 0g== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 2uuf5f836b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:28 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863cPZh077903; Fri, 6 Sep 2019 03:38:28 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 2utvr4jxrk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:28 +0000 Received: from abhmp0009.oracle.com (abhmp0009.oracle.com [141.146.116.15]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x863cRTH020475; Fri, 6 Sep 2019 03:38:27 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:38:27 -0700 Subject: [PATCH 07/11] xfs_scrub: record disk LBA size From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:38:26 -0700 Message-ID: <156774110666.2645135.15931272284847388930.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Remember the size (in bytes) of a logical block on the disk. We'll use this in subsequent patches to improve the ability of media scans to report on which files are corrupt. Signed-off-by: Darrick J. Wong --- scrub/disk.c | 7 +++---- scrub/disk.h | 3 ++- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/scrub/disk.c b/scrub/disk.c index dcdd5ba8..d2101cc6 100644 --- a/scrub/disk.c +++ b/scrub/disk.c @@ -193,7 +193,6 @@ disk_open( #endif struct disk *disk; bool suspicious_disk = false; - int lba_sz; int error; disk = calloc(1, sizeof(struct disk)); @@ -205,10 +204,10 @@ disk_open( goto out_free; /* Try to get LBA size. */ - error = ioctl(disk->d_fd, BLKSSZGET, &lba_sz); + error = ioctl(disk->d_fd, BLKSSZGET, &disk->d_lbasize); if (error) - lba_sz = 512; - disk->d_lbalog = log2_roundup(lba_sz); + disk->d_lbasize = 512; + disk->d_lbalog = log2_roundup(disk->d_lbasize); /* Obtain disk's stat info. */ error = fstat(disk->d_fd, &disk->d_sb); diff --git a/scrub/disk.h b/scrub/disk.h index 74a26d98..36bfb826 100644 --- a/scrub/disk.h +++ b/scrub/disk.h @@ -10,7 +10,8 @@ struct disk { struct stat d_sb; int d_fd; - int d_lbalog; + unsigned int d_lbalog; + unsigned int d_lbasize; /* bytes */ unsigned int d_flags; unsigned int d_blksize; /* bytes */ uint64_t d_size; /* bytes */ From patchwork Fri Sep 6 03:38:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134379 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DEEF376 for ; Fri, 6 Sep 2019 03:38:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CA0B0206B8 for ; Fri, 6 Sep 2019 03:38:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="loVyS6yl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392152AbfIFDih (ORCPT ); Thu, 5 Sep 2019 23:38:37 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:44662 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392146AbfIFDih (ORCPT ); Thu, 5 Sep 2019 23:38:37 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863YUqd074698; Fri, 6 Sep 2019 03:38:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=on1+A8FomuxiRkvyrWNovC5Tib37Rb5iUHxgWJMkgEI=; b=loVyS6ylRxYjuQUKjtcMfPvrh0coVBcARsz4yGiwugNFeUTdZm7PinDkXpI7zYaFpAUw 3bsySNuc1d66gjsRYDtczhiD/YcdiwDjTrG2+oKiEZtbxfydasm3GG7Qw2Sy8m/tcCN2 hHq65VzU7pgaM+MOaz/B/GhgdmiLFx5EyASQ2fZVo3hT2y7WHdxW3a8GLni4iCY1Xr8N 9aakRnNj8x8K2XCP5etXO192G//xG/KeANJ8tPK5x4oLzNRjRYTGS6bXL4QAY5ymfGql q7E+RRH1BA+ES7a1JjFjFJHZBFUXEsnhkhp4yy1FBxiblZ4LRky7tPS6pjiG//c42JYX HA== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2uuf51g3bw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:35 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863cRQl078087; Fri, 6 Sep 2019 03:38:34 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3020.oracle.com with ESMTP id 2utvr4jxvk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:34 +0000 Received: from abhmp0020.oracle.com (abhmp0020.oracle.com [141.146.116.26]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x863cXLj005667; Fri, 6 Sep 2019 03:38:33 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:38:33 -0700 Subject: [PATCH 08/11] xfs_scrub: enforce read verify pool minimum io size From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:38:32 -0700 Message-ID: <156774111276.2645135.1979781895407724434.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Make sure we always issue media verification requests aligned to the minimum IO size that the caller cares about. Concretely, this means that we only care about doing IO in filesystem block-sized chunks. Signed-off-by: Darrick J. Wong --- scrub/read_verify.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 73d30817..9d9be68d 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -77,6 +77,15 @@ read_verify_pool_alloc( struct read_verify_pool *rvp; int ret; + /* + * The minimum IO size must be a multiple of the disk sector size + * and a factor of the max io size. + */ + if (miniosz % disk->d_lbasize) + return EINVAL; + if (RVP_IO_MAX_SIZE % miniosz) + return EINVAL; + rvp = calloc(1, sizeof(struct read_verify_pool)); if (!rvp) return errno; @@ -245,6 +254,11 @@ read_verify_schedule_io( int ret; assert(rvp->readbuf); + + /* Round up and down to the start of a miniosz chunk. */ + start &= ~(rvp->miniosz - 1); + length = roundup(length, rvp->miniosz); + rv = ptvar_get(rvp->rvstate, &ret); if (ret) return ret; From patchwork Fri Sep 6 03:38:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134381 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F27A4924 for ; Fri, 6 Sep 2019 03:38:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DF57D20820 for ; Fri, 6 Sep 2019 03:38:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Ba9liCPv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392153AbfIFDin (ORCPT ); Thu, 5 Sep 2019 23:38:43 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:44766 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392146AbfIFDin (ORCPT ); Thu, 5 Sep 2019 23:38:43 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863XpYf074291; Fri, 6 Sep 2019 03:38:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=zAjXva+PC8/QMXuOkpSNqIOC7Z7vSbnIAH9MIXKr3ME=; b=Ba9liCPvx7FrXUigZBFbGVcoqVU3+WtYwVIlOLToC6N24JPGu6gOmQZiWdtGVpR+xfNq cgTEZl3d7z9wdWY/jWRaO02h2MXL9GVUIcx9cUSOp4KuSzEbqvrkGllk416oi1z/pkSA 6KPODdIQYByxNsdDjH2ic0bdxgnUL/goY3NxGKfhCTlnEfcki/nVO0m9n7uE24a9dFDa DbSm2wFYZbQYAaindvkprkdVZ2RxmtU4OD9aTFdGqpR56USunl/5MFFTsEZFaZ8jNzae ZBizBKirOh5IwtbPmcjYQp48jzH+gGfFnNdVM56Veemp6NWX79i3GHY5TukrCkrpMgiH Gw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2uuf51g3c1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:41 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863cPlD077880; Fri, 6 Sep 2019 03:38:40 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3020.oracle.com with ESMTP id 2utvr4jy03-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:40 +0000 Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x863cdFi019844; Fri, 6 Sep 2019 03:38:39 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:38:39 -0700 Subject: [PATCH 09/11] xfs_scrub: return bytes verified from a SCSI VERIFY command From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:38:39 -0700 Message-ID: <156774111898.2645135.3530205346306381964.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Since disk_scsi_verify and pread are interchangeably called from disk_read_verify(), we must return the number of bytes verified (or -1) just like what pread returns. This doesn't matter now due to bugs in scrub, but we're about to fix those bugs. Signed-off-by: Darrick J. Wong --- scrub/disk.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scrub/disk.c b/scrub/disk.c index d2101cc6..bf9c795a 100644 --- a/scrub/disk.c +++ b/scrub/disk.c @@ -144,7 +144,7 @@ disk_scsi_verify( iohdr.timeout = 30000; /* 30s */ error = ioctl(disk->d_fd, SG_IO, &iohdr); - if (error) + if (error < 0) return error; dbg_printf("VERIFY(16) fd %d lba %"PRIu64" len %"PRIu64" info %x " @@ -163,7 +163,7 @@ disk_scsi_verify( return -1; } - return error; + return blockcount << BBSHIFT; } #else # define disk_scsi_verify(...) (ENOTTY) From patchwork Fri Sep 6 03:38:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134391 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7C62176 for ; Fri, 6 Sep 2019 03:39:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5F5CC20820 for ; Fri, 6 Sep 2019 03:39:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="REIEMFJq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387769AbfIFDjV (ORCPT ); Thu, 5 Sep 2019 23:39:21 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:38322 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732014AbfIFDjV (ORCPT ); Thu, 5 Sep 2019 23:39:21 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863dD0P108577; Fri, 6 Sep 2019 03:39:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=Q+EV+14rIkDjV7nRhr7orpb8jHt3XCVs4IJD5vJVHjo=; b=REIEMFJqPPqu/B/Dx//FOjTg4riR9IBXfqUrlO3FhzijJNFAya1sR1E0L4pwz7bCRcfl fgKGNcW1+zqaZYre4aUBtVEDpKzM325EO0yJR0S+V8G0rn/KTEPXuCRmlPaI3A0W2G6Z LfAZZ+CbufoQ4FwaOrMbBA+OA7yjdkpyaT7od56OjBmjA5WGLAAMuusk5AGrrazvSkRg URBOfXOrGccGW1if9fT9Fl3M+XcYKayhFPsbuQZAN+X9YhUZuihv6VZZoyjZBtEpci/9 PBR0w2qCcd7YNxmrUMAiiSVOUiA+GoXLPjAA+5TrDBbVIqlWmZ1YzUiTIxg8aOr/OJdV Og== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 2uuf5f839k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:39:19 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863dIRx112883; Fri, 6 Sep 2019 03:39:18 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 2uud7p2sds-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:39:17 +0000 Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x863cjGF020665; Fri, 6 Sep 2019 03:38:45 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:38:45 -0700 Subject: [PATCH 10/11] xfs_scrub: fix read verify disk error handling strategy From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:38:45 -0700 Message-ID: <156774112526.2645135.10744599143310432497.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The error handling strategy for media errors is totally bogus. First of all, short reads are entirely unhandled -- when we encounter a short read, we know the disk was able to feed us the beginning of what we asked for, so we need to single-step through the remainder to try to capture the exact error that we hit. Second, an actual IO error causes the entire region to be marked bad even though it could be just a few MB of a multi-gigabyte extent that's bad. Therefore, single-step each block in the IO request until we stop getting IO errors to find out if all the blocks are bad or if it's just that extent. Third, fix the fact that the loop updates its own counter variables with the length fed to read(), which doesn't necessarily have anything to do with the amount of data that the read actually produced. Signed-off-by: Darrick J. Wong --- scrub/read_verify.c | 86 ++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 74 insertions(+), 12 deletions(-) diff --git a/scrub/read_verify.c b/scrub/read_verify.c index 9d9be68d..3dac10ce 100644 --- a/scrub/read_verify.c +++ b/scrub/read_verify.c @@ -169,30 +169,92 @@ read_verify( struct read_verify *rv = arg; struct read_verify_pool *rvp; unsigned long long verified = 0; + ssize_t io_max_size; ssize_t sz; ssize_t len; + int io_error; int ret; rvp = (struct read_verify_pool *)wq->wq_ctx; + if (rvp->errors_seen) + return; + + io_max_size = RVP_IO_MAX_SIZE; + while (rv->io_length > 0) { - len = min(rv->io_length, RVP_IO_MAX_SIZE); + io_error = 0; + len = min(rv->io_length, io_max_size); dbg_printf("diskverify %d %"PRIu64" %zu\n", rvp->disk->d_fd, rv->io_start, len); sz = disk_read_verify(rvp->disk, rvp->readbuf, rv->io_start, len); - if (sz < 0) { - dbg_printf("IOERR %d %"PRIu64" %zu\n", - rvp->disk->d_fd, rv->io_start, len); - /* IO error, so try the next logical block. */ - len = rvp->miniosz; - rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, len, - errno, rv->io_end_arg); + if (sz == len && io_max_size < rvp->miniosz) { + /* + * If the verify request was 100% successful and less + * than a single block in length, we were trying to + * read to the end of a block after a short read. That + * suggests there's something funny with this device, + * so single-step our way through the rest of the @rv + * range. + */ + io_max_size = rvp->miniosz; + } else if (sz < 0) { + io_error = errno; + + /* Runtime error, bail out... */ + if (io_error != EIO && io_error != EILSEQ) { + rvp->errors_seen = io_error; + return; + } + + /* + * A direct read encountered an error while performing + * a multi-block read. Reduce the transfer size to a + * single block so that we can identify the exact range + * of bad blocks and good blocks. We single-step all + * the way to the end of the @rv range, (re)starting + * with the block that just failed. + */ + if (io_max_size > rvp->miniosz) { + io_max_size = rvp->miniosz; + continue; + } + + /* + * A direct read hit an error while we were stepping + * through single blocks. Mark everything bad from + * io_start to the next miniosz block. + */ + sz = rvp->miniosz - (rv->io_start % rvp->miniosz); + dbg_printf("IOERR %d @ %"PRIu64" %zu err %d\n", + rvp->disk->d_fd, rv->io_start, sz, + io_error); + rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, sz, + io_error, rv->io_end_arg); + } else if (sz < len) { + /* + * A short direct read suggests that we might have hit + * an IO error midway through the read but still had to + * return the number of bytes that were actually read. + * + * We need to force an EIO, so try reading the rest of + * the block (if it was a partial block read) or the + * next full block. + */ + io_max_size = rvp->miniosz - (sz % rvp->miniosz); + dbg_printf("SHORT %d READ @ %"PRIu64" %zu try for %zd\n", + rvp->disk->d_fd, rv->io_start, sz, + io_max_size); + } else { + /* We should never get back more bytes than we asked. */ + assert(sz == len); } - progress_add(len); - verified += len; - rv->io_start += len; - rv->io_length -= len; + progress_add(sz); + if (io_error == 0) + verified += sz; + rv->io_start += sz; + rv->io_length -= sz; } free(rv); From patchwork Fri Sep 6 03:38:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11134383 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CAB63924 for ; Fri, 6 Sep 2019 03:38:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B61A52082C for ; Fri, 6 Sep 2019 03:38:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Od3D8M+5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392146AbfIFDi4 (ORCPT ); Thu, 5 Sep 2019 23:38:56 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:45094 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731760AbfIFDi4 (ORCPT ); Thu, 5 Sep 2019 23:38:56 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863YcF1074767; Fri, 6 Sep 2019 03:38:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=DHp6NeqHX2UVAq8iW1ZZUFju6HMRz5iZsEGCXgnurz0=; b=Od3D8M+5uPMxQuIQPfLpQ1Hm87CfBO4uZN4GkdCRUCySVwJFX+vbulgxmf1lFJ0I0naU fI3i3HioQIyOtLYFtSI6QeqrqQVUcgbNXQN1BQGe4TqjuqKljYTpVrBHHZz8y9+EYuxq XKJ80irHoEt1GBknDyPeUv1wY/kr/D4H/ru5PZdAtXWOG3wk9NnZM6x6S/I3EGWMnHoq gk1N5boHQ1BAmMFypbqzX3ZAUzrioaR/aJTz8JScj4ugLdugFCACt//QqqqjMxo7Ev2z yUoB8zizdg3dP/fKWWCVNJvSZtVqrI1mviOG9J8z6aSFSR+xIoixP3lCQZq2EKYHhdG/ vQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2uuf51g3cu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:54 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x863cP8X077906; Fri, 6 Sep 2019 03:38:53 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3020.oracle.com with ESMTP id 2utvr4jy7b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 06 Sep 2019 03:38:53 +0000 Received: from abhmp0020.oracle.com (abhmp0020.oracle.com [141.146.116.26]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x863cqph005740; Fri, 6 Sep 2019 03:38:52 GMT Received: from localhost (/10.159.148.70) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 05 Sep 2019 20:38:52 -0700 Subject: [PATCH 11/11] xfs_scrub: simulate errors in the read-verify phase From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Thu, 05 Sep 2019 20:38:51 -0700 Message-ID: <156774113148.2645135.9143982725131395334.stgit@magnolia> In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia> References: <156774106064.2645135.2756383874064764589.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060040 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9371 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909060039 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add a debugging hook so that we can simulate disk errors during the media scan to test that the code works. Signed-off-by: Darrick J. Wong --- scrub/disk.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++++ scrub/xfs_scrub.c | 2 ++ 2 files changed, 69 insertions(+) diff --git a/scrub/disk.c b/scrub/disk.c index bf9c795a..214a5346 100644 --- a/scrub/disk.c +++ b/scrub/disk.c @@ -276,6 +276,59 @@ disk_close( #define LBASIZE(d) (1ULL << (d)->d_lbalog) #define BTOLBA(d, bytes) (((uint64_t)(bytes) + LBASIZE(d) - 1) >> (d)->d_lbalog) +/* Simulate disk errors. */ +static int +disk_simulate_read_error( + struct disk *disk, + uint64_t start, + uint64_t *length) +{ + static int64_t interval; + uint64_t start_interval; + + /* Simulated disk errors are disabled. */ + if (interval < 0) + return 0; + + /* Figure out the disk read error interval. */ + if (interval == 0) { + char *p; + + /* Pretend there's bad media every so often, in bytes. */ + p = getenv("XFS_SCRUB_DISK_ERROR_INTERVAL"); + if (p == NULL) { + interval = -1; + return 0; + } + interval = strtoull(p, NULL, 10); + interval &= ~((1U << disk->d_lbalog) - 1); + } + + /* + * We simulate disk errors by pretending that there are media errors at + * predetermined intervals across the disk. If a read verify request + * crosses one of those intervals we shorten it so that the next read + * will start on an interval threshold. If the read verify request + * starts on an interval threshold, we send back EIO as if it had + * failed. + */ + if ((start % interval) == 0) { + dbg_printf("fd %d: simulating disk error at %"PRIu64".\n", + disk->d_fd, start); + return EIO; + } + + start_interval = start / interval; + if (start_interval != (start + *length) / interval) { + *length = ((start_interval + 1) * interval) - start; + dbg_printf( +"fd %d: simulating short read at %"PRIu64" to length %"PRIu64".\n", + disk->d_fd, start, *length); + } + + return 0; +} + /* Read-verify an extent of a disk device. */ ssize_t disk_read_verify( @@ -284,6 +337,20 @@ disk_read_verify( uint64_t start, uint64_t length) { + if (debug) { + int ret; + + ret = disk_simulate_read_error(disk, start, &length); + if (ret) { + errno = ret; + return -1; + } + + /* Don't actually issue the IO */ + if (getenv("XFS_SCRUB_DISK_VERIFY_SKIP")) + return length; + } + /* Convert to logical block size. */ if (disk->d_flags & DISK_FLAG_SCSI_VERIFY) return disk_scsi_verify(disk, BTOLBAT(disk, start), diff --git a/scrub/xfs_scrub.c b/scrub/xfs_scrub.c index 05478093..b6a01274 100644 --- a/scrub/xfs_scrub.c +++ b/scrub/xfs_scrub.c @@ -111,6 +111,8 @@ * XFS_SCRUB_NO_SCSI_VERIFY -- disable SCSI VERIFY (if present) * XFS_SCRUB_PHASE -- run only this scrub phase * XFS_SCRUB_THREADS -- start exactly this number of threads + * XFS_SCRUB_DISK_ERROR_INTERVAL-- simulate a disk error every this many bytes + * XFS_SCRUB_DISK_VERIFY_SKIP -- pretend disk verify read calls succeeded * * Available even in non-debug mode: * SERVICE_MODE -- compress all error codes to 1 for LSB