From patchwork Thu Oct 26 14:48:16 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 10028471 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F085960375 for ; Thu, 26 Oct 2017 14:48:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E04E628E49 for ; Thu, 26 Oct 2017 14:48:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D2F5F28E4D; Thu, 26 Oct 2017 14:48:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2404228E49 for ; Thu, 26 Oct 2017 14:48:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932326AbdJZOsS (ORCPT ); Thu, 26 Oct 2017 10:48:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51642 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932246AbdJZOsR (ORCPT ); Thu, 26 Oct 2017 10:48:17 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B5B5C4E90B; Thu, 26 Oct 2017 14:48:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B5B5C4E90B Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=bfoster@redhat.com Received: from bfoster.bfoster (dhcp-41-20.bos.redhat.com [10.18.41.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 847337F5F7; Thu, 26 Oct 2017 14:48:17 +0000 (UTC) Received: by bfoster.bfoster (Postfix, from userid 1000) id 517111213A7; Thu, 26 Oct 2017 10:48:16 -0400 (EDT) From: Brian Foster To: fstests@vger.kernel.org Cc: linux-xfs@vger.kernel.org Subject: [PATCH] tests/generic: test writepage cached mapping validity Date: Thu, 26 Oct 2017 10:48:16 -0400 Message-Id: <20171026144816.9259-1-bfoster@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 26 Oct 2017 14:48:17 +0000 (UTC) Sender: fstests-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP XFS has a bug where page writeback can end up sending data to the wrong location due to a stale, cached file mapping. Add a test to trigger this problem by racing background writeback with a truncate/rewrite of the final page of the file. Signed-off-by: Brian Foster --- Here's a new version of the writepages test I previously posted as RFC. This variant does not require an artificial delay to reproduce, so I've dropped the need for the error injection tag. I have been playing a bit with the file size and iteration count of the test. I started with something that ran a decent bit longer (~2m) as was necessary to reproduce on my dev/debug vm, but recently trimmed the file size and iteration count to something that runs much quicker (~10s) and reproduces nearly 100% of the time on my actual test hardware. The tradeoff is the reproducibility is much lower on my debug vm (~20-25% perhaps). The test still does reproduce when run over 10-15 iters, so I opted for the quicker test. In all, I am a bit curious about whether this reproduces reliably on others' test setups. If not, does tweaking the size/iterations improve the reproducibility? Brian v1: - New test algorithm that does not require artificial delay. - Created as generic test. rfc: https://marc.info/?l=linux-xfs&m=150886719725497&w=2 tests/generic/999 | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++ tests/generic/999.out | 2 ++ tests/generic/group | 1 + 3 files changed, 97 insertions(+) create mode 100755 tests/generic/999 create mode 100644 tests/generic/999.out diff --git a/tests/generic/999 b/tests/generic/999 new file mode 100755 index 0000000..9e56a1e --- /dev/null +++ b/tests/generic/999 @@ -0,0 +1,94 @@ +#! /bin/bash +# FS QA Test 999 +# +# Test XFS page writeback code for races with the cached file mapping. XFS +# caches the file -> block mapping for a full extent once it is initially looked +# up. The cached mapping is used for all subsequent pages in the same writeback +# cycle that cover the associated extent. Under certain conditions, it is +# possible for concurrent operations on the file to invalidate the cached +# mapping without the knowledge of writeback. Writeback ends up sending I/O to a +# partly stale mapping and potentially leaving delalloc blocks in the current +# mapping unconverted. +# +#----------------------------------------------------------------------- +# Copyright (c) 2017 Red Hat, Inc. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here + +# Modify as appropriate. +_supported_fs generic +_supported_os Linux +_require_scratch +_require_test_program "feature" + +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed" +_scratch_mount || _fail "mount failed" + +file=$SCRATCH_MNT/file +filesize=$((1024 * 1024 * 32)) +pagesize=`src/feature -s` +truncsize=$((filesize - pagesize)) + +for i in $(seq 0 15); do + # Truncate the file and fsync to persist the final size on-disk. This is + # required so the subsequent truncate will not wait on writeback. + $XFS_IO_PROG -fc "truncate 0" $file + $XFS_IO_PROG -c "truncate $filesize" -c fsync $file + + # create a small enough delalloc extent to likely be contiguous + $XFS_IO_PROG -c "pwrite 0 $filesize" $file >> $seqres.full 2>&1 + + # Start writeback and a racing truncate and rewrite of the final page. + $XFS_IO_PROG -c "sync_range -w 0 0" $file & + sync_pid=$! + $XFS_IO_PROG -c "truncate $truncsize" \ + -c "pwrite $truncsize $pagesize" $file >> $seqres.full 2>&1 + + # If the test fails, the most likely outcome is an sb_fdblocks mismatch + # and/or an associated delalloc assert failure on inode reclaim. Cycle + # the mount to trigger detection. + wait $sync_pid + _scratch_cycle_mount || _fail "mount failed" +done + +echo Silence is golden + +# success, all done +status=0 +exit diff --git a/tests/generic/999.out b/tests/generic/999.out new file mode 100644 index 0000000..3b276ca --- /dev/null +++ b/tests/generic/999.out @@ -0,0 +1,2 @@ +QA output created by 999 +Silence is golden diff --git a/tests/generic/group b/tests/generic/group index fbe0a7f..89342da 100644 --- a/tests/generic/group +++ b/tests/generic/group @@ -468,3 +468,4 @@ 463 auto quick clone dangerous 464 auto rw 465 auto rw quick aio +999 auto quick