From patchwork Wed Nov 26 15:30:39 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 5386421 Return-Path: X-Original-To: patchwork-fstests@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id E53509F319 for ; Wed, 26 Nov 2014 15:31:30 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C45F6201EF for ; Wed, 26 Nov 2014 15:31:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AB9AE201F5 for ; Wed, 26 Nov 2014 15:31:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753152AbaKZPbV (ORCPT ); Wed, 26 Nov 2014 10:31:21 -0500 Received: from victor.provo.novell.com ([137.65.250.26]:50029 "EHLO prv3-mh.provo.novell.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750892AbaKZPbT (ORCPT ); Wed, 26 Nov 2014 10:31:19 -0500 Received: from debian3.lan (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by prv3-mh.provo.novell.com with ESMTP (NOT encrypted); Wed, 26 Nov 2014 08:31:17 -0700 From: Filipe Manana To: fstests@vger.kernel.org Cc: linux-btrfs@vger.kernel.org, Filipe Manana Subject: [PATCH] fstests: add btrfs test to stress chunk allocation/removal and fstrim Date: Wed, 26 Nov 2014 15:30:39 +0000 Message-Id: <1417015839-26985-1-git-send-email-fdmanana@suse.com> X-Mailer: git-send-email 2.1.3 Sender: fstests-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: fstests@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Stress btrfs' block group allocation and deallocation while running fstrim in parallel. Part of the goal is also to get data block groups deallocated so that new metadata block groups, using the same physical device space ranges, get allocated while fstrim is running. This caused several issues ranging from invalid memory accesses, kernel crashes, metadata or data corruption, free space cache inconsistencies and free space leaks. Signed-off-by: Filipe Manana --- tests/btrfs/082 | 148 ++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/082.out | 2 + tests/btrfs/group | 1 + 3 files changed, 151 insertions(+) create mode 100755 tests/btrfs/082 create mode 100644 tests/btrfs/082.out diff --git a/tests/btrfs/082 b/tests/btrfs/082 new file mode 100755 index 0000000..8ac9f06 --- /dev/null +++ b/tests/btrfs/082 @@ -0,0 +1,148 @@ +#! /bin/bash +# FSQA Test No. 082 +# +# Stress btrfs' block group allocation and deallocation while running fstrim in +# parallel. Part of the goal is also to get data block groups deallocated so +# that new metadata block groups, using the same physical device space ranges, +# get allocated while fstrim is running. This caused several issues ranging +# from invalid memory accesses, kernel crashes, metadata or data corruption, +# free space cache inconsistencies and free space leaks. +# +# These issues were fixed by the following btrfs linux kernel patches: +# +# Btrfs: fix invalid block group rbtree access after bg is removed +# Btrfs: fix crash caused by block group removal +# Btrfs: fix freeing used extents after removing empty block group +# Btrfs: fix race between fs trimming and block group remove/allocation +# Btrfs: fix race between writing free space cache and trimming +# Btrfs: make btrfs_abort_transaction consider existence of new block groups +# +# The issues were found on a qemu/kvm guest with 4 virtual CPUs, 4Gb of ram and +# scsi-hd devices with discard support enabled (that means hole punching in the +# disk's image file is performed by the host). +# +#----------------------------------------------------------------------- +# +# Copyright (C) 2014 SUSE Linux Products GmbH. All Rights Reserved. +# Author: Filipe Manana +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + rm -fr $tmp +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# real QA test starts here +_need_to_be_root +_supported_fs btrfs +_supported_os Linux +_require_scratch_nocheck +_require_fstrim + +rm -f $seqres.full + +# Keep allocating and deallocating 2G of data space with the goal of creating +# and deleting 2 block groups constantly. The intention is to race with the +# fstrim loop below. +fallocate_loop() +{ + local name=$1 + while true; do + $XFS_IO_PROG -f -c "falloc -k 0 2G" \ + $SCRATCH_MNT/$name &> /dev/null + sleep 3 + $XFS_IO_PROG -c "truncate 0" \ + $SCRATCH_MNT/$name &> /dev/null + sleep 3 + done +} + +trim_loop() +{ + while true; do + $FSTRIM_PROG $SCRATCH_MNT + done +} + +# Create a bunch of small files that get their single extent inlined in the +# btree, so that we consume a lot of metadata space and get a chance of a +# data block group getting deleted and reused for metadata later. Sometimes +# the creation of all these files succeeds other times we get ENOSPC failures +# at some point - this depends on how fast the btrfs' cleaner kthread is +# notified about empty block groups, how fast it deletes them and how fast +# the fallocate calls happen. So we don't really care if they all succeed or +# not, the goal is just to keep metadata space usage growing while data block +# groups are deleted. +create_files() +{ + local prefix=$1 + + for ((i = 1; i <= 400000; i++)); do + echo "Creating file ${prefix}_$i" >>$seqres.full 2>&1 + $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 3900" \ + $SCRATCH_MNT/"${prefix}_$i" >>$seqres.full 2>&1 + ret=$? + if [ $ret -ne 0 ]; then + break + fi + done + +} + +fsz=`expr 40 \* 1024 \* 1024 \* 1024` +_scratch_mkfs_sized $fsz >>$seqres.full 2>&1 || \ + _fail "size=$fsz mkfs failed" +_scratch_mount + +for ((i = 0; i < 4; i++)); do + trim_loop & + trim_pids[$i]=$! +done + +fallocate_loop "falloc_file" & +fallocate_pid=$! + +create_files "foobar" + +kill $fallocate_pid +kill ${trim_pids[@]} +wait + +# Sleep a bit, otherwise umount fails often with EBUSY (TODO: investigate why). +sleep 3 + +# Check for fs consistency. The trimming was racy and caused some btree nodes +# to get full of zeroes on disk, which obviously caused fs metadata corruption. +# The race often lead to missing free space entries in a block group's free +# space cache too. +_check_scratch_fs + +echo "Silence is golden" +status=0 +exit diff --git a/tests/btrfs/082.out b/tests/btrfs/082.out new file mode 100644 index 0000000..2977f14 --- /dev/null +++ b/tests/btrfs/082.out @@ -0,0 +1,2 @@ +QA output created by 082 +Silence is golden diff --git a/tests/btrfs/group b/tests/btrfs/group index e79b848..6608005 100644 --- a/tests/btrfs/group +++ b/tests/btrfs/group @@ -84,3 +84,4 @@ 079 auto 080 auto 081 auto quick +082 auto