[v2,2/3] btrfs: qgroup: Try to flush qgroup space when we get -EDQUOT

[PROBLEM]
There are known problem related to how btrfs handles qgroup reserved
space.
One of the most obvious case is the the test case btrfs/153, which do
fallocate, then write into the preallocated range.

  btrfs/153 1s ... - output mismatch (see xfstests-dev/results//btrfs/153.out.bad)
      --- tests/btrfs/153.out     2019-10-22 15:18:14.068965341 +0800
      +++ xfstests-dev/results//btrfs/153.out.bad      2020-07-01 20:24:40.730000089 +0800
      @@ -1,2 +1,5 @@
       QA output created by 153
      +pwrite: Disk quota exceeded
      +/mnt/scratch/testfile2: Disk quota exceeded
      +/mnt/scratch/testfile2: Disk quota exceeded
       Silence is golden
      ...
      (Run 'diff -u xfstests-dev/tests/btrfs/153.out xfstests-dev/results//btrfs/153.out.bad'  to see the entire diff)

[CAUSE]
Since commit c6887cd11149 ("Btrfs: don't do nocow check unless we have to"),
we always reserve space no matter if it's COW or not.

Such behavior change is mostly for performance, and reverting it is not
a good idea anyway.

For preallcoated extent, we reserve qgroup data space for it already,
and since we also reserve data space for qgroup at buffered write time,
it needs twice the space for us to write into preallocated space.

This leads to the -EDQUOT in buffered write routine.

And we can't follow the same solution, unlike data/meta space check,
qgroup reserved space is shared between data/meta.
The EDQUOT can happen at the metadata reservation, so doing NODATACOW
check after qgroup reservation failure is not a solution.

[FIX]
To solve the problem, we don't return -EDQUOT directly, but every time
we got a -EDQUOT, we try to flush qgroup space by:
- Flush all inodes of the root
  NODATACOW writes will free the qgroup reserved at run_dealloc_range().
  However we don't have the infrastructure to only flush NODATACOW
  inodes, here we flush all inodes anyway.

- Wait ordered extents
  This would convert the preallocated metadata space into per-trans
  metadata, which can be freed in later transaction commit.

- Commit transaction
  This will free all per-trans metadata space.

Also we don't want to trigger flush too racy, so here we introduce a
per-root mutex to ensure if there is a running qgroup flushing, we wait
for it to end and don't start re-flush.

Fixes: c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/ctree.h   |   1 +
 fs/btrfs/disk-io.c |   1 +
 fs/btrfs/qgroup.c  | 118 ++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 108 insertions(+), 12 deletions(-)

Message ID	20200703061902.33350-3-wqu@suse.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=qDml=AO=vger.kernel.org=linux-btrfs-owner@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E390F618 for <patchwork-linux-btrfs@patchwork.kernel.org>; Fri, 3 Jul 2020 06:19:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D1F4B20771 for <patchwork-linux-btrfs@patchwork.kernel.org>; Fri, 3 Jul 2020 06:19:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726352AbgGCGTP (ORCPT <rfc822;patchwork-linux-btrfs@patchwork.kernel.org>); Fri, 3 Jul 2020 02:19:15 -0400 Received: from mx2.suse.de ([195.135.220.15]:49840 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726118AbgGCGTO (ORCPT <rfc822;linux-btrfs@vger.kernel.org>); Fri, 3 Jul 2020 02:19:14 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CB389AFF0 for <linux-btrfs@vger.kernel.org>; Fri, 3 Jul 2020 06:19:09 +0000 (UTC) From: Qu Wenruo <wqu@suse.com> To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 2/3] btrfs: qgroup: Try to flush qgroup space when we get -EDQUOT Date: Fri, 3 Jul 2020 14:19:01 +0800 Message-Id: <20200703061902.33350-3-wqu@suse.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200703061902.33350-1-wqu@suse.com> References: <20200703061902.33350-1-wqu@suse.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: <linux-btrfs.vger.kernel.org> X-Mailing-List: linux-btrfs@vger.kernel.org
Series	btrfs: qgroup: Fix the long existing regression of btrfs/153 \| expand [v2,0/3] btrfs: qgroup: Fix the long existing regression of btrfs/153 [v2,1/3] btrfs: qgroup: Allow btrfs_qgroup_reserve_data() to revert EXTENT_QGROUP_RESERVED bits whe… [v2,2/3] btrfs: qgroup: Try to flush qgroup space when we get -EDQUOT [v2,3/3] btrfs: qgroup: remove the ASYNC_COMMIT mechanism in favor of qgroup reserve retry-after-ED…

[v2,2/3] btrfs: qgroup: Try to flush qgroup space when we get -EDQUOT

Commit Message

Comments

Patch