[v2,0/2] btrfs: scrub: update last_physical more frequently

Message ID	cover.1721627526.git.wqu@suse.com (mailing list archive)
Headers	show Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD0491B964 for <linux-btrfs@vger.kernel.org>; Mon, 22 Jul 2024 05:56:11 +0000 (UTC) From: Qu Wenruo <wqu@suse.com> To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 0/2] btrfs: scrub: update last_physical more frequently Date: Mon, 22 Jul 2024 15:25:47 +0930 Message-ID: <cover.1721627526.git.wqu@suse.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	btrfs: scrub: update last_physical more frequently \| expand [v2,0/2] btrfs: scrub: update last_physical more frequently [v2,1/2] btrfs: extract the stripe length calculation into a helper [v2,2/2] btrfs: scrub: update last_physical after scrubing one stripe

Message ID

cover.1721627526.git.wqu@suse.com (mailing list archive)

Headers

From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH v2 0/2] btrfs: scrub: update last_physical more frequently
Date: Mon, 22 Jul 2024 15:25:47 +0930
Message-ID: <cover.1721627526.git.wqu@suse.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

btrfs: scrub: update last_physical more frequently | expand

Message

Qu Wenruo July 22, 2024, 5:55 a.m. UTC

[CHANGELOG]
v2:
- Rebased to the latest for-next branch
  No conflict at all

- Rewording the second patch
  There are two problems, one is serious (no last_physical update at
  all for almost full data chunks), the other one is just inconvenient
  (slow update on "btrfs scrub status").

- Add the missing spinlock
  It's mentioned in the commit messages but not in the code

There is a report in the mailling list that scrub only updates its
@last_physical at the end of a chunk.
In fact, it can be worse if there is a used stripe (aka, some extents
exist in the stripe) at the chunk boundary.
As it would skip the @last_physical for that chunk at all.
And for large fses, there are ensured to be several almost full data 

With @last_physical not update for a long time, if we cancel the scrub
halfway and resume, the resumed one scrub would only start at
@last_physical, meaning a lot of scrubbed extents would be re-scrubbed,
wasting quite some IO and CPU.

This patchset would fix it by updateing @last_physical for each finished
stripe (including both P/Q stripe of RAID56, and all data stripes for
all profiles), so that even if the scrub is cancelled, we at most
re-scrub one stripe.

Qu Wenruo (2):
  btrfs: extract the stripe length calculation into a helper
  btrfs: scrub: update last_physical after scrubing one stripe

 fs/btrfs/scrub.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

Comments

David Sterba July 25, 2024, 10:09 p.m. UTC | #1

On Mon, Jul 22, 2024 at 03:25:47PM +0930, Qu Wenruo wrote:
> [CHANGELOG]
> v2:
> - Rebased to the latest for-next branch
>   No conflict at all
> 
> - Rewording the second patch
>   There are two problems, one is serious (no last_physical update at
>   all for almost full data chunks), the other one is just inconvenient
>   (slow update on "btrfs scrub status").
> 
> - Add the missing spinlock
>   It's mentioned in the commit messages but not in the code
> 
> There is a report in the mailling list that scrub only updates its
> @last_physical at the end of a chunk.
> In fact, it can be worse if there is a used stripe (aka, some extents
> exist in the stripe) at the chunk boundary.
> As it would skip the @last_physical for that chunk at all.
> And for large fses, there are ensured to be several almost full data 
> 
> With @last_physical not update for a long time, if we cancel the scrub
> halfway and resume, the resumed one scrub would only start at
> @last_physical, meaning a lot of scrubbed extents would be re-scrubbed,
> wasting quite some IO and CPU.
> 
> This patchset would fix it by updateing @last_physical for each finished
> stripe (including both P/Q stripe of RAID56, and all data stripes for
> all profiles), so that even if the scrub is cancelled, we at most
> re-scrub one stripe.
> 
> Qu Wenruo (2):
>   btrfs: extract the stripe length calculation into a helper
>   btrfs: scrub: update last_physical after scrubing one stripe

Reviewed-by: David Sterba <dsterba@suse.com>