From patchwork Sun Nov 24 23:57:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13884303 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8F9518C903 for ; Sun, 24 Nov 2024 23:57:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732492680; cv=none; b=EYz56pMAjHO5DpJr77Q0JOy8dNaxNxw4PZIsVABNOu2u6cDsEhnEjBIW35Y1nL5Hw3e2Gq88lqOFQAtIwuq8cOO6AbQtz+0WHYIC2p/tXNokhJ7hm+OVxYD0MtD8QKdjsYbDcytfYnPUIHxZ42Kv21f6m9RNMDCHJteoDQEhCEg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732492680; c=relaxed/simple; bh=4SxOfTM+8k9V3UUnhSL7J427V1u0hMzPMFrGiKryQbE=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GCAFN2zkxlvBhvX1O+jC4f8R5LFETG89iyfr2o/XVnRE5AIS+On24fqZg49rQupymwnyYJE3czRYkzRDvznEVXp8QU38SU8rae+kdccKRPCP13UdBsCxrunEFNhQM1kjjZd/8lir3fpwzrqXH/55xKIQgjlUaCktex5g3x7Cft8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b=QVEMNb4k; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b=QVEMNb4k; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="QVEMNb4k"; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="QVEMNb4k" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 24C6221189 for ; Sun, 24 Nov 2024 23:57:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1732492677; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QLlrG6iYuL37r+/+Mtb6EzYxcRR3dT6DqW4FXzZ81nY=; b=QVEMNb4ks+Se6NLqAWzyFYJkQlF2wibPxu/kdcBmrtAv2WF7pSRYdEkvYfTMfUV9vHtYXY RWPH3OsYubWZkTnqhAO7RVflHvP9/PRlzf1crkO37ft73Q3CbKAuySmDyqQNaXbb44Kb93 7LSCdiBICOkF1lEn1fI1J9gfmzNDBag= Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1732492677; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QLlrG6iYuL37r+/+Mtb6EzYxcRR3dT6DqW4FXzZ81nY=; b=QVEMNb4ks+Se6NLqAWzyFYJkQlF2wibPxu/kdcBmrtAv2WF7pSRYdEkvYfTMfUV9vHtYXY RWPH3OsYubWZkTnqhAO7RVflHvP9/PRlzf1crkO37ft73Q3CbKAuySmDyqQNaXbb44Kb93 7LSCdiBICOkF1lEn1fI1J9gfmzNDBag= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 46506132CF for ; Sun, 24 Nov 2024 23:57:56 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id yF1hAIS9Q2emSQAAD6G6ig (envelope-from ) for ; Sun, 24 Nov 2024 23:57:56 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 7/7] btrfs: allow buffered write to skip full page if it's sector aligned Date: Mon, 25 Nov 2024 10:27:27 +1030 Message-ID: <2cf9783e7f152681164caa6abbc9535bfff9c6f8.1732492421.git.wqu@suse.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Score: -2.80 X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-0.986]; MIME_GOOD(-0.10)[text/plain]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; ARC_NA(0.00)[]; DKIM_SIGNED(0.00)[suse.com:s=susede1]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.com:mid,suse.com:email]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[linux-btrfs@vger.kernel.org]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Level: [BUG] Since the support of sector size < page size for btrfs, test case generic/563 fails with 4K sector size and 64K page size: --- tests/generic/563.out 2024-04-25 18:13:45.178550333 +0930 +++ /home/adam/xfstests-dev/results//generic/563.out.bad 2024-09-30 09:09:16.155312379 +0930 @@ -3,7 +3,8 @@ read is in range write is in range write -> read/write -read is in range +read has value of 8388608 +read is NOT in range -33792 .. 33792 write is in range ... [CAUSE] The test case creates a 8MiB file, then buffered write into the 8MiB using 4K block size, to overwrite the whole file. On 4K page sized systems, since the write range covers the full sector and page, btrfs will no bother reading the page, just like what XFS and EXT4 do. But 64K page sized systems, although the write is sector aligned, it's not page aligned, thus btrfs still goes the full page alignment check, and read the full page out. This causes extra data read, and fail the test case. [FIX] To skip the full page read, we need to do the following modification: - Do not trigger full page read as long as the buffered write is sector aligned This is pretty simple by modifying the check inside prepare_uptodate_page(). - Skip already uptodate sectors during full page read Or we can lead to the following data corruption: 0 32K 64K |///////| | Where the file range [0, 32K) is dirtied by buffered write, the remaining range [32K, 64K) is not. When reading the full page, since [0,32K) is only dirtied but not written back, there is no data extent map for it, but a hole covering [0, 64k). If we continue reading the full page range [0, 64K), the dirtied range will be filled with 0 (since there is only a hole covering the whole range). This causes the dirtied range to get lost. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 4 ++++ fs/btrfs/file.c | 5 +++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 624abe04401c..806315e82db7 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -991,6 +991,10 @@ static int btrfs_do_readpage(struct folio *folio, struct extent_map **em_cached, end_folio_read(folio, true, cur, end - cur + 1); break; } + if (btrfs_folio_test_uptodate(fs_info, folio, cur, blocksize)) { + end_folio_read(folio, true, cur, blocksize); + continue; + } em = __get_extent_map(inode, folio, cur, end - cur + 1, em_cached); if (IS_ERR(em)) { diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index e2c6165eba21..6f485b8bda66 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -841,14 +841,15 @@ static int prepare_uptodate_folio(struct inode *inode, struct folio *folio, u64 { u64 clamp_start = max_t(u64, pos, folio_pos(folio)); u64 clamp_end = min_t(u64, pos + len, folio_pos(folio) + folio_size(folio)); + const u32 sectorsize = inode_to_fs_info(inode)->sectorsize; int ret = 0; if (folio_test_uptodate(folio)) return 0; if (!force_uptodate && - IS_ALIGNED(clamp_start, PAGE_SIZE) && - IS_ALIGNED(clamp_end, PAGE_SIZE)) + IS_ALIGNED(clamp_start, sectorsize) && + IS_ALIGNED(clamp_end, sectorsize)) return 0; ret = btrfs_read_folio(NULL, folio);