From patchwork Tue Jun 25 10:18:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Kara X-Patchwork-Id: 13710839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6800CC2BBCA for ; Tue, 25 Jun 2024 10:19:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1B81C6B01F6; Tue, 25 Jun 2024 06:19:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 086E36B01F3; Tue, 25 Jun 2024 06:19:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F9D36B02F0; Tue, 25 Jun 2024 06:19:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A13A26B02F1 for ; Tue, 25 Jun 2024 06:19:14 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 53D79121833 for ; Tue, 25 Jun 2024 10:19:14 +0000 (UTC) X-FDA: 82269013428.12.F3ACD09 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf09.hostedemail.com (Postfix) with ESMTP id 0162814001A for ; Tue, 25 Jun 2024 10:19:11 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ikWfKa8b; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=IeaEZuPh; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ikWfKa8b; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=IeaEZuPh; spf=pass (imf09.hostedemail.com: domain of jack@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=jack@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719310745; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1PGCsNEAZb8feT1eGXK7ybre2/GF8QGIQq2PzvMqbP8=; b=s+TszfoKp36v9xXi+bemv3gTHryW9mc6rxLzJUk0/9M9gJz0fVq41yQfIaxtNciIWKibxK YBSc1dEGB9HLgNb8pEoH0S6/ApN6PygOdenAQ9P1XD/s108oNWLeoM7kWCsOo/QPCshd4s OBHJfQRbF+ywMunmcBfBUkrhH3+YeZc= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ikWfKa8b; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=IeaEZuPh; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ikWfKa8b; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=IeaEZuPh; spf=pass (imf09.hostedemail.com: domain of jack@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=jack@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719310745; a=rsa-sha256; cv=none; b=1qVKQaI9rYQxAG+m5cQS8BYoYwY6ndviYZ3a3LXHbaXnLP5LJVQac1WZ/idSlWJ9sC0W1e eUAHO50Bd4KKQ0PqjcYF5Y5EyEtWo7iWu2odrJpLr9s5s70afVE3uH9BSvwLube248EjfI f9YT6eFun84hVmVYNC9WzChYLQcwZyQ= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 3311F21972; Tue, 25 Jun 2024 10:19:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1719310750; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1PGCsNEAZb8feT1eGXK7ybre2/GF8QGIQq2PzvMqbP8=; b=ikWfKa8bne81WF9vVAJtHa0QPnc1LIizuvh4aPfFB1lSYHk1eS3SB1KHwnakTzv1X6jp1C S2ICBHjmdOtSSEnxQQp35KJPd3l2UgR/UggflD6wt2JWCAuvUXXPCguRd3MRAxealFQxRe RS0u8q0h0dJQrVXJEIwScpfdqBj/W5Q= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1719310750; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1PGCsNEAZb8feT1eGXK7ybre2/GF8QGIQq2PzvMqbP8=; b=IeaEZuPhSoqvbgTLHtMmztzmcS3MeIP9a99DQKV3gqPWf4qQhqR2jx3FMgWcrpu5l0va4w ncOskFr87u1gLuCg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1719310750; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1PGCsNEAZb8feT1eGXK7ybre2/GF8QGIQq2PzvMqbP8=; b=ikWfKa8bne81WF9vVAJtHa0QPnc1LIizuvh4aPfFB1lSYHk1eS3SB1KHwnakTzv1X6jp1C S2ICBHjmdOtSSEnxQQp35KJPd3l2UgR/UggflD6wt2JWCAuvUXXPCguRd3MRAxealFQxRe RS0u8q0h0dJQrVXJEIwScpfdqBj/W5Q= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1719310750; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1PGCsNEAZb8feT1eGXK7ybre2/GF8QGIQq2PzvMqbP8=; b=IeaEZuPhSoqvbgTLHtMmztzmcS3MeIP9a99DQKV3gqPWf4qQhqR2jx3FMgWcrpu5l0va4w ncOskFr87u1gLuCg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 25F0213A9A; Tue, 25 Jun 2024 10:19:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id hE5/CZ6ZemZpWQAAD6G6ig (envelope-from ); Tue, 25 Jun 2024 10:19:10 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id C5A64A087B; Tue, 25 Jun 2024 12:19:09 +0200 (CEST) From: Jan Kara To: Cc: Andrew Morton , Matthew Wilcox , , Jan Kara Subject: [PATCH 01/10] readahead: Make sure sync readahead reads needed page Date: Tue, 25 Jun 2024 12:18:51 +0200 Message-Id: <20240625101909.12234-1-jack@suse.cz> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20240625100859.15507-1-jack@suse.cz> References: <20240625100859.15507-1-jack@suse.cz> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3131; i=jack@suse.cz; h=from:subject; bh=2BByd5Qz40KJgLPQa4CRJHADLoyluDt7OXYy5Vc5KU8=; b=owEBbQGS/pANAwAIAZydqgc/ZEDZAcsmYgBmepmMsFecTDSrJB7mzuQeFHsMRnTlyeSDIRIpwYXj JqeNh6CJATMEAAEIAB0WIQSrWdEr1p4yirVVKBycnaoHP2RA2QUCZnqZjAAKCRCcnaoHP2RA2fsnCA ChMH7hiJSHQTZszWhes0ZPL8tX8rXTRFqaWRv2zKdXrp/hdP1LP+k/C0c5FgLJWPBkB4t7ZsUBEOTi hdoHqmPCCc5PFg8glpeNB1rjxr9Ax0HIh5ds1oYLunJaAliyqGz4ChBZzRMPIksKDOnLcUiWdXUbOK rLDS8t4ulwXAyV24MxlhhkWN6I/L9lH/TN5MdcBMVcTwpXKz6pfUCbRQToHmOQhRw1UBWe8mxov+Mr jyIgX5xWV7/KuX/HyL/BtRkpVBEcc5rd/8uTi267fSnRLSyd/A208PjJpoTYhn5mDGjyaWKazOn/Rk Tw25TYpOrregPCejiqhHkWyp2+9+Md X-Developer-Key: i=jack@suse.cz; a=openpgp; fpr=93C6099A142276A28BBE35D815BC833443038D8C X-Rspamd-Action: no action X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 0162814001A X-Stat-Signature: zoxkji7ymxmjok6o16dt7fctntbmdfw5 X-HE-Tag: 1719310751-436326 X-HE-Meta: U2FsdGVkX1/a+7cqCvz77JgDSoNnAi2oeTc5VHw0FhaiqlDt0APf6MkEY8hw6SFwnMoTTkH4n2ktXMjIU9DREdckxfr09DTavR4TG6RhIPbiQpxTj+2Q9slF5+5cv8SDGGyO0SNqrFftFO5D8S14wQg4fwWw+UAOq70WdDcDv98ibTtYPiky5c4XWmkEDrBkY4IeXnJDBDWS75TKFmhP3g1q34PY2U34Bee3L3O4/X4sBmMSEILFM1glCJO3cznX2Qz4dIUv5d6j+k3kZ4Bz70pDcRPk6OI0I+2Yj69fSzDTaTEA7Wet5dDs36vXIv4VIibV9AGn6DZE1lagNoJPvntOmDXK9bKC84Re5XZAgvvg0M47idjVVQne9nOyFAM516dh2Np2o+r26cRfkkpplLDopjaaX7yTID13D/N83mo3L5YRlxD9CiuIrbI9vAXs2p0bufcDgnk//OH9sCkju6OCuKNP/1Qn6kgytOev9KZn3mbBLTHqfLQ9Fkx9qXbd2n/JuPQSFtshLxKFuFXyIceHK+uaTuPrftGjM2OFIojeWnf5ISD+jvKfPaapp0sPz77OE6zOmvoaeHN5RTP0AfS61MFsWrxgZ/qFBigk8qfjRhd5w224wYoL3BpFVvyntIX+i+Ki0rNhenhoPrjJeMocR8JMnOHuoir9+am2iQfQDPIBBgwFLb7UefiHjYP5vBGAuF8KSAE9nFdKyd7qxkqfMH+C6PA0CE802Xpw8HSBgpbJrLgKsPP1Jwfz7Za9J4It1M32SdrfM0c6n4F56EOBN1FgoHcNCWuEhTOgOCF6inwHFDkqlynBXVpfquXE9GaAbLQWfmvLDTy1InJ7Onn1yMmqCIbPK23QbHg7uOMNRerZ8KIoJdlbj2JU1q4UCyoXCxcddgdGnzuE6hKkYz15Gd6d3LM07+FMSE75Oj8o9+im4DcRri2NipO073fQdAxwxFa7iqWF+/DoCL3 UcXPuDED SAIuKi/bEoT1QUhnxVSrAaTiOAFx2tn0THJ6BMw3r/jCMGl9GySKU862duOv8MssoZemFRBXs2ngBqwBEUSIM6X6CzHwnORmGlC/E8u5SF+P9+phLQINMOSOjwb/nMPKmWb7+jJxYZYmF/0K02R9uXVkbJMwi3NADqeAJP0/yToQstB5NCTTzuzmtefrB6sTBgmyn6zo3Z8vqP02S+7G3lDT4eQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: page_cache_sync_ra() is called when a folio we want to read is not in the page cache. It is expected that it creates the folio (and perhaps the following folios as well) and submits reads for them unless some error happens. However if index == ra->start + ra->size, ondemand_readahead() will treat the call as another async readahead hit. Thus ra->start will be advanced and we create pages and queue reads from ra->start + ra->size further. Consequentially the page at 'index' is not created and filemap_get_pages() has to always go through filemap_create_folio() path. This behavior has particularly unfortunate consequences when we have two IO threads sequentially reading from a shared file (as is the case when NFS serves sequential reads). In that case what can happen is: suppose ra->size == ra->async_size == 128, ra->start = 512 T1 T2 reads 128 pages at index 512 - hits async readahead mark filemap_readahead() ondemand_readahead() if (index == expected ...) ra->start = 512 + 128 = 640 ra->size = 128 ra->async_size = 128 page_cache_ra_order() blocks in ra_alloc_folio() reads 128 pages at index 640 - no page found page_cache_sync_readahead() ondemand_readahead() if (index == expected ...) ra->start = 640 + 128 = 768 ra->size = 128 ra->async_size = 128 page_cache_ra_order() submits reads from 768 - still no page found at index 640 filemap_create_folio() - goes on to index 641 page_cache_sync_readahead() ondemand_readahead() - founds ra is confused, trims is to small size finds pages were already inserted And as a result read performance suffers. Fix the problem by triggering async readahead case in ondemand_readahead() only if we are calling the function because we hit the readahead marker. In any other case we need to read the folio at 'index' and thus we cannot really use the current ra state. Note that the above situation could be viewed as a special case of file->f_ra state corruption. In fact two thread reading using the shared file can also seemingly corrupt file->f_ra in interesting ways due to concurrent access. I never saw that in practice and the fix is going to be much more complex so for now at least fix this practical problem while we ponder about the theoretically correct solution. Signed-off-by: Jan Kara --- mm/readahead.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/readahead.c b/mm/readahead.c index c1b23989d9ca..af0fbd302a38 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -580,7 +580,7 @@ static void ondemand_readahead(struct readahead_control *ractl, */ expected = round_down(ra->start + ra->size - ra->async_size, 1UL << order); - if (index == expected || index == (ra->start + ra->size)) { + if (folio && index == expected) { ra->start += ra->size; ra->size = get_next_ra_size(ra, max_pages); ra->async_size = ra->size;