From patchwork Wed Jun 28 11:02:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haibo Li X-Patchwork-Id: 13295575 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B4A9EB64DC for ; Wed, 28 Jun 2023 11:02:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE67C8D0002; Wed, 28 Jun 2023 07:02:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C97CD8D0001; Wed, 28 Jun 2023 07:02:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5F6D8D0002; Wed, 28 Jun 2023 07:02:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A8DF48D0001 for ; Wed, 28 Jun 2023 07:02:37 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 711CC80ABC for ; Wed, 28 Jun 2023 11:02:37 +0000 (UTC) X-FDA: 80951868354.24.C4990B1 Received: from mailgw01.mediatek.com (mailgw01.mediatek.com [216.200.240.184]) by imf09.hostedemail.com (Postfix) with ESMTP id 6F7D414002F for ; Wed, 28 Jun 2023 11:02:33 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=mediatek.com header.s=dk header.b=n+IxyXmA; dmarc=pass (policy=quarantine) header.from=mediatek.com; spf=pass (imf09.hostedemail.com: domain of haibo.li@mediatek.com designates 216.200.240.184 as permitted sender) smtp.mailfrom=haibo.li@mediatek.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687950154; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=89B20YeAg/sOI5tG4Ue+qiZY0x7A1oBUW3Grt3g3ZCU=; b=Vf2toehLQt3LmTN4WdLuGSZiMPErXa4OT0hu25Ulr9XzA2YpUfopl88JzuVbxAb5aKoTAl 6y6HrmMB2281HpnWoQVyK2xaTis5OeYiKu8/9AKZSuOHd63gMRoBz84y2fUPWrSJtpfwIC inz4vuww28HN88cHQAfceZ2CXRgJP1o= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=mediatek.com header.s=dk header.b=n+IxyXmA; dmarc=pass (policy=quarantine) header.from=mediatek.com; spf=pass (imf09.hostedemail.com: domain of haibo.li@mediatek.com designates 216.200.240.184 as permitted sender) smtp.mailfrom=haibo.li@mediatek.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687950154; a=rsa-sha256; cv=none; b=gC/VyXZkI3nWsoxaE7gqog35VuvJGakn9Aed6TTgof+ZBNOTgdOvzUg1XusXLTX3FYQLZY 48nZWxAFWPSZwiI8TvlPI3cv3jHmVSpcAaq/wnxreaJmCAqvUkI8AbfW5UdfH+oM9bb3QW b9271Ssl1GOmue3yVAtLv1w/iPSFmvo= X-UUID: 43e72f7015a311ee912e1518a6540028-20230628 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:Content-Transfer-Encoding:MIME-Version:Message-ID:Date:Subject:CC:To:From; bh=89B20YeAg/sOI5tG4Ue+qiZY0x7A1oBUW3Grt3g3ZCU=; b=n+IxyXmA/vpGQ/x+aT8zShgmrAKvIwjk05Vapqo6Crr0MHy7Ym6vKwHAcEoYQG1jU78GZ6SAUzLSP5XARymogSX53BaKPu8YjQSpbW6ZaZTL7cztqrtKKAPgnSNxVA/JaWfxv5g+XaM0vPNzU1RNiJBDHvXSiJZyVjYczcZoQDQ=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.27,REQID:143a3103-6a42-45d8-a56b-d551e6558500,IP:0,U RL:0,TC:0,Content:0,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTION: release,TS:0 X-CID-META: VersionHash:01c9525,CLOUDID:d1ac7782-5a99-42ae-a2dd-e4afb731b474,B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:11|1,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR: NO X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-CID-FACTOR: TF_CID_SPAM_SNR,TF_CID_SPAM_ULN X-UUID: 43e72f7015a311ee912e1518a6540028-20230628 Received: from mtkmbs13n2.mediatek.inc [(172.21.101.108)] by mailgw01.mediatek.com (envelope-from ) (musrelay.mediatek.com ESMTP with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 925431606; Wed, 28 Jun 2023 04:02:25 -0700 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs11n1.mediatek.inc (172.21.101.185) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Wed, 28 Jun 2023 19:02:23 +0800 Received: from mszsdtlt102.gcn.mediatek.inc (10.16.4.142) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Wed, 28 Jun 2023 19:02:22 +0800 From: Haibo Li To: CC: "Matthew Wilcox (Oracle)" , Andrew Morton , Matthias Brugger , AngeloGioacchino Del Regno , , , , , , Haibo Li Subject: [PATCH] mm/filemap.c:fix update prev_pos after one read request done Date: Wed, 28 Jun 2023 19:02:20 +0800 Message-ID: <20230628110220.120134-1-haibo.li@mediatek.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-MTK: N X-Rspam-User: X-Stat-Signature: 17sgymg5e7qeq73ypdkn546d9i8ssa54 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6F7D414002F X-HE-Tag: 1687950153-936997 X-HE-Meta: U2FsdGVkX18X/IFwC9f1Dfi8jIVH4YO+7PSJozeeoShHxfdhLfdgqRy2RjHZaLemQhmhMsrDqb/xnxBOXQXrLsxFzRyTJBi+Jc/DScCsKdhiRaRoAv95tCfM0uheyqr1BkRbYN86c3q95tcH7yGlofDFPNxVyeKqpg1irtp6MxTVQsoW8cItdLVMzOrK/3LqigPlvbBiwc5Qb/PFjlC198xQRulxSEAsqdjEInqqBCJUfDqaHEXuOKQ/tMaZb6vCKpQsmBB0zktbbEgYVRNmFpJ0Xn1QBI4TvwBv+yR4qARIYQvgN7GRZquRXMyUF4eK4lMHudFdeU/Q59WNKUz0tzylXMJBSLx3VwnX8xpkbGzHJI1laoJnKI4llEEoqsZsFpQ0Nf3eW/7Bh4V6YK+4S5U7LjvoaX/BjTsrWiXPAFDlzSrOzGmp+D62ZNj1sOGaAH3FyLAsqH/ben+C56GWSVv9A2UXVrzmh8jJWCIbig6DC3WZp04PJv5ZT/OpnHBRNSiIaQTTPj+cnILtAX6LyRueC4VwxQDpOamaZ5yQqDuBhPPEU91DwH4HHchCShrGIpJ3Y4Ff3VzQUvfm28o2yU3vWerhP+PFabQo3BBCQ6zHoOPZYFybpjGMqS3DcYxzn5ySgw2CPxolUcoS+J7IjvFWZtoRvy6LwTu1JQA9bcGBTaqhiGIWTRCGLd92ODF3y9ulCSoYmX7MZqdp7yXgj15RQx51ejy+EaFa3zf6kwUco/SAETrR0xaSMaoixWBTLS/jRixKkDhCP5Zg75/FgXuU2jknzzR95fuUVOwLIU8d/ACEDrzwxjMcPghscuKwhy0jZvLjpG5JTdMpc15hdsL6KXWG0cUzy+X55vVjFysKtx05/n/9XJK8t3QBMiam5sc/1z3AliqAxjPwxFjOnM6w3kjtQWlhfdzocXKEnjsFYErAsaWKOsM/qbtG7tgYS8psujylp0p+8Vt77fc sT+LnNdT airOIGYIyVOcR41T06p/VRW/ZoW3iyl+W5lBBPoM/wnCN06chMi2Gq/pBbrNRsBL2aOz6s+E+amSexqHErOhacO7U+GH3FkA0vPjllKfLfCJKl5BZqrpRWjx+RIT9uY4L9DTY09Dkt3qkQ7vI4LbSJjhkJ6hXoAesD2/k1LzDcMJBYbh8bE6Che2oE82gI+Ib518nP32LaMTED4Fm6oRx8qA0Mx4fzHI+VQAC449+rPNGfpiUgIFT3P0HfXZjvzUl0HkfA+96WPj1bW/vie7CLN98ufWi9pS6vR3NNKA1K2JFvzWxqzPnJ7N5KriDDkG1HFMVTqyDilfADhA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: ra->prev_pos tracks the last visited byte in the previous read request. It is used to check whether it is sequential read in ondemand_readahead and thus affects the readahead window. From commit 06c0444290ce ("mm/filemap.c: generic_file_buffered_read() now uses find_get_pages_contig"),update logic of prev_pos is changed. It updates prev_pos after each returns from filemap_get_pages. But the read request from user may be not fully completed at this point. The updated prev_pos impacts the subsequent readahead window. The real problem is performance drop of fsck_msdos between linux-5.4 and linux-5.15(also linux-6.4). Comparing to linux-5.4,It spends about 110% time and read 140% pages. The read pattern of fsck_msdos is not fully sequential. Simplified read pattern of fsck_msdos likes below: 1.read at page offset 0xa,size 0x1000 2.read at other page offset like 0x20,size 0x1000 3.read at page offset 0xa,size 0x4000 4.read at page offset 0xe,size 0x1000 Here is the read status on linux-6.4: 1.after read at page offset 0xa,size 0x1000 ->page ofs 0xa go into pagecache 2.after read at page offset 0x20,size 0x1000 ->page ofs 0x20 go into pagecache 3.read at page offset 0xa,size 0x4000 ->filemap_get_pages read ofs 0xa from pagecache and returns ->prev_pos is updated to 0xb and goto next loop ->filemap_get_pages tends to read ofs 0xb,size 0x3000 ->initial_readahead case in ondemand_readahead since prev_pos is the same as request ofs. ->read 8 pages while async size is 5 pages (PageReadahead flag at page 0xe) 4.read at page offset 0xe,size 0x1000 ->hit page 0xe with PageReadahead flag set,double the ra_size. read 16 pages while async size is 16 pages Now it reads 24 pages while actually uses 5 pages on linux-5.4: 1.the same as 6.4 2.the same as 6.4 3.read at page offset 0xa,size 0x4000 ->read ofs 0xa from pagecache ->read ofs 0xb,size 0x3000 using page_cache_sync_readahead read 3 pages ->prev_pos is updated to 0xd before generic_file_buffered_read returns 4.read at page offset 0xe,size 0x1000 ->initial_readahead case in ondemand_readahead since request ofs-prev_pos==1 ->read 4 pages while async size is 3 pages Now it reads 7 pages while actually uses 5 pages. In above demo,the initial_readahead case is triggered by offset of user request on linux-5.4. While it may be triggered by update logic of prev_pos on linux-6.4. To fix the performance drop,update prev_pos after finishing one read request. Signed-off-by: Haibo Li Reviewed-by: Jan Kara --- mm/filemap.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 83dda76d1fc3..16b2054eee71 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2670,6 +2670,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, int i, error = 0; bool writably_mapped; loff_t isize, end_offset; + loff_t last_pos = ra->prev_pos; if (unlikely(iocb->ki_pos >= inode->i_sb->s_maxbytes)) return 0; @@ -2721,8 +2722,8 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, * When a read accesses the same folio several times, only * mark it as accessed the first time. */ - if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1, - fbatch.folios[0])) + if (!pos_same_folio(iocb->ki_pos, last_pos - 1, + fbatch.folios[0])) folio_mark_accessed(fbatch.folios[0]); for (i = 0; i < folio_batch_count(&fbatch); i++) { @@ -2749,7 +2750,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, already_read += copied; iocb->ki_pos += copied; - ra->prev_pos = iocb->ki_pos; + last_pos = iocb->ki_pos; if (copied < bytes) { error = -EFAULT; @@ -2763,7 +2764,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, } while (iov_iter_count(iter) && iocb->ki_pos < isize && !error); file_accessed(filp); - + ra->prev_pos = last_pos; return already_read ? already_read : error; } EXPORT_SYMBOL_GPL(filemap_read);