From patchwork Wed Nov 6 09:21:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yafang Shao X-Patchwork-Id: 13864198 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C17B2D3E798 for ; Wed, 6 Nov 2024 09:21:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E110A6B0085; Wed, 6 Nov 2024 04:21:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DC1146B0088; Wed, 6 Nov 2024 04:21:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C87F26B0089; Wed, 6 Nov 2024 04:21:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id AACCA6B0085 for ; Wed, 6 Nov 2024 04:21:28 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2362940FCD for ; Wed, 6 Nov 2024 09:21:28 +0000 (UTC) X-FDA: 82755124704.20.7952AAA Received: from mail-pg1-f174.google.com (mail-pg1-f174.google.com [209.85.215.174]) by imf21.hostedemail.com (Postfix) with ESMTP id 98EF51C0007 for ; Wed, 6 Nov 2024 09:20:21 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=iLyLE+DS; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.215.174 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730884762; a=rsa-sha256; cv=none; b=yFzy74bRyY0C1eSJHxU/xGzJTT2yteigRA9xGL8wAc4eJNPTHWYx8PpIEsP8nK87ccSP6A cWyxz3zRNNGiAV97fM9wMBp+6K1omwvxEVPPaZFfqUgGcjKUOagCV0MGp2/uMfSjNoWa0I AgJbZ6jyYpQxhqJegmRSkqLzk7Vmtgc= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=iLyLE+DS; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.215.174 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730884762; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=W+cQmVsZlDpX2PNzmQboVQybPxIHEZ6C5RV7jAdjZnY=; b=QWDvnt1dgtRtQ8y6J+grDp+cOH58lVnkynjOrhX+C99gGpGU9lW37rYPSlGbZLlmeVB5+e Mh0BalS34KZpnWXn+ARbB6E9lPv4mhJxGZTc4m+os/B8IURMWN7yvI7QKMxbvvR+unvUOL 6/UDw8l3zZBy+f3wWYCK6D02MGOa/KQ= Received: by mail-pg1-f174.google.com with SMTP id 41be03b00d2f7-7eae96e6624so4691396a12.2 for ; Wed, 06 Nov 2024 01:21:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730884885; x=1731489685; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=W+cQmVsZlDpX2PNzmQboVQybPxIHEZ6C5RV7jAdjZnY=; b=iLyLE+DSMz/0N1DWifz6ClZllEGWMiE77NhwOhaK8vuKdna6+pPazWsK8O5ykPnsL3 CFnBqHz3Rf8Hz0s/A34Znt7P/E6kkkxA5QtL1U7cTXspKQDjsAXm9Xv+LpYhkSuUCDYQ /Y0VYqkaL7JDNKGEgsMhVE6MEEb3I2ZXNYluGTbmHE4DtIN+tt3TD1CkjS6cWdhlD+YC FBcvLY1cQhJxVzoqDzhimV2GU3K7Dy3htv5MF/SpkJxTwwbd+5gaEh0aTJvlSpKMdQXk Ss3/QfQ3Z23phvasfRoMw7veWhSj13YmkWMgBjSJX1CYPGItC/59PK9VXtN2VHw5ge5l 2yNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730884885; x=1731489685; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=W+cQmVsZlDpX2PNzmQboVQybPxIHEZ6C5RV7jAdjZnY=; b=juTvN2lDzrLCaw4PVIX5EIVn528gQp8QpkmDgQEVlTEBeV+ElC//DsW+0OjQ13W6Cd DNXfBttEIzky/+pqjhhBJkwbmsIVT/fZaSsX9uxh4NtEZqpRVcw134QYzL1xPbnTnuWW 1pss3p8zaA3N3C9iU0fFtNrMZygRbr55jDkmitDS5NbGnMc8pXH1o0vffs0Ze8tT5TqO 53ZXF1ZHqtv+q5gh/QtX3aXEP9aZQ2NGM4hacZuY1/19kAH+KHjwANWv4BprxEalC8F2 ymKY9u+ldV3WjskegOdvngLUHR++RB6fYmpe+G5dbT7yUbI3igDos1h6/X73DbCDSXrK 0K0w== X-Gm-Message-State: AOJu0YzEszsz7Jz6F/LPoBpNsNpSLidPamrjJp6F0KwKamOrc0r8XJ64 hdFry2l0N39N7hh+da6kIZ0UgsKGhYxJe7Khxd6aeWhAlnyp/bTo X-Google-Smtp-Source: AGHT+IH2OhB85lSpe0/DglbjHai1azImImfEo2iOVWmZiMcajA3OJBBpPDuOcQ1DgTH4zlE2S7Xntw== X-Received: by 2002:a05:6a20:729f:b0:1db:ed8a:a607 with SMTP id adf61e73a8af0-1dbed8aa68fmr8774885637.11.1730884884657; Wed, 06 Nov 2024 01:21:24 -0800 (PST) Received: from localhost.localdomain ([39.144.44.137]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e99a525494sm990734a91.10.2024.11.06.01.21.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 Nov 2024 01:21:24 -0800 (PST) From: Yafang Shao To: willy@infradead.org, akpm@linux-foundation.org Cc: linux-mm@kvack.org, Yafang Shao Subject: [PATCH] mm/readahead: Fix large folio support in async readahead Date: Wed, 6 Nov 2024 17:21:14 +0800 Message-Id: <20241106092114.8408-1-laoar.shao@gmail.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) MIME-Version: 1.0 X-Rspamd-Queue-Id: 98EF51C0007 X-Stat-Signature: izr84h1hubpj4z86hsoh5gw473yamjcd X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1730884821-687279 X-HE-Meta: U2FsdGVkX18KT/LuwR1d8k6/JNeg/q+Cz+suJc8kNi+Tc6vKYGBOuOfj6Jncd2L/Dc9s79pH/5T8SAAKOs9jax28bQ1ZyYKajxHd4GYUOtIV7Agw6Uhx3oeXPXVCkE84GlqtadlLcrrEVD/INvi2O5T421aAyEFIojUYB3RCrCS0w/DKwCV8KEunyT+c5v8Ytjgf00NPir0Xrz7pLJCfbGIZvcsQY6B799DtLE8OTtPG3XmYWVHsqtj7wJdKUcS5BTGHURpxjk5A8G2YMdLK6/2loPMROanGh89Dcne5nTUPzBk5jF5WolKerQ2IG8uU+NuSwVLKlxUwyDKHcoLYJ6UfEUpTckoL9Fz/QpNdCxnIC+TEmTkDlGfT32mJGDmOdu63Cd/Zz1je88nSXSGjusJ/Zqxr04hbJWZBw4wd7/T+C+rN8SrYPCo8vPTNiZVZTuf8s2pRZa7MAQbbJ0d5jSn/r0v9CffuE/5Ksd++vGchXgl0JjLgGQngKtILQivri+BfOk7HLCXoObH+Ylbv+PTi2NRwBiBZ6GBVEQiM5UWqD15YLcvdIpc8TQB9mhp03yIJgJ4x2RTLdrjmOH/dt0VZqJYOvn9muRDfeXe5lkSQROFawsE5LNtYBrs3e6PBfQF+1PddHF56FhWo5I9+1F1TM5EA9MIE6IznMLF47l99DFG+cU6Kgc3S8hk/hUQnG3UDA9fzhsgOfD9cGCLZjSJ0xCwXERp+ifc0Q/ZUzLZiC2E7EtyoQPVDIXqyTpAwzajJt6ooKmo9B0QWYrsE/9sKxZGbpDKLhoEphduW4346rHLnr1rMbnFn59By9fz9fJpNpe2CmQgmfhtmcxFqZaIf1D0oH9oJ743pzfO2lr9Ldi/l9BmaSLbzUuxMTNcitxbavjPZHGhWWTQk/iz4VL3JNjuCUfIhwuYZBDg/j1uVpmvJthcoOiiA4JG993XoyfscbRLtTmvJkx3s9G+ QLuoQsRI sTPiW0lnAz5Yvu7PYLkwD4NoIXd6EuaLEcYOU4ER5JSK0kj8CgZPBAQr1C34cKcqyQRUrc384HFTLHdPZx3n6LchVfpmhVYHMum24KHJ3Uu91oBulG47vyZjDT3ctBN0ExV98Wo70bgmy97nPyr1uJ7rhKWr787kyxZ5HP0oS9TbLKAltmlEhAAE9QNnZUQye9LAoj//MblfcNNyagLRIhMUrAmyxs+g4RsawMShBQhWqk35a6p1w80U4kHNjETSWqPYWIQ5si3m0nbURGqft4X+Cv4G9taXQa0/jhg0fDR1qaC6DDh12tNSOkujE/YkRwA0UBzDt4Rk58oAsTAl8R9BLlMLKYKI1plMzsvV1l6l9K7I/SQMVGomp1SevSp1D/HN/ZHCPvwQQ7EjeKSq13LJ90MowU+CsitM2K0kQaqhDD80HDoCdaOT1Ug9Nwq14yDxmT410pMirP03MVpUAqmEt78DZ7j4efxRKOE//LfA/AftVkvGJ03frOfSNzU48cS+mnOSVM0Lx6GCp2FZb1GAottXpGNlrYm+Lu+NmxgEg+j5U3lACb1UCGA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When testing large folio support with XFS on our servers, we observed that only a few large folios are mapped when reading large files via mmap. After a thorough analysis, I identified it was caused by the `/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this parameter is set to 128KB. After I tune it to 2MB, the large folio can work as expected. However, I believe the large folio behavior should not be dependent on the value of read_ahead_kb. It would be more robust if the kernel can automatically adopt to it. With `/sys/block/*/queue/read_ahead_kb` set to a non-2MB aligned size, this issue can be verified with a simple test case, as shown below: #define LEN (1024 * 1024 * 1024) // 1GB file int main(int argc, char *argv[]) { char *addr; int fd, i; fd = open("data", O_RDWR); if (fd < 0) { perror("open"); exit(-1); } addr = mmap(NULL, LEN, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (addr == MAP_FAILED) { perror("mmap"); exit(-1); } if (madvise(addr, LEN, MADV_HUGEPAGE)) { perror("madvise"); exit(-1); } for (i = 0; i < LEN / 4096; i++) memset(addr + i * 4096, 1, 1); while (1) {} // Verifiable with /proc/meminfo munmap(addr, LEN); close(fd); exit(0); } When large folio support is enabled and read_ahead_kb is set to a smaller value, ra->size (4MB) may exceed the maximum allowed size (e.g., 128KB). To address this, we need to add a conditional check for such cases. However, this alone is insufficient, as users might set read_ahead_kb to a larger, non-hugepage-aligned value (e.g., 4MB + 128KB). In these instances, it is essential to explicitly align ra->size with the hugepage size. Fixes: 4687fdbb805a ("mm/filemap: Support VM_HUGEPAGE for file mappings") Signed-off-by: Yafang Shao Cc: Matthew Wilcox --- mm/readahead.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Changes: RFC->v1: - Simplify the code as suggested by Matthew RFC: https://lore.kernel.org/linux-mm/20241104143015.34684-1-laoar.shao@gmail.com/ diff --git a/mm/readahead.c b/mm/readahead.c index 3dc6c7a128dd..9e2c6168ebfa 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -385,6 +385,8 @@ static unsigned long get_next_ra_size(struct file_ra_state *ra, return 4 * cur; if (cur <= max / 2) return 2 * cur; + if (cur > max) + return cur; return max; } @@ -642,7 +644,7 @@ void page_cache_async_ra(struct readahead_control *ractl, 1UL << order); if (index == expected) { ra->start += ra->size; - ra->size = get_next_ra_size(ra, max_pages); + ra->size = ALIGN(get_next_ra_size(ra, max_pages), 1 << order); ra->async_size = ra->size; goto readit; }