From patchwork Thu Oct 29 19:33:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867327 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C741992C for ; Thu, 29 Oct 2020 19:34:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 96AD7207DE for ; Thu, 29 Oct 2020 19:34:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="rFY30l94" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726463AbgJ2TeP (ORCPT ); Thu, 29 Oct 2020 15:34:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56544 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726250AbgJ2TeM (ORCPT ); Thu, 29 Oct 2020 15:34:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AC29C0613D2 for ; Thu, 29 Oct 2020 12:34:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=UUhtRr4WBYAqjdtduicXSD66sp/VlkyNt6Q6brpQcm8=; b=rFY30l940WJWT4CO+yioGGTNoM ws4SktgudrmeT+d7Ir+H4S6qngvDUm0Ft0b1wDySRozUdw7NCqC1zynuOIRsPinG3QsyEs44+i+lD wmDl285in1amVWHBuU7RzcbTVuDGlsXllDoXNCku1/HTr/bybJIiR+2pPyE2sQDohyh2t9wTTtnBi 6jhPCKssrdyTK0kqrprlymXvFYvVK5lnd05AGix2DdYq1zWJz94i7p0HrXehFTrwxXwEXGLKgG7mT gersukQZgWVZlhyUDYHTl4ISyxbGhy9UH5tnUd6P22cdS3REtoc4O9xftwd/1yY7pMFTmL55CNrHW ho0YeLOg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgV-0007b4-Nz; Thu, 29 Oct 2020 19:34:07 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 01/19] XArray: Expose xas_destroy Date: Thu, 29 Oct 2020 19:33:47 +0000 Message-Id: <20201029193405.29125-2-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This proves to be useful functionality for the THP page cache. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/xarray.h | 1 + lib/xarray.c | 7 ++++--- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/include/linux/xarray.h b/include/linux/xarray.h index 92c0160b3352..4d40279f49d1 100644 --- a/include/linux/xarray.h +++ b/include/linux/xarray.h @@ -1503,6 +1503,7 @@ void *xas_find_marked(struct xa_state *, unsigned long max, xa_mark_t); void xas_init_marks(const struct xa_state *); bool xas_nomem(struct xa_state *, gfp_t); +void xas_destroy(struct xa_state *); void xas_pause(struct xa_state *); void xas_create_range(struct xa_state *); diff --git a/lib/xarray.c b/lib/xarray.c index fb3a0ccebb7e..fc70e37c4c17 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -258,13 +258,14 @@ static void xa_node_free(struct xa_node *node) call_rcu(&node->rcu_head, radix_tree_node_rcu_free); } -/* +/** * xas_destroy() - Free any resources allocated during the XArray operation. * @xas: XArray operation state. * - * This function is now internal-only. + * Usually xas_destroy() is called by xas_nomem(), but some users want to + * unconditionally release any memory that was allocated. */ -static void xas_destroy(struct xa_state *xas) +void xas_destroy(struct xa_state *xas) { struct xa_node *next, *node = xas->xa_alloc; From patchwork Thu Oct 29 19:33:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867311 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF671697 for ; Thu, 29 Oct 2020 19:34:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B605F207EA for ; Thu, 29 Oct 2020 19:34:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="udw+dk7p" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726232AbgJ2TeL (ORCPT ); Thu, 29 Oct 2020 15:34:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726133AbgJ2TeJ (ORCPT ); Thu, 29 Oct 2020 15:34:09 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8FA6AC0613D2 for ; Thu, 29 Oct 2020 12:34:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=E8k46S1XYZz+Wfo1T6X7i3JTr3IK8+TDAKapZ72AyOs=; b=udw+dk7pm/N9ER8Rl8D/7iZ91T vy9wV4X2NqwNDf3Q77wj1o8eS7ZSHJgHgKRHrIbJ9zK4/RwGHpEcuEiTDmmiiWTotgMYiy79/nT05 RrQVLhYuRgcje3nI+rGkTFtWv3T4zXjq5vGCF0N2bklEvMGQwO8NjwEBDGldpWwymrFN+Bwc+rxJX C9uxsWbTwuZqiby4mmJaXjDZdpxeo4jfx10zwdVHO+MbMDaSn6I879b2CERo7DlEufGTyFEakUddy yxdcs4jVHbYGCrIuVNecavjfHY2hztxRK9P5kbmaQ9JS1rHixWjN+14L4pBxWZXA7eet1o3ZoLznm V5wC3Mew==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgV-0007b8-Uo; Thu, 29 Oct 2020 19:34:08 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 02/19] mm: Use multi-index entries in the page cache Date: Thu, 29 Oct 2020 19:33:48 +0000 Message-Id: <20201029193405.29125-3-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We currently store order-N THPs as 2^N consecutive entries. While this consumes rather more memory than necessary, it also turns out to be buggy. A writeback operation which starts in the middle of a dirty THP will not notice as the dirty bit is only set on the head index. With multi-index entries, the dirty bit will be found no matter where in the THP the iteration starts. This does end up simplifying the page cache slightly, although not as much as I had hoped. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/pagemap.h | 10 ------- mm/filemap.c | 62 ++++++++++++++++++++++++----------------- mm/huge_memory.c | 19 ++++++++++--- mm/khugepaged.c | 12 +++++++- mm/migrate.c | 8 ------ mm/shmem.c | 11 ++------ 6 files changed, 65 insertions(+), 57 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 62b759f92e36..00288ed24698 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -912,16 +912,6 @@ static inline unsigned int __readahead_batch(struct readahead_control *rac, VM_BUG_ON_PAGE(PageTail(page), page); array[i++] = page; rac->_batch_count += thp_nr_pages(page); - - /* - * The page cache isn't using multi-index entries yet, - * so the xas cursor needs to be manually moved to the - * next index. This can be removed once the page cache - * is converted. - */ - if (PageHead(page)) - xas_set(&xas, rac->_index + rac->_batch_count); - if (i == array_sz) break; } diff --git a/mm/filemap.c b/mm/filemap.c index 5c4db536fff4..8537ee86f99f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -127,13 +127,12 @@ static void page_cache_delete(struct address_space *mapping, /* hugetlb pages are represented by a single entry in the xarray */ if (!PageHuge(page)) { - xas_set_order(&xas, page->index, compound_order(page)); - nr = compound_nr(page); + xas_set_order(&xas, page->index, thp_order(page)); + nr = thp_nr_pages(page); } VM_BUG_ON_PAGE(!PageLocked(page), page); VM_BUG_ON_PAGE(PageTail(page), page); - VM_BUG_ON_PAGE(nr != 1 && shadow, page); xas_store(&xas, shadow); xas_init_marks(&xas); @@ -311,19 +310,12 @@ static void page_cache_delete_batch(struct address_space *mapping, WARN_ON_ONCE(!PageLocked(page)); - if (page->index == xas.xa_index) - page->mapping = NULL; + page->mapping = NULL; /* Leave page->index set: truncation lookup relies on it */ - /* - * Move to the next page in the vector if this is a regular - * page or the index is of the last sub-page of this compound - * page. - */ - if (page->index + compound_nr(page) - 1 == xas.xa_index) - i++; + i++; xas_store(&xas, NULL); - total_pages++; + total_pages += thp_nr_pages(page); } mapping->nrpages -= total_pages; } @@ -1956,20 +1948,24 @@ unsigned find_lock_entries(struct address_space *mapping, pgoff_t start, indices[pvec->nr] = xas.xa_index; if (!pagevec_add(pvec, page)) break; - goto next; + continue; unlock: unlock_page(page); put: put_page(page); -next: - if (!xa_is_value(page) && PageTransHuge(page)) - xas_set(&xas, page->index + thp_nr_pages(page)); } rcu_read_unlock(); return pagevec_count(pvec); } +static inline bool thp_last_tail(struct page *head, pgoff_t index) +{ + if (!PageTransCompound(head) || PageHuge(head)) + return true; + return index == head->index + thp_nr_pages(head) - 1; +} + /** * find_get_pages_range - gang pagecache lookup * @mapping: The address_space to search @@ -2008,11 +2004,17 @@ unsigned find_get_pages_range(struct address_space *mapping, pgoff_t *start, if (xa_is_value(page)) continue; +again: pages[ret] = find_subpage(page, xas.xa_index); if (++ret == nr_pages) { *start = xas.xa_index + 1; goto out; } + if (!thp_last_tail(page, xas.xa_index)) { + xas.xa_index++; + page_ref_inc(page); + goto again; + } } /* @@ -3018,6 +3020,12 @@ void filemap_map_pages(struct vm_fault *vmf, struct page *head, *page; unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); + max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); + if (max_idx == 0) + return; + if (end_pgoff >= max_idx) + end_pgoff = max_idx - 1; + rcu_read_lock(); xas_for_each(&xas, head, end_pgoff) { if (xas_retry(&xas, head)) @@ -3037,20 +3045,16 @@ void filemap_map_pages(struct vm_fault *vmf, /* Has the page moved or been split? */ if (unlikely(head != xas_reload(&xas))) goto skip; - page = find_subpage(head, xas.xa_index); - - if (!PageUptodate(head) || - PageReadahead(page) || - PageHWPoison(page)) + if (!PageUptodate(head) || PageReadahead(head)) goto skip; if (!trylock_page(head)) goto skip; - if (head->mapping != mapping || !PageUptodate(head)) goto unlock; - max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); - if (xas.xa_index >= max_idx) + page = find_subpage(head, xas.xa_index); +again: + if (PageHWPoison(page)) goto unlock; if (mmap_miss > 0) @@ -3062,6 +3066,14 @@ void filemap_map_pages(struct vm_fault *vmf, last_pgoff = xas.xa_index; if (alloc_set_pte(vmf, page)) goto unlock; + if (!thp_last_tail(head, xas.xa_index)) { + xas.xa_index++; + page++; + page_ref_inc(head); + if (xas.xa_index >= end_pgoff) + goto unlock; + goto again; + } unlock_page(head); goto next; unlock: diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f99167d74cbc..0e900e594e77 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2626,6 +2626,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) struct page *head = compound_head(page); struct pglist_data *pgdata = NODE_DATA(page_to_nid(head)); struct deferred_split *ds_queue = get_deferred_split_queue(head); + XA_STATE(xas, &head->mapping->i_pages, head->index); struct anon_vma *anon_vma = NULL; struct address_space *mapping = NULL; int count, mapcount, extra_pins, ret; @@ -2690,19 +2691,28 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) unmap_page(head); VM_BUG_ON_PAGE(compound_mapcount(head), head); + if (mapping) { + xas_split_alloc(&xas, head, thp_order(head), + mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK); + if (xas_error(&xas)) { + ret = xas_error(&xas); + goto out_unlock; + } + } + /* prevent PageLRU to go away from under us, and freeze lru stats */ spin_lock_irqsave(&pgdata->lru_lock, flags); if (mapping) { - XA_STATE(xas, &mapping->i_pages, page_index(head)); - /* * Check if the head page is present in page cache. * We assume all tail are present too, if head is there. */ - xa_lock(&mapping->i_pages); + xas_lock(&xas); + xas_reset(&xas); if (xas_load(&xas) != head) goto fail; + xas_split(&xas, head, thp_order(head)); } /* Prevent deferred_split_scan() touching ->_refcount */ @@ -2735,7 +2745,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) } spin_unlock(&ds_queue->split_queue_lock); fail: if (mapping) - xa_unlock(&mapping->i_pages); + xas_unlock(&xas); spin_unlock_irqrestore(&pgdata->lru_lock, flags); remap_page(head, thp_nr_pages(head)); ret = -EBUSY; @@ -2749,6 +2759,7 @@ fail: if (mapping) if (mapping) i_mmap_unlock_read(mapping); out: + xas_destroy(&xas); count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED); return ret; } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 2cb93aa8bf84..230e62a92ae7 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1645,7 +1645,10 @@ static void collapse_file(struct mm_struct *mm, } count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); - /* This will be less messy when we use multi-index entries */ + /* + * Ensure we have slots for all the pages in the range. This is + * almost certainly a no-op because most of the pages must be present + */ do { xas_lock_irq(&xas); xas_create_range(&xas); @@ -1851,6 +1854,9 @@ static void collapse_file(struct mm_struct *mm, __mod_lruvec_page_state(new_page, NR_SHMEM, nr_none); } + /* Join all the small entries into a single multi-index entry */ + xas_set_order(&xas, start, HPAGE_PMD_ORDER); + xas_store(&xas, new_page); xa_locked: xas_unlock_irq(&xas); xa_unlocked: @@ -1972,6 +1978,10 @@ static void khugepaged_scan_file(struct mm_struct *mm, continue; } + /* + * XXX: khugepaged should compact smaller compound pages + * into a PMD sized page + */ if (PageTransCompound(page)) { result = SCAN_PAGE_COMPOUND; break; diff --git a/mm/migrate.c b/mm/migrate.c index d1ca7bdc80ca..39663dfbc273 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -460,14 +460,6 @@ int migrate_page_move_mapping(struct address_space *mapping, } xas_store(&xas, newpage); - if (PageTransHuge(page)) { - int i; - - for (i = 1; i < HPAGE_PMD_NR; i++) { - xas_next(&xas); - xas_store(&xas, newpage); - } - } /* * Drop cache reference from old page by unfreezing diff --git a/mm/shmem.c b/mm/shmem.c index d1068c6d731d..e9ab59caae50 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -670,7 +670,6 @@ static int shmem_add_to_page_cache(struct page *page, struct mm_struct *charge_mm) { XA_STATE_ORDER(xas, &mapping->i_pages, index, compound_order(page)); - unsigned long i = 0; unsigned long nr = compound_nr(page); int error; @@ -700,17 +699,11 @@ static int shmem_add_to_page_cache(struct page *page, void *entry; xas_lock_irq(&xas); entry = xas_find_conflict(&xas); - if (entry != expected) + if (entry != expected) { xas_set_err(&xas, -EEXIST); - xas_create_range(&xas); - if (xas_error(&xas)) goto unlock; -next: - xas_store(&xas, page); - if (++i < nr) { - xas_next(&xas); - goto next; } + xas_store(&xas, page); if (PageTransHuge(page)) { count_vm_event(THP_FILE_ALLOC); __inc_lruvec_page_state(page, NR_SHMEM_THPS); From patchwork Thu Oct 29 19:33:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867315 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 73B8792C for ; Thu, 29 Oct 2020 19:34:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B2E6206DD for ; Thu, 29 Oct 2020 19:34:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="GjtdCwcf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726251AbgJ2TeM (ORCPT ); Thu, 29 Oct 2020 15:34:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725828AbgJ2TeK (ORCPT ); Thu, 29 Oct 2020 15:34:10 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD7A7C0613CF for ; Thu, 29 Oct 2020 12:34:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=s1cCfqcsILzWonrmhPADFuQaWV/n44mIEDAgMnBPBWs=; b=GjtdCwcfhF/TVVIqPnUDC25++f BnD+8bDa4SOZWlnGoG3Ygs+kQnhqaxR+FPpm1Q9/3Izljt77lSRroTSVmH1B8H898vyRAPmXlXLbQ ee4kUxtRQH5hJpM+v3uC/KOo5Y1sI4U7rvfXDEuDOql7UQqlMI7ccJ0UMs3WZcwAooYnuQVqjvNkw Ar0YofCqerzlrNdX0ZQ9nci8/vB9jlzwQaBlmYUzxtx8suGfVO27aAHOlQe6m2gM39SSzEu+XJnOe yp39RaqKFYcT057V9xRdMwE4ABGftx9NKzrIQjaMOLwYHPu/St5IPisjwfYGIxtsk3bJmOwNgGJlC VU5KiK4Q==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgW-0007bG-6y; Thu, 29 Oct 2020 19:34:08 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 03/19] mm: Support arbitrary THP sizes Date: Thu, 29 Oct 2020 19:33:49 +0000 Message-Id: <20201029193405.29125-4-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Use the compound size of the page instead of assuming PTE or PMD size. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/huge_mm.h | 8 ++------ include/linux/mm.h | 42 ++++++++++++++++++++--------------------- 2 files changed, 23 insertions(+), 27 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index c2ecb6036ad8..60a907a19f7d 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -272,9 +272,7 @@ static inline struct page *thp_head(struct page *page) static inline unsigned int thp_order(struct page *page) { VM_BUG_ON_PGFLAGS(PageTail(page), page); - if (PageHead(page)) - return HPAGE_PMD_ORDER; - return 0; + return compound_order(page); } /** @@ -284,9 +282,7 @@ static inline unsigned int thp_order(struct page *page) static inline int thp_nr_pages(struct page *page) { VM_BUG_ON_PGFLAGS(PageTail(page), page); - if (PageHead(page)) - return HPAGE_PMD_NR; - return 1; + return compound_nr(page); } struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, diff --git a/include/linux/mm.h b/include/linux/mm.h index 4e75f1c64534..3a73161a25a4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -685,6 +685,27 @@ int vma_is_stack_for_current(struct vm_area_struct *vma); struct mmu_gather; struct inode; +static inline unsigned int compound_order(struct page *page) +{ + if (!PageHead(page)) + return 0; + return page[1].compound_order; +} + +/* Returns the number of pages in this potentially compound page. */ +static inline unsigned long compound_nr(struct page *page) +{ + if (!PageHead(page)) + return 1; + return page[1].compound_nr; +} + +static inline void set_compound_order(struct page *page, unsigned int order) +{ + page[1].compound_order = order; + page[1].compound_nr = 1U << order; +} + #include /* @@ -901,13 +922,6 @@ static inline void destroy_compound_page(struct page *page) compound_page_dtors[page[1].compound_dtor](page); } -static inline unsigned int compound_order(struct page *page) -{ - if (!PageHead(page)) - return 0; - return page[1].compound_order; -} - static inline bool hpage_pincount_available(struct page *page) { /* @@ -931,20 +945,6 @@ static inline int compound_pincount(struct page *page) return head_compound_pincount(page); } -static inline void set_compound_order(struct page *page, unsigned int order) -{ - page[1].compound_order = order; - page[1].compound_nr = 1U << order; -} - -/* Returns the number of pages in this potentially compound page. */ -static inline unsigned long compound_nr(struct page *page) -{ - if (!PageHead(page)) - return 1; - return page[1].compound_nr; -} - /* Returns the number of bytes in this potentially compound page. */ static inline unsigned long page_size(struct page *page) { From patchwork Thu Oct 29 19:33:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867317 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B89931130 for ; Thu, 29 Oct 2020 19:34:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8BB3420782 for ; Thu, 29 Oct 2020 19:34:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="jbojzSwR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726253AbgJ2TeM (ORCPT ); Thu, 29 Oct 2020 15:34:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726184AbgJ2TeK (ORCPT ); Thu, 29 Oct 2020 15:34:10 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1351BC0613D3 for ; Thu, 29 Oct 2020 12:34:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=XOaTjVTgn8Ke6TW/15Kdkt6j4VR7ZhSGr7eWJqQULJg=; b=jbojzSwRofPkOkoavKlBB6etKE oKwiwOJBHnt8IE7ND7Yfl9vL32xfhR4qnQEv+sz7DwBMfod9FW5ivZq1oCFpNvgf5Vbc+FQ4sksOS v2hYpQMpqWPu02JNBWDsrjqzoSKEEC49u5htHt1SjBphd3CH7dqgdTG7eMiI14aG9YKNAWjlgyBsw pRVkh5j/zPXPkz/VlK9/HecZYO1l673HNjok7L7nieESidCigtnUqOTZtP9jKfLtL3HY4F2CGzE0f rgHmsLdAVbFeoP9ygr9SNWgf1wqT1V6GBGekXNJ11BJNfgg0QXAOTBmJC4Os59yfwUtVBNhTXN4ck 1dMteOFA==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgW-0007bN-FB; Thu, 29 Oct 2020 19:34:08 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 04/19] mm: Change NR_FILE_THPS to account in base pages Date: Thu, 29 Oct 2020 19:33:50 +0000 Message-Id: <20201029193405.29125-5-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org With variable sized THPs, we have to account in number of base pages, not in number of PMD-sized pages. Signed-off-by: Matthew Wilcox (Oracle) --- drivers/base/node.c | 3 +-- fs/proc/meminfo.c | 2 +- include/linux/mmzone.h | 2 +- mm/filemap.c | 2 +- mm/huge_memory.c | 3 ++- mm/khugepaged.c | 3 ++- 6 files changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 6ffa470e2984..6bcc2e7da775 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -467,8 +467,7 @@ static ssize_t node_read_meminfo(struct device *dev, HPAGE_PMD_NR), nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR), - nid, K(node_page_state(pgdat, NR_FILE_THPS) * - HPAGE_PMD_NR), + nid, K(node_page_state(pgdat, NR_FILE_THPS)), nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED) * HPAGE_PMD_NR) #endif diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 887a5532e449..d1ccee90f0e1 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -135,7 +135,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v) show_val_kb(m, "ShmemPmdMapped: ", global_node_page_state(NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR); show_val_kb(m, "FileHugePages: ", - global_node_page_state(NR_FILE_THPS) * HPAGE_PMD_NR); + global_node_page_state(NR_FILE_THPS)); show_val_kb(m, "FilePmdMapped: ", global_node_page_state(NR_FILE_PMDMAPPED) * HPAGE_PMD_NR); #endif diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index fb3bf696c05e..b0e2c546d5f5 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -193,7 +193,7 @@ enum node_stat_item { NR_SHMEM, /* shmem pages (included tmpfs/GEM pages) */ NR_SHMEM_THPS, NR_SHMEM_PMDMAPPED, - NR_FILE_THPS, + NR_FILE_THPS, /* Accounted in base pages */ NR_FILE_PMDMAPPED, NR_ANON_THPS, NR_VMSCAN_WRITE, diff --git a/mm/filemap.c b/mm/filemap.c index 8537ee86f99f..a68516ddeddc 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -194,7 +194,7 @@ static void unaccount_page_cache_page(struct address_space *mapping, if (PageTransHuge(page)) __dec_lruvec_page_state(page, NR_SHMEM_THPS); } else if (PageTransHuge(page)) { - __dec_lruvec_page_state(page, NR_FILE_THPS); + __mod_lruvec_page_state(page, NR_FILE_THPS, -nr); filemap_nr_thps_dec(mapping); } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0e900e594e77..03374ec61b45 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2729,7 +2729,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) if (PageSwapBacked(head)) __dec_lruvec_page_state(head, NR_SHMEM_THPS); else - __dec_lruvec_page_state(head, NR_FILE_THPS); + __mod_lruvec_page_state(head, NR_FILE_THPS, + -thp_nr_pages(head)); } __split_huge_page(page, list, end, flags); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 230e62a92ae7..74c90f5d352f 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1844,7 +1844,8 @@ static void collapse_file(struct mm_struct *mm, if (is_shmem) __inc_lruvec_page_state(new_page, NR_SHMEM_THPS); else { - __inc_lruvec_page_state(new_page, NR_FILE_THPS); + __mod_lruvec_page_state(new_page, NR_FILE_THPS, + thp_nr_pages(new_page)); filemap_nr_thps_inc(mapping); } From patchwork Thu Oct 29 19:33:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867319 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 05731697 for ; Thu, 29 Oct 2020 19:34:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CF44520796 for ; Thu, 29 Oct 2020 19:34:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="ZQaQfChy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726259AbgJ2TeN (ORCPT ); Thu, 29 Oct 2020 15:34:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726203AbgJ2TeL (ORCPT ); Thu, 29 Oct 2020 15:34:11 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55DFFC0613D4 for ; Thu, 29 Oct 2020 12:34:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=dlYJ9D+4FGM52whGaN4YYn56NteR9ZTeFStm51rap7s=; b=ZQaQfChyypFbOCaeq1kooxG0AK GZwqatG8XukWqu+4VmwtRNtkd/wsOnJ7vSQB03Bo7a2pHeHYRAxsDgg0NWXQgoQ3d+A9B+1vN2l1F iPnHFN7a/OLWIEdJkanthQ2s0ZJW9v3GWVK5gnSQgN59fYlgJqVAsFT8g2tf4OUI2ReSunW5idUZX v3egWpe35i58nEvsIkgXdtoQoQDqu0PuJGGVU8fEqWKhhyx0AS6e3oTM9wkppsGdhbO00ew1nettS IIC2YsjcFrfRlOhhGHdr3CWG3qlb0i52R1FKyWYclfhLfvK8LwH+tby4DOuUiQU7Pp9DFLdoZqFr3 2bI0gTrg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgW-0007bY-O2; Thu, 29 Oct 2020 19:34:08 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 05/19] mm/filemap: Rename generic_file_buffered_read subfunctions Date: Thu, 29 Oct 2020 19:33:51 +0000 Message-Id: <20201029193405.29125-6-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The recent split of generic_file_buffered_read() created some very long function names which are hard to distinguish from each other. Rename as follows: generic_file_buffered_read_readpage -> gfbr_read_page generic_file_buffered_read_pagenotuptodate -> gfbr_update_page generic_file_buffered_read_no_cached_page -> gfbr_create_page generic_file_buffered_read_get_pages -> gfbr_get_pages Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Kent Overstreet --- mm/filemap.c | 43 ++++++++++++++----------------------------- 1 file changed, 14 insertions(+), 29 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index a68516ddeddc..7bc791b47a68 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2176,11 +2176,8 @@ static int lock_page_for_iocb(struct kiocb *iocb, struct page *page) return lock_page_killable(page); } -static struct page * -generic_file_buffered_read_readpage(struct kiocb *iocb, - struct file *filp, - struct address_space *mapping, - struct page *page) +static struct page *gfbr_read_page(struct kiocb *iocb, struct file *filp, + struct address_space *mapping, struct page *page) { struct file_ra_state *ra = &filp->f_ra; int error; @@ -2231,12 +2228,9 @@ generic_file_buffered_read_readpage(struct kiocb *iocb, return page; } -static struct page * -generic_file_buffered_read_pagenotuptodate(struct kiocb *iocb, - struct file *filp, - struct iov_iter *iter, - struct page *page, - loff_t pos, loff_t count) +static struct page *gfbr_update_page(struct kiocb *iocb, struct file *filp, + struct iov_iter *iter, struct page *page, loff_t pos, + loff_t count) { struct address_space *mapping = filp->f_mapping; struct inode *inode = mapping->host; @@ -2299,12 +2293,10 @@ generic_file_buffered_read_pagenotuptodate(struct kiocb *iocb, return page; } - return generic_file_buffered_read_readpage(iocb, filp, mapping, page); + return gfbr_read_page(iocb, filp, mapping, page); } -static struct page * -generic_file_buffered_read_no_cached_page(struct kiocb *iocb, - struct iov_iter *iter) +static struct page *gfbr_create_page(struct kiocb *iocb, struct iov_iter *iter) { struct file *filp = iocb->ki_filp; struct address_space *mapping = filp->f_mapping; @@ -2315,10 +2307,6 @@ generic_file_buffered_read_no_cached_page(struct kiocb *iocb, if (iocb->ki_flags & IOCB_NOIO) return ERR_PTR(-EAGAIN); - /* - * Ok, it wasn't cached, so we need to create a new - * page.. - */ page = page_cache_alloc(mapping); if (!page) return ERR_PTR(-ENOMEM); @@ -2330,13 +2318,11 @@ generic_file_buffered_read_no_cached_page(struct kiocb *iocb, return error != -EEXIST ? ERR_PTR(error) : NULL; } - return generic_file_buffered_read_readpage(iocb, filp, mapping, page); + return gfbr_read_page(iocb, filp, mapping, page); } -static int generic_file_buffered_read_get_pages(struct kiocb *iocb, - struct iov_iter *iter, - struct page **pages, - unsigned int nr) +static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, + struct page **pages, unsigned int nr) { struct file *filp = iocb->ki_filp; struct address_space *mapping = filp->f_mapping; @@ -2363,7 +2349,7 @@ static int generic_file_buffered_read_get_pages(struct kiocb *iocb, if (nr_got) goto got_pages; - pages[0] = generic_file_buffered_read_no_cached_page(iocb, iter); + pages[0] = gfbr_create_page(iocb, iter); err = PTR_ERR_OR_ZERO(pages[0]); if (!IS_ERR_OR_NULL(pages[0])) nr_got = 1; @@ -2397,8 +2383,8 @@ static int generic_file_buffered_read_get_pages(struct kiocb *iocb, break; } - page = generic_file_buffered_read_pagenotuptodate(iocb, - filp, iter, page, pg_pos, pg_count); + page = gfbr_update_page(iocb, filp, iter, page, + pg_pos, pg_count); if (IS_ERR_OR_NULL(page)) { for (j = i + 1; j < nr_got; j++) put_page(pages[j]); @@ -2474,8 +2460,7 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, iocb->ki_flags |= IOCB_NOWAIT; i = 0; - pg_nr = generic_file_buffered_read_get_pages(iocb, iter, - pages, nr_pages); + pg_nr = gfbr_get_pages(iocb, iter, pages, nr_pages); if (pg_nr < 0) { error = pg_nr; break; From patchwork Thu Oct 29 19:33:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867325 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8129416C1 for ; Thu, 29 Oct 2020 19:34:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5923620782 for ; Thu, 29 Oct 2020 19:34:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="doC9htAO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725914AbgJ2TeO (ORCPT ); Thu, 29 Oct 2020 15:34:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726208AbgJ2TeL (ORCPT ); Thu, 29 Oct 2020 15:34:11 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93E11C0613D5 for ; Thu, 29 Oct 2020 12:34:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=m3kX9V2viinw8plXiDNypMDhWT98QwXvBE2VVZQYm9o=; b=doC9htAOcjqC5YmMj3f1l9fEap Yw01mYv9o8B6B3qb9NBzrw+7KgXRToZ8s6yVfYwM0dsNwWxTtvZnkWp2mnB8VxPpL/xiha4WVIv0+ SQM1fQlGcIXsPhoNsKbJ2gREyyBkbjc/l5HVrBXJtBCPVaj+J9irUeRlXnE5nJXI/rRvmTm3PcXTa rx/MhfqFeTzdZelRtjZFDxcAZy7G9p1s6Zox+D7DdFRZ+pYNOv/ShiuqAGOX6thgpuXd4cKC6b0yt md6Ymd+SyowMmar9IJ0jJeSrPPTLp/RQNLddGd1P1U04YVQVfrZQP+YmQ7gQ+O8JL1JXIGutPRuyA 5RwiUBqA==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgX-0007bf-0C; Thu, 29 Oct 2020 19:34:09 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 06/19] mm/filemap: Change calling convention for gfbr_ functions Date: Thu, 29 Oct 2020 19:33:52 +0000 Message-Id: <20201029193405.29125-7-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org gfbr_update_page() would prefer to have mapping passed to it than filp, as would gfbr_create_page(). That makes gfbr_read_page() retrieve the file pointer from the iocb. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Kent Overstreet --- mm/filemap.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 7bc791b47a68..1bfd87d85bfd 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2176,9 +2176,10 @@ static int lock_page_for_iocb(struct kiocb *iocb, struct page *page) return lock_page_killable(page); } -static struct page *gfbr_read_page(struct kiocb *iocb, struct file *filp, +static struct page *gfbr_read_page(struct kiocb *iocb, struct address_space *mapping, struct page *page) { + struct file *filp = iocb->ki_filp; struct file_ra_state *ra = &filp->f_ra; int error; @@ -2228,11 +2229,10 @@ static struct page *gfbr_read_page(struct kiocb *iocb, struct file *filp, return page; } -static struct page *gfbr_update_page(struct kiocb *iocb, struct file *filp, - struct iov_iter *iter, struct page *page, loff_t pos, - loff_t count) +static struct page *gfbr_update_page(struct kiocb *iocb, + struct address_space *mapping, struct iov_iter *iter, + struct page *page, loff_t pos, loff_t count) { - struct address_space *mapping = filp->f_mapping; struct inode *inode = mapping->host; int error; @@ -2293,13 +2293,12 @@ static struct page *gfbr_update_page(struct kiocb *iocb, struct file *filp, return page; } - return gfbr_read_page(iocb, filp, mapping, page); + return gfbr_read_page(iocb, mapping, page); } -static struct page *gfbr_create_page(struct kiocb *iocb, struct iov_iter *iter) +static struct page *gfbr_create_page(struct kiocb *iocb, + struct address_space *mapping, struct iov_iter *iter) { - struct file *filp = iocb->ki_filp; - struct address_space *mapping = filp->f_mapping; pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; struct page *page; int error; @@ -2318,7 +2317,7 @@ static struct page *gfbr_create_page(struct kiocb *iocb, struct iov_iter *iter) return error != -EEXIST ? ERR_PTR(error) : NULL; } - return gfbr_read_page(iocb, filp, mapping, page); + return gfbr_read_page(iocb, mapping, page); } static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, @@ -2349,7 +2348,7 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, if (nr_got) goto got_pages; - pages[0] = gfbr_create_page(iocb, iter); + pages[0] = gfbr_create_page(iocb, mapping, iter); err = PTR_ERR_OR_ZERO(pages[0]); if (!IS_ERR_OR_NULL(pages[0])) nr_got = 1; @@ -2383,7 +2382,7 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, break; } - page = gfbr_update_page(iocb, filp, iter, page, + page = gfbr_update_page(iocb, mapping, iter, page, pg_pos, pg_count); if (IS_ERR_OR_NULL(page)) { for (j = i + 1; j < nr_got; j++) From patchwork Thu Oct 29 19:33:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867361 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7E561697 for ; Thu, 29 Oct 2020 19:34:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5734F20EDD for ; Thu, 29 Oct 2020 19:34:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="cgdijzCx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726581AbgJ2Teb (ORCPT ); Thu, 29 Oct 2020 15:34:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56544 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726212AbgJ2TeL (ORCPT ); Thu, 29 Oct 2020 15:34:11 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D176DC0613D6 for ; Thu, 29 Oct 2020 12:34:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=5Qi+fkVc5y2m3fg4x8n5YvSh2yc6fBKTXrBLflPtfFo=; b=cgdijzCxM0jsbGOUGge5xzXqxe pjiKNnaXc3NeGn6Y9ZsbMPzl3h6AUlFtKDhYbT9JE/1mw4hoEM1yCsMVuIomzvrFZULUGhCwGPmNV HQxBdil412iT477HB03yMlb+wSEuHSMTMiiKKCDai6JexnnPjf0x+pgsEXOtIA3srgZx1gEKOc6Yv xCc4+C3DJQ8MesMu6VHhTLTZ5eTGxroqTX8j3JPz8CHa2aBsSL93VpY7WEmAapLdQO/ilsSonqioq /TFR3xNUasJUuLriGHbkxPKD7ApX71/nzu3DmIIZU87H2xZCjcDti2ZkI7oOVmerilThG4K3G00HL TkNn+MnQ==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgX-0007bn-AZ; Thu, 29 Oct 2020 19:34:09 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 07/19] mm/filemap: Use head pages in generic_file_buffered_read Date: Thu, 29 Oct 2020 19:33:53 +0000 Message-Id: <20201029193405.29125-8-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add mapping_get_read_heads() which returns the head pages which represent a contiguous array of bytes in the file. It also stops when encountering a page marked as Readahead or !Uptodate (but does return that page) so it can be handled appropriately by gfbr_get_pages(). Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 78 ++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 61 insertions(+), 17 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 1bfd87d85bfd..c0161f42f37d 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2166,6 +2166,48 @@ static void shrink_readahead_size_eio(struct file_ra_state *ra) ra->ra_pages /= 4; } +static unsigned mapping_get_read_heads(struct address_space *mapping, + pgoff_t index, unsigned int nr_pages, struct page **pages) +{ + XA_STATE(xas, &mapping->i_pages, index); + struct page *head; + unsigned int ret = 0; + + if (unlikely(!nr_pages)) + return 0; + + rcu_read_lock(); + for (head = xas_load(&xas); head; head = xas_next(&xas)) { + if (xas_retry(&xas, head)) + continue; + if (xa_is_value(head)) + break; + if (!page_cache_get_speculative(head)) + goto retry; + + /* Has the page moved or been split? */ + if (unlikely(head != xas_reload(&xas))) + goto put_page; + + pages[ret++] = head; + if (ret == nr_pages) + break; + if (!PageUptodate(head)) + break; + if (PageReadahead(head)) + break; + xas.xa_index = head->index + thp_nr_pages(head) - 1; + xas.xa_offset = (xas.xa_index >> xas.xa_shift) & XA_CHUNK_MASK; + continue; +put_page: + put_page(head); +retry: + xas_reset(&xas); + } + rcu_read_unlock(); + return ret; +} + static int lock_page_for_iocb(struct kiocb *iocb, struct page *page) { if (iocb->ki_flags & IOCB_WAITQ) @@ -2328,14 +2370,14 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, struct file_ra_state *ra = &filp->f_ra; pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; pgoff_t last_index = (iocb->ki_pos + iter->count + PAGE_SIZE-1) >> PAGE_SHIFT; - int i, j, nr_got, err = 0; + int i, nr_got, err = 0; nr = min_t(unsigned long, last_index - index, nr); find_page: if (fatal_signal_pending(current)) return -EINTR; - nr_got = find_get_pages_contig(mapping, index, nr, pages); + nr_got = mapping_get_read_heads(mapping, index, nr, pages); if (nr_got) goto got_pages; @@ -2344,7 +2386,7 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, page_cache_sync_readahead(mapping, ra, filp, index, last_index - index); - nr_got = find_get_pages_contig(mapping, index, nr, pages); + nr_got = mapping_get_read_heads(mapping, index, nr, pages); if (nr_got) goto got_pages; @@ -2355,15 +2397,14 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, got_pages: for (i = 0; i < nr_got; i++) { struct page *page = pages[i]; - pgoff_t pg_index = index + i; + pgoff_t pg_index = page->index; loff_t pg_pos = max(iocb->ki_pos, (loff_t) pg_index << PAGE_SHIFT); loff_t pg_count = iocb->ki_pos + iter->count - pg_pos; if (PageReadahead(page)) { if (iocb->ki_flags & IOCB_NOIO) { - for (j = i; j < nr_got; j++) - put_page(pages[j]); + put_page(page); nr_got = i; err = -EAGAIN; break; @@ -2375,8 +2416,7 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, if (!PageUptodate(page)) { if ((iocb->ki_flags & IOCB_NOWAIT) || ((iocb->ki_flags & IOCB_WAITQ) && i)) { - for (j = i; j < nr_got; j++) - put_page(pages[j]); + put_page(page); nr_got = i; err = -EAGAIN; break; @@ -2385,8 +2425,6 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, page = gfbr_update_page(iocb, mapping, iter, page, pg_pos, pg_count); if (IS_ERR_OR_NULL(page)) { - for (j = i + 1; j < nr_got; j++) - put_page(pages[j]); nr_got = i; err = PTR_ERR_OR_ZERO(page); break; @@ -2500,20 +2538,26 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, mark_page_accessed(pages[i]); for (i = 0; i < pg_nr; i++) { - unsigned int offset = iocb->ki_pos & ~PAGE_MASK; - unsigned int bytes = min_t(loff_t, end_offset - iocb->ki_pos, - PAGE_SIZE - offset); - unsigned int copied; + struct page *page = pages[i]; + size_t page_size = thp_size(page); + size_t offset = iocb->ki_pos & (page_size - 1); + size_t bytes = min_t(loff_t, end_offset - iocb->ki_pos, + page_size - offset); + size_t copied; /* * If users can be writing to this page using arbitrary * virtual addresses, take care about potential aliasing * before reading the page on the kernel side. */ - if (writably_mapped) - flush_dcache_page(pages[i]); + if (writably_mapped) { + int j; + + for (j = 0; j < thp_nr_pages(page); j++) + flush_dcache_page(page + j); + } - copied = copy_page_to_iter(pages[i], offset, bytes, iter); + copied = copy_page_to_iter(page, offset, bytes, iter); written += copied; iocb->ki_pos += copied; From patchwork Thu Oct 29 19:33:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867353 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E053F92C for ; Thu, 29 Oct 2020 19:34:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B92F4214DB for ; Thu, 29 Oct 2020 19:34:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="p/v9M6ww" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726575AbgJ2Te3 (ORCPT ); Thu, 29 Oct 2020 15:34:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726133AbgJ2TeM (ORCPT ); Thu, 29 Oct 2020 15:34:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED1EEC0613D3 for ; Thu, 29 Oct 2020 12:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=3hbUZbM6zGykKpCg4QK7R1IyrA2YegBa7tJLLyOCQP8=; b=p/v9M6wwP8wJfhOVYcSHALP0Io 3ej4nUuX+G7NfN+JzKtFwVXYKBam//vH81nE63uNuDSeBOw5rj4OC224lElZqXsWNO64xOKWA8o6F IlDGD2L12OB1yA/v0dTtDxQfM2DdX61F3ztgR6qe0LVOSK1jDZJeXExzanTsNNPwmvlTMDkIDCHQF jO08whbEyiEI5CAPCZtDl5MC4VAbe7QPl2T6cSoV/wElmRuSXTWsy/ZFbqCvplBzncx9zl30DI8t1 cXfYXtvBCBDiuDuCbux4HV+2XwCbTipydb+5yHYGR7r9qdQId48QcUKTafbUXV0us+YseE+osJHe0 s3l/dKMg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgX-0007bu-IX; Thu, 29 Oct 2020 19:34:09 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , "Kirill A . Shutemov" Subject: [PATCH 08/19] mm/filemap: Add __page_cache_alloc_order Date: Thu, 29 Oct 2020 19:33:54 +0000 Message-Id: <20201029193405.29125-9-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This new function allows page cache pages to be allocated that are larger than an order-0 page. Signed-off-by: Matthew Wilcox (Oracle) Acked-by: Kirill A. Shutemov --- include/linux/pagemap.h | 24 +++++++++++++++++++++--- mm/filemap.c | 16 ++++++++++------ 2 files changed, 31 insertions(+), 9 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 00288ed24698..bb6692001028 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -284,15 +284,33 @@ static inline void *detach_page_private(struct page *page) return data; } +static inline gfp_t thp_gfpmask(gfp_t gfp) +{ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + /* We'd rather allocate smaller pages than stall a page fault */ + gfp |= GFP_TRANSHUGE_LIGHT; + gfp &= ~__GFP_DIRECT_RECLAIM; +#endif + return gfp; +} + #ifdef CONFIG_NUMA -extern struct page *__page_cache_alloc(gfp_t gfp); +extern struct page *__page_cache_alloc_order(gfp_t gfp, unsigned int order); #else -static inline struct page *__page_cache_alloc(gfp_t gfp) +static inline +struct page *__page_cache_alloc_order(gfp_t gfp, unsigned int order) { - return alloc_pages(gfp, 0); + if (order == 0) + return alloc_pages(gfp, 0); + return thp_prep(alloc_pages(thp_gfpmask(gfp), order)); } #endif +static inline struct page *__page_cache_alloc(gfp_t gfp) +{ + return __page_cache_alloc_order(gfp, 0); +} + static inline struct page *page_cache_alloc(struct address_space *x) { return __page_cache_alloc(mapping_gfp_mask(x)); diff --git a/mm/filemap.c b/mm/filemap.c index c0161f42f37d..64fe0018ee17 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -934,24 +934,28 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping, EXPORT_SYMBOL_GPL(add_to_page_cache_lru); #ifdef CONFIG_NUMA -struct page *__page_cache_alloc(gfp_t gfp) +struct page *__page_cache_alloc_order(gfp_t gfp, unsigned int order) { int n; struct page *page; + if (order > 0) + gfp = thp_gfpmask(gfp); + if (cpuset_do_page_mem_spread()) { unsigned int cpuset_mems_cookie; do { cpuset_mems_cookie = read_mems_allowed_begin(); n = cpuset_mem_spread_node(); - page = __alloc_pages_node(n, gfp, 0); + page = __alloc_pages_node(n, gfp, order); } while (!page && read_mems_allowed_retry(cpuset_mems_cookie)); - - return page; + } else { + page = alloc_pages(gfp, order); } - return alloc_pages(gfp, 0); + + return thp_prep(page); } -EXPORT_SYMBOL(__page_cache_alloc); +EXPORT_SYMBOL(__page_cache_alloc_order); #endif /* From patchwork Thu Oct 29 19:33:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867357 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 91A50697 for ; Thu, 29 Oct 2020 19:34:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6A52720825 for ; Thu, 29 Oct 2020 19:34:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="GG08yJaq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726579AbgJ2Tea (ORCPT ); Thu, 29 Oct 2020 15:34:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726214AbgJ2TeL (ORCPT ); Thu, 29 Oct 2020 15:34:11 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F822C0613D2 for ; Thu, 29 Oct 2020 12:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=2Bn5b1P1ATGl/MqPbstL/ShGH6FmIS/ZdMSjQOY1lJ4=; b=GG08yJaqKcj195RpDR4Pd6vfkR aKZdXWtCjSMaAt8/DxgYYwzJoEUHTh6Mo3BGXx587A3pM9VPifVfz70pzEuLyJ3tEyDpU8EsjvZns 1MQpV1Eb87bpv64/Pyl5Iz4nWnMm7gInAI5dNaGqWXF2CKJcbLKnIVuX1IN+9CpDy+H+cF63BpRAy KrBz5WxU75R48VKwmCaJG2wPop9AtlfB5mCVfxrsMWHliVanDoqRlkVnEfGO+L6V9NwNy3+2dNEAx L0yEknmsmX5I1kKvCvhIDSYp2ZyUiaEkC2ArZxVTUfE5FOURW8rhG/9G2eaIA2ScRm5kWCC9xoEDl QSvlDUxg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgX-0007c1-S2; Thu, 29 Oct 2020 19:34:09 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 09/19] mm/filemap: Allow THPs to be added to the page cache Date: Thu, 29 Oct 2020 19:33:55 +0000 Message-Id: <20201029193405.29125-10-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We return -EEXIST if there are any non-shadow entries in the page cache in the range covered by the THP. If there are multiple shadow entries in the range, we set *shadowp to one of them (currently the one at the highest index). If that turns out to be the wrong answer, we can implement something more complex. This is mostly modelled after the equivalent function in the shmem code. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 64fe0018ee17..dabc26cf0067 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -811,23 +811,25 @@ noinline int __add_to_page_cache_locked(struct page *page, { XA_STATE(xas, &mapping->i_pages, offset); int huge = PageHuge(page); - int error; + unsigned int nr = 1; VM_BUG_ON_PAGE(!PageLocked(page), page); VM_BUG_ON_PAGE(PageSwapBacked(page), page); mapping_set_update(&xas, mapping); - get_page(page); - page->mapping = mapping; - page->index = offset; - if (!huge) { - error = mem_cgroup_charge(page, current->mm, gfp); + int error = mem_cgroup_charge(page, current->mm, gfp); + if (error) - goto error; + return error; + xas_set_order(&xas, offset, thp_order(page)); + nr = thp_nr_pages(page); } gfp &= GFP_RECLAIM_MASK; + page_ref_add(page, nr); + page->mapping = mapping; + page->index = xas.xa_index; do { unsigned int order = xa_get_order(xas.xa, xas.xa_index); @@ -851,6 +853,8 @@ noinline int __add_to_page_cache_locked(struct page *page, /* entry may have been split before we acquired lock */ order = xa_get_order(xas.xa, xas.xa_index); if (order > thp_order(page)) { + /* How to handle large swap entries? */ + BUG_ON(shmem_mapping(mapping)); xas_split(&xas, old, order); xas_reset(&xas); } @@ -860,27 +864,30 @@ noinline int __add_to_page_cache_locked(struct page *page, if (xas_error(&xas)) goto unlock; - mapping->nrpages++; + mapping->nrpages += nr; /* hugetlb pages do not participate in page cache accounting */ - if (!huge) - __inc_lruvec_page_state(page, NR_FILE_PAGES); + if (!huge) { + __mod_lruvec_page_state(page, NR_FILE_PAGES, nr); + if (nr > 1) + __mod_node_page_state(page_pgdat(page), + NR_FILE_THPS, nr); + } unlock: xas_unlock_irq(&xas); } while (xas_nomem(&xas, gfp)); - if (xas_error(&xas)) { - error = xas_error(&xas); + if (xas_error(&xas)) goto error; - } trace_mm_filemap_add_to_page_cache(page); return 0; error: page->mapping = NULL; /* Leave page->index set: truncation relies upon it */ - put_page(page); - return error; + page_ref_sub(page, nr); + VM_BUG_ON_PAGE(page_count(page) <= 0, page); + return xas_error(&xas); } ALLOW_ERROR_INJECTION(__add_to_page_cache_locked, ERRNO); From patchwork Thu Oct 29 19:33:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867321 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 423BC16C0 for ; Thu, 29 Oct 2020 19:34:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1B28120782 for ; Thu, 29 Oct 2020 19:34:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="Iw/i9Ycu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726458AbgJ2TeP (ORCPT ); Thu, 29 Oct 2020 15:34:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726237AbgJ2TeM (ORCPT ); Thu, 29 Oct 2020 15:34:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B32D7C0613CF for ; Thu, 29 Oct 2020 12:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=lGF4HW3xTcYw5c4m1uJQR09qdQYmGLF0sDqvoEBnKwc=; b=Iw/i9Ycuo/RjVb7jovRKBRt3Eu g5R4Efdp+GOX3ielnOCclp45qFbW3YsCuCrbikcdZBo3L6W9xy2O7WMI2m6xI1yrLNj3WXyUTXt1v hfz6I58c39cVYrMCL9WNP3yvrbX4IwNUMlU2y5XltGV8eOr/9msH86oD/qNhhDh29Tt5mNLnKYccE tFViHnnDwYtNpmrJalI3BpozUl1m7VgOp29aNTh575wdE9KC2TCaP5rdiMZEsPFpC+iId6uw2YQp4 fQOiqw35i7gmozKL7dSg3jwil7aQKBY1HzIloSTksrEjwQWlIQ+a9FJFt21eiyaIMpiyXVOXl5ih/ gGNDLNWw==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgY-0007c6-3K; Thu, 29 Oct 2020 19:34:10 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 10/19] mm/vmscan: Optimise shrink_page_list for smaller THPs Date: Thu, 29 Oct 2020 19:33:56 +0000 Message-Id: <20201029193405.29125-11-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org A THP which is smaller than a PMD does not need to do the extra work in try_to_unmap() of trying to split a PMD entry. Signed-off-by: Matthew Wilcox (Oracle) --- mm/vmscan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 1c2e6ca92a45..9e140d19611a 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1300,7 +1300,8 @@ static unsigned int shrink_page_list(struct list_head *page_list, enum ttu_flags flags = ttu_flags | TTU_BATCH_FLUSH; bool was_swapbacked = PageSwapBacked(page); - if (unlikely(PageTransHuge(page))) + if (PageTransHuge(page) && + thp_order(page) >= HPAGE_PMD_ORDER) flags |= TTU_SPLIT_HUGE_PMD; if (!try_to_unmap(page, flags)) { From patchwork Thu Oct 29 19:33:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867333 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D81941130 for ; Thu, 29 Oct 2020 19:34:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AE44320838 for ; Thu, 29 Oct 2020 19:34:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="uWdBNNjA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726483AbgJ2TeS (ORCPT ); Thu, 29 Oct 2020 15:34:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726449AbgJ2TeP (ORCPT ); Thu, 29 Oct 2020 15:34:15 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1101BC0613CF for ; Thu, 29 Oct 2020 12:34:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=i8c7CZBKt8Ob5/OYaiIUyfJTleCuj0emLFJGvRTncuU=; b=uWdBNNjAXs1rCJYYTItaVeSM3U K1J/DzxuCFj1zw6tNJfm7VjwBaTj3OtYg4GL/Mrb3ufFvBjbzQqtwdjMU3PBuieBIGRo84PkFy2Qc f96K75ZsLZsflCQiw8eC+3IpYyiMJa9GZ3DnhjawTbntVRHZ0IuUOcRaJKE670DvTd6/zgkJTT1p5 L0qTFx3v2A7c0BtDAA7urISjaH8w7iyYAOdDFZUOx5/C6wk1NRu0OG1IQhjXzQhYfW2jbQEa2ERNK GTRs8aCcvCdY/PqCekNMqbr6nsoQ3c/tI8WH+Ocsi43H6fHdK0Jy9V9kIdIcvrKb8bRELH8gouo3L FyOqlz9w==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgY-0007cG-FY; Thu, 29 Oct 2020 19:34:10 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 11/19] mm/filemap: Allow PageReadahead to be set on head pages Date: Thu, 29 Oct 2020 19:33:57 +0000 Message-Id: <20201029193405.29125-12-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Adjust the callers to only call PageReadahead on the head page. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/page-flags.h | 4 ++-- mm/filemap.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 00d8efd72496..8b523d25fccf 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -380,8 +380,8 @@ PAGEFLAG(MappedToDisk, mappedtodisk, PF_NO_TAIL) /* PG_readahead is only used for reads; PG_reclaim is only for writes */ PAGEFLAG(Reclaim, reclaim, PF_NO_TAIL) TESTCLEARFLAG(Reclaim, reclaim, PF_NO_TAIL) -PAGEFLAG(Readahead, reclaim, PF_NO_COMPOUND) - TESTCLEARFLAG(Readahead, reclaim, PF_NO_COMPOUND) +PAGEFLAG(Readahead, reclaim, PF_ONLY_HEAD) + TESTCLEARFLAG(Readahead, reclaim, PF_ONLY_HEAD) #ifdef CONFIG_HIGHMEM /* diff --git a/mm/filemap.c b/mm/filemap.c index dabc26cf0067..91145e33635d 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2892,10 +2892,10 @@ static struct file *do_async_mmap_readahead(struct vm_fault *vmf, mmap_miss = READ_ONCE(ra->mmap_miss); if (mmap_miss) WRITE_ONCE(ra->mmap_miss, --mmap_miss); - if (PageReadahead(page)) { + if (PageReadahead(thp_head(page))) { fpin = maybe_unlock_mmap_for_io(vmf, fpin); page_cache_async_readahead(mapping, ra, file, - page, offset, ra->ra_pages); + thp_head(page), offset, ra->ra_pages); } return fpin; } From patchwork Thu Oct 29 19:33:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867339 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D584F697 for ; Thu, 29 Oct 2020 19:34:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A95152151B for ; Thu, 29 Oct 2020 19:34:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="kqmrBljn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726518AbgJ2TeT (ORCPT ); Thu, 29 Oct 2020 15:34:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726461AbgJ2TeP (ORCPT ); Thu, 29 Oct 2020 15:34:15 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B43FC0613D2 for ; Thu, 29 Oct 2020 12:34:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=9jHZ5GBv8es5xhWOYPp8bxpNJzE+ReAs7wgmc5S+gNQ=; b=kqmrBljnmjaTQvkFMi1WHooSuA 0QUGbNxPu826EVcm6Lisk4YJeN/cmFioWhhmvI5Ht9FRuJ5FI48bubHzDj43/Jm2yluBYF9Dw1m2Q NCpx86vk0VZNMZ36YaDN66SivObYsJFcvyP4BYwdcsKH2yc+W7Q+oS79RrmAxmK9dOZs+a5sOyCco p0vD8Sxh0gr50U27O6CEW8neQxQNZWqg5MxMHHQ5eaIkqnzRIZs8f54ATCKtCazKyGfoyhnWYDUjz 2RC4RB+Nn5ib2cjZscK1pg4cXzxUvaJnsb996ggqJLTXoHbeY6rI5vVeBMf8tCUtkoR1SIqSOXdkf MzdUImBQ==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgY-0007cM-NC; Thu, 29 Oct 2020 19:34:10 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 12/19] mm: Pass a sleep state to put_and_wait_on_page_locked Date: Thu, 29 Oct 2020 19:33:58 +0000 Message-Id: <20201029193405.29125-13-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This is prep work for the next patch, but I think at least one of the current callers would prefer a killable sleep to an uninterruptible one. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/pagemap.h | 3 +-- mm/filemap.c | 7 +++++-- mm/huge_memory.c | 4 ++-- mm/migrate.c | 4 ++-- 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index bb6692001028..b5baaf1347e2 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -699,8 +699,7 @@ static inline int wait_on_page_locked_killable(struct page *page) return wait_on_page_bit_killable(compound_head(page), PG_locked); } -extern void put_and_wait_on_page_locked(struct page *page); - +int put_and_wait_on_page_locked(struct page *page, int state); void wait_on_page_writeback(struct page *page); extern void end_page_writeback(struct page *page); void wait_for_stable_page(struct page *page); diff --git a/mm/filemap.c b/mm/filemap.c index 91145e33635d..215729048cbd 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1369,20 +1369,23 @@ static int wait_on_page_locked_async(struct page *page, /** * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked * @page: The page to wait for. + * @state: The sleep state (TASK_KILLABLE, TASK_UNINTERRUPTIBLE, etc). * * The caller should hold a reference on @page. They expect the page to * become unlocked relatively soon, but do not wish to hold up migration * (for example) by holding the reference while waiting for the page to * come unlocked. After this function returns, the caller should not * dereference @page. + * + * Return: 0 if the page was unlocked or -EINTR if interrupted by a signal. */ -void put_and_wait_on_page_locked(struct page *page) +int put_and_wait_on_page_locked(struct page *page, int state) { wait_queue_head_t *q; page = compound_head(page); q = page_waitqueue(page); - wait_on_page_bit_common(q, page, PG_locked, TASK_UNINTERRUPTIBLE, DROP); + return wait_on_page_bit_common(q, page, PG_locked, state, DROP); } /** diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 03374ec61b45..580afce93d4a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1432,7 +1432,7 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) if (!get_page_unless_zero(page)) goto out_unlock; spin_unlock(vmf->ptl); - put_and_wait_on_page_locked(page); + put_and_wait_on_page_locked(page, TASK_UNINTERRUPTIBLE); goto out; } @@ -1468,7 +1468,7 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) if (!get_page_unless_zero(page)) goto out_unlock; spin_unlock(vmf->ptl); - put_and_wait_on_page_locked(page); + put_and_wait_on_page_locked(page, TASK_UNINTERRUPTIBLE); goto out; } diff --git a/mm/migrate.c b/mm/migrate.c index 39663dfbc273..a50bbb0e029b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -335,7 +335,7 @@ void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, if (!get_page_unless_zero(page)) goto out; pte_unmap_unlock(ptep, ptl); - put_and_wait_on_page_locked(page); + put_and_wait_on_page_locked(page, TASK_UNINTERRUPTIBLE); return; out: pte_unmap_unlock(ptep, ptl); @@ -369,7 +369,7 @@ void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t *pmd) if (!get_page_unless_zero(page)) goto unlock; spin_unlock(ptl); - put_and_wait_on_page_locked(page); + put_and_wait_on_page_locked(page, TASK_UNINTERRUPTIBLE); return; unlock: spin_unlock(ptl); From patchwork Thu Oct 29 19:33:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867351 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9503C92C for ; Thu, 29 Oct 2020 19:34:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6A8F4214DB for ; Thu, 29 Oct 2020 19:34:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="b0ecZ0DX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726571AbgJ2Te2 (ORCPT ); Thu, 29 Oct 2020 15:34:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725828AbgJ2TeM (ORCPT ); Thu, 29 Oct 2020 15:34:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B3DBC0613D3 for ; Thu, 29 Oct 2020 12:34:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=toB6t8pYm3LNmloiIQYto1Jnb9pOJ7DO7/Ur4ZlMNcE=; b=b0ecZ0DXNvHzbtvr3lRs9mLhr1 3gFIpggSVNuTQP6apiQvXyXhbVeyFuUt8FzveMDCuYRrxSY/ySB/u09BhV7CNCROjkJq8FfaULIrW PVzglXmnsWoXRa4okzDML9C1j6nZN05GcGhmPvuT7tSdkIwfEzSEl7qWfLBpu1/rOhdlJBDrLh+TM DToJBTbso6oLe34BpzzfzJ+bqb+G2oeM8KQAkfyVhXKTa3sDyapEvIAaN9gEnDF6o7aXsHUt4qCXo QAqvtOXrdwqW7z2ZrK7RHXrswqSbLmchxoUFVsuLHHpvlOP9OVfyBv7EcpdTTuMOckXiyFkwmPDFb vIeyp1sg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgY-0007cU-V3; Thu, 29 Oct 2020 19:34:11 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 13/19] mm/filemap: Support readpage splitting a page Date: Thu, 29 Oct 2020 19:33:59 +0000 Message-Id: <20201029193405.29125-14-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We need to tell readpage which subpage we're actually interested in (by passing the subpage to gfbr_read_page()), and if it does split the THP, we need to update the page in the page array to be the subpage. For page splitting to succeed, the thread asking to split the page has to be the only one with a reference to the page. Calling wait_on_page_locked() while holding a reference to the page will effectively prevent this from happening with sufficient threads waiting on the same page. Use put_and_wait_on_page_locked() to sleep without holding a reference to the page, then retry the page lookup after the page is unlocked. Since we now get the page lock a little earlier in gfbr_update_page(), we can eliminate a number of duplicate checks. The original intent (commit ebded02788b5 ("avoid unnecessary calls to lock_page when waiting for IO to complete during a read") behind getting the page lock later was to avoid re-locking the page after it has been brought uptodate by another thread. We will still avoid that because we go through the normal lookup path again after the winning thread has brought the page uptodate. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 76 +++++++++++++++++----------------------------------- 1 file changed, 24 insertions(+), 52 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 215729048cbd..87f89e5dd64e 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1358,14 +1358,6 @@ static int __wait_on_page_locked_async(struct page *page, return ret; } -static int wait_on_page_locked_async(struct page *page, - struct wait_page_queue *wait) -{ - if (!PageLocked(page)) - return 0; - return __wait_on_page_locked_async(compound_head(page), wait, false); -} - /** * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked * @page: The page to wait for. @@ -2259,6 +2251,7 @@ static struct page *gfbr_read_page(struct kiocb *iocb, return error != AOP_TRUNCATED_PAGE ? ERR_PTR(error) : NULL; } + page = thp_head(page); if (!PageUptodate(page)) { error = lock_page_for_iocb(iocb, page); if (unlikely(error)) { @@ -2292,64 +2285,42 @@ static struct page *gfbr_update_page(struct kiocb *iocb, struct inode *inode = mapping->host; int error; - /* - * See comment in do_read_cache_page on why - * wait_on_page_locked is used to avoid unnecessarily - * serialisations and why it's safe. - */ if (iocb->ki_flags & IOCB_WAITQ) { - error = wait_on_page_locked_async(page, - iocb->ki_waitq); - } else { - error = wait_on_page_locked_killable(page); - } - if (unlikely(error)) { - put_page(page); - return ERR_PTR(error); + error = lock_page_async(page, iocb->ki_waitq); + if (error) { + put_page(page); + return ERR_PTR(error); + } + } else if (!trylock_page(page)) { + put_and_wait_on_page_locked(page, TASK_KILLABLE); + return NULL; } + if (PageUptodate(page)) - return page; + goto uptodate; if (inode->i_blkbits == PAGE_SHIFT || !mapping->a_ops->is_partially_uptodate) - goto page_not_up_to_date; + goto readpage; /* pipes can't handle partially uptodate pages */ if (unlikely(iov_iter_is_pipe(iter))) - goto page_not_up_to_date; - if (!trylock_page(page)) - goto page_not_up_to_date; - /* Did it get truncated before we got the lock? */ + goto readpage; if (!page->mapping) - goto page_not_up_to_date_locked; + goto truncated; if (!mapping->a_ops->is_partially_uptodate(page, - pos & ~PAGE_MASK, count)) - goto page_not_up_to_date_locked; + pos & (thp_size(page) - 1), count)) + goto readpage; +uptodate: unlock_page(page); return page; -page_not_up_to_date: - /* Get exclusive access to the page ... */ - error = lock_page_for_iocb(iocb, page); - if (unlikely(error)) { - put_page(page); - return ERR_PTR(error); - } - -page_not_up_to_date_locked: - /* Did it get truncated before we got the lock? */ - if (!page->mapping) { - unlock_page(page); - put_page(page); - return NULL; - } - - /* Did somebody else fill it already? */ - if (PageUptodate(page)) { - unlock_page(page); - return page; - } - +readpage: + page += (pos / PAGE_SIZE) - page->index; return gfbr_read_page(iocb, mapping, page); +truncated: + unlock_page(page); + put_page(page); + return NULL; } static struct page *gfbr_create_page(struct kiocb *iocb, @@ -2443,6 +2414,7 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter, err = PTR_ERR_OR_ZERO(page); break; } + pages[i] = page; } } From patchwork Thu Oct 29 19:34:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867329 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1261A1130 for ; Thu, 29 Oct 2020 19:34:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DFD4520782 for ; Thu, 29 Oct 2020 19:34:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="g0vNUxW0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726473AbgJ2TeQ (ORCPT ); Thu, 29 Oct 2020 15:34:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726254AbgJ2TeN (ORCPT ); Thu, 29 Oct 2020 15:34:13 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4F74C0613CF for ; Thu, 29 Oct 2020 12:34:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=/n41mDNN8SRDTuHLwr+ITcPUNk1MPlDk+yJjScFNfLA=; b=g0vNUxW0T6MxqaqqEh+HadhWEZ +OV4xwepyUQjVS4tvFfVvJxH3GRpbKulJXVI440cSz7IJtl2G5khv3kgAHg4JI7gh3nz4Qe1qlv6m 2WNKqubJTSqbmT+OFXx+9cn9mk8AzjLUmQ2UVlIjw/xNl6m4ZCADn/VvgJLnW2o9AAV+7OIkA7IYb UUonl2o+ZxMNxC0yB4Q1OJxQew8HS40zTkm5xhWahODZRoby6bpoDW79RwQQQDsl5UvZgQXKsZzPx wvGiwdfcTRDdG0m9EY+2WXyAXjd7F+GpunIdkS9tQGUcxEDLDlJwxza9z+BeyCuIfRohi+Uyqj2jV sodeo9Kw==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgZ-0007cd-9Z; Thu, 29 Oct 2020 19:34:11 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 14/19] mm/filemap: Inline __wait_on_page_locked_async into caller Date: Thu, 29 Oct 2020 19:34:00 +0000 Message-Id: <20201029193405.29125-15-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The previous patch removed wait_on_page_locked_async(), so inline __wait_on_page_locked_async into __lock_page_async(). Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 53 ++++++++++++++++++++++------------------------------ 1 file changed, 22 insertions(+), 31 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 87f89e5dd64e..211a7c1fab3f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1328,36 +1328,6 @@ int wait_on_page_bit_killable(struct page *page, int bit_nr) } EXPORT_SYMBOL(wait_on_page_bit_killable); -static int __wait_on_page_locked_async(struct page *page, - struct wait_page_queue *wait, bool set) -{ - struct wait_queue_head *q = page_waitqueue(page); - int ret = 0; - - wait->page = page; - wait->bit_nr = PG_locked; - - spin_lock_irq(&q->lock); - __add_wait_queue_entry_tail(q, &wait->wait); - SetPageWaiters(page); - if (set) - ret = !trylock_page(page); - else - ret = PageLocked(page); - /* - * If we were succesful now, we know we're still on the - * waitqueue as we're still under the lock. This means it's - * safe to remove and return success, we know the callback - * isn't going to trigger. - */ - if (!ret) - __remove_wait_queue(q, &wait->wait); - else - ret = -EIOCBQUEUED; - spin_unlock_irq(&q->lock); - return ret; -} - /** * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked * @page: The page to wait for. @@ -1525,7 +1495,28 @@ EXPORT_SYMBOL_GPL(__lock_page_killable); int __lock_page_async(struct page *page, struct wait_page_queue *wait) { - return __wait_on_page_locked_async(page, wait, true); + struct wait_queue_head *q = page_waitqueue(page); + int ret = 0; + + wait->page = page; + wait->bit_nr = PG_locked; + + spin_lock_irq(&q->lock); + __add_wait_queue_entry_tail(q, &wait->wait); + SetPageWaiters(page); + ret = !trylock_page(page); + /* + * If we were succesful now, we know we're still on the + * waitqueue as we're still under the lock. This means it's + * safe to remove and return success, we know the callback + * isn't going to trigger. + */ + if (!ret) + __remove_wait_queue(q, &wait->wait); + else + ret = -EIOCBQUEUED; + spin_unlock_irq(&q->lock); + return ret; } /* From patchwork Thu Oct 29 19:34:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867349 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 53DE01130 for ; Thu, 29 Oct 2020 19:34:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2B48F20EDD for ; Thu, 29 Oct 2020 19:34:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="UBv1pHc9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726482AbgJ2Te1 (ORCPT ); Thu, 29 Oct 2020 15:34:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726256AbgJ2TeN (ORCPT ); Thu, 29 Oct 2020 15:34:13 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24298C0613D2 for ; Thu, 29 Oct 2020 12:34:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=wmxCHbi3U3r2Cgj8P7MXYuiRGLaiVPEgjTYbK8P//3o=; b=UBv1pHc9YqhyysyzsEv48fXnR5 OsvB0zVLCy1RrfHvu5sor3juarZomqtlTo6qP9+dl4UXMlbLqBh6jXcMvlNMY0c1PcUVxM8hfdDDf bSm+ZVY3xoNuC3O95k7Lday+o2ssEVKRNXi9a4+x3RBLDd+hFLvEUdsp8xOwuBgji8n32avKg7YtH ANeDZoy65mZsMbciBszt4BatDfn0IkgZmwImOkCVYlHH+Pheu6doQAv/jSqOcIHCoUzfOUsYGnazZ MAp80CGE5Sj4INiR2kj0g/wdOcUXQqW3LdSmTGxr0hedawo7cgBthiR711qQhWIvgd98l4D9o+3If SBX0EbRg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgZ-0007cj-I3; Thu, 29 Oct 2020 19:34:11 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 15/19] mm/readahead: Add THP readahead Date: Thu, 29 Oct 2020 19:34:01 +0000 Message-Id: <20201029193405.29125-16-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org If the filesystem supports THPs, allocate larger pages in the readahead code when it seems worth doing. The heuristic for choosing larger page sizes will surely need some tuning, but this aggressive ramp-up seems good for testing. Signed-off-by: Matthew Wilcox (Oracle) --- mm/readahead.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 94 insertions(+), 6 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index c5b0457415be..dc9876104ee8 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -149,7 +149,7 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages, blk_finish_plug(&plug); - BUG_ON(!list_empty(pages)); + BUG_ON(pages && !list_empty(pages)); BUG_ON(readahead_count(rac)); out: @@ -429,11 +429,99 @@ static int try_context_readahead(struct address_space *mapping, return 1; } +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static inline int ra_alloc_page(struct readahead_control *ractl, pgoff_t index, + pgoff_t mark, unsigned int order, gfp_t gfp) +{ + int err; + struct page *page = __page_cache_alloc_order(gfp, order); + + if (!page) + return -ENOMEM; + if (mark - index < (1UL << order)) + SetPageReadahead(page); + err = add_to_page_cache_lru(page, ractl->mapping, index, gfp); + if (err) + put_page(page); + else + ractl->_nr_pages += 1UL << order; + return err; +} + +static void page_cache_ra_order(struct readahead_control *ractl, + struct file_ra_state *ra, unsigned int new_order) +{ + struct address_space *mapping = ractl->mapping; + pgoff_t index = readahead_index(ractl); + pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; + pgoff_t mark = index + ra->size - ra->async_size; + int err = 0; + gfp_t gfp = readahead_gfp_mask(mapping); + + if (!mapping_thp_support(mapping) || ra->size < 4) + goto fallback; + + limit = min(limit, index + ra->size - 1); + + /* Grow page size up to PMD size */ + if (new_order < HPAGE_PMD_ORDER) { + new_order += 2; + if (new_order > HPAGE_PMD_ORDER) + new_order = HPAGE_PMD_ORDER; + while ((1 << new_order) > ra->size) + new_order--; + } + + while (index <= limit) { + unsigned int order = new_order; + + /* Align with smaller pages if needed */ + if (index & ((1UL << order) - 1)) { + order = __ffs(index); + if (order == 1) + order = 0; + } + /* Don't allocate pages past EOF */ + while (index + (1UL << order) - 1 > limit) { + if (--order == 1) + order = 0; + } + err = ra_alloc_page(ractl, index, mark, order, gfp); + if (err) + break; + index += 1UL << order; + } + + if (index > limit) { + ra->size += index - limit - 1; + ra->async_size += index - limit - 1; + } + + read_pages(ractl, NULL, false); + + /* + * If there were already pages in the page cache, then we may have + * left some gaps. Let the regular readahead code take care of this + * situation. + */ + if (!err) + return; +fallback: + do_page_cache_ra(ractl, ra->size, ra->async_size); +} +#else +static void page_cache_ra_order(struct readahead_control *ractl, + struct file_ra_state *ra, unsigned int order) +{ + do_page_cache_ra(ractl, ra->size, ra->async_size); +} +#endif + /* * A minimal readahead algorithm for trivial sequential/random reads. */ static void ondemand_readahead(struct readahead_control *ractl, - struct file_ra_state *ra, bool hit_readahead_marker, + struct file_ra_state *ra, struct page *page, unsigned long req_size) { struct backing_dev_info *bdi = inode_to_bdi(ractl->mapping->host); @@ -473,7 +561,7 @@ static void ondemand_readahead(struct readahead_control *ractl, * Query the pagecache for async_size, which normally equals to * readahead size. Ramp it up and use it as the new readahead size. */ - if (hit_readahead_marker) { + if (page) { pgoff_t start; rcu_read_lock(); @@ -546,7 +634,7 @@ static void ondemand_readahead(struct readahead_control *ractl, } ractl->_index = ra->start; - do_page_cache_ra(ractl, ra->size, ra->async_size); + page_cache_ra_order(ractl, ra, page ? thp_order(page) : 0); } void page_cache_sync_ra(struct readahead_control *ractl, @@ -574,7 +662,7 @@ void page_cache_sync_ra(struct readahead_control *ractl, } /* do read-ahead */ - ondemand_readahead(ractl, ra, false, req_count); + ondemand_readahead(ractl, ra, NULL, req_count); } EXPORT_SYMBOL_GPL(page_cache_sync_ra); @@ -604,7 +692,7 @@ void page_cache_async_ra(struct readahead_control *ractl, return; /* do read-ahead */ - ondemand_readahead(ractl, ra, true, req_count); + ondemand_readahead(ractl, ra, page, req_count); } EXPORT_SYMBOL_GPL(page_cache_async_ra); From patchwork Thu Oct 29 19:34:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867347 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AB7DD697 for ; Thu, 29 Oct 2020 19:34:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 824EE20EDD for ; Thu, 29 Oct 2020 19:34:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="XBvRw9or" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726200AbgJ2Te0 (ORCPT ); Thu, 29 Oct 2020 15:34:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726184AbgJ2TeN (ORCPT ); Thu, 29 Oct 2020 15:34:13 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70B91C0613D3 for ; Thu, 29 Oct 2020 12:34:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=JtnSk9FEEgyaxqeuAWYlzwggtFwYiu6sjhMbVnBCTYw=; b=XBvRw9orNd8h9rQjHnBMKhkep1 /jl88pKbWM5jCvFqJfh85I401AZooE3a/C4ejwpp4qhiotrF8dlXGRxLt50GKm1fDAafH/N692M7U 58kez5ZDeU2JvNv7Q5BwvMb26oOJX+UaSRBc+beO+4AG1wEuACpXZlprPy8PS1X8iUi5WbqqqCsY9 m5OmbxHlIeKL7a04gXeGOsiI2S4GGrMdWUZnUX3nxD9BvREQtyJXB3xqjTJqRsKbvQnmQ7xOaIRI2 UlCkxoWNEzimRE6FXMrbj/lkxAZqf4uQfizyDRzR0KAlzU0JooGofizcENtYR8EiKqqVkDb44/c/a bmmlq3Tw==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDgZ-0007cq-Ok; Thu, 29 Oct 2020 19:34:11 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: William Kucharski , Matthew Wilcox Subject: [PATCH 16/19] mm/readahead: Align THP mappings for non-DAX Date: Thu, 29 Oct 2020 19:34:02 +0000 Message-Id: <20201029193405.29125-17-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: William Kucharski When we have the opportunity to use transparent huge pages to map a file, we want to follow the same rules as DAX. Signed-off-by: William Kucharski Signed-off-by: Matthew Wilcox (Oracle) --- mm/huge_memory.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 580afce93d4a..660fe8c29cd9 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -556,13 +556,10 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long ret; loff_t off = (loff_t)pgoff << PAGE_SHIFT; - if (!IS_DAX(filp->f_mapping->host) || !IS_ENABLED(CONFIG_FS_DAX_PMD)) - goto out; - ret = __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE); if (ret) return ret; -out: + return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } EXPORT_SYMBOL_GPL(thp_get_unmapped_area); From patchwork Thu Oct 29 19:34:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867343 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9744B92C for ; Thu, 29 Oct 2020 19:34:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 70552214DB for ; Thu, 29 Oct 2020 19:34:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="lxGnL+AQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726564AbgJ2TeZ (ORCPT ); Thu, 29 Oct 2020 15:34:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726200AbgJ2TeO (ORCPT ); Thu, 29 Oct 2020 15:34:14 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD20AC0613D4 for ; Thu, 29 Oct 2020 12:34:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=WKx2KmhhcuRAZfmxd4Yn/rz7noB6w4aBxf/AcKdOaQA=; b=lxGnL+AQeRmL3QGNzMU1QfXWrx VgTsRmAuLPj/ZStrR1BVz9vkhPUyttBDL9f5RriNS3MjUqqEeYFpyCs557gYBVEyZaRCUP/4K/idO hdN1JcWlneeRa4n52NCb6c3Cz7t5da3eqIjXL1J79kkp/JyGjFCZhcz6JF4+pVvb/8ZElxGl1qDHW UPlkICcUDozLL2VX6Q2Pg0aNqQnyICSsvoI8D42q1+6E6RMYUX47bQWiXmIHCnlN/8zB8iGMtmDHn gVaPYOaPQplJkTk1+P4Rugm4l6go4IRjf6Jp7247mwUbug5kAtGqs2w8rindvY6jNsWR9iHj3pDBk 6V3T1dLA==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDga-0007cv-0u; Thu, 29 Oct 2020 19:34:12 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 17/19] mm/readahead: Switch to page_cache_ra_order Date: Thu, 29 Oct 2020 19:34:03 +0000 Message-Id: <20201029193405.29125-18-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org do_page_cache_ra() was being exposed for the benefit of do_sync_mmap_readahead(). Switch it over to page_cache_ra_order() partly because it's a better interface but mostly for the benefit of the next patch. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 2 +- mm/internal.h | 4 ++-- mm/readahead.c | 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 211a7c1fab3f..ee4a4990bad3 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2833,7 +2833,7 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) ra->size = ra->ra_pages; ra->async_size = ra->ra_pages / 4; ractl._index = ra->start; - do_page_cache_ra(&ractl, ra->size, ra->async_size); + page_cache_ra_order(&ractl, ra, 0); return fpin; } diff --git a/mm/internal.h b/mm/internal.h index 1391e3239547..3ea43642b99d 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -49,8 +49,8 @@ void unmap_page_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end, struct zap_details *details); -void do_page_cache_ra(struct readahead_control *, unsigned long nr_to_read, - unsigned long lookahead_size); +void page_cache_ra_order(struct readahead_control *, struct file_ra_state *, + unsigned int order); void force_page_cache_ra(struct readahead_control *, struct file_ra_state *, unsigned long nr); static inline void force_page_cache_readahead(struct address_space *mapping, diff --git a/mm/readahead.c b/mm/readahead.c index dc9876104ee8..d280e8f2e834 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -246,7 +246,7 @@ EXPORT_SYMBOL_GPL(page_cache_ra_unbounded); * behaviour which would occur if page allocations are causing VM writeback. * We really don't want to intermingle reads and writes like that. */ -void do_page_cache_ra(struct readahead_control *ractl, +static void do_page_cache_ra(struct readahead_control *ractl, unsigned long nr_to_read, unsigned long lookahead_size) { struct inode *inode = ractl->mapping->host; @@ -448,7 +448,7 @@ static inline int ra_alloc_page(struct readahead_control *ractl, pgoff_t index, return err; } -static void page_cache_ra_order(struct readahead_control *ractl, +void page_cache_ra_order(struct readahead_control *ractl, struct file_ra_state *ra, unsigned int new_order) { struct address_space *mapping = ractl->mapping; @@ -510,7 +510,7 @@ static void page_cache_ra_order(struct readahead_control *ractl, do_page_cache_ra(ractl, ra->size, ra->async_size); } #else -static void page_cache_ra_order(struct readahead_control *ractl, +void page_cache_ra_order(struct readahead_control *ractl, struct file_ra_state *ra, unsigned int order) { do_page_cache_ra(ractl, ra->size, ra->async_size); From patchwork Thu Oct 29 19:34:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867331 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 52F10697 for ; Thu, 29 Oct 2020 19:34:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 29FE320782 for ; Thu, 29 Oct 2020 19:34:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="jc65qq8S" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726474AbgJ2TeR (ORCPT ); Thu, 29 Oct 2020 15:34:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726282AbgJ2TeO (ORCPT ); Thu, 29 Oct 2020 15:34:14 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA0FEC0613CF for ; Thu, 29 Oct 2020 12:34:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=l3Uro+7ac/lRqi712xHiq8ZfTNOWcphqVa06+YHLQMc=; b=jc65qq8SyRwDz41gI2oZLmMKJt qT2hX/tLg5oVN3xhHJy3IKoyss20k4x97u4URjS2IlDEb/+2tCDm7E68Wzw6JLcIoswEapJ3N9MFE 5/OYTq60uCAa+pYXJnu27Nj8ChZsaPE8tJVqQdKLE4+/E1sq8/dAQ4mndKyDWZbO7udZNuQYWB2/T e7zSMCLUcigQnfzkz7vyBGPXr7M+wDML0fvMhnOy6aAkODphaO9kkYYFMlrDrtidZoqq+nMvPkQMm 2mRbDZbqniBoAm/e6v3c/ttgXz8fuLf+0iXx1ODRwW6m2Bxmb0AzzRP8JrUQiCjmffgh0a724BCbq VyUeD3Mg==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDga-0007d1-9I; Thu, 29 Oct 2020 19:34:12 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 18/19] mm/filemap: Support VM_HUGEPAGE for file mappings Date: Thu, 29 Oct 2020 19:34:04 +0000 Message-Id: <20201029193405.29125-19-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org If the VM_HUGEPAGE flag is set, attempt to allocate PMD-sized THPs during readahead, even if we have no history of readahead being successful. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index ee4a4990bad3..79a2ac001948 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2801,6 +2801,20 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) struct file *fpin = NULL; unsigned int mmap_miss; +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + /* Use the readahead code for THP, even if readahead is disabled */ + if (vmf->vma->vm_flags & VM_HUGEPAGE) { + fpin = maybe_unlock_mmap_for_io(vmf, fpin); + ractl._index &= ~((unsigned long)HPAGE_PMD_NR - 1); + ra->size = HPAGE_PMD_NR; + if (vmf->vma->vm_flags & VM_RAND_READ) + ra->size *= 2; + ra->async_size = HPAGE_PMD_NR; + page_cache_ra_order(&ractl, ra, HPAGE_PMD_ORDER); + return fpin; + } +#endif + /* If we don't want any read-ahead, don't bother */ if (vmf->vma->vm_flags & VM_RAND_READ) return fpin; From patchwork Thu Oct 29 19:34:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11867335 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 84435697 for ; Thu, 29 Oct 2020 19:34:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5B16821531 for ; Thu, 29 Oct 2020 19:34:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="akYO2Qy2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726484AbgJ2TeS (ORCPT ); Thu, 29 Oct 2020 15:34:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726441AbgJ2TeO (ORCPT ); Thu, 29 Oct 2020 15:34:14 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CFE0C0613D2 for ; Thu, 29 Oct 2020 12:34:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=moBwCUezAcUFLo9WoDknD01nYVYEhE9p2ZKZoE3n6zM=; b=akYO2Qy2xJ3zVwULlpMcvG8eoi xzf6a72H7GVH2I2lViSoHAXbUOPuAFimkpllWBeTtqcCcnpWzGL2WavspUlfIyn1FDiaBppxkNvRb hutBxqvgnS3FNDsLAIBEapy6pGVLdKYQU2H/gyMbkJCTi1z9kLlnBxqD3wbBTePiWu1qfmyJeeoNp tW1QU42p/Wu8b7dZUWpenEAZ53JGGBRJkJhH26bchczLELRwE6DXhS0ZGjc7tUD5XLtYSxUkMG+pm LFp44VXgB8Sl5XbXWxa4t8Kv33PvevbyjMf/IimtXGnHkXpXXp8K5tULJ4au45swYPoTEcr88U3YJ OHrpXn/w==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kYDga-0007dA-Le; Thu, 29 Oct 2020 19:34:12 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 19/19] selftests/vm/transhuge-stress: Support file-backed THPs Date: Thu, 29 Oct 2020 19:34:05 +0000 Message-Id: <20201029193405.29125-20-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20201029193405.29125-1-willy@infradead.org> References: <20201029193405.29125-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a -f option to test THPs on files Signed-off-by: Matthew Wilcox (Oracle) --- tools/testing/selftests/vm/transhuge-stress.c | 36 ++++++++++++------- 1 file changed, 24 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/vm/transhuge-stress.c b/tools/testing/selftests/vm/transhuge-stress.c index fd7f1b4a96f9..77b775700bf6 100644 --- a/tools/testing/selftests/vm/transhuge-stress.c +++ b/tools/testing/selftests/vm/transhuge-stress.c @@ -26,15 +26,17 @@ #define PAGEMAP_PFN(ent) ((ent) & ((1ull << 55) - 1)) int pagemap_fd; +int backing_fd = -1; +int mmap_flags = MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE; +#define PROT_RW (PROT_READ | PROT_WRITE) int64_t allocate_transhuge(void *ptr) { uint64_t ent[2]; /* drop pmd */ - if (mmap(ptr, HPAGE_SIZE, PROT_READ | PROT_WRITE, - MAP_FIXED | MAP_ANONYMOUS | - MAP_NORESERVE | MAP_PRIVATE, -1, 0) != ptr) + if (mmap(ptr, HPAGE_SIZE, PROT_RW, MAP_FIXED | mmap_flags, + backing_fd, 0) != ptr) errx(2, "mmap transhuge"); if (madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE)) @@ -60,6 +62,8 @@ int main(int argc, char **argv) size_t ram, len; void *ptr, *p; struct timespec a, b; + int i = 0; + char *name = NULL; double s; uint8_t *map; size_t map_len; @@ -69,14 +73,23 @@ int main(int argc, char **argv) ram = SIZE_MAX / 4; else ram *= sysconf(_SC_PAGESIZE); + len = ram; + + while (++i < argc) { + if (!strcmp(argv[i], "-h")) + errx(1, "usage: %s [size in MiB]", argv[0]); + else if (!strcmp(argv[i], "-f")) + name = argv[++i]; + else + len = atoll(argv[i]) << 20; + } - if (argc == 1) - len = ram; - else if (!strcmp(argv[1], "-h")) - errx(1, "usage: %s [size in MiB]", argv[0]); - else - len = atoll(argv[1]) << 20; - + if (name) { + backing_fd = open(name, O_RDWR); + if (backing_fd == -1) + err(2, "open %s", name); + mmap_flags = MAP_SHARED; + } warnx("allocate %zd transhuge pages, using %zd MiB virtual memory" " and %zd MiB of ram", len >> HPAGE_SHIFT, len >> 20, len >> (20 + HPAGE_SHIFT - PAGE_SHIFT - 1)); @@ -86,8 +99,7 @@ int main(int argc, char **argv) err(2, "open pagemap"); len -= len % HPAGE_SIZE; - ptr = mmap(NULL, len + HPAGE_SIZE, PROT_READ | PROT_WRITE, - MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); + ptr = mmap(NULL, len + HPAGE_SIZE, PROT_RW, mmap_flags, backing_fd, 0); if (ptr == MAP_FAILED) err(2, "initial mmap"); ptr += HPAGE_SIZE - (uintptr_t)ptr % HPAGE_SIZE;