From patchwork Mon Jun 29 15:19:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11631109 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F079912 for ; Mon, 29 Jun 2020 15:19:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0BE3E24688 for ; Mon, 29 Jun 2020 15:19:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="SfiAFWRW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0BE3E24688 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 494E96B0026; Mon, 29 Jun 2020 11:19:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4459B6B0027; Mon, 29 Jun 2020 11:19:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 334CC6B0028; Mon, 29 Jun 2020 11:19:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0192.hostedemail.com [216.40.44.192]) by kanga.kvack.org (Postfix) with ESMTP id 195B26B0026 for ; Mon, 29 Jun 2020 11:19:43 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CA8D71EF3 for ; Mon, 29 Jun 2020 15:19:42 +0000 (UTC) X-FDA: 76982609004.06.crook35_270dcdd26e70 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 38F191003CE52 for ; Mon, 29 Jun 2020 15:19:40 +0000 (UTC) X-Spam-Summary: 1,0,0,26577c79dedfd621,d41d8cd98f00b204,willy@infradead.org,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1534:1540:1711:1730:1747:1777:1792:2196:2199:2393:2559:2562:2693:2901:3138:3139:3140:3141:3142:3352:3865:3866:3867:3870:3871:3874:4321:4385:5007:6261:6653:10004:11026:11232:11638:11658:11914:12296:12297:12555:12895:12986:13069:13311:13357:13894:14096:14181:14384:14394:14721:21063:21080:21451:21627:21740:30054,0,RBL:90.155.50.34:@infradead.org:.lbl8.mailshell.net-62.8.15.100 64.201.201.201;04yf99g1q9rtqi4cufduqyms6oj64optnuu9ydmd8yeqpcdr7bndunwq8kyuf11.8bifamkb3exwhdkza8sd39gdf1gdah7q7bgtqyg8xbzer83ww1upu48eg8p3tf7.s-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: crook35_270dcdd26e70 X-Filterd-Recvd-Size: 2528 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Mon, 29 Jun 2020 15:19:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=bDNPQjUWgRkV8co8dAz0ebefMO2FAf+no7vUUNNaM8I=; b=SfiAFWRWquOAxkht/98sZYB8Zh yTDX8HYiYqYL6wgviXsY180wziuHsWEr+PkozNlwsswCuwkY2RbWXmnQnVwn3xsZ5x/dCrrtD5Rex AOZE9Omv/3w0guA9IrAFVIJ2b9V/3Jy1zmDGde81tentyLITNaP1Y+R8yaB6AB5t6yZLWzXzZqNT1 JZY9U4fVVsSrQoEBF9V9db31ZJ8DXaiVcT8OQ2GdkkdReI8mKSZ8gQ/4zS72MzMIFGbz4KgackjPI rY7mVeTGW6aqZUxZbVGiHsHYcHpzx5qjpd4Vhw+FEHJw6K6EYpcp5qzQgS3WC/v7QuIBDU9WGaFFm k4OxGVWQ==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1jpvZJ-00046C-07; Mon, 29 Jun 2020 15:19:37 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, Andrew Morton Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 1/2] mm: Move PageDoubleMap bit Date: Mon, 29 Jun 2020 16:19:32 +0100 Message-Id: <20200629151933.15671-2-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200629151933.15671-1-willy@infradead.org> References: <20200629151933.15671-1-willy@infradead.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 38F191003CE52 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: PG_private_2 is defined as being PF_ANY (applicable to tail pages as well as regular & head pages). That means that the first tail page of a double-map page will appear to have Private2 set. Use the Workingset bit instead which is defined as PF_HEAD so any attempt to access the Workingset bit on a tail page will redirect to the head page's Workingset bit. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Zi Yan --- include/linux/page-flags.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 6be1aa559b1e..b4e6051aa311 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -164,7 +164,7 @@ enum pageflags { PG_slob_free = PG_private, /* Compound pages. Stored in first tail page's flags */ - PG_double_map = PG_private_2, + PG_double_map = PG_workingset, /* non-lru isolated movable page */ PG_isolated = PG_reclaim, From patchwork Mon Jun 29 15:20:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 11631133 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CAE06913 for ; Mon, 29 Jun 2020 15:20:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8992C246F5 for ; Mon, 29 Jun 2020 15:20:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="JoQ9h1kO" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8992C246F5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AAEDE6B0003; Mon, 29 Jun 2020 11:20:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A5F036B000A; Mon, 29 Jun 2020 11:20:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9754B6B0023; Mon, 29 Jun 2020 11:20:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0192.hostedemail.com [216.40.44.192]) by kanga.kvack.org (Postfix) with ESMTP id 80FAB6B0003 for ; Mon, 29 Jun 2020 11:20:39 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 49B78181AC9C6 for ; Mon, 29 Jun 2020 15:20:39 +0000 (UTC) X-FDA: 76982611398.28.cakes84_3f176a226e70 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 1BA6D6D62 for ; Mon, 29 Jun 2020 15:20:39 +0000 (UTC) X-Spam-Summary: 1,0,0,935e6486ed97f461,d41d8cd98f00b204,willy@infradead.org,,RULES_HIT:2:41:69:355:379:541:800:960:966:973:988:989:1260:1311:1314:1345:1359:1437:1515:1535:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:2898:2903:3138:3139:3140:3141:3142:3308:3865:3866:3867:3868:3870:3871:3872:4050:4119:4250:4321:4385:4605:5007:6117:6119:6261:6653:7558:7903:8957:9010:9592:10004:10128:11026:11232:11473:11658:11914:12043:12296:12297:12438:12555:12679:12683:12895:12986:13894:14096:14394:21063:21067:21080:21324:21450:21451:21627:21990:30054:30070,0,RBL:90.155.50.34:@infradead.org:.lbl8.mailshell.net-64.201.201.201 62.8.15.100;04y8ed4jpgydjz7rax4dmq6f1xykkockwkobrw71z3mseno8caao7ff7gt4ujr5.u5zwbyz8mnncbew8gbriujt15w5ctztpdkj5qk8d579nui7cxzwiatjhkzuk5ko.a-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: cakes84_3f176a226e70 X-Filterd-Recvd-Size: 8855 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Mon, 29 Jun 2020 15:20:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=HPtuNowuEy22hYW/7Sdfs5gnhvuh51nym0Qx9CBBMvw=; b=JoQ9h1kObLFPxml4Xc3ZnOKRov gfk91esL/rVW8HkydjAGjm3RjUpoHiaZPrrm4JPiIMeuwVaUPWqxn0sjR15oHaa/JhIpsSRix0V2r 7RH/HrATbRCAdAEqQasB94anSbNTGIEc4kOPTIB0hcGg4U7At+e+ON9vExoV/RPvl1RHoHSTF+OhR 3UYrazMhp1XpgKYNqfYOILGdWUgWxi8gpiFQ48IOmK93uJCjExRcdWzApLAyq7W2/0GhgnHXPPuvF uCsXoOHEkgjVETuB5W1V1sYbsOOKm9/7op9SCZdEcPlgQZJtErFaPat8lJqhhsSUjYF4T9f10mmh7 bCFTx5iA==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1jpvaG-0004F1-GJ; Mon, 29 Jun 2020 15:20:36 +0000 From: "Matthew Wilcox (Oracle)" To: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton Cc: "Matthew Wilcox (Oracle)" , Hugh Dickins Subject: [PATCH 2/2] mm: Use multi-index entries in the page cache Date: Mon, 29 Jun 2020 16:20:33 +0100 Message-Id: <20200629152033.16175-3-willy@infradead.org> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200629152033.16175-1-willy@infradead.org> References: <20200629152033.16175-1-willy@infradead.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1BA6D6D62 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We currently store order-N THPs as 2^N consecutive entries. While this consumes rather more memory than necessary, it also turns out to be buggy for filesystems which track dirty pages as a writeback operation which starts in the middle of a dirty THP will not notice as the dirty bit is only set on the head index. With multi-index entries, the dirty bit will be found on the head index. This does end up simplifying the page cache slightly, although not as much as I had hoped. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 42 +++++++++++++++++++----------------------- mm/huge_memory.c | 21 +++++++++++++++++---- mm/khugepaged.c | 12 +++++++++++- mm/shmem.c | 11 ++--------- 4 files changed, 49 insertions(+), 37 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 80ce3658b147..28859bc43a3a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -126,13 +126,12 @@ static void page_cache_delete(struct address_space *mapping, /* hugetlb pages are represented by a single entry in the xarray */ if (!PageHuge(page)) { - xas_set_order(&xas, page->index, compound_order(page)); - nr = compound_nr(page); + xas_set_order(&xas, page->index, thp_order(page)); + nr = thp_nr_pages(page); } VM_BUG_ON_PAGE(!PageLocked(page), page); VM_BUG_ON_PAGE(PageTail(page), page); - VM_BUG_ON_PAGE(nr != 1 && shadow, page); xas_store(&xas, shadow); xas_init_marks(&xas); @@ -322,19 +321,12 @@ static void page_cache_delete_batch(struct address_space *mapping, WARN_ON_ONCE(!PageLocked(page)); - if (page->index == xas.xa_index) - page->mapping = NULL; + page->mapping = NULL; /* Leave page->index set: truncation lookup relies on it */ - /* - * Move to the next page in the vector if this is a regular - * page or the index is of the last sub-page of this compound - * page. - */ - if (page->index + compound_nr(page) - 1 == xas.xa_index) - i++; + i++; xas_store(&xas, NULL); - total_pages++; + total_pages += thp_nr_pages(page); } mapping->nrpages -= total_pages; } @@ -851,20 +843,24 @@ static int __add_to_page_cache_locked(struct page *page, } do { - xas_lock_irq(&xas); - old = xas_load(&xas); - if (old && !xa_is_value(old)) - xas_set_err(&xas, -EEXIST); - xas_store(&xas, page); - if (xas_error(&xas)) - goto unlock; + unsigned long exceptional = 0; - if (xa_is_value(old)) { - mapping->nrexceptional--; + xas_lock_irq(&xas); + xas_for_each_conflict(&xas, old) { + if (!xa_is_value(old)) { + xas_set_err(&xas, -EEXIST); + goto unlock; + } + exceptional++; if (shadowp) *shadowp = old; } - mapping->nrpages++; + + xas_store(&xas, page); + if (xas_error(&xas)) + goto unlock; + mapping->nrexceptional -= exceptional; + mapping->nrpages += nr; /* hugetlb pages do not participate in page cache accounting */ if (!huge) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 78c84bee7e29..7e5ff05ceeaa 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2603,6 +2603,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) struct page *head = compound_head(page); struct pglist_data *pgdata = NODE_DATA(page_to_nid(head)); struct deferred_split *ds_queue = get_deferred_split_queue(head); + XA_STATE_ORDER(xas, &head->mapping->i_pages, head->index, + compound_order(head)); struct anon_vma *anon_vma = NULL; struct address_space *mapping = NULL; int count, mapcount, extra_pins, ret; @@ -2667,19 +2669,28 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) unmap_page(head); VM_BUG_ON_PAGE(compound_mapcount(head), head); + if (mapping) { + /* XXX: Need better GFP flags here */ + xas_split_alloc(&xas, head, 0, GFP_ATOMIC); + if (xas_error(&xas)) { + ret = xas_error(&xas); + goto out_unlock; + } + } + /* prevent PageLRU to go away from under us, and freeze lru stats */ spin_lock_irqsave(&pgdata->lru_lock, flags); if (mapping) { - XA_STATE(xas, &mapping->i_pages, page_index(head)); - /* * Check if the head page is present in page cache. * We assume all tail are present too, if head is there. */ - xa_lock(&mapping->i_pages); + xas_lock(&xas); + xas_reset(&xas); if (xas_load(&xas) != head) goto fail; + xas_split(&xas, head, 0); } /* Prevent deferred_split_scan() touching ->_refcount */ @@ -2717,7 +2728,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) } spin_unlock(&ds_queue->split_queue_lock); fail: if (mapping) - xa_unlock(&mapping->i_pages); + xas_unlock(&xas); spin_unlock_irqrestore(&pgdata->lru_lock, flags); remap_page(head); ret = -EBUSY; @@ -2731,6 +2742,8 @@ fail: if (mapping) if (mapping) i_mmap_unlock_read(mapping); out: + /* Free any memory we didn't use */ + xas_nomem(&xas, 0); count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED); return ret; } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b043c40a21d4..52dcec90e1c3 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1638,7 +1638,10 @@ static void collapse_file(struct mm_struct *mm, } count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); - /* This will be less messy when we use multi-index entries */ + /* + * Ensure we have slots for all the pages in the range. This is + * almost certainly a no-op because most of the pages must be present + */ do { xas_lock_irq(&xas); xas_create_range(&xas); @@ -1844,6 +1847,9 @@ static void collapse_file(struct mm_struct *mm, __mod_lruvec_page_state(new_page, NR_SHMEM, nr_none); } + /* Join all the small entries into a single multi-index entry */ + xas_set_order(&xas, start, HPAGE_PMD_ORDER); + xas_store(&xas, new_page); xa_locked: xas_unlock_irq(&xas); xa_unlocked: @@ -1965,6 +1971,10 @@ static void khugepaged_scan_file(struct mm_struct *mm, continue; } + /* + * XXX: khugepaged should compact smaller compound pages + * into a PMD sized page + */ if (PageTransCompound(page)) { result = SCAN_PAGE_COMPOUND; break; diff --git a/mm/shmem.c b/mm/shmem.c index a0dbe62f8042..030cc483dd3f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -608,7 +608,6 @@ static int shmem_add_to_page_cache(struct page *page, struct mm_struct *charge_mm) { XA_STATE_ORDER(xas, &mapping->i_pages, index, compound_order(page)); - unsigned long i = 0; unsigned long nr = compound_nr(page); int error; @@ -638,17 +637,11 @@ static int shmem_add_to_page_cache(struct page *page, void *entry; xas_lock_irq(&xas); entry = xas_find_conflict(&xas); - if (entry != expected) + if (entry != expected) { xas_set_err(&xas, -EEXIST); - xas_create_range(&xas); - if (xas_error(&xas)) goto unlock; -next: - xas_store(&xas, page); - if (++i < nr) { - xas_next(&xas); - goto next; } + xas_store(&xas, page); if (PageTransHuge(page)) { count_vm_event(THP_FILE_ALLOC); __inc_node_page_state(page, NR_SHMEM_THPS);