From patchwork Fri Apr 14 15:37:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Ostrovsky X-Patchwork-Id: 9681387 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2A33A60326 for ; Fri, 14 Apr 2017 15:38:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BC2928691 for ; Fri, 14 Apr 2017 15:38:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 108E6286B0; Fri, 14 Apr 2017 15:38:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7CABB28691 for ; Fri, 14 Apr 2017 15:38:27 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cz3GJ-0007F2-Mm; Fri, 14 Apr 2017 15:35:51 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cz3GI-0007D3-79 for xen-devel@lists.xen.org; Fri, 14 Apr 2017 15:35:50 +0000 Received: from [193.109.254.147] by server-6.bemta-6.messagelabs.com id 27/FB-03430-55CE0F85; Fri, 14 Apr 2017 15:35:49 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrDLMWRWlGSWpSXmKPExsXSO6nOVTf0zYc Ig8M9LBZLPi5mcWD0OLr7N1MAYxRrZl5SfkUCa8aS5tKC1w4VS96cZWpgPKzVxcjFISTQwSRx /tlsVgjnC6NE/40bzBDORkaJNycWsEE4PYwS02dOBcpwcrAJGEmcPTqdEcQWEZCWuPb5MiNIE bNAA5PE83MHwRLCAv4SbZeWsYLYLAKqEl2TlrCA2LwCnhLfHx9iB7ElBOQkdqx+wtTFyMHBKe AlcfwvWKsQUMnP5guMECXGEn2z+lgmMPItYGRYxahRnFpUllqka2ykl1SUmZ5RkpuYmaNraGC ml5taXJyYnpqTmFSsl5yfu4kRGCoMQLCD8fS6wEOMkhxMSqK8C159iBDiS8pPqcxILM6ILyrN SS0+xCjDwaEkwXsKJCdYlJqeWpGWmQMMWpi0BAePkgiv4WugNG9xQWJucWY6ROoUo6KUOG8fS EIAJJFRmgfXBouUS4yyUsK8jECHCPEUpBblZpagyr9iFOdgVBLmPQaynSczrwRu+iugxUxAix kmgy0uSURISTUwConsLtvUc+zx3C5zvvN23/XsN2gkR89aVC/07nTWrO1/Kh6Fmqiv2bHT9lz Ty8gtuafP67TYznl/3zonTDlLOX7/+pObUivN027XKLQfcvGKefn8ELOk/QHLpWvKk99kVYbK Ho1TZwvNVNzos0n8f3kj/5KDob3fA9b89vI/Jfrnpp7sNZcUJZbijERDLeai4kQAtgz/6Y8CA AA= X-Env-Sender: boris.ostrovsky@oracle.com X-Msg-Ref: server-11.tower-27.messagelabs.com!1492184147!66784056!1 X-Originating-IP: [141.146.126.69] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTQxLjE0Ni4xMjYuNjkgPT4gMjc3MjE4\n X-StarScan-Received: X-StarScan-Version: 9.4.12; banners=-,-,- X-VirusChecked: Checked Received: (qmail 41391 invoked from network); 14 Apr 2017 15:35:48 -0000 Received: from aserp1040.oracle.com (HELO aserp1040.oracle.com) (141.146.126.69) by server-11.tower-27.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 14 Apr 2017 15:35:48 -0000 Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v3EFZeVR006391 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 14 Apr 2017 15:35:40 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0021.oracle.com (8.13.8/8.14.4) with ESMTP id v3EFZe5i026711 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 14 Apr 2017 15:35:40 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v3EFZdl0022236; Fri, 14 Apr 2017 15:35:39 GMT Received: from ovs101.us.oracle.com (/10.149.76.201) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 14 Apr 2017 08:35:39 -0700 From: Boris Ostrovsky To: xen-devel@lists.xen.org Date: Fri, 14 Apr 2017 11:37:36 -0400 Message-Id: <1492184258-3277-8-git-send-email-boris.ostrovsky@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1492184258-3277-1-git-send-email-boris.ostrovsky@oracle.com> References: <1492184258-3277-1-git-send-email-boris.ostrovsky@oracle.com> X-Source-IP: aserv0021.oracle.com [141.146.126.233] Cc: sstabellini@kernel.org, wei.liu2@citrix.com, George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, tim@xen.org, jbeulich@suse.com, Boris Ostrovsky Subject: [Xen-devel] [PATCH v3 7/9] mm: Keep pages available for allocation while scrubbing X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Instead of scrubbing pages while holding heap lock we can mark buddy's head as being scrubbed and drop the lock temporarily. If someone (most likely alloc_heap_pages()) tries to access this chunk it will signal the scrubber to abort scrub by setting head's PAGE_SCRUB_ABORT bit. The scrubber checks this bit after processing each page and stops its work as soon as it sees it. Signed-off-by: Boris Ostrovsky --- Changes in v3: * Adjusted page_info's scrub_state definitions but kept them as binary flags since I think having both PAGE_SCRUBBING and PAGE_SCRUB_ABORT bits set make sense. xen/common/page_alloc.c | 92 ++++++++++++++++++++++++++++++++++++++++++--- xen/include/asm-arm/mm.h | 4 ++ xen/include/asm-x86/mm.h | 4 ++ 3 files changed, 93 insertions(+), 7 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 0b2dff1..514a4a1 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -694,6 +694,17 @@ static void page_list_add_scrub(struct page_info *pg, unsigned int node, page_list_add(pg, &heap(node, zone, order)); } +static void check_and_stop_scrub(struct page_info *head) +{ + if ( head->u.free.scrub_state & PAGE_SCRUBBING ) + { + head->u.free.scrub_state |= PAGE_SCRUB_ABORT; + spin_lock_kick(); + while ( ACCESS_ONCE(head->u.free.scrub_state) & PAGE_SCRUB_ABORT ) + cpu_relax(); + } +} + /* Allocate 2^@order contiguous pages. */ static struct page_info *alloc_heap_pages( unsigned int zone_lo, unsigned int zone_hi, @@ -780,10 +791,15 @@ static struct page_info *alloc_heap_pages( { if ( (pg = page_list_remove_head(&heap(node, zone, j))) ) { - if ( (order == 0) || use_unscrubbed || - !pg->u.free.dirty_head ) + if ( !pg->u.free.dirty_head ) goto found; + if ( (order == 0) || use_unscrubbed ) + { + check_and_stop_scrub(pg); + goto found; + } + page_list_add_tail(pg, &heap(node, zone, j)); } } @@ -921,6 +937,8 @@ static int reserve_offlined_page(struct page_info *head) head->u.free.dirty_head = false; + check_and_stop_scrub(head); + page_list_del(head, &heap(node, zone, head_order)); while ( cur_head < (head + (1 << head_order)) ) @@ -1027,6 +1045,9 @@ merge_and_free_buddy(struct page_info *pg, unsigned int node, (phys_to_nid(page_to_maddr(buddy)) != node) ) break; + if ( buddy->u.free.scrub_state & PAGE_SCRUBBING ) + break; + page_list_del(buddy, &heap(node, zone, order)); need_scrub |= buddy->u.free.dirty_head; buddy->u.free.dirty_head = false; @@ -1098,14 +1119,35 @@ static unsigned int node_to_scrub(bool get_node) return closest; } +struct scrub_wait_state { + struct page_info *pg; + bool drop; +}; + +static void scrub_continue(void *data) +{ + struct scrub_wait_state *st = data; + + if ( st->drop ) + return; + + if ( st->pg->u.free.scrub_state & PAGE_SCRUB_ABORT ) + { + /* There is a waiter for this buddy. Release it. */ + st->drop = true; + st->pg->u.free.scrub_state = 0; + } +} + bool scrub_free_pages(void) { struct page_info *pg; unsigned int zone, order, scrub_order; - unsigned long i, num_processed, start, end; + unsigned long i, num_processed, start, end, dirty_cnt; unsigned int cpu = smp_processor_id(); bool preempt = false, is_frag; nodeid_t node; + struct scrub_wait_state st; /* Scrubbing granularity. */ #define SCRUB_CHUNK_ORDER 8 @@ -1134,8 +1176,13 @@ bool scrub_free_pages(void) if ( !pg->u.free.dirty_head ) break; + ASSERT(!pg->u.free.scrub_state); + pg->u.free.scrub_state = PAGE_SCRUBBING; + + spin_unlock(&heap_lock); + scrub_order = MIN(order, SCRUB_CHUNK_ORDER); - num_processed = 0; + num_processed = dirty_cnt = 0; is_frag = false; while ( num_processed < (1UL << order) ) { @@ -1145,8 +1192,24 @@ bool scrub_free_pages(void) if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) { scrub_one_page(&pg[i]); + /* + * We can modify count_info without holding heap + * lock since we effectively locked this buddy by + * setting its scrub_state. + */ pg[i].count_info &= ~PGC_need_scrub; - node_need_scrub[node]--; + dirty_cnt++; + } + + if ( ACCESS_ONCE(pg->u.free.scrub_state) & + PAGE_SCRUB_ABORT ) + { + /* Someone wants this chunk. Drop everything. */ + pg->u.free.scrub_state = 0; + spin_lock(&heap_lock); + node_need_scrub[node] -= dirty_cnt; + spin_unlock(&heap_lock); + goto out_nolock; } } @@ -1159,11 +1222,20 @@ bool scrub_free_pages(void) } } - start = 0; - end = num_processed; + st.pg = pg; + st.drop = false; + spin_lock_cb(&heap_lock, scrub_continue, &st); + + node_need_scrub[node] -= dirty_cnt; + + if ( st.drop ) + goto out; page_list_del(pg, &heap(node, zone, order)); + start = 0; + end = num_processed; + /* Merge clean pages */ while ( start < end ) { @@ -1194,6 +1266,8 @@ bool scrub_free_pages(void) end += (1UL << chunk_order); } + pg->u.free.scrub_state = 0; + if ( preempt || (node_need_scrub[node] == 0) ) goto out; } @@ -1202,6 +1276,8 @@ bool scrub_free_pages(void) out: spin_unlock(&heap_lock); + + out_nolock: node_clear(node, node_scrubbing); return softirq_pending(cpu) || (node_to_scrub(false) != NUMA_NO_NODE); } @@ -1240,6 +1316,8 @@ static void free_heap_pages( if ( page_state_is(&pg[i], offlined) ) tainted = 1; + pg[i].u.free.scrub_state = 0; + /* If a page has no owner it will need no safety TLB flush. */ pg[i].u.free.need_tlbflush = (page_get_owner(&pg[i]) != NULL); if ( pg[i].u.free.need_tlbflush ) diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index abc3f6b..b333b16 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -43,6 +43,10 @@ struct page_info } inuse; /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */ struct { +#define PAGE_SCRUBBING (1<<0) +#define PAGE_SCRUB_ABORT (1<<1) + unsigned char scrub_state; + /* Do TLBs need flushing for safety before next page use? */ bool_t need_tlbflush; /* Set on a buddy head if the buddy has unscrubbed pages. */ diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h index 5cf528a..d00c4a1 100644 --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -87,6 +87,10 @@ struct page_info /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */ struct { +#define PAGE_SCRUBBING (1<<0) +#define PAGE_SCRUB_ABORT (1<<1) + unsigned char scrub_state; + /* Do TLBs need flushing for safety before next page use? */ bool_t need_tlbflush; /* Set on a buddy head if the buddy has unscrubbed pages. */