From patchwork Thu Jun 22 18:57:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Ostrovsky X-Patchwork-Id: 9805109 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AA69960329 for ; Thu, 22 Jun 2017 18:58:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98803286D0 for ; Thu, 22 Jun 2017 18:58:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B000286E1; Thu, 22 Jun 2017 18:58:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 09598283DA for ; Thu, 22 Jun 2017 18:58:06 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dO7GZ-0005Rm-Sj; Thu, 22 Jun 2017 18:55:43 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dO7GY-0005PG-H3 for xen-devel@lists.xen.org; Thu, 22 Jun 2017 18:55:42 +0000 Received: from [85.158.137.68] by server-16.bemta-3.messagelabs.com id B9/DC-29088-DA21C495; Thu, 22 Jun 2017 18:55:41 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrLLMWRWlGSWpSXmKPExsUyZ7p8oO5aIZ9 Igz3XGC2WfFzM4sDocXT3b6YAxijWzLyk/IoE1ox/03YxFzwIr/h46il7A+MDiy5GLg4hgYlM Er9mLGKEcH4zSmxcdoIdwtnIKLHh4wlmCKeHUWL/0uWsXYycHGwCRhJnj05nBLFFBKQlrn2+D NbOLNDAJPH83EGwhLCAl8T81feZQGwWAVWJr4d/sIPYvEDx97sXsoDYEgIKElMevmcGsTkFvC Uen3oL1isEVLP07Vp2iBpDic8blzJPYORbwMiwilG9OLWoLLVI10gvqSgzPaMkNzEzR9fQwFg vN7W4ODE9NScxqVgvOT93EyMwWOoZGBh3MJ5qdj7EKMnBpCTKu7nYO1KILyk/pTIjsTgjvqg0 J7X4EKMMB4eSBK8FMPiEBItS01Mr0jJzgGELk5bg4FES4f3CB5TmLS5IzC3OTIdInWJUlBLn3 ScIlBAASWSU5sG1wWLlEqOslDAvIwMDgxBPQWpRbmYJqvwrRnEORiVh3u8gU3gy80rgpr8CWs wEtPjFEQ+QxSWJCCmpBsbVd+YcdvpS+MmG+e41Pqen4VUmadnqcitvTeU6q3WLxf17mYQ915M 0z65PRma+ku29/NmBWVoCmpVG81VeTdbyYStj9mdK/6rZcudupbjG4/aI5983V3nJ9DevX7Kn dlJgmkGl9lbGkAjPrfu8Tv0qVj3m3ft5ele8ffzUhr1LpRc5qX84r8RSnJFoqMVcVJwIAGCeY SeQAgAA X-Env-Sender: boris.ostrovsky@oracle.com X-Msg-Ref: server-9.tower-31.messagelabs.com!1498157739!51489293!1 X-Originating-IP: [156.151.31.81] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTU2LjE1MS4zMS44MSA9PiAyODgzMzk=\n X-StarScan-Received: X-StarScan-Version: 9.4.19; banners=-,-,- X-VirusChecked: Checked Received: (qmail 32080 invoked from network); 22 Jun 2017 18:55:40 -0000 Received: from userp1040.oracle.com (HELO userp1040.oracle.com) (156.151.31.81) by server-9.tower-31.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 22 Jun 2017 18:55:40 -0000 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id v5MItYER011666 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 22 Jun 2017 18:55:34 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id v5MItY0c017545 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 22 Jun 2017 18:55:34 GMT Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id v5MItX9r005602; Thu, 22 Jun 2017 18:55:33 GMT Received: from ovs104.us.oracle.com (/10.149.76.204) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 22 Jun 2017 11:55:33 -0700 From: Boris Ostrovsky To: xen-devel@lists.xen.org Date: Thu, 22 Jun 2017 14:57:08 -0400 Message-Id: <1498157830-21845-7-git-send-email-boris.ostrovsky@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1498157830-21845-1-git-send-email-boris.ostrovsky@oracle.com> References: <1498157830-21845-1-git-send-email-boris.ostrovsky@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] Cc: sstabellini@kernel.org, wei.liu2@citrix.com, George.Dunlap@eu.citrix.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, tim@xen.org, jbeulich@suse.com, Boris Ostrovsky Subject: [Xen-devel] [PATCH v5 6/8] mm: Keep heap accessible to others while scrubbing X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Instead of scrubbing pages while holding heap lock we can mark buddy's head as being scrubbed and drop the lock temporarily. If someone (most likely alloc_heap_pages()) tries to access this chunk it will signal the scrubber to abort scrub by setting head's BUDDY_SCRUB_ABORT bit. The scrubber checks this bit after processing each page and stops its work as soon as it sees it. Signed-off-by: Boris Ostrovsky --- Changes in v5: * Fixed off-by-one error in setting first_dirty * Changed struct page_info.u.free to a union to permit use of ACCESS_ONCE in check_and_stop_scrub() * Renamed PAGE_SCRUBBING etc. macros to BUDDY_SCRUBBING etc xen/common/page_alloc.c | 105 +++++++++++++++++++++++++++++++++++++++++++++-- xen/include/asm-arm/mm.h | 28 ++++++++----- xen/include/asm-x86/mm.h | 29 ++++++++----- 3 files changed, 138 insertions(+), 24 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 4e2775f..f0e5399 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -687,6 +687,7 @@ static void page_list_add_scrub(struct page_info *pg, unsigned int node, { PFN_ORDER(pg) = order; pg->u.free.first_dirty = first_dirty; + pg->u.free.scrub_state = BUDDY_NOT_SCRUBBING; if ( first_dirty != INVALID_DIRTY_IDX ) page_list_add_tail(pg, &heap(node, zone, order)); @@ -694,6 +695,25 @@ static void page_list_add_scrub(struct page_info *pg, unsigned int node, page_list_add(pg, &heap(node, zone, order)); } +static void check_and_stop_scrub(struct page_info *head) +{ + if ( head->u.free.scrub_state == BUDDY_SCRUBBING ) + { + struct page_info pg; + + head->u.free.scrub_state = BUDDY_SCRUB_ABORT; + spin_lock_kick(); + for ( ; ; ) + { + /* Can't ACCESS_ONCE() a bitfield. */ + pg.u.free.val = ACCESS_ONCE(head->u.free.val); + if ( pg.u.free.scrub_state != BUDDY_SCRUB_ABORT ) + break; + cpu_relax(); + } + } +} + static struct page_info *get_free_buddy(unsigned int zone_lo, unsigned int zone_hi, unsigned int order, unsigned int memflags, @@ -738,14 +758,19 @@ static struct page_info *get_free_buddy(unsigned int zone_lo, { if ( (pg = page_list_remove_head(&heap(node, zone, j))) ) { + if ( pg->u.free.first_dirty == INVALID_DIRTY_IDX ) + return pg; /* * We grab single pages (order=0) even if they are * unscrubbed. Given that scrubbing one page is fairly quick * it is not worth breaking higher orders. */ - if ( (order == 0) || use_unscrubbed || - pg->u.free.first_dirty == INVALID_DIRTY_IDX) + if ( (order == 0) || use_unscrubbed ) + { + check_and_stop_scrub(pg); return pg; + } + page_list_add_tail(pg, &heap(node, zone, j)); } } @@ -928,6 +953,7 @@ static int reserve_offlined_page(struct page_info *head) cur_head = head; + check_and_stop_scrub(head); /* * We may break the buddy so let's mark the head as clean. Then, when * merging chunks back into the heap, we will see whether the chunk has @@ -1084,6 +1110,29 @@ static unsigned int node_to_scrub(bool get_node) return closest; } +struct scrub_wait_state { + struct page_info *pg; + unsigned int first_dirty; + bool drop; +}; + +static void scrub_continue(void *data) +{ + struct scrub_wait_state *st = data; + + if ( st->drop ) + return; + + if ( st->pg->u.free.scrub_state == BUDDY_SCRUB_ABORT ) + { + /* There is a waiter for this buddy. Release it. */ + st->drop = true; + st->pg->u.free.first_dirty = st->first_dirty; + smp_wmb(); + st->pg->u.free.scrub_state = BUDDY_NOT_SCRUBBING; + } +} + bool scrub_free_pages(void) { struct page_info *pg; @@ -1106,25 +1155,53 @@ bool scrub_free_pages(void) do { while ( !page_list_empty(&heap(node, zone, order)) ) { - unsigned int i; + unsigned int i, dirty_cnt; + struct scrub_wait_state st; /* Unscrubbed pages are always at the end of the list. */ pg = page_list_last(&heap(node, zone, order)); if ( pg->u.free.first_dirty == INVALID_DIRTY_IDX ) break; + ASSERT(!pg->u.free.scrub_state); + pg->u.free.scrub_state = BUDDY_SCRUBBING; + + spin_unlock(&heap_lock); + + dirty_cnt = 0; + for ( i = pg->u.free.first_dirty; i < (1U << order); i++) { if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) { scrub_one_page(&pg[i]); + /* + * We can modify count_info without holding heap + * lock since we effectively locked this buddy by + * setting its scrub_state. + */ pg[i].count_info &= ~PGC_need_scrub; - node_need_scrub[node]--; + dirty_cnt++; cnt += 100; /* scrubbed pages add heavier weight. */ } else cnt++; + if ( pg->u.free.scrub_state == BUDDY_SCRUB_ABORT ) + { + /* Someone wants this chunk. Drop everything. */ + + pg->u.free.first_dirty = (i == (1U << order) - 1) ? + INVALID_DIRTY_IDX : i + 1; + smp_wmb(); + pg->u.free.scrub_state = BUDDY_NOT_SCRUBBING; + + spin_lock(&heap_lock); + node_need_scrub[node] -= dirty_cnt; + spin_unlock(&heap_lock); + goto out_nolock; + } + /* * Scrub a few (8) pages before becoming eligible for * preemption. But also count non-scrubbing loop iterations @@ -1138,6 +1215,17 @@ bool scrub_free_pages(void) } } + st.pg = pg; + st.first_dirty = (i >= (1UL << order) - 1) ? + INVALID_DIRTY_IDX : i + 1; + st.drop = false; + spin_lock_cb(&heap_lock, scrub_continue, &st); + + node_need_scrub[node] -= dirty_cnt; + + if ( st.drop ) + goto out; + if ( i >= (1U << order) - 1 ) { page_list_del(pg, &heap(node, zone, order)); @@ -1146,6 +1234,8 @@ bool scrub_free_pages(void) else pg->u.free.first_dirty = i + 1; + pg->u.free.scrub_state = BUDDY_NOT_SCRUBBING; + if ( preempt || (node_need_scrub[node] == 0) ) goto out; } @@ -1154,6 +1244,8 @@ bool scrub_free_pages(void) out: spin_unlock(&heap_lock); + + out_nolock: node_clear(node, node_scrubbing); return softirq_pending(cpu) || (node_to_scrub(false) != NUMA_NO_NODE); } @@ -1235,6 +1327,8 @@ static void free_heap_pages( (phys_to_nid(page_to_maddr(predecessor)) != node) ) break; + check_and_stop_scrub(predecessor); + page_list_del(predecessor, &heap(node, zone, order)); if ( predecessor->u.free.first_dirty != INVALID_DIRTY_IDX ) @@ -1256,6 +1350,9 @@ static void free_heap_pages( (PFN_ORDER(successor) != order) || (phys_to_nid(page_to_maddr(successor)) != node) ) break; + + check_and_stop_scrub(successor); + page_list_del(successor, &heap(node, zone, order)); if ( successor->u.free.first_dirty != INVALID_DIRTY_IDX ) diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index 889a85e..625aa16 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -42,18 +42,26 @@ struct page_info unsigned long type_info; } inuse; /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */ - struct { - /* Do TLBs need flushing for safety before next page use? */ - unsigned long need_tlbflush:1; - - /* - * Index of the first *possibly* unscrubbed page in the buddy. - * One more than maximum possible order (MAX_ORDER+1) to - * accommodate INVALID_DIRTY_IDX. - */ + union { + struct { + /* Do TLBs need flushing for safety before next page use? */ + unsigned long need_tlbflush:1; + + /* + * Index of the first *possibly* unscrubbed page in the buddy. + * One more than maximum possible order (MAX_ORDER+1) to + * accommodate INVALID_DIRTY_IDX. + */ #define INVALID_DIRTY_IDX (-1UL & (((1UL<