Message ID | 1492184258-3277-6-git-send-email-boris.ostrovsky@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 14.04.17 at 17:37, <boris.ostrovsky@oracle.com> wrote: > While scrubbing from idle loop, check for softirqs every 256 pages. > If softirq is pending, don't scrub any further and merge the > partially-scrubbed buddy back into heap by breaking the clean portion > into smaller power-of-2 chunks. Then repeat the same process for the > dirty part. This is ugly, as it gets us back into the state where full merge opportunities aren't being realized, just that the time window may be smaller now. As hinted at before, is there no way to flag the first page needing scrubbing alongside the head indicating that _some_ page needs scrubbing? The pages are all free, so if there's no other suitable storage, the head page itself could serve as such. But instead of a flag in struct page_info, perhaps you could store a (relatively small) integer? Jan
On 05/04/2017 11:43 AM, Jan Beulich wrote: >>>> On 14.04.17 at 17:37, <boris.ostrovsky@oracle.com> wrote: >> While scrubbing from idle loop, check for softirqs every 256 pages. >> If softirq is pending, don't scrub any further and merge the >> partially-scrubbed buddy back into heap by breaking the clean portion >> into smaller power-of-2 chunks. Then repeat the same process for the >> dirty part. > This is ugly, as it gets us back into the state where full merge > opportunities aren't being realized, just that the time window > may be smaller now. As hinted at before, is there no way to > flag the first page needing scrubbing alongside the head > indicating that _some_ page needs scrubbing? The pages are > all free, so if there's no other suitable storage, the head page > itself could serve as such. But instead of a flag in struct > page_info, perhaps you could store a (relatively small) integer? How will it help? Even if we know what the fist dirty page is we still may have to drop scrubbing if irq is pending. We simply will not have to scan the buddy until first dirty page is found. Or perhaps I don't understand what you are suggesting. -boris
>>> On 04.05.17 at 19:18, <boris.ostrovsky@oracle.com> wrote: > On 05/04/2017 11:43 AM, Jan Beulich wrote: >>>>> On 14.04.17 at 17:37, <boris.ostrovsky@oracle.com> wrote: >>> While scrubbing from idle loop, check for softirqs every 256 pages. >>> If softirq is pending, don't scrub any further and merge the >>> partially-scrubbed buddy back into heap by breaking the clean portion >>> into smaller power-of-2 chunks. Then repeat the same process for the >>> dirty part. >> This is ugly, as it gets us back into the state where full merge >> opportunities aren't being realized, just that the time window >> may be smaller now. As hinted at before, is there no way to >> flag the first page needing scrubbing alongside the head >> indicating that _some_ page needs scrubbing? The pages are >> all free, so if there's no other suitable storage, the head page >> itself could serve as such. But instead of a flag in struct >> page_info, perhaps you could store a (relatively small) integer? > > How will it help? Even if we know what the fist dirty page is we still > may have to drop scrubbing if irq is pending. We simply will not have to > scan the buddy until first dirty page is found. > > Or perhaps I don't understand what you are suggesting. If you get a request to stop scrubbing, you split the current buddy into a clean and a (possibly) dirty part (if I understood the code in this patch correctly). It's that splitting which I'd like to see avoided, and by keeping a "first dirty" index (which would be updated when you stop scrubbing) you could do so. Jan
On 05/05/2017 06:27 AM, Jan Beulich wrote: >>>> On 04.05.17 at 19:18, <boris.ostrovsky@oracle.com> wrote: >> On 05/04/2017 11:43 AM, Jan Beulich wrote: >>>>>> On 14.04.17 at 17:37, <boris.ostrovsky@oracle.com> wrote: >>>> While scrubbing from idle loop, check for softirqs every 256 pages. >>>> If softirq is pending, don't scrub any further and merge the >>>> partially-scrubbed buddy back into heap by breaking the clean portion >>>> into smaller power-of-2 chunks. Then repeat the same process for the >>>> dirty part. >>> This is ugly, as it gets us back into the state where full merge >>> opportunities aren't being realized, just that the time window >>> may be smaller now. As hinted at before, is there no way to >>> flag the first page needing scrubbing alongside the head >>> indicating that _some_ page needs scrubbing? The pages are >>> all free, so if there's no other suitable storage, the head page >>> itself could serve as such. But instead of a flag in struct >>> page_info, perhaps you could store a (relatively small) integer? >> How will it help? Even if we know what the fist dirty page is we still >> may have to drop scrubbing if irq is pending. We simply will not have to >> scan the buddy until first dirty page is found. >> >> Or perhaps I don't understand what you are suggesting. > If you get a request to stop scrubbing, you split the current > buddy into a clean and a (possibly) dirty part (if I understood > the code in this patch correctly). Right. > It's that splitting which I'd like > to see avoided, and by keeping a "first dirty" index (which > would be updated when you stop scrubbing) you could do so. If we are to avoid splitting then we might just keep the buddy on the "dirty end" (i.e tail) of the order if we are interrupted in the middle. Having "first_dirty" (which is really "maybe_first_dirty") would only help us start next scan faster but it won't really change the algorithm. (Or maybe that's exactly what you are proposing.) -boris
>>> On 05.05.17 at 15:51, <boris.ostrovsky@oracle.com> wrote: > On 05/05/2017 06:27 AM, Jan Beulich wrote: >>>>> On 04.05.17 at 19:18, <boris.ostrovsky@oracle.com> wrote: >>> On 05/04/2017 11:43 AM, Jan Beulich wrote: >>>>>>> On 14.04.17 at 17:37, <boris.ostrovsky@oracle.com> wrote: >>>>> While scrubbing from idle loop, check for softirqs every 256 pages. >>>>> If softirq is pending, don't scrub any further and merge the >>>>> partially-scrubbed buddy back into heap by breaking the clean portion >>>>> into smaller power-of-2 chunks. Then repeat the same process for the >>>>> dirty part. >>>> This is ugly, as it gets us back into the state where full merge >>>> opportunities aren't being realized, just that the time window >>>> may be smaller now. As hinted at before, is there no way to >>>> flag the first page needing scrubbing alongside the head >>>> indicating that _some_ page needs scrubbing? The pages are >>>> all free, so if there's no other suitable storage, the head page >>>> itself could serve as such. But instead of a flag in struct >>>> page_info, perhaps you could store a (relatively small) integer? >>> How will it help? Even if we know what the fist dirty page is we still >>> may have to drop scrubbing if irq is pending. We simply will not have to >>> scan the buddy until first dirty page is found. >>> >>> Or perhaps I don't understand what you are suggesting. >> If you get a request to stop scrubbing, you split the current >> buddy into a clean and a (possibly) dirty part (if I understood >> the code in this patch correctly). > > Right. > >> It's that splitting which I'd like >> to see avoided, and by keeping a "first dirty" index (which >> would be updated when you stop scrubbing) you could do so. > > > If we are to avoid splitting then we might just keep the buddy on the > "dirty end" (i.e tail) of the order if we are interrupted in the middle. > > Having "first_dirty" (which is really "maybe_first_dirty") would only > help us start next scan faster but it won't really change the algorithm. > > (Or maybe that's exactly what you are proposing.) Yes - using such a hint means better chances of actually making overall progress despite not splitting buddies. Jan
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index fcd7308..0b2dff1 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -996,7 +996,7 @@ static int reserve_offlined_page(struct page_info *head) static struct page_info * merge_and_free_buddy(struct page_info *pg, unsigned int node, unsigned int zone, unsigned int order, - bool need_scrub) + bool need_scrub, bool is_frag) { ASSERT(spin_is_locked(&heap_lock)); @@ -1011,7 +1011,15 @@ merge_and_free_buddy(struct page_info *pg, unsigned int node, if ( (page_to_mfn(pg) & mask) ) buddy = pg - mask; /* Merge with predecessor block. */ else - buddy = pg + mask; /* Merge with successor block. */ + { + /* + * Merge with successor block. + * Only un-fragmented buddy can be merged forward. + */ + if ( is_frag ) + break; + buddy = pg + mask; + } if ( !mfn_valid(_mfn(page_to_mfn(buddy))) || !page_state_is(buddy, free) || @@ -1093,12 +1101,15 @@ static unsigned int node_to_scrub(bool get_node) bool scrub_free_pages(void) { struct page_info *pg; - unsigned int zone, order; - unsigned long i; + unsigned int zone, order, scrub_order; + unsigned long i, num_processed, start, end; unsigned int cpu = smp_processor_id(); - bool preempt = false; + bool preempt = false, is_frag; nodeid_t node; + /* Scrubbing granularity. */ +#define SCRUB_CHUNK_ORDER 8 + /* * Don't scrub while dom0 is being constructed since we may * fail trying to call map_domain_page() from scrub_one_page(). @@ -1123,25 +1134,64 @@ bool scrub_free_pages(void) if ( !pg->u.free.dirty_head ) break; - for ( i = 0; i < (1UL << order); i++) + scrub_order = MIN(order, SCRUB_CHUNK_ORDER); + num_processed = 0; + is_frag = false; + while ( num_processed < (1UL << order) ) { - if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) + for ( i = num_processed; + i < num_processed + (1UL << scrub_order); i++ ) { - scrub_one_page(&pg[i]); - pg[i].count_info &= ~PGC_need_scrub; - node_need_scrub[node]--; + if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) + { + scrub_one_page(&pg[i]); + pg[i].count_info &= ~PGC_need_scrub; + node_need_scrub[node]--; + } } + + num_processed += (1UL << scrub_order); if ( softirq_pending(cpu) ) { preempt = true; + is_frag = (num_processed < (1UL << order)); break; } } - if ( i == (1UL << order) ) + start = 0; + end = num_processed; + + page_list_del(pg, &heap(node, zone, order)); + + /* Merge clean pages */ + while ( start < end ) + { + /* + * Largest power-of-two chunk starting @start, + * not greater than @end + */ + unsigned chunk_order = flsl(end - start) - 1; + struct page_info *ppg = &pg[start]; + + PFN_ORDER(ppg) = chunk_order; + merge_and_free_buddy(ppg, node, zone, chunk_order, false, is_frag); + start += (1UL << chunk_order); + } + + /* Merge unscrubbed pages */ + while ( end < (1UL << order) ) { - page_list_del(pg, &heap(node, zone, order)); - merge_and_free_buddy(pg, node, zone, order, false); + /* + * Largest power-of-two chunk starting @end, not crossing + * next power-of-two boundary + */ + unsigned chunk_order = ffsl(end) - 1; + struct page_info *ppg = &pg[end]; + + PFN_ORDER(ppg) = chunk_order; + merge_and_free_buddy(ppg, node, zone, chunk_order, true, true); + end += (1UL << chunk_order); } if ( preempt || (node_need_scrub[node] == 0) ) @@ -1215,7 +1265,7 @@ static void free_heap_pages( node_need_scrub[node] += (1UL << order); } - pg = merge_and_free_buddy(pg, node, zone, order, need_scrub); + pg = merge_and_free_buddy(pg, node, zone, order, need_scrub, false); if ( tainted ) reserve_offlined_page(pg);
While scrubbing from idle loop, check for softirqs every 256 pages. If softirq is pending, don't scrub any further and merge the partially-scrubbed buddy back into heap by breaking the clean portion into smaller power-of-2 chunks. Then repeat the same process for the dirty part. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- xen/common/page_alloc.c | 78 ++++++++++++++++++++++++++++++++++++++-------- 1 files changed, 64 insertions(+), 14 deletions(-)