Message ID | 1495209040-11101-4-git-send-email-boris.ostrovsky@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 19.05.17 at 17:50, <boris.ostrovsky@oracle.com> wrote: > @@ -734,8 +735,15 @@ static struct page_info *get_free_buddy(unsigned int zone_lo, > > /* Find smallest order which can satisfy the request. */ > for ( j = order; j <= MAX_ORDER; j++ ) > + { > if ( (pg = page_list_remove_head(&heap(node, zone, j))) ) > - return pg; > + { > + if ( (order == 0) || use_unscrubbed || Why is order 0 being special cased here? If this really is intended, a comment should be added. > @@ -821,9 +829,16 @@ static struct page_info *alloc_heap_pages( > pg = get_free_buddy(zone_lo, zone_hi, order, memflags, d); > if ( !pg ) > { > - /* No suitable memory blocks. Fail the request. */ > - spin_unlock(&heap_lock); > - return NULL; > + /* Try now getting a dirty buddy. */ > + if ( !(memflags & MEMF_no_scrub) ) > + pg = get_free_buddy(zone_lo, zone_hi, order, > + memflags | MEMF_no_scrub, d); > + if ( !pg ) > + { > + /* No suitable memory blocks. Fail the request. */ > + spin_unlock(&heap_lock); > + return NULL; > + } > } I'd appreciate if you avoided the re-indentation by simply prefixing another if() to the one that's already there. > @@ -855,10 +870,24 @@ static struct page_info *alloc_heap_pages( > if ( d != NULL ) > d->last_alloc_node = node; > > + need_scrub &= !(memflags & MEMF_no_scrub); Can't this be done right away when need_scrub is being set? > for ( i = 0; i < (1 << order); i++ ) > { > /* Reference count must continuously be zero for free pages. */ > - BUG_ON(pg[i].count_info != PGC_state_free); > + BUG_ON((pg[i].count_info & ~PGC_need_scrub ) != PGC_state_free); Isn't this change needed in one of the earlier patches already? There also is a stray blank ahead of the first closing paren here. > + if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) > + { > + if ( need_scrub ) > + scrub_one_page(&pg[i]); > + node_need_scrub[node]--; > + /* > + * Technically, we need to set first_dirty to INVALID_DIRTY_IDX > + * on buddy's head. However, since we assign pg[i].count_info > + * below, we can skip this. > + */ This comment is correct only with the current way struct page_info's fields are unionized. In fact I think the comment is unneeded - the buddy is being transitioned from free to allocated here, so the field loses its meaning. Jan
On 06/09/2017 11:22 AM, Jan Beulich wrote: >>>> On 19.05.17 at 17:50, <boris.ostrovsky@oracle.com> wrote: >> @@ -734,8 +735,15 @@ static struct page_info *get_free_buddy(unsigned int zone_lo, >> >> /* Find smallest order which can satisfy the request. */ >> for ( j = order; j <= MAX_ORDER; j++ ) >> + { >> if ( (pg = page_list_remove_head(&heap(node, zone, j))) ) >> - return pg; >> + { >> + if ( (order == 0) || use_unscrubbed || > Why is order 0 being special cased here? If this really is intended, a > comment should be added. That's because for a single page it's not worth skipping a dirty buddy. (It is a pretty arbitrary number, could be <=1 or even <=2, presumably) I'll add a comment. >> @@ -855,10 +870,24 @@ static struct page_info *alloc_heap_pages( >> if ( d != NULL ) >> d->last_alloc_node = node; >> >> + need_scrub &= !(memflags & MEMF_no_scrub); > Can't this be done right away when need_scrub is being set? No, because we use the earlier assignment to decide how we put "sub-buddies" back to the heap (dirty or not). Here we use need_scrub to decide whether to scrub the buddy. This may change though with the changes that you suggested in the comments to the first patch. > >> for ( i = 0; i < (1 << order); i++ ) >> { >> /* Reference count must continuously be zero for free pages. */ >> - BUG_ON(pg[i].count_info != PGC_state_free); >> + BUG_ON((pg[i].count_info & ~PGC_need_scrub ) != PGC_state_free); > Isn't this change needed in one of the earlier patches already? At this patch level we are still scrubbing in free_heap_pages() so there is never an unscrubbed page in the allocator. The next patch will switch to scrubbing from idle loop. > There also is a stray blank ahead of the first closing paren here. > >> + if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) >> + { >> + if ( need_scrub ) >> + scrub_one_page(&pg[i]); >> + node_need_scrub[node]--; >> + /* >> + * Technically, we need to set first_dirty to INVALID_DIRTY_IDX >> + * on buddy's head. However, since we assign pg[i].count_info >> + * below, we can skip this. >> + */ > This comment is correct only with the current way struct page_info's > fields are unionized. In fact I think the comment is unneeded - the > buddy is being transitioned from free to allocated here, so the field > loses its meaning. That, actually, is exactly what I was trying to say. I can drop the comment if you feel it is obvious why we don't need to set first_dirty. -boris
>>> On 09.06.17 at 22:55, <boris.ostrovsky@oracle.com> wrote: > On 06/09/2017 11:22 AM, Jan Beulich wrote: >>>>> On 19.05.17 at 17:50, <boris.ostrovsky@oracle.com> wrote: >>> @@ -734,8 +735,15 @@ static struct page_info *get_free_buddy(unsigned int >>> + if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) >>> + { >>> + if ( need_scrub ) >>> + scrub_one_page(&pg[i]); >>> + node_need_scrub[node]--; >>> + /* >>> + * Technically, we need to set first_dirty to INVALID_DIRTY_IDX >>> + * on buddy's head. However, since we assign pg[i].count_info >>> + * below, we can skip this. >>> + */ >> This comment is correct only with the current way struct page_info's >> fields are unionized. In fact I think the comment is unneeded - the >> buddy is being transitioned from free to allocated here, so the field >> loses its meaning. > > That, actually, is exactly what I was trying to say. I can drop the > comment if you feel it is obvious why we don't need to set first_dirty. Well, my personal order of preference would be to (a) drop the comment or then (b) re-word it to express the free -> allocated transition as the reason explicitly. Others my prefer a corrected comment over no comment at all ... Jan
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 1e57885..b7c7426 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -703,6 +703,7 @@ static struct page_info *get_free_buddy(unsigned int zone_lo, nodemask_t nodemask = d ? d->node_affinity : node_online_map; unsigned int j, zone, nodemask_retry = 0, request = 1UL << order; struct page_info *pg; + bool use_unscrubbed = (memflags & MEMF_no_scrub); if ( node == NUMA_NO_NODE ) { @@ -734,8 +735,15 @@ static struct page_info *get_free_buddy(unsigned int zone_lo, /* Find smallest order which can satisfy the request. */ for ( j = order; j <= MAX_ORDER; j++ ) + { if ( (pg = page_list_remove_head(&heap(node, zone, j))) ) - return pg; + { + if ( (order == 0) || use_unscrubbed || + pg->u.free.first_dirty == INVALID_DIRTY_IDX) + return pg; + page_list_add_tail(pg, &heap(node, zone, j)); + } + } } while ( zone-- > zone_lo ); /* careful: unsigned zone may wrap */ if ( (memflags & MEMF_exact_node) && req_node != NUMA_NO_NODE ) @@ -821,9 +829,16 @@ static struct page_info *alloc_heap_pages( pg = get_free_buddy(zone_lo, zone_hi, order, memflags, d); if ( !pg ) { - /* No suitable memory blocks. Fail the request. */ - spin_unlock(&heap_lock); - return NULL; + /* Try now getting a dirty buddy. */ + if ( !(memflags & MEMF_no_scrub) ) + pg = get_free_buddy(zone_lo, zone_hi, order, + memflags | MEMF_no_scrub, d); + if ( !pg ) + { + /* No suitable memory blocks. Fail the request. */ + spin_unlock(&heap_lock); + return NULL; + } } node = phys_to_nid(page_to_maddr(pg)); @@ -855,10 +870,24 @@ static struct page_info *alloc_heap_pages( if ( d != NULL ) d->last_alloc_node = node; + need_scrub &= !(memflags & MEMF_no_scrub); for ( i = 0; i < (1 << order); i++ ) { /* Reference count must continuously be zero for free pages. */ - BUG_ON(pg[i].count_info != PGC_state_free); + BUG_ON((pg[i].count_info & ~PGC_need_scrub ) != PGC_state_free); + + if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) + { + if ( need_scrub ) + scrub_one_page(&pg[i]); + node_need_scrub[node]--; + /* + * Technically, we need to set first_dirty to INVALID_DIRTY_IDX + * on buddy's head. However, since we assign pg[i].count_info + * below, we can skip this. + */ + } + pg[i].count_info = PGC_state_inuse; if ( !(memflags & MEMF_no_tlbflush) ) @@ -1737,7 +1766,7 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags) ASSERT(!in_irq()); pg = alloc_heap_pages(MEMZONE_XEN, MEMZONE_XEN, - order, memflags, NULL); + order, memflags | MEMF_no_scrub, NULL); if ( unlikely(pg == NULL) ) return NULL; @@ -1787,7 +1816,7 @@ void *alloc_xenheap_pages(unsigned int order, unsigned int memflags) if ( !(memflags >> _MEMF_bits) ) memflags |= MEMF_bits(xenheap_bits); - pg = alloc_domheap_pages(NULL, order, memflags); + pg = alloc_domheap_pages(NULL, order, memflags | MEMF_no_scrub); if ( unlikely(pg == NULL) ) return NULL; diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h index 88de3c1..0d4b7c2 100644 --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -224,6 +224,8 @@ struct npfec { #define MEMF_no_owner (1U<<_MEMF_no_owner) #define _MEMF_no_tlbflush 6 #define MEMF_no_tlbflush (1U<<_MEMF_no_tlbflush) +#define _MEMF_no_scrub 7 +#define MEMF_no_scrub (1U<<_MEMF_no_scrub) #define _MEMF_node 8 #define MEMF_node_mask ((1U << (8 * sizeof(nodeid_t))) - 1) #define MEMF_node(n) ((((n) + 1) & MEMF_node_mask) << _MEMF_node)
When allocating pages in alloc_heap_pages() first look for clean pages. If none is found then retry, take pages marked as unscrubbed and scrub them. Note that we shouldn't find unscrubbed pages in alloc_heap_pages() yet. However, this will become possible when we stop scrubbing from free_heap_pages() and instead do it from idle loop. Since not all allocations require clean pages (such as xenheap allocations) introduce MEMF_no_scrub flag that callers can set if they are willing to consume unscrubbed pages. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- Changes in v4: * Add MEMF_no_tlbflush flag xen/common/page_alloc.c | 43 ++++++++++++++++++++++++++++++++++++------- xen/include/xen/mm.h | 2 ++ 2 files changed, 38 insertions(+), 7 deletions(-)