Message ID | 20200207155701.2781820-1-dwmw2@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] xen/mm: fold PGC_broken into PGC_state bits | expand |
Hi David, Could you please send the series in a separate thread? So we don't mix the two discussions (i.e merge and free on boot allocated page) together. On 07/02/2020 15:57, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > Only PGC_state_offlining and PGC_state_offlined are valid in conjunction > with PGC_broken. The other two states (free and inuse) were never valid > for a broken page. > > By folding PGC_broken in, we can have three bits for PGC_state which > allows up to 8 states, of which 6 are currently used and 2 are available > for new use cases. The idea looks good to me. I have a few mostly nitpicking comment below. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > xen/arch/x86/domctl.c | 2 +- > xen/common/page_alloc.c | 67 ++++++++++++++++++++++++---------------- > xen/include/asm-arm/mm.h | 26 +++++++++++----- > xen/include/asm-x86/mm.h | 33 +++++++++++++------- > 4 files changed, 82 insertions(+), 46 deletions(-) > > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c > index 4fa9c91140..17a318e16d 100644 > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -415,7 +415,7 @@ long arch_do_domctl( > if ( page->u.inuse.type_info & PGT_pinned ) > type |= XEN_DOMCTL_PFINFO_LPINTAB; > > - if ( page->count_info & PGC_broken ) > + if ( page_is_broken(page) ) > type = XEN_DOMCTL_PFINFO_BROKEN; > } > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > index 97902d42c1..4084503554 100644 > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -1093,7 +1093,7 @@ static int reserve_offlined_page(struct page_info *head) > struct page_info *pg; > int next_order; > > - if ( page_state_is(cur_head, offlined) ) > + if ( page_is_offlined(cur_head) ) > { > cur_head++; > if ( first_dirty != INVALID_DIRTY_IDX && first_dirty ) > @@ -1113,7 +1113,7 @@ static int reserve_offlined_page(struct page_info *head) > for ( i = (1 << cur_order), pg = cur_head + (1 << cur_order ); > i < (1 << next_order); > i++, pg++ ) > - if ( page_state_is(pg, offlined) ) > + if ( page_is_offlined(pg) ) > break; > if ( i == ( 1 << next_order) ) > { > @@ -1145,16 +1145,19 @@ static int reserve_offlined_page(struct page_info *head) > > for ( cur_head = head; cur_head < head + ( 1UL << head_order); cur_head++ ) > { > - if ( !page_state_is(cur_head, offlined) ) > + struct page_list_head *list; We tend to add a newline after a series of declaration. > + if ( page_state_is(cur_head, offlined) ) > + list = &page_offlined_list; > + else if (page_state_is(cur_head, broken) ) > + list = &page_broken_list; > + else > continue; > > avail[node][zone]--; > total_avail_pages--; > ASSERT(total_avail_pages >= 0); > > - page_list_add_tail(cur_head, > - test_bit(_PGC_broken, &cur_head->count_info) ? > - &page_broken_list : &page_offlined_list); > + page_list_add_tail(cur_head, list); > > count++; > } > @@ -1404,13 +1407,16 @@ static void free_heap_pages( > switch ( pg[i].count_info & PGC_state ) > { > case PGC_state_inuse: > - BUG_ON(pg[i].count_info & PGC_broken); > pg[i].count_info = PGC_state_free; > break; > > case PGC_state_offlining: > - pg[i].count_info = (pg[i].count_info & PGC_broken) | > - PGC_state_offlined; > + pg[i].count_info = PGC_state_offlined; > + tainted = 1; > + break; > + > + case PGC_state_broken_offlining: > + pg[i].count_info = PGC_state_broken; > tainted = 1; > break; > > @@ -1527,16 +1533,25 @@ static unsigned long mark_page_offline(struct page_info *pg, int broken) > do { > nx = x = y; > > - if ( ((x & PGC_state) != PGC_state_offlined) && > - ((x & PGC_state) != PGC_state_offlining) ) > + nx &= ~PGC_state; > + > + switch ( x & PGC_state ) > { > - nx &= ~PGC_state; > - nx |= (((x & PGC_state) == PGC_state_free) > - ? PGC_state_offlined : PGC_state_offlining); > - } > + case PGC_state_inuse: > + case PGC_state_offlining: > + nx |= broken ? PGC_state_offlining : PGC_state_broken_offlining; > + break; > + > + case PGC_state_free: > + nx |= broken ? PGC_state_broken : PGC_state_offlined; > > - if ( broken ) > - nx |= PGC_broken; > + case PGC_state_broken_offlining: > + nx |= PGC_state_broken_offlining; > + > + case PGC_state_offlined: > + case PGC_state_broken: > + nx |= PGC_state_broken; Shouldn't this be: case PGC_state_offlined: nx |= broken ? PGC_state_offlined : PGC_state_broken; case PGC_state_broken: nx |= PGC_state_broken; There are also quite a difference with the default case now. Without this patch, if you were to add a new state but not handling it here, you would transition to PGC_state_offlining. With this patch, we will transtion to 0 (i.e PGC_state_inuse for now). PGC_state_* are not an enum, the compiler can't help to catch new state that doesn't have a corresponding case. So I would suggest to add a default matching the current behavior and adding an ASSERT_UNREACHABLE(). Note that I am open to a different kind of handling here. > + } > > if ( x == nx ) > break; > @@ -1609,7 +1624,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status) > * need to prevent malicious guest access the broken page again. > * Under such case, hypervisor shutdown guest, preventing recursive mce. > */ > - if ( (pg->count_info & PGC_broken) && (owner = page_get_owner(pg)) ) > + if ( page_is_broken(pg) && (owner = page_get_owner(pg)) ) > { > *status = PG_OFFLINE_AGAIN; > domain_crash(owner); > @@ -1620,7 +1635,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status) > > old_info = mark_page_offline(pg, broken); > > - if ( page_state_is(pg, offlined) ) > + if ( page_is_offlined(pg) ) > { > reserve_heap_page(pg); > > @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, uint32_t *status) > do { > ret = *status = 0; > > - if ( y & PGC_broken ) > + if ( (y & PGC_state) == PGC_state_broken || > + (y & PGC_state) == PGC_state_broken_offlining ) This is a bit a shame we can't use page_is_broken. Would it make sense to introduce an helper that just take a count_info? > { > ret = -EINVAL; > *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; > break; > } > - > - if ( (y & PGC_state) == PGC_state_offlined ) > + else if ( (y & PGC_state) == PGC_state_offlined ) > { > page_list_del(pg, &page_offlined_list); > *status = PG_ONLINE_ONLINED; > @@ -1747,11 +1762,11 @@ int query_page_offline(mfn_t mfn, uint32_t *status) > > pg = mfn_to_page(mfn); > > - if ( page_state_is(pg, offlining) ) > + if ( page_is_offlining(pg) ) > *status |= PG_OFFLINE_STATUS_OFFLINE_PENDING; > - if ( pg->count_info & PGC_broken ) > + if ( page_is_broken(pg) ) > *status |= PG_OFFLINE_STATUS_BROKEN; > - if ( page_state_is(pg, offlined) ) > + if ( page_is_offlined(pg) ) > *status |= PG_OFFLINE_STATUS_OFFLINED; > > spin_unlock(&heap_lock); > @@ -2483,7 +2498,7 @@ __initcall(pagealloc_keyhandler_init); > > void scrub_one_page(struct page_info *pg) > { > - if ( unlikely(pg->count_info & PGC_broken) ) > + if ( unlikely(page_is_broken(pg)) ) > return; > > #ifndef NDEBUG > diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h > index 333efd3a60..c9466c8ca0 100644 > --- a/xen/include/asm-arm/mm.h > +++ b/xen/include/asm-arm/mm.h > @@ -112,13 +112,25 @@ struct page_info > /* Page is broken? */ > #define _PGC_broken PG_shift(7) > #define PGC_broken PG_mask(1, 7) Shouldn't this be dropped now? > - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */ > -#define PGC_state PG_mask(3, 9) > -#define PGC_state_inuse PG_mask(0, 9) > -#define PGC_state_offlining PG_mask(1, 9) > -#define PGC_state_offlined PG_mask(2, 9) > -#define PGC_state_free PG_mask(3, 9) > -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) > + /* > + * Mutually-exclusive page states: > + * { inuse, offlining, offlined, free, broken_offlining, broken } > + */ > +#define PGC_state PG_mask(7, 9) > +#define PGC_state_inuse PG_mask(0, 9) > +#define PGC_state_offlining PG_mask(1, 9) > +#define PGC_state_offlined PG_mask(2, 9) > +#define PGC_state_free PG_mask(3, 9) > +#define PGC_state_broken_offlining PG_mask(4, 9) > +#define PGC_state_broken PG_mask(5, 9) I agree that making all the value aligned make it nicer to read, but this is increasing number of "unrelated" changes and makes the review more difficult. I would prefer if we leave the indentation alone for the current #define. But I am not going to push for it :). > + > +#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) > +#define page_is_broken(pg) (page_state_is((pg), broken_offlining) || \ > + page_state_is((pg), broken)) > +#define page_is_offlined(pg) (page_state_is((pg), broken) || \ > + page_state_is((pg), offlined)) > +#define page_is_offlining(pg) (page_state_is((pg), broken_offlining) || \ > + page_state_is((pg), offlining)) > > /* Count of references to this frame. */ > #define PGC_count_width PG_shift(9) > diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h > index 2ca8882ad0..3edadf7a7c 100644 > --- a/xen/include/asm-x86/mm.h > +++ b/xen/include/asm-x86/mm.h > @@ -67,18 +67,27 @@ > /* 3-bit PAT/PCD/PWT cache-attribute hint. */ > #define PGC_cacheattr_base PG_shift(6) > #define PGC_cacheattr_mask PG_mask(7, 6) > - /* Page is broken? */ > -#define _PGC_broken PG_shift(7) > -#define PGC_broken PG_mask(1, 7) > - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */ > -#define PGC_state PG_mask(3, 9) > -#define PGC_state_inuse PG_mask(0, 9) > -#define PGC_state_offlining PG_mask(1, 9) > -#define PGC_state_offlined PG_mask(2, 9) > -#define PGC_state_free PG_mask(3, 9) > -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) > - > - /* Count of references to this frame. */ > + /* > + * Mutually-exclusive page states: > + * { inuse, offlining, offlined, free, broken_offlining, broken } > + */ > +#define PGC_state PG_mask(7, 9) > +#define PGC_state_inuse PG_mask(0, 9) > +#define PGC_state_offlining PG_mask(1, 9) > +#define PGC_state_offlined PG_mask(2, 9) > +#define PGC_state_free PG_mask(3, 9) > +#define PGC_state_broken_offlining PG_mask(4, 9) > +#define PGC_state_broken PG_mask(5, 9) > + > +#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) > +#define page_is_broken(pg) (page_state_is((pg), broken_offlining) || \ > + page_state_is((pg), broken)) > +#define page_is_offlined(pg) (page_state_is((pg), broken) || \ > + page_state_is((pg), offlined)) > +#define page_is_offlining(pg) (page_state_is((pg), broken_offlining) || \ > + page_state_is((pg), offlining)) > + > +/* Count of references to this frame. */ > #define PGC_count_width PG_shift(9) > #define PGC_count_mask ((1UL<<PGC_count_width)-1) > > Cheers,
On Fri, 2020-02-07 at 20:27 +0000, Julien Grall wrote: > Could you please send the series in a separate thread? So we don't mix > the two discussions (i.e merge and free on boot allocated page) together. Well, they're all part of the same mess, really. There are cases where pages end up in free_heap_pages() which were never vetted by init_heap_pages(). While that normally works just fine — to the extent that we've never really cared — the hack excluding MFN0 is one of the things that gets 'missed' for such pages. I was only looking at this because the early vmap support makes it a tiny bit more likely that some pages will be freed that way after being given out by the boot allocator, but there were plenty of reasons it might happen already. These patches basically legitimise that — we make it OK for free_heap_pages() to be given a range which was never in the heap in the first place. And in so doing, we fix the MFN0/zone merge trick even when (for example) MFN0 was part of the dom0 initramfs and assigned directly, but gets freed up later. But sure, having failed to elicit any screams of "don't do it like that", I'll fix up the things you pointed out and I'll repost it as part of the series containing the early vmap() support, since that's why I'm working on it. Although at this point I'm half tempted to declare that there are *so* many ways this can happen already, that the tiny little bit that it's made more likely by the early vmap() support is basically in the noise. In that case we can separate these patches out as an independent fix rather than a dependency of early vmap.
On 09/02/2020 13:22, David Woodhouse wrote: > On Fri, 2020-02-07 at 20:27 +0000, Julien Grall wrote: >> Could you please send the series in a separate thread? So we don't mix >> the two discussions (i.e merge and free on boot allocated page) together. > > Well, they're all part of the same mess, really. Sending a series in the middle of another series is always more difficult to track :). The more if they are handled by two different person... > > There are cases where pages end up in free_heap_pages() which were > never vetted by init_heap_pages(). While that normally works just fine > — to the extent that we've never really cared — the hack excluding MFN0 > is one of the things that gets 'missed' for such pages. > > I was only looking at this because the early vmap support makes it a > tiny bit more likely that some pages will be freed that way after being > given out by the boot allocator, but there were plenty of reasons it > might happen already. > > These patches basically legitimise that — we make it OK for > free_heap_pages() to be given a range which was never in the heap in > the first place. And in so doing, we fix the MFN0/zone merge trick even > when (for example) MFN0 was part of the dom0 initramfs and assigned > directly, but gets freed up later. > > But sure, having failed to elicit any screams of "don't do it like > that", I'll fix up the things you pointed out and I'll repost it as > part of the series containing the early vmap() support, since that's > why I'm working on it. > > Although at this point I'm half tempted to declare that there are *so* > many ways this can happen already, that the tiny little bit that it's > made more likely by the early vmap() support is basically in the noise. > > In that case we can separate these patches out as an independent fix > rather than a dependency of early vmap. I hadn't realize how messy it was because I had Arm in mind and wasn't expected x86 to abuse so much the interface. For x86, I agree that this is noise as they are abusing the interface pretty much everywhere. However, on Arm there is only one place that is abusing the interface. It is in the ACPI code, although I think it will be just a leak given the implementation of acpi_os_free_memory(). As we don't free page-tables yet on Arm, the introduction of the early vmap would not introduce any more abuse on Arm. It would obviously nice to fix it, but as you said this is noise on x86. So that's really up to the x86 folks (Andrew, George, Jan) to see whether yet another abuse is ok for them :). Cheers,
On 07.02.2020 16:57, David Woodhouse wrote: > @@ -1145,16 +1145,19 @@ static int reserve_offlined_page(struct page_info *head) > > for ( cur_head = head; cur_head < head + ( 1UL << head_order); cur_head++ ) > { > - if ( !page_state_is(cur_head, offlined) ) > + struct page_list_head *list; > + if ( page_state_is(cur_head, offlined) ) > + list = &page_offlined_list; > + else if (page_state_is(cur_head, broken) ) > + list = &page_broken_list; > + else > continue; > > avail[node][zone]--; > total_avail_pages--; > ASSERT(total_avail_pages >= 0); > > - page_list_add_tail(cur_head, > - test_bit(_PGC_broken, &cur_head->count_info) ? > - &page_broken_list : &page_offlined_list); > + page_list_add_tail(cur_head, list); While I realize it's fewer comparisons this way, I still wonder whether for the reader's sake it wouldn't better be page_is_offlined() first and then page_is_broken() down here. > @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, uint32_t *status) > do { > ret = *status = 0; > > - if ( y & PGC_broken ) > + if ( (y & PGC_state) == PGC_state_broken || > + (y & PGC_state) == PGC_state_broken_offlining ) > { > ret = -EINVAL; > *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; > break; > } > - > - if ( (y & PGC_state) == PGC_state_offlined ) > + else if ( (y & PGC_state) == PGC_state_offlined ) I don't see a need for adding "else" here. > --- a/xen/include/asm-x86/mm.h > +++ b/xen/include/asm-x86/mm.h > @@ -67,18 +67,27 @@ > /* 3-bit PAT/PCD/PWT cache-attribute hint. */ > #define PGC_cacheattr_base PG_shift(6) > #define PGC_cacheattr_mask PG_mask(7, 6) > - /* Page is broken? */ > -#define _PGC_broken PG_shift(7) > -#define PGC_broken PG_mask(1, 7) > - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */ > -#define PGC_state PG_mask(3, 9) > -#define PGC_state_inuse PG_mask(0, 9) > -#define PGC_state_offlining PG_mask(1, 9) > -#define PGC_state_offlined PG_mask(2, 9) > -#define PGC_state_free PG_mask(3, 9) > -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) > - > - /* Count of references to this frame. */ > + /* > + * Mutually-exclusive page states: > + * { inuse, offlining, offlined, free, broken_offlining, broken } > + */ > +#define PGC_state PG_mask(7, 9) > +#define PGC_state_inuse PG_mask(0, 9) > +#define PGC_state_offlining PG_mask(1, 9) > +#define PGC_state_offlined PG_mask(2, 9) > +#define PGC_state_free PG_mask(3, 9) > +#define PGC_state_broken_offlining PG_mask(4, 9) TBH I'd prefer PGC_state_offlining_broken, as it's not the offlining which is broken, but a broken page is being offlined. > +#define PGC_state_broken PG_mask(5, 9) > + > +#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) Blanks around & please. > +#define page_is_broken(pg) (page_state_is((pg), broken_offlining) || \ > + page_state_is((pg), broken)) > +#define page_is_offlined(pg) (page_state_is((pg), broken) || \ > + page_state_is((pg), offlined)) The inclusion of "broken" here would seem to deserve a (brief) comment, either here or next to PGC_state_broken's #define. > +#define page_is_offlining(pg) (page_state_is((pg), broken_offlining) || \ > + page_state_is((pg), offlining)) Overall I wonder whether the PGC_state_* ordering couldn't be adjusted such that at least some of these three won't need two comparisons (by masking off a bit before comparing). Also for all three - no need for extra parentheses around pg (or in general macro arguments which get handed on without being part of an expression). Jan
On Fri, 2020-02-07 at 20:27 +0000, Julien Grall wrote: > > + switch ( x & PGC_state ) > > { > > - nx &= ~PGC_state; > > - nx |= (((x & PGC_state) == PGC_state_free) > > - ? PGC_state_offlined : PGC_state_offlining); > > - } > > + case PGC_state_inuse: > > + case PGC_state_offlining: > > + nx |= broken ? PGC_state_offlining : PGC_state_broken_offlining; > > + break; > > + > > + case PGC_state_free: > > + nx |= broken ? PGC_state_broken : PGC_state_offlined; > > > > - if ( broken ) > > - nx |= PGC_broken; > > + case PGC_state_broken_offlining: > > + nx |= PGC_state_broken_offlining; > > + > > + case PGC_state_offlined: > > + case PGC_state_broken: > > + nx |= PGC_state_broken; > > Shouldn't this be: > > case PGC_state_offlined: > nx |= broken ? PGC_state_offlined : PGC_state_broken; > > case PGC_state_broken: > nx |= PGC_state_broken; > > There are also quite a difference with the default case now. Without > this patch, if you were to add a new state but not handling it here, you > would transition to PGC_state_offlining. With this patch, we will > transtion to 0 (i.e PGC_state_inuse for now). > > PGC_state_* are not an enum, the compiler can't help to catch new state > that doesn't have a corresponding case. So I would suggest to add a > default matching the current behavior and adding an > ASSERT_UNREACHABLE(). Note that I am open to a different kind of > handling here. I revamped this, taking into account your later suggestion of a helper that works on the count_info. Looks more like this: /* If it was already broken, it stays broken */ if ( pgc_is_broken(x) ) broken = 1; if ( pgc_is_offlined(x) || pgc_is(x, free) ) nx |= broken ? PGC_state_broken : PGC_state_offlined; else nx |= broken ? PGC_state_broken_offlining : PGC_state_offlining; > > - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */ > > -#define PGC_state PG_mask(3, 9) > > -#define PGC_state_inuse PG_mask(0, 9) > > -#define PGC_state_offlining PG_mask(1, 9) > > -#define PGC_state_offlined PG_mask(2, 9) > > -#define PGC_state_free PG_mask(3, 9) > > -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) > > + /* > > + * Mutually-exclusive page states: > > + * { inuse, offlining, offlined, free, broken_offlining, broken } > > + */ > > +#define PGC_state PG_mask(7, 9) > > +#define PGC_state_inuse PG_mask(0, 9) > > +#define PGC_state_offlining PG_mask(1, 9) > > +#define PGC_state_offlined PG_mask(2, 9) > > +#define PGC_state_free PG_mask(3, 9) > > +#define PGC_state_broken_offlining PG_mask(4, 9) > > +#define PGC_state_broken PG_mask(5, 9) > > I agree that making all the value aligned make it nicer to read, but > this is increasing number of "unrelated" changes and makes the review > more difficult. > > I would prefer if we leave the indentation alone for the current > #define. But I am not going to push for it :). I'm generally sympathetic to that point of view but at this point, all those page states are kind of being redefined and it makes sense to think about them all; having them all change doesn't hurt.
On Thu, 2020-02-20 at 12:10 +0100, Jan Beulich wrote: > On 07.02.2020 16:57, David Woodhouse wrote: > > @@ -1145,16 +1145,19 @@ static int reserve_offlined_page(struct > > page_info *head) > > > > for ( cur_head = head; cur_head < head + ( 1UL << head_order); > > cur_head++ ) > > { > > - if ( !page_state_is(cur_head, offlined) ) > > + struct page_list_head *list; > > + if ( page_state_is(cur_head, offlined) ) > > + list = &page_offlined_list; > > + else if (page_state_is(cur_head, broken) ) > > + list = &page_broken_list; > > + else > > continue; > > > > avail[node][zone]--; > > total_avail_pages--; > > ASSERT(total_avail_pages >= 0); > > > > - page_list_add_tail(cur_head, > > - test_bit(_PGC_broken, &cur_head->count_info) ? > > - &page_broken_list : &page_offlined_list); > > + page_list_add_tail(cur_head, list); > > While I realize it's fewer comparisons this way, I still wonder > whether for the reader's sake it wouldn't better be > page_is_offlined() first and then page_is_broken() down here. Nah, that would be worse. This way there are two cases which are explicitly handled and the list to use for each of them is explicitly set. The 'if (a||b) … some_function(a ? thing_for_a : thing_for_b)' construct is much less comprehensible. > > @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, > > uint32_t *status) > > do { > > ret = *status = 0; > > > > - if ( y & PGC_broken ) > > + if ( (y & PGC_state) == PGC_state_broken || > > + (y & PGC_state) == PGC_state_broken_offlining ) > > { > > ret = -EINVAL; > > *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; > > break; > > } > > - > > - if ( (y & PGC_state) == PGC_state_offlined ) > > + else if ( (y & PGC_state) == PGC_state_offlined ) > > I don't see a need for adding "else" here. They are mutually exclusive cases. It makes things a whole lot clearer to the reader to put the 'else' there, and sometimes helps a naïve compiler along the way too. > > --- a/xen/include/asm-x86/mm.h > > +++ b/xen/include/asm-x86/mm.h > > @@ -67,18 +67,27 @@ > > /* 3-bit PAT/PCD/PWT cache-attribute hint. */ > > #define PGC_cacheattr_base PG_shift(6) > > #define PGC_cacheattr_mask PG_mask(7, 6) > > - /* Page is broken? */ > > -#define _PGC_broken PG_shift(7) > > -#define PGC_broken PG_mask(1, 7) > > - /* Mutually-exclusive page states: { inuse, offlining, offlined, > > free }. */ > > -#define PGC_state PG_mask(3, 9) > > -#define PGC_state_inuse PG_mask(0, 9) > > -#define PGC_state_offlining PG_mask(1, 9) > > -#define PGC_state_offlined PG_mask(2, 9) > > -#define PGC_state_free PG_mask(3, 9) > > -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == > > PGC_state_##st) > > - > > - /* Count of references to this frame. */ > > + /* > > + * Mutually-exclusive page states: > > + * { inuse, offlining, offlined, free, broken_offlining, broken } > > + */ > > +#define PGC_state PG_mask(7, 9) > > +#define PGC_state_inuse PG_mask(0, 9) > > +#define PGC_state_offlining PG_mask(1, 9) > > +#define PGC_state_offlined PG_mask(2, 9) > > +#define PGC_state_free PG_mask(3, 9) > > +#define PGC_state_broken_offlining PG_mask(4, 9) > > TBH I'd prefer PGC_state_offlining_broken, as it's not the > offlining which is broken, but a broken page is being > offlined. It is the page which is both broken and offlining. Or indeed it is the page which is both offlining and broken. > > +#define PGC_state_broken PG_mask(5, 9) > > + > > +#define page_state_is(pg, st) (((pg)->count_info&PGC_state) > > == PGC_state_##st) > > Blanks around & please. That part I hadn't touched but sure, I'll add those while I'm touching it. I'd already ignored Julien's request *not* to make whitespace cleanups while I'm here, after all :) > > +#define page_is_broken(pg) (page_state_is((pg), > > broken_offlining) || \ > > + page_state_is((pg), broken)) > > +#define page_is_offlined(pg) (page_state_is((pg), broken) > > || \ > > + page_state_is((pg), offlined)) > > The inclusion of "broken" here would seem to deserve a (brief) > comment, either here or next to PGC_state_broken's #define. Done, in the version which will be sent shortly. > > +#define page_is_offlining(pg) (page_state_is((pg), broken_offlining) || \ > > + page_state_is((pg), offlining)) > > Overall I wonder whether the PGC_state_* ordering couldn't be > adjusted such that at least some of these three won't need > two comparisons (by masking off a bit before comparing). The whole point in this exercise is that there isn't a whole bit for these; they are each *two* states out of the possible 8. > Also for all three - no need for extra parentheses around pg > (or in general macro arguments which get handed on without > being part of an expression). Yeah, I'll remove some of those.
On 17.03.2020 22:52, David Woodhouse wrote: > On Thu, 2020-02-20 at 12:10 +0100, Jan Beulich wrote: >> On 07.02.2020 16:57, David Woodhouse wrote: >>> @@ -1145,16 +1145,19 @@ static int reserve_offlined_page(struct >>> page_info *head) >>> >>> for ( cur_head = head; cur_head < head + ( 1UL << head_order); >>> cur_head++ ) >>> { >>> - if ( !page_state_is(cur_head, offlined) ) >>> + struct page_list_head *list; >>> + if ( page_state_is(cur_head, offlined) ) >>> + list = &page_offlined_list; >>> + else if (page_state_is(cur_head, broken) ) >>> + list = &page_broken_list; >>> + else >>> continue; >>> >>> avail[node][zone]--; >>> total_avail_pages--; >>> ASSERT(total_avail_pages >= 0); >>> >>> - page_list_add_tail(cur_head, >>> - test_bit(_PGC_broken, &cur_head->count_info) ? >>> - &page_broken_list : &page_offlined_list); >>> + page_list_add_tail(cur_head, list); >> >> While I realize it's fewer comparisons this way, I still wonder >> whether for the reader's sake it wouldn't better be >> page_is_offlined() first and then page_is_broken() down here. > > Nah, that would be worse. This way there are two cases which are > explicitly handled and the list to use for each of them is explicitly > set. The 'if (a||b) … some_function(a ? thing_for_a : thing_for_b)' > construct is much less comprehensible. It's a matter of taste, I agree, and in such a case I generally advise to see about limiting code churn. For code you then still introduce anew, yes, taste decisions may typically be to the authors judgement (there are exceptions, though). >>> @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, >>> uint32_t *status) >>> do { >>> ret = *status = 0; >>> >>> - if ( y & PGC_broken ) >>> + if ( (y & PGC_state) == PGC_state_broken || >>> + (y & PGC_state) == PGC_state_broken_offlining ) >>> { >>> ret = -EINVAL; >>> *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; >>> break; >>> } >>> - >>> - if ( (y & PGC_state) == PGC_state_offlined ) >>> + else if ( (y & PGC_state) == PGC_state_offlined ) >> >> I don't see a need for adding "else" here. > > They are mutually exclusive cases. It makes things a whole lot clearer > to the reader to put the 'else' there, and sometimes helps a naïve > compiler along the way too. Well, I'm afraid I'm going to be pretty strict about this: It's again a matter of taste, yes, but we generally try to avoid pointless else. What you consider "a whole lot clearer to the reader" is the opposite from my pov. >>> --- a/xen/include/asm-x86/mm.h >>> +++ b/xen/include/asm-x86/mm.h >>> @@ -67,18 +67,27 @@ >>> /* 3-bit PAT/PCD/PWT cache-attribute hint. */ >>> #define PGC_cacheattr_base PG_shift(6) >>> #define PGC_cacheattr_mask PG_mask(7, 6) >>> - /* Page is broken? */ >>> -#define _PGC_broken PG_shift(7) >>> -#define PGC_broken PG_mask(1, 7) >>> - /* Mutually-exclusive page states: { inuse, offlining, offlined, >>> free }. */ >>> -#define PGC_state PG_mask(3, 9) >>> -#define PGC_state_inuse PG_mask(0, 9) >>> -#define PGC_state_offlining PG_mask(1, 9) >>> -#define PGC_state_offlined PG_mask(2, 9) >>> -#define PGC_state_free PG_mask(3, 9) >>> -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == >>> PGC_state_##st) >>> - >>> - /* Count of references to this frame. */ >>> + /* >>> + * Mutually-exclusive page states: >>> + * { inuse, offlining, offlined, free, broken_offlining, broken } >>> + */ >>> +#define PGC_state PG_mask(7, 9) >>> +#define PGC_state_inuse PG_mask(0, 9) >>> +#define PGC_state_offlining PG_mask(1, 9) >>> +#define PGC_state_offlined PG_mask(2, 9) >>> +#define PGC_state_free PG_mask(3, 9) >>> +#define PGC_state_broken_offlining PG_mask(4, 9) >> >> TBH I'd prefer PGC_state_offlining_broken, as it's not the >> offlining which is broken, but a broken page is being >> offlined. > > It is the page which is both broken and offlining. > Or indeed it is the page which is both offlining and broken. I.e. you agree with flipping the two parts around? >>> +#define page_is_offlining(pg) (page_state_is((pg), broken_offlining) || \ >>> + page_state_is((pg), offlining)) >> >> Overall I wonder whether the PGC_state_* ordering couldn't be >> adjusted such that at least some of these three won't need >> two comparisons (by masking off a bit before comparing). > > The whole point in this exercise is that there isn't a whole bit for > these; they are each *two* states out of the possible 8. Sure. But just consider the more general case: Instead of writing if ( i == 6 || i == 7 ) you can as well write if ( (i | 1) == 7 ) Similar for multiple == vs a single <= or >=. Jan
Hi Jan, On 18/03/2020 09:56, Jan Beulich wrote: > On 17.03.2020 22:52, David Woodhouse wrote: >> On Thu, 2020-02-20 at 12:10 +0100, Jan Beulich wrote: >>> On 07.02.2020 16:57, David Woodhouse wrote: >>>> @@ -1145,16 +1145,19 @@ static int reserve_offlined_page(struct >>>> page_info *head) >>>> for ( cur_head = head; cur_head < head + ( 1UL << head_order); >>>> cur_head++ ) >>>> { >>>> - if ( !page_state_is(cur_head, offlined) ) >>>> + struct page_list_head *list; >>>> + if ( page_state_is(cur_head, offlined) ) >>>> + list = &page_offlined_list; >>>> + else if (page_state_is(cur_head, broken) ) >>>> + list = &page_broken_list; >>>> + else >>>> continue; >>>> avail[node][zone]--; >>>> total_avail_pages--; >>>> ASSERT(total_avail_pages >= 0); >>>> - page_list_add_tail(cur_head, >>>> - test_bit(_PGC_broken, >>>> &cur_head->count_info) ? >>>> - &page_broken_list : &page_offlined_list); >>>> + page_list_add_tail(cur_head, list); >>> >>> While I realize it's fewer comparisons this way, I still wonder >>> whether for the reader's sake it wouldn't better be >>> page_is_offlined() first and then page_is_broken() down here. >> >> Nah, that would be worse. This way there are two cases which are >> explicitly handled and the list to use for each of them is explicitly >> set. The 'if (a||b) … some_function(a ? thing_for_a : thing_for_b)' >> construct is much less comprehensible. > > It's a matter of taste, I agree, and in such a case I generally advise > to see about limiting code churn. For code you then still introduce > anew, yes, taste decisions may typically be to the authors judgement > (there are exceptions, though). > >>>> @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, >>>> uint32_t *status) >>>> do { >>>> ret = *status = 0; >>>> - if ( y & PGC_broken ) >>>> + if ( (y & PGC_state) == PGC_state_broken || >>>> + (y & PGC_state) == PGC_state_broken_offlining ) >>>> { >>>> ret = -EINVAL; >>>> *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; >>>> break; >>>> } >>>> - >>>> - if ( (y & PGC_state) == PGC_state_offlined ) >>>> + else if ( (y & PGC_state) == PGC_state_offlined ) >>> >>> I don't see a need for adding "else" here. >> >> They are mutually exclusive cases. It makes things a whole lot clearer >> to the reader to put the 'else' there, and sometimes helps a naïve >> compiler along the way too. > > Well, I'm afraid I'm going to be pretty strict about this: It's again > a matter of taste, yes, but we generally try to avoid pointless else. > What you consider "a whole lot clearer to the reader" is the opposite > from my pov. While I agree the 'else' may be pointless, I don't think it is worth an argument. As the author of the patch, it is his choice to write the code like that. > >>>> --- a/xen/include/asm-x86/mm.h >>>> +++ b/xen/include/asm-x86/mm.h >>>> @@ -67,18 +67,27 @@ >>>> /* 3-bit PAT/PCD/PWT cache-attribute hint. */ >>>> #define PGC_cacheattr_base PG_shift(6) >>>> #define PGC_cacheattr_mask PG_mask(7, 6) >>>> - /* Page is broken? */ >>>> -#define _PGC_broken PG_shift(7) >>>> -#define PGC_broken PG_mask(1, 7) >>>> - /* Mutually-exclusive page states: { inuse, offlining, offlined, >>>> free }. */ >>>> -#define PGC_state PG_mask(3, 9) >>>> -#define PGC_state_inuse PG_mask(0, 9) >>>> -#define PGC_state_offlining PG_mask(1, 9) >>>> -#define PGC_state_offlined PG_mask(2, 9) >>>> -#define PGC_state_free PG_mask(3, 9) >>>> -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == >>>> PGC_state_##st) >>>> - >>>> - /* Count of references to this frame. */ >>>> + /* >>>> + * Mutually-exclusive page states: >>>> + * { inuse, offlining, offlined, free, broken_offlining, broken } >>>> + */ >>>> +#define PGC_state PG_mask(7, 9) >>>> +#define PGC_state_inuse PG_mask(0, 9) >>>> +#define PGC_state_offlining PG_mask(1, 9) >>>> +#define PGC_state_offlined PG_mask(2, 9) >>>> +#define PGC_state_free PG_mask(3, 9) >>>> +#define PGC_state_broken_offlining PG_mask(4, 9) >>> >>> TBH I'd prefer PGC_state_offlining_broken, as it's not the >>> offlining which is broken, but a broken page is being >>> offlined. >> >> It is the page which is both broken and offlining. >> Or indeed it is the page which is both offlining and broken. > > I.e. you agree with flipping the two parts around? > >>>> +#define page_is_offlining(pg) (page_state_is((pg), >>>> broken_offlining) || \ >>>> + page_state_is((pg), offlining)) >>> >>> Overall I wonder whether the PGC_state_* ordering couldn't be >>> adjusted such that at least some of these three won't need >>> two comparisons (by masking off a bit before comparing). >> >> The whole point in this exercise is that there isn't a whole bit for >> these; they are each *two* states out of the possible 8. > > Sure. But just consider the more general case: Instead of writing > > if ( i == 6 || i == 7 ) > > you can as well write > > if ( (i | 1) == 7 ) I stumbled accross a few of those recently and this is not the obvious things to read. Yes, your code may be faster. But is it really worth it compare to the cost of readability and futureproofness? Cheers,
On 18.03.2020 13:31, Julien Grall wrote: > On 18/03/2020 09:56, Jan Beulich wrote: >> On 17.03.2020 22:52, David Woodhouse wrote: >>> On Thu, 2020-02-20 at 12:10 +0100, Jan Beulich wrote: >>>> On 07.02.2020 16:57, David Woodhouse wrote: >>>>> +#define page_is_offlining(pg) (page_state_is((pg), broken_offlining) || \ >>>>> + page_state_is((pg), offlining)) >>>> >>>> Overall I wonder whether the PGC_state_* ordering couldn't be >>>> adjusted such that at least some of these three won't need >>>> two comparisons (by masking off a bit before comparing). >>> >>> The whole point in this exercise is that there isn't a whole bit for >>> these; they are each *two* states out of the possible 8. >> >> Sure. But just consider the more general case: Instead of writing >> >> if ( i == 6 || i == 7 ) >> >> you can as well write >> >> if ( (i | 1) == 7 ) > > I stumbled accross a few of those recently and this is not the > obvious things to read. Depends on the reader, I guess. > Yes, your code may be faster. But is it really worth it compare > to the cost of readability and futureproofness? Not sure how being future proof comes into play here. I'm not suggesting to use literal numbers. I'd also be happy to be proven wrong in assuming that the compiler still can't do such transformations itself; it couldn't when I check a while back. Reducing the number of conditional branches is, imo, worth at least some effort at the source level. Jan
On Wed, 2020-03-18 at 12:31 +0000, Julien Grall wrote: > On 18/03/2020 09:56, Jan Beulich wrote: > > On 17.03.2020 22:52, David Woodhouse wrote: > > > On Thu, 2020-02-20 at 12:10 +0100, Jan Beulich wrote: > > > > > @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, > > > > > uint32_t *status) > > > > > do { > > > > > ret = *status = 0; > > > > > - if ( y & PGC_broken ) > > > > > + if ( (y & PGC_state) == PGC_state_broken || > > > > > + (y & PGC_state) == PGC_state_broken_offlining ) > > > > > { > > > > > ret = -EINVAL; > > > > > *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; > > > > > break; > > > > > } > > > > > - > > > > > - if ( (y & PGC_state) == PGC_state_offlined ) > > > > > + else if ( (y & PGC_state) == PGC_state_offlined ) > > > > > > > > I don't see a need for adding "else" here. > > > > > > They are mutually exclusive cases. It makes things a whole lot clearer > > > to the reader to put the 'else' there, and sometimes helps a naïve > > > compiler along the way too. > > > > Well, I'm afraid I'm going to be pretty strict about this: It's again > > a matter of taste, yes, but we generally try to avoid pointless else. > > What you consider "a whole lot clearer to the reader" is the opposite > > from my pov. > > While I agree the 'else' may be pointless, I don't think it is worth an > argument. As the author of the patch, it is his choice to write the code > like that. Indeed. While I appreciate your insight, Jan, and your detailed reviews are undoubtedly helpful — especially to me as I poke around the Xen code base without knowing where the bodies are buried — I do sometimes find that it degenerates into what appears to be gratuitous bikeshedding. Like *some* others, I'm perfectly capable of responding "I understand you would have done it differently, but I prefer it this way". But even for those like me who have the self-confidence (or arrogance?) to respond in such a way, the end result is often the same — a patch series which the maintainer doesn't apply because it has "unresolved issues". Perfect is the enemy of good. Especially when perfection is so subjective. This definitely isn't the kind of welcoming community that I enjoy trying to get my junior engineers to contribute to. And they aren't snowflakes; they cope with the Linux community just fine, for the most part. Earlier today, I found myself adjusting a patch in order to tweak the behaviour of a "can never happen" situation, when it was far from clear that the *existing* behaviour in that situation would have been correct anyway. There is a lot of value in your reviews, and they are appreciated. But the overall effect is seen by some as making the Xen community somewhat dysfunctional. The -MP makefile patch I posted yesterday... I almost didn't bother. And when I allowed myself to be talked into it, I was entirely unsurprised when a review came in basically asking me to prove a negative before the patch could proceed. So as far as I can tell, it'll fall by the wayside and the build will remain broken any time anyone removes or renames a header file. Because life's too short to invest the energy to make improvements like that. One of these days, I may attempt to revive my series cleaning up the 16-bit and 32-bit boot code. Which was a clear improvement and simplification, and again you gave extremely valid feedback which helped to improve it — but again it was interspersed with more subjective and less helpful comments which basically derailed it. Having carefully threaded my way through the existing byzantine code and made incremental bisectable changes, I ended up with feedback that basically would have required me to start again from scratch, in order to satisfy what appeared to be fairly arbitrary and subjective demands. As is often the case in creating a bisectable series out of complex changes, I had sometimes moved/refactored code, only to move/refactor it again in a subsequent patch. Sometimes the ordering of such inter- related changes can be fairly arbitrary, and I had made my choices as I'd progressed; the real focus being the end result. At one point you were picking on intermediate details of how I'd made my overall series bisectable, and seemed to be demanding that I go back and refactor (the intermediate stages for no better reason than because you would have done it differently. Again, your attention to detail and your expertise are massively appreciated. But please let's remember that "perfect is the enemy of good", and strike a balance which allows forward progress without blocking improvements. Sometimes I wonder if you truly realise how much you derail the progress of a patch series just by raising well-intentioned "queries" around it. > > > > > --- a/xen/include/asm-x86/mm.h > > > > > +++ b/xen/include/asm-x86/mm.h > > > > > @@ -67,18 +67,27 @@ > > > > > /* 3-bit PAT/PCD/PWT cache-attribute hint. */ > > > > > #define PGC_cacheattr_base PG_shift(6) > > > > > #define PGC_cacheattr_mask PG_mask(7, 6) > > > > > - /* Page is broken? */ > > > > > -#define _PGC_broken PG_shift(7) > > > > > -#define PGC_broken PG_mask(1, 7) > > > > > - /* Mutually-exclusive page states: { inuse, offlining, offlined, > > > > > free }. */ > > > > > -#define PGC_state PG_mask(3, 9) > > > > > -#define PGC_state_inuse PG_mask(0, 9) > > > > > -#define PGC_state_offlining PG_mask(1, 9) > > > > > -#define PGC_state_offlined PG_mask(2, 9) > > > > > -#define PGC_state_free PG_mask(3, 9) > > > > > -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == > > > > > PGC_state_##st) > > > > > - > > > > > - /* Count of references to this frame. */ > > > > > + /* > > > > > + * Mutually-exclusive page states: > > > > > + * { inuse, offlining, offlined, free, broken_offlining, broken } > > > > > + */ > > > > > +#define PGC_state PG_mask(7, 9) > > > > > +#define PGC_state_inuse PG_mask(0, 9) > > > > > +#define PGC_state_offlining PG_mask(1, 9) > > > > > +#define PGC_state_offlined PG_mask(2, 9) > > > > > +#define PGC_state_free PG_mask(3, 9) > > > > > +#define PGC_state_broken_offlining PG_mask(4, 9) > > > > > > > > TBH I'd prefer PGC_state_offlining_broken, as it's not the > > > > offlining which is broken, but a broken page is being > > > > offlined. > > > > > > It is the page which is both broken and offlining. > > > Or indeed it is the page which is both offlining and broken. > > > > I.e. you agree with flipping the two parts around? I hope I have respectfully made it clear that no, I'm really not happy with the very concept of such a request. Perhaps it would be easier for me to acquiesce, in the short term. But on the whole I think it's better to put my foot down and say 'no', and focus on real work and things that matter. > > > > > +#define page_is_offlining(pg) (page_state_is((pg), > > > > > broken_offlining) || \ > > > > > + page_state_is((pg), offlining)) > > > > > > > > Overall I wonder whether the PGC_state_* ordering couldn't be > > > > adjusted such that at least some of these three won't need > > > > two comparisons (by masking off a bit before comparing). > > > > > > The whole point in this exercise is that there isn't a whole bit for > > > these; they are each *two* states out of the possible 8. > > > > Sure. But just consider the more general case: Instead of writing > > > > if ( i == 6 || i == 7 ) > > > > you can as well write > > > > if ( (i | 1) == 7 ) > > I stumbled accross a few of those recently and this is not the obvious > things to read. Yes, your code may be faster. But is it really worth it > compare to the cost of readability and futureproofness? No. Just no. If that kind of change is really a worthwhile win, it'll depend on the CPU. File a GCC PR with a test case as a missed optimisation. Don't make the source code gratuitously harder to read. Honestly, this, right here, is the *epitome* of why I, and others, sometimes feel that submitting a patch to Xen can be more effort than it's worth. This email is not intended as a personal attack; I hope you don't feel that it is. For about the fifth time: your careful reviews and your attention to detail are *massively* appreciated. Just a little over the time sometimes, and I'd like to ask you to take care to be aware of the overall effect, and that you are not blocking progress. Thanks.
On 18.03.2020 18:13, David Woodhouse wrote: > On Wed, 2020-03-18 at 12:31 +0000, Julien Grall wrote: >> On 18/03/2020 09:56, Jan Beulich wrote: >>> On 17.03.2020 22:52, David Woodhouse wrote: >>>> On Thu, 2020-02-20 at 12:10 +0100, Jan Beulich wrote: >>>>>> @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, >>>>>> uint32_t *status) >>>>>> do { >>>>>> ret = *status = 0; >>>>>> - if ( y & PGC_broken ) >>>>>> + if ( (y & PGC_state) == PGC_state_broken || >>>>>> + (y & PGC_state) == PGC_state_broken_offlining ) >>>>>> { >>>>>> ret = -EINVAL; >>>>>> *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; >>>>>> break; >>>>>> } >>>>>> - >>>>>> - if ( (y & PGC_state) == PGC_state_offlined ) >>>>>> + else if ( (y & PGC_state) == PGC_state_offlined ) >>>>> >>>>> I don't see a need for adding "else" here. >>>> >>>> They are mutually exclusive cases. It makes things a whole lot clearer >>>> to the reader to put the 'else' there, and sometimes helps a naïve >>>> compiler along the way too. >>> >>> Well, I'm afraid I'm going to be pretty strict about this: It's again >>> a matter of taste, yes, but we generally try to avoid pointless else. >>> What you consider "a whole lot clearer to the reader" is the opposite >>> from my pov. >> >> While I agree the 'else' may be pointless, I don't think it is worth an >> argument. As the author of the patch, it is his choice to write the code >> like that. > > Indeed. While I appreciate your insight, Jan, and your detailed reviews > are undoubtedly helpful — especially to me as I poke around the Xen > code base without knowing where the bodies are buried — I do sometimes > find that it degenerates into what appears to be gratuitous > bikeshedding. > > Like *some* others, I'm perfectly capable of responding "I understand > you would have done it differently, but I prefer it this way". > > But even for those like me who have the self-confidence (or arrogance?) > to respond in such a way, the end result is often the same — a patch > series which the maintainer doesn't apply because it has "unresolved > issues". > > Perfect is the enemy of good. Especially when perfection is so > subjective. > > This definitely isn't the kind of welcoming community that I enjoy > trying to get my junior engineers to contribute to. And they aren't > snowflakes; they cope with the Linux community just fine, for the most > part. I appreciate your open an honest feedback, and having had similar comments in the past I can assure you that I've already tried to adjust where I find this acceptable. I take it you realize that there are two limitations in this - trying doesn't mean succeeding, and the boundaries of what I'd consider acceptable to let go with no comments. Of course there are always two sides of the medal. As a maintainer of some piece of code, I view it as my responsibility to look after not only the technical correctness of that code, but also after its style (in the broadest sense of the word). Looking at some very bad examples in our tree, many of which I'm afraid have a Linux origin, I'm in particular of the opinion that consistent style is a significant aid in code readability and maintainability. And I hope you agree that _style_ adjustments are pretty easy to make, so I don't view asking for such as placing a significant burden on the submitter. The alternative of letting it go uncommented and then take the time myself to clean up seems quite a bit worse to me, not the least because of this scaling even less well than the amount of code review that needs doing. The mentioned Linux origin of some of the particularly bad examples in our tree is why I view your "they cope with the Linux community just fine" as not really applicable. This is despite our subsequent changes to those files often having made the situation worse rather than better. To some degree the same goes for bigger than necessary code churn, albeit I agree that in a number of cases it is far less objective to judge than the aim for consistent style. Extra code churn instead is often making review harder, irrespective of the often good intentions behind doing so. > There is a lot of value in your reviews, and they are appreciated. But > the overall effect is seen by some as making the Xen community somewhat > dysfunctional. In which case I ought to consider, of course after first checking with my management, to step back as a maintainer. I'd very much regret doing so, but if it's in the interest of the community ... (As an aside, likely being among those doing the largest part of code reviews, helping with that part of the overall workload the project generates would reduce the number of reviews I'd have to do, and hence the chances of me giving comments viewed as unhelpful or worse by submitters. Or, to put it in different, frank, but hopefully not offending words - I'd like to see you do a fair amount of code review, including looking after merely cosmetic aspects in the spirit of our written and unwritten rules, before you actually comment on me going too far with some of my feedback. And without me wanting to put too much emphasis on this: It is my opinion that maintainer views generally have somewhat higher weight than non-maintainer ones. I'm not going to claim though there aren't cases where I might go too far and hence abuse rather than use this, but as per above I can only try to avoid doing so, I can't promise to succeed. And of course I, like others, can be convinced to be wrong.) > The -MP makefile patch I posted yesterday... I almost didn't bother. > And when I allowed myself to be talked into it, I was entirely > unsurprised when a review came in basically asking me to prove a > negative before the patch could proceed. So as far as I can tell, it'll > fall by the wayside and the build will remain broken any time anyone > removes or renames a header file. Because life's too short to invest > the energy to make improvements like that. So are you saying that as a maintainer I should let go uncommented a change which I'm unconvinced doesn't have negative side effects, besides its positive intended behavioral change? The more that here the workaround is rather trivial? As you may imagine, I've run into the situation myself a number of times, without considering this a reason to make any adjustments to the build machinery. >>>>>> --- a/xen/include/asm-x86/mm.h >>>>>> +++ b/xen/include/asm-x86/mm.h >>>>>> @@ -67,18 +67,27 @@ >>>>>> /* 3-bit PAT/PCD/PWT cache-attribute hint. */ >>>>>> #define PGC_cacheattr_base PG_shift(6) >>>>>> #define PGC_cacheattr_mask PG_mask(7, 6) >>>>>> - /* Page is broken? */ >>>>>> -#define _PGC_broken PG_shift(7) >>>>>> -#define PGC_broken PG_mask(1, 7) >>>>>> - /* Mutually-exclusive page states: { inuse, offlining, offlined, >>>>>> free }. */ >>>>>> -#define PGC_state PG_mask(3, 9) >>>>>> -#define PGC_state_inuse PG_mask(0, 9) >>>>>> -#define PGC_state_offlining PG_mask(1, 9) >>>>>> -#define PGC_state_offlined PG_mask(2, 9) >>>>>> -#define PGC_state_free PG_mask(3, 9) >>>>>> -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == >>>>>> PGC_state_##st) >>>>>> - >>>>>> - /* Count of references to this frame. */ >>>>>> + /* >>>>>> + * Mutually-exclusive page states: >>>>>> + * { inuse, offlining, offlined, free, broken_offlining, broken } >>>>>> + */ >>>>>> +#define PGC_state PG_mask(7, 9) >>>>>> +#define PGC_state_inuse PG_mask(0, 9) >>>>>> +#define PGC_state_offlining PG_mask(1, 9) >>>>>> +#define PGC_state_offlined PG_mask(2, 9) >>>>>> +#define PGC_state_free PG_mask(3, 9) >>>>>> +#define PGC_state_broken_offlining PG_mask(4, 9) >>>>> >>>>> TBH I'd prefer PGC_state_offlining_broken, as it's not the >>>>> offlining which is broken, but a broken page is being >>>>> offlined. >>>> >>>> It is the page which is both broken and offlining. >>>> Or indeed it is the page which is both offlining and broken. >>> >>> I.e. you agree with flipping the two parts around? > > I hope I have respectfully made it clear that no, I'm really not happy > with the very concept of such a request. > > Perhaps it would be easier for me to acquiesce, in the short term. > > But on the whole I think it's better to put my foot down and say 'no', > and focus on real work and things that matter. Well, in the specific case here I've meanwhile realized that my alternative naming suggested in in no way less ambiguous. So stick to what you've chosen, albeit I continue to dislike the name ambiguously also suggesting that the offlining operation might be broken (e.g. as in "can't be offlined"), rather than the page itself. I'm not going to exclude though that this is just because of not being a native English speaker. >>>>>> +#define page_is_offlining(pg) (page_state_is((pg), >>>>>> broken_offlining) || \ >>>>>> + page_state_is((pg), offlining)) >>>>> >>>>> Overall I wonder whether the PGC_state_* ordering couldn't be >>>>> adjusted such that at least some of these three won't need >>>>> two comparisons (by masking off a bit before comparing). >>>> >>>> The whole point in this exercise is that there isn't a whole bit for >>>> these; they are each *two* states out of the possible 8. >>> >>> Sure. But just consider the more general case: Instead of writing >>> >>> if ( i == 6 || i == 7 ) >>> >>> you can as well write >>> >>> if ( (i | 1) == 7 ) >> >> I stumbled accross a few of those recently and this is not the obvious >> things to read. Yes, your code may be faster. But is it really worth it >> compare to the cost of readability and futureproofness? > > No. Just no. > > If that kind of change is really a worthwhile win, it'll depend on the > CPU. File a GCC PR with a test case as a missed optimisation. Your experience may be different, but I hardly ever see any effect from reporting bugs (not just against gcc) unless they're really bad or really easy to address. That's why I generally prefer to fix such issues myself, provided of course that I can find the time. > Don't make the source code gratuitously harder to read. That's a very subjective aspect again. Personally I find two comparisons of the same variable against different constants harder to read. > Honestly, this, right here, is the *epitome* of why I, and others, > sometimes feel that submitting a patch to Xen can be more effort than > it's worth. Note how I said "I wonder", not "please make". Jan
On Thu, 2020-03-19 at 09:49 +0100, Jan Beulich wrote: > On 18.03.2020 18:13, David Woodhouse wrote: > > On Wed, 2020-03-18 at 12:31 +0000, Julien Grall wrote: > > > On 18/03/2020 09:56, Jan Beulich wrote: > > > > On 17.03.2020 22:52, David Woodhouse wrote: > > > > > On Thu, 2020-02-20 at 12:10 +0100, Jan Beulich wrote: > > > > > > > @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, > > > > > > > uint32_t *status) > > > > > > > do { > > > > > > > ret = *status = 0; > > > > > > > - if ( y & PGC_broken ) > > > > > > > + if ( (y & PGC_state) == PGC_state_broken || > > > > > > > + (y & PGC_state) == PGC_state_broken_offlining ) > > > > > > > { > > > > > > > ret = -EINVAL; > > > > > > > *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; > > > > > > > break; > > > > > > > } > > > > > > > - > > > > > > > - if ( (y & PGC_state) == PGC_state_offlined ) > > > > > > > + else if ( (y & PGC_state) == PGC_state_offlined ) > > > > > > > > > > > > I don't see a need for adding "else" here. > > > > > > > > > > They are mutually exclusive cases. It makes things a whole lot clearer > > > > > to the reader to put the 'else' there, and sometimes helps a naïve > > > > > compiler along the way too. > > > > > > > > Well, I'm afraid I'm going to be pretty strict about this: It's again > > > > a matter of taste, yes, but we generally try to avoid pointless else. > > > > What you consider "a whole lot clearer to the reader" is the opposite > > > > from my pov. > > > > > > While I agree the 'else' may be pointless, I don't think it is worth an > > > argument. As the author of the patch, it is his choice to write the code > > > like that. > > > > Indeed. While I appreciate your insight, Jan, and your detailed reviews > > are undoubtedly helpful — especially to me as I poke around the Xen > > code base without knowing where the bodies are buried — I do sometimes > > find that it degenerates into what appears to be gratuitous > > bikeshedding. > > > > Like *some* others, I'm perfectly capable of responding "I understand > > you would have done it differently, but I prefer it this way". > > > > But even for those like me who have the self-confidence (or arrogance?) > > to respond in such a way, the end result is often the same — a patch > > series which the maintainer doesn't apply because it has "unresolved > > issues". > > > > Perfect is the enemy of good. Especially when perfection is so > > subjective. > > > > This definitely isn't the kind of welcoming community that I enjoy > > trying to get my junior engineers to contribute to. And they aren't > > snowflakes; they cope with the Linux community just fine, for the most > > part. > > I appreciate your open an honest feedback, and having had similar > comments in the past I can assure you that I've already tried to > adjust where I find this acceptable. I take it you realize that > there are two limitations in this - trying doesn't mean succeeding, > and the boundaries of what I'd consider acceptable to let go with > no comments. > > Of course there are always two sides of the medal. > > As a maintainer of some piece of code, I view it as my > responsibility to look after not only the technical correctness of > that code, but also after its style (in the broadest sense of the > word). Looking at some very bad examples in our tree, many of > which I'm afraid have a Linux origin, I'm in particular of the > opinion that consistent style is a significant aid in code > readability and maintainability. And I hope you agree that _style_ > adjustments are pretty easy to make, so I don't view asking for > such as placing a significant burden on the submitter. The > alternative of letting it go uncommented and then take the time > myself to clean up seems quite a bit worse to me, not the least > because of this scaling even less well than the amount of code > review that needs doing. Yes, 100% agreed. And I'll even concede that for the cases of moving code around that happens to not conform to the current style, and asking contributors to fix it up as they go. I was agreeing with you on that point, while simultaneously telling Julien "nah, I'll fix it while I'm here" when he suggested that I *not* realign the PGC_state bit definitions. > The mentioned Linux origin of some of the particularly bad > examples in our tree is why I view your "they cope with the Linux > community just fine" as not really applicable. This is despite > our subsequent changes to those files often having made the > situation worse rather than better. Was more about the community effect than technical matters, but let's not rathole on that. > To some degree the same goes for bigger than necessary code churn, > albeit I agree that in a number of cases it is far less objective > to judge than the aim for consistent style. Extra code churn > instead is often making review harder, irrespective of the often > good intentions behind doing so. Completely agreed. > > There is a lot of value in your reviews, and they are appreciated. But > > the overall effect is seen by some as making the Xen community somewhat > > dysfunctional. > > In which case I ought to consider, of course after first checking > with my management, to step back as a maintainer. I'd very much > regret doing so, but if it's in the interest of the community ... I definitely don't think that would be in the interest of the community. As I think I may have mentioned once or twice in my previous message, your detailed reviews are massively appreciated and useful. > (As an aside, likely being among those doing the largest part of > code reviews, helping with that part of the overall workload the > project generates would reduce the number of reviews I'd have to > do, and hence the chances of me giving comments viewed as > unhelpful or worse by submitters. Or, to put it in different, > frank, but hopefully not offending words - I'd like to see you do > a fair amount of code review, including looking after merely > cosmetic aspects in the spirit of our written and unwritten rules, > before you actually comment on me going too far with some of my > feedback. And without me wanting to put too much emphasis on this: > It is my opinion that maintainer views generally have somewhat > higher weight than non-maintainer ones. I'm not going to claim > though there aren't cases where I might go too far and hence abuse > rather than use this, but as per above I can only try to avoid > doing so, I can't promise to succeed. And of course I, like others, > can be convinced to be wrong.) Understood. > > The -MP makefile patch I posted yesterday... I almost didn't bother. > > And when I allowed myself to be talked into it, I was entirely > > unsurprised when a review came in basically asking me to prove a > > negative before the patch could proceed. So as far as I can tell, it'll > > fall by the wayside and the build will remain broken any time anyone > > removes or renames a header file. Because life's too short to invest > > the energy to make improvements like that. > > So are you saying that as a maintainer I should let go uncommented a > change which I'm unconvinced doesn't have negative side effects, > besides its positive intended behavioral change? The more that here > the workaround is rather trivial? As you may imagine, I've run into > the situation myself a number of times, without considering this a > reason to make any adjustments to the build machinery. Jan, I would respectfully request that you take another look at your initial response, but put yourself in the shoes of a patch submitter: https://lists.xenproject.org/archives/html/xen-devel/2020-03/msg01171.html You mention a "simple" workaround... but the workaround I've been using is to manually remove the offending .o.d files, one at a time (or at least one directory at a time), until the broken build starts working again. Is that what you meant? And you really didn't ever consider that it should be fixed? And the substance of the response is basically saying "this is voodoo and we can't touch it or unspecified things might break, but I have no idea where to tell you to look." Looking back I realise that the concern about phony rules overriding pattern rules didn't even come from you; your concern was more nebulous and unaddressable. It looks like I came up with a straw man and shot *that* down in my later analysis (although that wasn't my intent; I think the concern about pattern rules really did come from somewhere). You asked a question about "why isn't this default behaviour", which is kind of a silly question when asking about an option (-MP) that was added to GCC almost a decade after the initial -MD behaviour was established. Of *course* they didn't retroactively change the default. Read that message again from the point of view of a contributor. Pretend it isn't even me; pretend it's someone attempting to make their first, trivial, improvement to the Xen ecosystem. I hope you'll understand why my initial reaction was just a monosyllabic 'no'. > > > > > > > +#define PGC_state_broken_offlining PG_mask(4, 9) > > > > > > > > > > > > TBH I'd prefer PGC_state_offlining_broken, as it's not the > > > > > > offlining which is broken, but a broken page is being > > > > > > offlined. > > > > > > > > > > It is the page which is both broken and offlining. > > > > > Or indeed it is the page which is both offlining and broken. > > > > > > > > I.e. you agree with flipping the two parts around? > > > > I hope I have respectfully made it clear that no, I'm really not happy > > with the very concept of such a request. > > > > Perhaps it would be easier for me to acquiesce, in the short term. > > > > But on the whole I think it's better to put my foot down and say 'no', > > and focus on real work and things that matter. > > Well, in the specific case here I've meanwhile realized that my > alternative naming suggested in in no way less ambiguous. So > stick to what you've chosen, albeit I continue to dislike the > name ambiguously also suggesting that the offlining operation > might be broken (e.g. as in "can't be offlined"), rather than the > page itself. I'm not going to exclude though that this is just > because of not being a native English speaker. As a native English speaker, the naming of these bothered me too. They're too long and redundant. But subsuming the PGC_broken but into a 3-bit PGC_state makes sense, and we can't abandon that idea purely because we can't come up with a *name* that fills us with joy. There wasn't a *good* answer. I vacillated for a while, and picked the one that offended me least. And then ended up in a debate about it when it really wasn't important. > > > > > > > +#define page_is_offlining(pg) (page_state_is((pg), > > > > > > > broken_offlining) || \ > > > > > > > + page_state_is((pg), offlining)) > > > > > > > > > > > > Overall I wonder whether the PGC_state_* ordering couldn't be > > > > > > adjusted such that at least some of these three won't need > > > > > > two comparisons (by masking off a bit before comparing). > > > > > > > > > > The whole point in this exercise is that there isn't a whole bit for > > > > > these; they are each *two* states out of the possible 8. > > > > > > > > Sure. But just consider the more general case: Instead of writing > > > > > > > > if ( i == 6 || i == 7 ) > > > > > > > > you can as well write > > > > > > > > if ( (i | 1) == 7 ) > > > > > > I stumbled accross a few of those recently and this is not the obvious > > > things to read. Yes, your code may be faster. But is it really worth it > > > compare to the cost of readability and futureproofness? > > > > No. Just no. > > > > If that kind of change is really a worthwhile win, it'll depend on the > > CPU. File a GCC PR with a test case as a missed optimisation. > > Your experience may be different, but I hardly ever see any effect from > reporting bugs (not just against gcc) unless they're really bad or really > easy to address. That's why I generally prefer to fix such issues myself, > provided of course that I can find the time. Perhaps so. But if I *don't* file it, it *certainly* doesn't get fixed. And I've learned over the years *not* to second-guess the optimisations that today's compiler might make, with the wind blowing in this particular direction. FWIW 'return (x == 6 || x == 7)' ends up being emitted by GCC on x86 as subl $6, %edi xorl %eax, %eax cmpl $1, %edi setbe %al ret And 'return (x == 5 || x == 7)' gives: andl $-3, %edi xorl %eax, %eax cmpl $5, %edi sete %al ret So it does look like GCC is actually doing its job, on this occasion. But that's entirely beside the point, which is that I'm having some pointless discussion about compiler optimisation minutiæ when fixing PGC_broken was *already* deep into yak-shaving for the improvement I was *actually* trying to make. It's distracting from real work, raising barriers to getting fixes merged. > > Don't make the source code gratuitously harder to read. > > That's a very subjective aspect again. Personally I find two comparisons > of the same variable against different constants harder to read. > > > Honestly, this, right here, is the *epitome* of why I, and others, > > sometimes feel that submitting a patch to Xen can be more effort than > > it's worth. > > Note how I said "I wonder", not "please make". Perspective again. That distinction really doesn't matter. Perhaps you underestimate the weight your words carry, as a well-respected maintainer. You can't negate that effect purely by word tricks like saying 'I wonder'. Because understatement is a very common tool in the English language, especially in British English — and we've all seen people write "I wonder if you should..." when what was really meant was "I will set fire to you if you don't...". Understatement like that doesn't work. It still derails the patch review. It just didn't need to be said, in that context. Let me repeat — because I've only said it once today, I think, that your reviews are incredibly useful. I'm only asking that you recognise the weight that your 'wondering' can have, and recognise when something you are asking for is *subjective*. A review is not about "is this code precisely how it would look if I had written it myself", but it is about "is this code correct and maintainable". Sometimes, as in the example with the PGC_state_ naming above, there isn't a "nice" answer. We pick the solution that offends us least. And I completely understand as a maintainer, what it's like to be on the receiving end of such a choice. You think "that could be nicer"... and have to work through the alternatives yourself before you realise that actually, it was the best of the choices available. Each of the responses I've identified from you as 'excessive' has some merit, we can focus on each of them and you can justify them, to a certain extent. But as a whole, the effect is of a barrage of nitpicks of questionable utility which really does hinder forward progress. Let's try to focus on comments which will genuinely make the code better. It's not that we should deliberately stop paying attention to detail, or deliberately allow buggy and broken code into the tree. It's that we should be aware that "perfect is the enemy of good". For my part, I'll stop whining at you now. If I end up giving responses to parts of your code review which seem to be along the lines of "that's nice, dear, but I didn't think so and I did the typing", please hark back to this conversation. I'll try to phrase them more appropriately than that, but no promises :) Thanks.
On 19.03.2020 11:26, David Woodhouse wrote: > On Thu, 2020-03-19 at 09:49 +0100, Jan Beulich wrote: >> On 18.03.2020 18:13, David Woodhouse wrote: >>> The -MP makefile patch I posted yesterday... I almost didn't bother. >>> And when I allowed myself to be talked into it, I was entirely >>> unsurprised when a review came in basically asking me to prove a >>> negative before the patch could proceed. So as far as I can tell, it'll >>> fall by the wayside and the build will remain broken any time anyone >>> removes or renames a header file. Because life's too short to invest >>> the energy to make improvements like that. >> >> So are you saying that as a maintainer I should let go uncommented a >> change which I'm unconvinced doesn't have negative side effects, >> besides its positive intended behavioral change? The more that here >> the workaround is rather trivial? As you may imagine, I've run into >> the situation myself a number of times, without considering this a >> reason to make any adjustments to the build machinery. > > Jan, I would respectfully request that you take another look at your > initial response, but put yourself in the shoes of a patch submitter: > https://lists.xenproject.org/archives/html/xen-devel/2020-03/msg01171.html > > You mention a "simple" workaround... but the workaround I've been using > is to manually remove the offending .o.d files, one at a time (or at > least one directory at a time), until the broken build starts working > again. Is that what you meant? And you really didn't ever consider that > it should be fixed? No, the workaround I've been using (after initially doing the expensive one you describe) was to simply put in an empty file (or perhaps one with an #error directive) in the place of the deleted one, rebuild, and all .*.o.d files would have been updated. I might do so even before fully deleting the file. > And the substance of the response is basically saying "this is voodoo > and we can't touch it or unspecified things might break, but I have no > idea where to tell you to look." > > Looking back I realise that the concern about phony rules overriding > pattern rules didn't even come from you; your concern was more nebulous > and unaddressable. It looks like I came up with a straw man and shot > *that* down in my later analysis (although that wasn't my intent; I > think the concern about pattern rules really did come from somewhere). > > You asked a question about "why isn't this default behaviour", which is > kind of a silly question when asking about an option (-MP) that was > added to GCC almost a decade after the initial -MD behaviour was > established. Of *course* they didn't retroactively change the default. I don't see at all why this would be "of course" - if there really was no undue side effect, why couldn't they? > Read that message again from the point of view of a contributor. > Pretend it isn't even me; pretend it's someone attempting to make their > first, trivial, improvement to the Xen ecosystem. > > I hope you'll understand why my initial reaction was just a > monosyllabic 'no'. To be honest- no, I don't. I didn't say "no way". Instead I asked back to see whether there's more background to this. It is a useful piece of information to know that -MP post-dates -MD by 10 or more years. It's still speculation of why a new option was added rather than making this default behavior, but I feel less afraid of the change this way than by an implied "this not going to do any harm" without really being certain why there is a separate option in the first place (and gcc doc also not saying anything to this effect). I can certainly follow your sentiment, not the least because especially in my early days I also frequently got back replies I didn't like, in various projects. Yet in a case like this one I'm afraid it is not the reviewer's job to point out the unsafety of a change, but it's the submitter who has to (sufficiently) prove that a change won't break anything. Yes, in the typical case, when there's a recognizable bug, the reviewer would point this out. But there are cases where there's no obvious bug, but experience (and, as so often, insufficient documentation) tells one to be wary of changes of, possibly, any kind. Jan
On Thu, 2020-03-19 at 12:59 +0100, Jan Beulich wrote: > > Read that message again from the point of view of a contributor. > > Pretend it isn't even me; pretend it's someone attempting to make > > their first, trivial, improvement to the Xen ecosystem. > > > > I hope you'll understand why my initial reaction was just a > > monosyllabic 'no'. > > To be honest- no, I don't. I didn't say "no way". Then you have completely missed my point about how subtly understating your 'objections' makes no difference at all to the outcome. But OK, I'll come back to that at the end. You have made your intent clear, more than once now, and we should take it on board. > Instead I asked back to see whether there's more background to this. > It is a useful piece of information to know that -MP post-dates -MD > by 10 or more years. It's still speculation of why a new option was > added rather than making this default behavior, but I feel less > afraid of the change this way than by an implied "this not going to > do any harm" without really being certain why there is a separate > option in the first place (and gcc doc also not saying anything to > this effect). It is not my job to make you feel less afraid of change. > I can certainly follow your sentiment, not the least because > especially in my early days I also frequently got back replies I > didn't like, in various projects. Yet in a case like this one I'm > afraid it is not the reviewer's job to point out the unsafety of > a change, but it's the submitter who has to (sufficiently) prove > that a change won't break anything. I'm sure you didn't mean it as such, Jan, but FYI that response could be construed as being fairly patronising. If you were to direct it towards someone who's even remotely inclined to feeling patronised, that is. :) > Yes, in the typical case, when there's a recognizable bug, the > reviewer would point this out. But there are cases where there's no > obvious bug, but experience (and, as so often, insufficient > documentation) tells one to be wary of changes of, possibly, any > kind. I find this response to be purely obstructive and unhelpful. Your response to my patch was basically asking me to prove a negative, and I find myself surprised and disappointed that you cannot acknowledge that. I didn't think our viewpoints were really that far apart; perhaps I was wrong. If there was an actual bug — or even the suspicion of a bug — I could understand it. But this is just voodoo "we're too scared to change things because we don't understand". We are better than that. You can be better than that. But I will take on board your comments about understatement and the fact that you hadn't actually said "no". In future I shall consider merely ignoring such interjections unless you explicitly state that you are blocking the acceptance of a patch. Or, I suppose, resorting to the style of monosyllabic answer that I had originally given in this case. I trust that maintainers will take that on board too, and that open "questions" from you in a thread will not be considered sufficient reason not to merge a patch. That seems to be what you're saying is your intent, yes?
On 19.03.2020 14:54, David Woodhouse wrote: > On Thu, 2020-03-19 at 12:59 +0100, Jan Beulich wrote: >>> Read that message again from the point of view of a contributor. >>> Pretend it isn't even me; pretend it's someone attempting to make >>> their first, trivial, improvement to the Xen ecosystem. >>> >>> I hope you'll understand why my initial reaction was just a >>> monosyllabic 'no'. >> >> To be honest- no, I don't. I didn't say "no way". > > Then you have completely missed my point about how subtly understating > your 'objections' makes no difference at all to the outcome. > > But OK, I'll come back to that at the end. You have made your intent > clear, more than once now, and we should take it on board. > >> Instead I asked back to see whether there's more background to this. >> It is a useful piece of information to know that -MP post-dates -MD >> by 10 or more years. It's still speculation of why a new option was >> added rather than making this default behavior, but I feel less >> afraid of the change this way than by an implied "this not going to >> do any harm" without really being certain why there is a separate >> option in the first place (and gcc doc also not saying anything to >> this effect). > > It is not my job to make you feel less afraid of change. > >> I can certainly follow your sentiment, not the least because >> especially in my early days I also frequently got back replies I >> didn't like, in various projects. Yet in a case like this one I'm >> afraid it is not the reviewer's job to point out the unsafety of >> a change, but it's the submitter who has to (sufficiently) prove >> that a change won't break anything. > > I'm sure you didn't mean it as such, Jan, but FYI that response could > be construed as being fairly patronising. If you were to direct it > towards someone who's even remotely inclined to feeling patronised, > that is. :) I certainly didn't mean to, I apologize. (My dictionary gives me several very different meanings of "patronize", so I'm somewhat guessing which meaning you infer here.) >> Yes, in the typical case, when there's a recognizable bug, the >> reviewer would point this out. But there are cases where there's no >> obvious bug, but experience (and, as so often, insufficient >> documentation) tells one to be wary of changes of, possibly, any >> kind. > > I find this response to be purely obstructive and unhelpful. I'm sorry if it feels like this to you. > Your > response to my patch was basically asking me to prove a negative, and I > find myself surprised and disappointed that you cannot acknowledge > that. I didn't think our viewpoints were really that far apart; perhaps > I was wrong. I'm certainly willing to acknowledge that I've asked a question that may be difficult if possible at all to answer in a way that we'd be fully certain in the end. Yet even after all of the discussion we've had here I still think the question was appropriate to ask. It continues to be unobvious to me that non- default behavior of a tool would imply using this behavior is going to be free of side effects. The historical aspect you've dug out afterwards is at least a partial explanation which, seeing that you've got an unconditional and a conditional ack, is good enough for me to let the change go in, despite still not being finally convinced of it being free of side effects. > If there was an actual bug — or even the suspicion of a bug — I could > understand it. But this is just voodoo "we're too scared to change > things because we don't understand". Not just this, but also because things had been broken in subtle ways in the past. Until we get a better one, we have to live with the build system being fragile here and there. > We are better than that. You can be better than that. > > But I will take on board your comments about understatement and the > fact that you hadn't actually said "no". In future I shall consider > merely ignoring such interjections unless you explicitly state that you > are blocking the acceptance of a patch. Or, I suppose, resorting to the > style of monosyllabic answer that I had originally given in this case. > > I trust that maintainers will take that on board too, and that open > "questions" from you in a thread will not be considered sufficient > reason not to merge a patch. > > That seems to be what you're saying is your intent, yes? My intent was to get clarification before the patch would go in. I didn't mean to block it, but I also didn't see it go in without such clarification. I'm struggling to see what's bad in asking whether you/we are certain enough that a change won't have bad side effects; if there were, we might treat an easy to work around situation by one hard to recognize and address. Seeing you reply just "no" seemed a fair answer to me (while I sensed a certain level of annoyance), albeit not one that would resolve the question. In anticipation I did include anyone else who might know right away. Had I known the answer myself, I of course wouldn't have asked. Bottom line - when I say "no", I mean "no". When I ask a question I expect it to be resolved, at least to a reasonable degree. When I say "I wonder" I indeed mean just that; to me "may I suggest to consider as an alternative" is simply more words, which may again be an effect of English not being my native language. And when I say "ack", I mean "ack". (I also didn't think I made any comments about understatement; it was you who brought up that [cultural] aspect.) I'm afraid as a result of this discussion I'm now more confused as to finding common grounds than I was before. Jan
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 4fa9c91140..17a318e16d 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -415,7 +415,7 @@ long arch_do_domctl( if ( page->u.inuse.type_info & PGT_pinned ) type |= XEN_DOMCTL_PFINFO_LPINTAB; - if ( page->count_info & PGC_broken ) + if ( page_is_broken(page) ) type = XEN_DOMCTL_PFINFO_BROKEN; } diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 97902d42c1..4084503554 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1093,7 +1093,7 @@ static int reserve_offlined_page(struct page_info *head) struct page_info *pg; int next_order; - if ( page_state_is(cur_head, offlined) ) + if ( page_is_offlined(cur_head) ) { cur_head++; if ( first_dirty != INVALID_DIRTY_IDX && first_dirty ) @@ -1113,7 +1113,7 @@ static int reserve_offlined_page(struct page_info *head) for ( i = (1 << cur_order), pg = cur_head + (1 << cur_order ); i < (1 << next_order); i++, pg++ ) - if ( page_state_is(pg, offlined) ) + if ( page_is_offlined(pg) ) break; if ( i == ( 1 << next_order) ) { @@ -1145,16 +1145,19 @@ static int reserve_offlined_page(struct page_info *head) for ( cur_head = head; cur_head < head + ( 1UL << head_order); cur_head++ ) { - if ( !page_state_is(cur_head, offlined) ) + struct page_list_head *list; + if ( page_state_is(cur_head, offlined) ) + list = &page_offlined_list; + else if (page_state_is(cur_head, broken) ) + list = &page_broken_list; + else continue; avail[node][zone]--; total_avail_pages--; ASSERT(total_avail_pages >= 0); - page_list_add_tail(cur_head, - test_bit(_PGC_broken, &cur_head->count_info) ? - &page_broken_list : &page_offlined_list); + page_list_add_tail(cur_head, list); count++; } @@ -1404,13 +1407,16 @@ static void free_heap_pages( switch ( pg[i].count_info & PGC_state ) { case PGC_state_inuse: - BUG_ON(pg[i].count_info & PGC_broken); pg[i].count_info = PGC_state_free; break; case PGC_state_offlining: - pg[i].count_info = (pg[i].count_info & PGC_broken) | - PGC_state_offlined; + pg[i].count_info = PGC_state_offlined; + tainted = 1; + break; + + case PGC_state_broken_offlining: + pg[i].count_info = PGC_state_broken; tainted = 1; break; @@ -1527,16 +1533,25 @@ static unsigned long mark_page_offline(struct page_info *pg, int broken) do { nx = x = y; - if ( ((x & PGC_state) != PGC_state_offlined) && - ((x & PGC_state) != PGC_state_offlining) ) + nx &= ~PGC_state; + + switch ( x & PGC_state ) { - nx &= ~PGC_state; - nx |= (((x & PGC_state) == PGC_state_free) - ? PGC_state_offlined : PGC_state_offlining); - } + case PGC_state_inuse: + case PGC_state_offlining: + nx |= broken ? PGC_state_offlining : PGC_state_broken_offlining; + break; + + case PGC_state_free: + nx |= broken ? PGC_state_broken : PGC_state_offlined; - if ( broken ) - nx |= PGC_broken; + case PGC_state_broken_offlining: + nx |= PGC_state_broken_offlining; + + case PGC_state_offlined: + case PGC_state_broken: + nx |= PGC_state_broken; + } if ( x == nx ) break; @@ -1609,7 +1624,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status) * need to prevent malicious guest access the broken page again. * Under such case, hypervisor shutdown guest, preventing recursive mce. */ - if ( (pg->count_info & PGC_broken) && (owner = page_get_owner(pg)) ) + if ( page_is_broken(pg) && (owner = page_get_owner(pg)) ) { *status = PG_OFFLINE_AGAIN; domain_crash(owner); @@ -1620,7 +1635,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status) old_info = mark_page_offline(pg, broken); - if ( page_state_is(pg, offlined) ) + if ( page_is_offlined(pg) ) { reserve_heap_page(pg); @@ -1699,14 +1714,14 @@ unsigned int online_page(mfn_t mfn, uint32_t *status) do { ret = *status = 0; - if ( y & PGC_broken ) + if ( (y & PGC_state) == PGC_state_broken || + (y & PGC_state) == PGC_state_broken_offlining ) { ret = -EINVAL; *status = PG_ONLINE_FAILED |PG_ONLINE_BROKEN; break; } - - if ( (y & PGC_state) == PGC_state_offlined ) + else if ( (y & PGC_state) == PGC_state_offlined ) { page_list_del(pg, &page_offlined_list); *status = PG_ONLINE_ONLINED; @@ -1747,11 +1762,11 @@ int query_page_offline(mfn_t mfn, uint32_t *status) pg = mfn_to_page(mfn); - if ( page_state_is(pg, offlining) ) + if ( page_is_offlining(pg) ) *status |= PG_OFFLINE_STATUS_OFFLINE_PENDING; - if ( pg->count_info & PGC_broken ) + if ( page_is_broken(pg) ) *status |= PG_OFFLINE_STATUS_BROKEN; - if ( page_state_is(pg, offlined) ) + if ( page_is_offlined(pg) ) *status |= PG_OFFLINE_STATUS_OFFLINED; spin_unlock(&heap_lock); @@ -2483,7 +2498,7 @@ __initcall(pagealloc_keyhandler_init); void scrub_one_page(struct page_info *pg) { - if ( unlikely(pg->count_info & PGC_broken) ) + if ( unlikely(page_is_broken(pg)) ) return; #ifndef NDEBUG diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index 333efd3a60..c9466c8ca0 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -112,13 +112,25 @@ struct page_info /* Page is broken? */ #define _PGC_broken PG_shift(7) #define PGC_broken PG_mask(1, 7) - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */ -#define PGC_state PG_mask(3, 9) -#define PGC_state_inuse PG_mask(0, 9) -#define PGC_state_offlining PG_mask(1, 9) -#define PGC_state_offlined PG_mask(2, 9) -#define PGC_state_free PG_mask(3, 9) -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) + /* + * Mutually-exclusive page states: + * { inuse, offlining, offlined, free, broken_offlining, broken } + */ +#define PGC_state PG_mask(7, 9) +#define PGC_state_inuse PG_mask(0, 9) +#define PGC_state_offlining PG_mask(1, 9) +#define PGC_state_offlined PG_mask(2, 9) +#define PGC_state_free PG_mask(3, 9) +#define PGC_state_broken_offlining PG_mask(4, 9) +#define PGC_state_broken PG_mask(5, 9) + +#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) +#define page_is_broken(pg) (page_state_is((pg), broken_offlining) || \ + page_state_is((pg), broken)) +#define page_is_offlined(pg) (page_state_is((pg), broken) || \ + page_state_is((pg), offlined)) +#define page_is_offlining(pg) (page_state_is((pg), broken_offlining) || \ + page_state_is((pg), offlining)) /* Count of references to this frame. */ #define PGC_count_width PG_shift(9) diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h index 2ca8882ad0..3edadf7a7c 100644 --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -67,18 +67,27 @@ /* 3-bit PAT/PCD/PWT cache-attribute hint. */ #define PGC_cacheattr_base PG_shift(6) #define PGC_cacheattr_mask PG_mask(7, 6) - /* Page is broken? */ -#define _PGC_broken PG_shift(7) -#define PGC_broken PG_mask(1, 7) - /* Mutually-exclusive page states: { inuse, offlining, offlined, free }. */ -#define PGC_state PG_mask(3, 9) -#define PGC_state_inuse PG_mask(0, 9) -#define PGC_state_offlining PG_mask(1, 9) -#define PGC_state_offlined PG_mask(2, 9) -#define PGC_state_free PG_mask(3, 9) -#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) - - /* Count of references to this frame. */ + /* + * Mutually-exclusive page states: + * { inuse, offlining, offlined, free, broken_offlining, broken } + */ +#define PGC_state PG_mask(7, 9) +#define PGC_state_inuse PG_mask(0, 9) +#define PGC_state_offlining PG_mask(1, 9) +#define PGC_state_offlined PG_mask(2, 9) +#define PGC_state_free PG_mask(3, 9) +#define PGC_state_broken_offlining PG_mask(4, 9) +#define PGC_state_broken PG_mask(5, 9) + +#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st) +#define page_is_broken(pg) (page_state_is((pg), broken_offlining) || \ + page_state_is((pg), broken)) +#define page_is_offlined(pg) (page_state_is((pg), broken) || \ + page_state_is((pg), offlined)) +#define page_is_offlining(pg) (page_state_is((pg), broken_offlining) || \ + page_state_is((pg), offlining)) + +/* Count of references to this frame. */ #define PGC_count_width PG_shift(9) #define PGC_count_mask ((1UL<<PGC_count_width)-1)