Message ID | alpine.DEB.2.22.394.2111081430090.3317@ubuntu-linux-20-04-desktop (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ASSERT in rangeset_remove_range | expand |
On Tue, Nov 9, 2021 at 12:45 AM Stefano Stabellini <sstabellini@kernel.org> wrote: > Hi Oleksandr, Julien, > Hi Stefano, Julien. [Sorry for the possible format issues] > > I discovered a bug caused by the recent changes to introduce extended > regions in make_hypervisor_node (more logs appended): > > > (XEN) d1 BANK[0] 0x00000040000000-0x0000007e800000 (1000MB) > (XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB) > (XEN) DEBUG find_unallocated_memory 994 s=40000000 e=7e7fffff > (XEN) DEBUG find_unallocated_memory 994 s=200000000 e=1ffffffff > (XEN) Assertion 's <= e' failed at rangeset.c:189 > > > When a bank of memory is zero in size, then rangeset_remove_range is > called with end < start, triggering an ASSERT in rangeset_remove_range. > > One solution is to avoid creating 0 size banks. The following change > does that: > > > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c > index 49b4eb2b13..3efe542d0f 100644 > --- a/xen/arch/arm/domain_build.c > +++ b/xen/arch/arm/domain_build.c > @@ -459,9 +459,12 @@ static void __init allocate_memory(struct domain *d, > struct kernel_info *kinfo) > goto fail; > > bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem); > - if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), > + if ( bank_size > 0 ) > + { > + if ( !allocate_bank_memory(d, kinfo, > gaddr_to_gfn(GUEST_RAM1_BASE), > bank_size) ) > - goto fail; > + goto fail; > + } > > if ( kinfo->unassigned_mem ) > goto fail; > > > > We have a couple of other options too: > > - remove the ASSERT in rangeset_remove_range > There is an argument that we should simply return error > fromrangeset_remove_range, rather than a full assert. > > - add a if (end > start) check before calling rangeset_remove_range > We could check that end > start before calling rangeset_remove_range at > all the call sites in domain_build.c. There are 5 call sites at the > moment. > > Any other ideas or suggestions? > Personally I would avoid creating zero-sized banks (your first solution) as I don't see any point in taking them into the account and exposing them to the guest. What I don't understand is how this assert is triggered by upstream Xen (how the make_hypervisor_node() gets called for DomU)? Do you have some changes on top? > > Cheers, > > Stefano > > > > (XEN) DEBUG find_memory_holes 1126 s=ff990000 e=ff990fff > (XEN) DEBUG find_memory_holes 1126 s=ff990000 e=ff990fff > (XEN) DEBUG find_memory_holes 1126 s=ff990000 e=ff990fff > (XEN) DEBUG find_memory_holes 1126 s=ff990000 e=ff990fff > (XEN) DEBUG find_memory_holes 1126 s=f9010000 e=f901ffff > (XEN) DEBUG find_memory_holes 1126 s=f9020000 e=f903ffff > (XEN) DEBUG find_memory_holes 1126 s=f9040000 e=f905ffff > (XEN) DEBUG find_memory_holes 1126 s=f9060000 e=f907ffff > (XEN) DEBUG find_memory_holes 1126 s=fd800000 e=fd81ffff > (XEN) DEBUG find_memory_holes 1126 s=ff060000 e=ff060fff > (XEN) DEBUG find_memory_holes 1126 s=ff070000 e=ff070fff > (XEN) DEBUG find_memory_holes 1126 s=fd6e0000 e=fd6e8fff > (XEN) DEBUG find_memory_holes 1126 s=fd6e9000 e=fd6edfff > (XEN) DEBUG find_memory_holes 1126 s=fd500000 e=fd500fff > (XEN) DEBUG find_memory_holes 1126 s=fd510000 e=fd510fff > (XEN) DEBUG find_memory_holes 1126 s=fd520000 e=fd520fff > (XEN) DEBUG find_memory_holes 1126 s=fd530000 e=fd530fff > (XEN) DEBUG find_memory_holes 1126 s=fd540000 e=fd540fff > (XEN) DEBUG find_memory_holes 1126 s=fd550000 e=fd550fff > (XEN) DEBUG find_memory_holes 1126 s=fd560000 e=fd560fff > (XEN) DEBUG find_memory_holes 1126 s=fd570000 e=fd570fff > (XEN) DEBUG find_memory_holes 1126 s=fd4b0000 e=fd4bffff > (XEN) DEBUG find_memory_holes 1126 s=ffa80000 e=ffa80fff > (XEN) DEBUG find_memory_holes 1126 s=ffa90000 e=ffa90fff > (XEN) DEBUG find_memory_holes 1126 s=ffaa0000 e=ffaa0fff > (XEN) DEBUG find_memory_holes 1126 s=ffab0000 e=ffab0fff > (XEN) DEBUG find_memory_holes 1126 s=ffac0000 e=ffac0fff > (XEN) DEBUG find_memory_holes 1126 s=ffad0000 e=ffad0fff > (XEN) DEBUG find_memory_holes 1126 s=ffae0000 e=ffae0fff > (XEN) DEBUG find_memory_holes 1126 s=ffaf0000 e=ffaf0fff > (XEN) DEBUG find_memory_holes 1126 s=fd070000 e=fd09ffff > (XEN) DEBUG find_memory_holes 1126 s=ff100000 e=ff100fff > (XEN) DEBUG find_memory_holes 1126 s=ff0b0000 e=ff0b0fff > (XEN) DEBUG find_memory_holes 1126 s=ff0c0000 e=ff0c0fff > (XEN) DEBUG find_memory_holes 1126 s=ff0d0000 e=ff0d0fff > (XEN) DEBUG find_memory_holes 1126 s=ff0e0000 e=ff0e0fff > (XEN) DEBUG find_memory_holes 1126 s=ff0a0000 e=ff0a0fff > (XEN) DEBUG find_memory_holes 1126 s=ff020000 e=ff020fff > (XEN) DEBUG find_memory_holes 1126 s=ff030000 e=ff030fff > (XEN) DEBUG find_memory_holes 1126 s=ff960000 e=ff960fff > (XEN) DEBUG find_memory_holes 1126 s=ffa00000 e=ffa0ffff > (XEN) DEBUG find_memory_holes 1126 s=fd0b0000 e=fd0bffff > (XEN) DEBUG find_memory_holes 1126 s=fd490000 e=fd49ffff > (XEN) DEBUG find_memory_holes 1126 s=ffa10000 e=ffa1ffff > (XEN) DEBUG find_memory_holes 1126 s=fd0e0000 e=fd0e0fff > (XEN) DEBUG find_memory_holes 1126 s=fd480000 e=fd480fff > (XEN) DEBUG find_memory_holes 1126 s=8000000000 e=8000ffffff > (XEN) DEBUG handle_pci_range 1056 s=e0000000 e=efffffff > (XEN) DEBUG handle_pci_range 1056 s=600000000 e=7ffffffff > (XEN) DEBUG find_memory_holes 1126 s=ff0f0000 e=ff0f0fff > (XEN) DEBUG find_memory_holes 1126 s=c0000000 e=c7ffffff > (XEN) DEBUG find_memory_holes 1126 s=ffa60000 e=ffa60fff > (XEN) DEBUG find_memory_holes 1126 s=fd400000 e=fd43ffff > (XEN) DEBUG find_memory_holes 1126 s=fd3d0000 e=fd3d0fff > (XEN) DEBUG find_memory_holes 1126 s=fd0c0000 e=fd0c1fff > (XEN) DEBUG find_memory_holes 1126 s=ff160000 e=ff160fff > (XEN) DEBUG find_memory_holes 1126 s=ff170000 e=ff170fff > (XEN) DEBUG find_memory_holes 1126 s=ff040000 e=ff040fff > (XEN) DEBUG find_memory_holes 1126 s=ff050000 e=ff050fff > (XEN) DEBUG find_memory_holes 1126 s=ff110000 e=ff110fff > (XEN) DEBUG find_memory_holes 1126 s=ff120000 e=ff120fff > (XEN) DEBUG find_memory_holes 1126 s=ff130000 e=ff130fff > (XEN) DEBUG find_memory_holes 1126 s=ff140000 e=ff140fff > (XEN) DEBUG find_memory_holes 1126 s=ff000000 e=ff000fff > (XEN) DEBUG find_memory_holes 1126 s=ff010000 e=ff010fff > (XEN) DEBUG find_memory_holes 1126 s=ff9d0000 e=ff9d0fff > (XEN) DEBUG find_memory_holes 1126 s=fe200000 e=fe23ffff > (XEN) DEBUG find_memory_holes 1126 s=ff9e0000 e=ff9e0fff > (XEN) DEBUG find_memory_holes 1126 s=fe300000 e=fe33ffff > (XEN) DEBUG find_memory_holes 1126 s=fd4d0000 e=fd4d0fff > (XEN) DEBUG find_memory_holes 1126 s=ff150000 e=ff150fff > (XEN) DEBUG find_memory_holes 1126 s=ffa50000 e=ffa50fff > (XEN) DEBUG find_memory_holes 1126 s=ffa50000 e=ffa50fff > (XEN) DEBUG find_memory_holes 1126 s=ffa50000 e=ffa50fff > (XEN) DEBUG find_memory_holes 1126 s=fd4c0000 e=fd4c0fff > (XEN) DEBUG find_memory_holes 1126 s=fd4a0000 e=fd4a0fff > (XEN) DEBUG find_memory_holes 1126 s=fd4aa000 e=fd4aafff > (XEN) DEBUG find_memory_holes 1126 s=fd4ab000 e=fd4abfff > (XEN) DEBUG find_memory_holes 1126 s=fd4ac000 e=fd4acfff > (XEN) DEBUG find_memory_holes 1126 s=0 e=7fefffff > (XEN) DEBUG find_memory_holes 1126 s=800000000 e=87fffffff > (XEN) Extended region 0: 0x80000000->0xc0000000 > (XEN) Extended region 1: 0xc8000000->0xe0000000 > (XEN) Extended region 2: 0xf0000000->0xf9000000 > (XEN) Extended region 3: 0xffc00000->0x600000000 > (XEN) Extended region 4: 0x880000000->0x8000000000 > (XEN) Extended region 5: 0x8001000000->0x10000000000 > (XEN) Loading zImage from 0000000000e00000 to > 0000000020000000-0000000021367a00 > (XEN) Loading d0 initrd from 0000000002200000 to > 0x0000000028200000-0x00000000293f936d > (XEN) Loading d0 DTB to 0x0000000028000000-0x0000000028009604 > (XEN) *** LOADING DOMU cpus=1 memory=fa000KB *** > (XEN) Loading d1 kernel from boot module @ 0000000003400000 > (XEN) Loading ramdisk from boot module @ 0000000004800000 > (XEN) Allocating mappings totalling 1000MB for d1: > (XEN) d1 BANK[0] 0x00000040000000-0x0000007e800000 (1000MB) > (XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB) > (XEN) DEBUG find_unallocated_memory 994 s=40000000 e=7e7fffff > (XEN) DEBUG find_unallocated_memory 994 s=200000000 e=1ffffffff > (XEN) Assertion 's <= e' failed at rangeset.c:189 > (XEN) ----[ Xen-4.16-rc arm64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) PC: 0000000000220e6c rangeset_remove_range+0xbc/0x2bc > (XEN) LR: 00000000002cd508 > (XEN) SP: 0000000000306f60 > (XEN) CPSR: 0000000020000249 MODE:64-bit EL2h (Hypervisor, handler) > (XEN) X0: 00008000fbf61e70 X1: 0000000200000000 X2: 00000001ffffffff > (XEN) X3: 0000000000000005 X4: 0000000000000000 X5: 0000000000000028 > (XEN) X6: 0000000000000080 X7: fefefefefefeff09 X8: 7f7f7f7f7f7f7f7f > (XEN) X9: 717164616f726051 X10: 7f7f7f7f7f7f7f7f X11: 0101010101010101 > (XEN) X12: 0000000000000008 X13: 0000000000000009 X14: 0000000000306cb8 > (XEN) X15: 0000000000000020 X16: 000000000028c5a8 X17: 0000000000000000 > (XEN) X18: 0180000000000000 X19: 00000001ffffffff X20: 0000000000000001 > (XEN) X21: 0000000200000000 X22: 0000000200000000 X23: 0000000000000002 > (XEN) X24: 0000000000000002 X25: 00000000003070e0 X26: 00000000002d9e68 > (XEN) X27: 00000000002d8d18 X28: 00008000fbf40000 FP: 0000000000306f60 > (XEN) > (XEN) VTCR_EL2: 0000000080023558 > (XEN) VTTBR_EL2: 0000000000000000 > (XEN) > (XEN) SCTLR_EL2: 0000000030cd183d > (XEN) HCR_EL2: 0000000000000038 > (XEN) TTBR0_EL2: 0000000004b45000 > (XEN) > (XEN) ESR_EL2: 00000000f2000001 > (XEN) HPFAR_EL2: 0000000000000000 > (XEN) FAR_EL2: 0000000000000000 > (XEN) > (XEN) Xen stack trace from sp=0000000000306f60: > (XEN) 0000000000307040 00000000002cf2a8 00008000fbf5a000 > 0000000000000000 > (XEN) 00008000fbf40000 00000000003070a8 0000000000307de4 > 00000000002aa078 > (XEN) 0000000880000000 0000000000000002 00000000002e8d08 > 00000000000fff00 > (XEN) 00008000fbf61e70 00008000fbf5a000 00008000fbf61220 > 0000000000000000 > (XEN) 0000000000307040 00000000002cf288 00008000fbf5a000 > 0000000000000000 > (XEN) 00008000fbf40000 00000000003070a8 0000000000307de4 > 65782c32766e6578 > (XEN) 000000000030006e 2d6e65782c6e6578 6e65780036312e34 > 000000006e65782c > (XEN) 0000000000307d80 00000000002d0440 00008000fbfd95a0 > 0000000000307dc8 > (XEN) 00000000002d99b8 00000000002da338 0000000000307de4 > 00000000002aa078 > (XEN) 000000000000000f 0000000000307110 0000000000000001 > 00c2010000000002 > (XEN) 00000000003070c8 0000000000000000 6d933f29040f0000 > 0000002200000000 > (XEN) 0010000000000000 0020000300000000 0020000000000000 > 00000000000fa000 > (XEN) 0000000000000001 00008000fbf5a000 00008000fbf40000 > 0000000000000000 > (XEN) 0000000000000002 0000000040000000 000000003e800000 > 0000000000000000 > (XEN) 0000000200000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) Xen call trace: > (XEN) [<0000000000220e6c>] rangeset_remove_range+0xbc/0x2bc (PC) > (XEN) [<00000000002cd508>] > domain_build.c#make_hypervisor_node+0x258/0x7f4 (LR) > (XEN) [<00000000002cf2a8>] domain_build.c#construct_domU+0x9cc/0xa8c > (XEN) [<00000000002d0440>] create_domUs+0xe8/0x224 > (XEN) [<00000000002d4988>] start_xen+0xafc/0xbf0 > (XEN) [<00000000002001a0>] arm64/head.o#primary_switched+0xc/0x1c > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Assertion 's <= e' failed at rangeset.c:189 > (XEN) **************************************** > >
(+ Jan, Andrew, Wei for the common code) On 08/11/2021 22:45, Stefano Stabellini wrote: > Hi Oleksandr, Julien, Hi, > I discovered a bug caused by the recent changes to introduce extended > regions in make_hypervisor_node (more logs appended): > > > (XEN) d1 BANK[0] 0x00000040000000-0x0000007e800000 (1000MB) > (XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB) > (XEN) DEBUG find_unallocated_memory 994 s=40000000 e=7e7fffff > (XEN) DEBUG find_unallocated_memory 994 s=200000000 e=1ffffffff > (XEN) Assertion 's <= e' failed at rangeset.c:189 > > > When a bank of memory is zero in size, then rangeset_remove_range is > called with end < start, triggering an ASSERT in rangeset_remove_range. > > One solution is to avoid creating 0 size banks. The following change > does that: > > > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c > index 49b4eb2b13..3efe542d0f 100644 > --- a/xen/arch/arm/domain_build.c > +++ b/xen/arch/arm/domain_build.c > @@ -459,9 +459,12 @@ static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo) > goto fail; > > bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem); > - if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), > + if ( bank_size > 0 ) > + { > + if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), > bank_size) ) > - goto fail; > + goto fail; > + } I would move the size check in allocate_bank_memory(). > > if ( kinfo->unassigned_mem ) > goto fail; > > > > We have a couple of other options too: > > - remove the ASSERT in rangeset_remove_range > There is an argument that we should simply return error > fromrangeset_remove_range, rather than a full assert. To be honest, this is a developper mistake to call with end < start. If we were going to return an error then we would completely hide (even in developper) it because we would fallback to not exposing extended regions. So I am not sure switch from ASSERT() to a plain check is a good idea. Jan, Andrew, Wei, what do you think? That said, this option would not be sufficient to fix your problem as extended regions will not work. > > - add a if (end > start) check before calling rangeset_remove_range > We could check that end > start before calling rangeset_remove_range at > all the call sites in domain_build.c. There are 5 call sites at the > moment. I think we only want to add (end > start) where we expect the region size to be 0. AFAICT, the only other potential place where this can happens is ``find_memory_holes()`` (I vaguely recall a discussion in the past where some of the "reg" property would have size == 0). > > Any other ideas or suggestions? My preference goes with your initial sugestion (so long the check is moved to allocate_bank_memory()). [...] > (XEN) Assertion 's <= e' failed at rangeset.c:189 > (XEN) ----[ Xen-4.16-rc arm64 debug=y Not tainted ]---- > (XEN) Xen call trace: > (XEN) [<0000000000220e6c>] rangeset_remove_range+0xbc/0x2bc (PC) > (XEN) [<00000000002cd508>] domain_build.c#make_hypervisor_node+0x258/0x7f4 (LR) > (XEN) [<00000000002cf2a8>] domain_build.c#construct_domU+0x9cc/0xa8c Vanilla staging doesn't call make_hypervisor_node() from construct_domU. So what are you using? > (XEN) [<00000000002d0440>] create_domUs+0xe8/0x224 > (XEN) [<00000000002d4988>] start_xen+0xafc/0xbf0 > (XEN) [<00000000002001a0>] arm64/head.o#primary_switched+0xc/0x1c > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) Assertion 's <= e' failed at rangeset.c:189 > (XEN) **************************************** > Cheers,
On 09.11.2021 12:58, Julien Grall wrote: > On 08/11/2021 22:45, Stefano Stabellini wrote: >> I discovered a bug caused by the recent changes to introduce extended >> regions in make_hypervisor_node (more logs appended): >> >> >> (XEN) d1 BANK[0] 0x00000040000000-0x0000007e800000 (1000MB) >> (XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB) >> (XEN) DEBUG find_unallocated_memory 994 s=40000000 e=7e7fffff >> (XEN) DEBUG find_unallocated_memory 994 s=200000000 e=1ffffffff >> (XEN) Assertion 's <= e' failed at rangeset.c:189 >> >> >> When a bank of memory is zero in size, then rangeset_remove_range is >> called with end < start, triggering an ASSERT in rangeset_remove_range. >> >> One solution is to avoid creating 0 size banks. The following change >> does that: >> >> >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c >> index 49b4eb2b13..3efe542d0f 100644 >> --- a/xen/arch/arm/domain_build.c >> +++ b/xen/arch/arm/domain_build.c >> @@ -459,9 +459,12 @@ static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo) >> goto fail; >> >> bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem); >> - if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), >> + if ( bank_size > 0 ) >> + { >> + if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), >> bank_size) ) >> - goto fail; >> + goto fail; >> + } > > I would move the size check in allocate_bank_memory(). > >> >> if ( kinfo->unassigned_mem ) >> goto fail; >> >> >> >> We have a couple of other options too: >> >> - remove the ASSERT in rangeset_remove_range >> There is an argument that we should simply return error >> fromrangeset_remove_range, rather than a full assert. > > To be honest, this is a developper mistake to call with end < start. If > we were going to return an error then we would completely hide (even in > developper) it because we would fallback to not exposing extended regions. > > So I am not sure switch from ASSERT() to a plain check is a good idea. > Jan, Andrew, Wei, what do you think? There might be reasons to convert, but I don't think the situation here warrants doing so. Jan
On Tue, 9 Nov 2021, Julien Grall wrote: > (+ Jan, Andrew, Wei for the common code) > > On 08/11/2021 22:45, Stefano Stabellini wrote: > > Hi Oleksandr, Julien, > > Hi, > > > I discovered a bug caused by the recent changes to introduce extended > > regions in make_hypervisor_node (more logs appended): > > > > > > (XEN) d1 BANK[0] 0x00000040000000-0x0000007e800000 (1000MB) > > (XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB) > > (XEN) DEBUG find_unallocated_memory 994 s=40000000 e=7e7fffff > > (XEN) DEBUG find_unallocated_memory 994 s=200000000 e=1ffffffff > > (XEN) Assertion 's <= e' failed at rangeset.c:189 > > > > > > When a bank of memory is zero in size, then rangeset_remove_range is > > called with end < start, triggering an ASSERT in rangeset_remove_range. > > > > One solution is to avoid creating 0 size banks. The following change > > does that: > > > > > > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c > > index 49b4eb2b13..3efe542d0f 100644 > > --- a/xen/arch/arm/domain_build.c > > +++ b/xen/arch/arm/domain_build.c > > @@ -459,9 +459,12 @@ static void __init allocate_memory(struct domain *d, > > struct kernel_info *kinfo) > > goto fail; > > bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem); > > - if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), > > + if ( bank_size > 0 ) > > + { > > + if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), > > bank_size) ) > > - goto fail; > > + goto fail; > > + } > > I would move the size check in allocate_bank_memory(). Sure, I can do that > > if ( kinfo->unassigned_mem ) > > goto fail; > > > > > > > > We have a couple of other options too: > > > > - remove the ASSERT in rangeset_remove_range > > There is an argument that we should simply return error > > fromrangeset_remove_range, rather than a full assert. > > To be honest, this is a developper mistake to call with end < start. If we > were going to return an error then we would completely hide (even in > developper) it because we would fallback to not exposing extended regions. > > So I am not sure switch from ASSERT() to a plain check is a good idea. Jan, > Andrew, Wei, what do you think? > > That said, this option would not be sufficient to fix your problem as extended > regions will not work. > > > > > - add a if (end > start) check before calling rangeset_remove_range > > We could check that end > start before calling rangeset_remove_range at > > all the call sites in domain_build.c. There are 5 call sites at the > > moment. > > I think we only want to add (end > start) where we expect the region size to > be 0. AFAICT, the only other potential place where this can happens is > ``find_memory_holes()`` (I vaguely recall a discussion in the past where some > of the "reg" property would have size == 0). > > > > > Any other ideas or suggestions? > > My preference goes with your initial sugestion (so long the check is moved to > allocate_bank_memory()). > > [...] > > > (XEN) Assertion 's <= e' failed at rangeset.c:189 > > (XEN) ----[ Xen-4.16-rc arm64 debug=y Not tainted ]---- > > (XEN) Xen call trace: > > (XEN) [<0000000000220e6c>] rangeset_remove_range+0xbc/0x2bc (PC) > > (XEN) [<00000000002cd508>] > > domain_build.c#make_hypervisor_node+0x258/0x7f4 (LR) > > (XEN) [<00000000002cf2a8>] domain_build.c#construct_domU+0x9cc/0xa8c > > Vanilla staging doesn't call make_hypervisor_node() from construct_domU. So > what are you using? Well spotted. This is my WIP branch with PV drivers support for Dom0less guests (soon to be sent to xen-devel). The underlying bug could affect vanilla Xen too as it only takes a zero-size bank to trigger it, but it is certainly harder to reproduce because make_hypervisor_node is only called for Dom0 and allocate_bank_memory (the function that today always adds a zero size bank) is called for DomUs. > > (XEN) [<00000000002d0440>] create_domUs+0xe8/0x224 > > (XEN) [<00000000002d4988>] start_xen+0xafc/0xbf0 > > (XEN) [<00000000002001a0>] arm64/head.o#primary_switched+0xc/0x1c > > (XEN) > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 0: > > (XEN) Assertion 's <= e' failed at rangeset.c:189 > > (XEN) ****************************************
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 49b4eb2b13..3efe542d0f 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -459,9 +459,12 @@ static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo) goto fail; bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem); - if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), + if ( bank_size > 0 ) + { + if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE), bank_size) ) - goto fail; + goto fail; + } if ( kinfo->unassigned_mem ) goto fail;