Message ID | 20211227211642.994461-4-maz@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | target/arm: Reduced-IPA space and highmem=off fixes | expand |
Hi Marc, On 12/27/21 10:16 PM, Marc Zyngier wrote: > Even when the VM is configured with highmem=off, the highest_gpa > field includes devices that are above the 4GiB limit. > Similarily, nothing seem to check that the memory is within > the limit set by the highmem=off option. > > This leads to failures in virt_kvm_type() on systems that have > a crippled IPA range, as the reported IPA space is larger than > what it should be. > > Instead, honor the user-specified limit to only use the devices > at the lowest end of the spectrum, and fail if we have memory > crossing the 4GiB limit. > > Reviewed-by: Andrew Jones <drjones@redhat.com> > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- > hw/arm/virt.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > index 8b600d82c1..84dd3b36fb 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms) > exit(EXIT_FAILURE); > } > > + if (!vms->highmem && > + vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { > + error_report("highmem=off, but memory crosses the 4GiB limit\n"); > + exit(EXIT_FAILURE); The memory is composed of initial memory and device memory. device memory is put after the initial memory but has a 1GB alignment On top of that you have 1G page alignment per device memory slot so potentially the highest mem address is larger than vms->memmap[VIRT_MEM].base + ms->maxram_size. I would rather do the check on device_memory_base + device_memory_size > + } > /* > * We compute the base of the high IO region depending on the > * amount of initial and device memory. The device memory start/size > @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms) > vms->memmap[i].size = size; > base += size; > } > - vms->highest_gpa = base - 1; > + vms->highest_gpa = (vms->highmem ? > + base : > + vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; As per the previous comment this looks wrong to me if !highmem. If !highmem, if RAM requirements are low we still could get benefit from REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we simply don't care? If we don't, why don't we simply skip the extended_memmap overlay as suggested in v2? I did not get your reply sorry. Thanks Eric > if (device_memory_size > 0) { > ms->device_memory = g_malloc0(sizeof(*ms->device_memory)); > ms->device_memory->base = device_memory_base;
Hi Marc, On 1/5/22 10:22 AM, Eric Auger wrote: > Hi Marc, > > On 12/27/21 10:16 PM, Marc Zyngier wrote: >> Even when the VM is configured with highmem=off, the highest_gpa >> field includes devices that are above the 4GiB limit. >> Similarily, nothing seem to check that the memory is within >> the limit set by the highmem=off option. >> >> This leads to failures in virt_kvm_type() on systems that have >> a crippled IPA range, as the reported IPA space is larger than >> what it should be. >> >> Instead, honor the user-specified limit to only use the devices >> at the lowest end of the spectrum, and fail if we have memory >> crossing the 4GiB limit. >> >> Reviewed-by: Andrew Jones <drjones@redhat.com> >> Signed-off-by: Marc Zyngier <maz@kernel.org> >> --- >> hw/arm/virt.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/hw/arm/virt.c b/hw/arm/virt.c >> index 8b600d82c1..84dd3b36fb 100644 >> --- a/hw/arm/virt.c >> +++ b/hw/arm/virt.c >> @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms) >> exit(EXIT_FAILURE); >> } >> >> + if (!vms->highmem && >> + vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { >> + error_report("highmem=off, but memory crosses the 4GiB limit\n"); >> + exit(EXIT_FAILURE); > > The memory is composed of initial memory and device memory. > device memory is put after the initial memory but has a 1GB alignment > On top of that you have 1G page alignment per device memory slot > > so potentially the highest mem address is larger than > vms->memmap[VIRT_MEM].base + ms->maxram_size. > I would rather do the check on device_memory_base + device_memory_size >> + } >> /* >> * We compute the base of the high IO region depending on the >> * amount of initial and device memory. The device memory start/size >> @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms) >> vms->memmap[i].size = size; >> base += size; >> } >> - vms->highest_gpa = base - 1; >> + vms->highest_gpa = (vms->highmem ? >> + base : >> + vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; > As per the previous comment this looks wrong to me if !highmem. > > If !highmem, if RAM requirements are low we still could get benefit from > REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we the above assertion is wrong, sorry, as we kept the legacy mem map in that situation (HIGH regions always put above 256 GB). So anyway we can skip the extended memmap if !highmem. Eric > simply don't care? If we don't, why don't we simply skip the > extended_memmap overlay as suggested in v2? I did not get your reply sorry. > > Thanks > > Eric >> if (device_memory_size > 0) { >> ms->device_memory = g_malloc0(sizeof(*ms->device_memory)); >> ms->device_memory->base = device_memory_base; >
On Wed, 05 Jan 2022 09:22:39 +0000, Eric Auger <eric.auger@redhat.com> wrote: > > Hi Marc, > > On 12/27/21 10:16 PM, Marc Zyngier wrote: > > Even when the VM is configured with highmem=off, the highest_gpa > > field includes devices that are above the 4GiB limit. > > Similarily, nothing seem to check that the memory is within > > the limit set by the highmem=off option. > > > > This leads to failures in virt_kvm_type() on systems that have > > a crippled IPA range, as the reported IPA space is larger than > > what it should be. > > > > Instead, honor the user-specified limit to only use the devices > > at the lowest end of the spectrum, and fail if we have memory > > crossing the 4GiB limit. > > > > Reviewed-by: Andrew Jones <drjones@redhat.com> > > Signed-off-by: Marc Zyngier <maz@kernel.org> > > --- > > hw/arm/virt.c | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > > index 8b600d82c1..84dd3b36fb 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms) > > exit(EXIT_FAILURE); > > } > > > > + if (!vms->highmem && > > + vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { > > + error_report("highmem=off, but memory crosses the 4GiB limit\n"); > > + exit(EXIT_FAILURE); > > The memory is composed of initial memory and device memory. > device memory is put after the initial memory but has a 1GB alignment > On top of that you have 1G page alignment per device memory slot > > so potentially the highest mem address is larger than > vms->memmap[VIRT_MEM].base + ms->maxram_size. > I would rather do the check on device_memory_base + device_memory_size Yup, that's a good point. There is also a corner case in one of the later patches where I check this limit against the PA using the rounded-up device_memory_size. This could result in returning an error if the last memory slot would still fit in the PA space, but the rounded-up quantity wouldn't. I don't think it matters much, but I'll fix it anyway. > > + } > > /* > > * We compute the base of the high IO region depending on the > > * amount of initial and device memory. The device memory start/size > > @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms) > > vms->memmap[i].size = size; > > base += size; > > } > > - vms->highest_gpa = base - 1; > > + vms->highest_gpa = (vms->highmem ? > > + base : > > + vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; > As per the previous comment this looks wrong to me if !highmem. Agreed. > If !highmem, if RAM requirements are low we still could get benefit from > REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we > simply don't care? I don't see how. These devices live at a minimum of 256GB, which contradicts the very meaning of !highmem being a 4GB limit. > If we don't, why don't we simply skip the extended_memmap overlay as > suggested in v2? I did not get your reply sorry. Because although this makes sense if you only care about a 32bit limit, we eventually want to check against an arbitrary PA limit and enable the individual devices that do fit in that space. In order to do that, we need to compute the base addresses for these extra devices. Also, computing 3 base addresses isn't going to be massively expensive. Thanks, M.
Hi Marc, On 1/6/22 10:26 PM, Marc Zyngier wrote: > On Wed, 05 Jan 2022 09:22:39 +0000, > Eric Auger <eric.auger@redhat.com> wrote: >> Hi Marc, >> >> On 12/27/21 10:16 PM, Marc Zyngier wrote: >>> Even when the VM is configured with highmem=off, the highest_gpa >>> field includes devices that are above the 4GiB limit. >>> Similarily, nothing seem to check that the memory is within >>> the limit set by the highmem=off option. >>> >>> This leads to failures in virt_kvm_type() on systems that have >>> a crippled IPA range, as the reported IPA space is larger than >>> what it should be. >>> >>> Instead, honor the user-specified limit to only use the devices >>> at the lowest end of the spectrum, and fail if we have memory >>> crossing the 4GiB limit. >>> >>> Reviewed-by: Andrew Jones <drjones@redhat.com> >>> Signed-off-by: Marc Zyngier <maz@kernel.org> >>> --- >>> hw/arm/virt.c | 9 ++++++++- >>> 1 file changed, 8 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c >>> index 8b600d82c1..84dd3b36fb 100644 >>> --- a/hw/arm/virt.c >>> +++ b/hw/arm/virt.c >>> @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms) >>> exit(EXIT_FAILURE); >>> } >>> >>> + if (!vms->highmem && >>> + vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { >>> + error_report("highmem=off, but memory crosses the 4GiB limit\n"); >>> + exit(EXIT_FAILURE); >> The memory is composed of initial memory and device memory. >> device memory is put after the initial memory but has a 1GB alignment >> On top of that you have 1G page alignment per device memory slot >> >> so potentially the highest mem address is larger than >> vms->memmap[VIRT_MEM].base + ms->maxram_size. >> I would rather do the check on device_memory_base + device_memory_size > Yup, that's a good point. > > There is also a corner case in one of the later patches where I check > this limit against the PA using the rounded-up device_memory_size. > This could result in returning an error if the last memory slot would > still fit in the PA space, but the rounded-up quantity wouldn't. I > don't think it matters much, but I'll fix it anyway. > >>> + } >>> /* >>> * We compute the base of the high IO region depending on the >>> * amount of initial and device memory. The device memory start/size >>> @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms) >>> vms->memmap[i].size = size; >>> base += size; >>> } >>> - vms->highest_gpa = base - 1; >>> + vms->highest_gpa = (vms->highmem ? >>> + base : >>> + vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; >> As per the previous comment this looks wrong to me if !highmem. > Agreed. > >> If !highmem, if RAM requirements are low we still could get benefit from >> REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we >> simply don't care? > I don't see how. These devices live at a minimum of 256GB, which > contradicts the very meaning of !highmem being a 4GB limit. Yes I corrected the above statement afterwards, sorry for the noise. > >> If we don't, why don't we simply skip the extended_memmap overlay as >> suggested in v2? I did not get your reply sorry. > Because although this makes sense if you only care about a 32bit > limit, we eventually want to check against an arbitrary PA limit and > enable the individual devices that do fit in that space. In my understanding that is what virt_kvm_type() was supposed to do by testing the result of kvm_arm_get_max_vm_ipa_size and requested_pa_size (which accounted the high regions) and exiting if they were incompatible. But I must miss something. > > In order to do that, we need to compute the base addresses for these > extra devices. Also, computing 3 base addresses isn't going to be > massively expensive. > > Thanks, > > M. > Eric
Hi Eric, On Fri, 07 Jan 2022 17:15:19 +0000, Eric Auger <eric.auger@redhat.com> wrote: > > Hi Marc, > > On 1/6/22 10:26 PM, Marc Zyngier wrote: > > On Wed, 05 Jan 2022 09:22:39 +0000, > > Eric Auger <eric.auger@redhat.com> wrote: > >> Hi Marc, > >> > >> On 12/27/21 10:16 PM, Marc Zyngier wrote: > >>> Even when the VM is configured with highmem=off, the highest_gpa > >>> field includes devices that are above the 4GiB limit. > >>> Similarily, nothing seem to check that the memory is within > >>> the limit set by the highmem=off option. > >>> > >>> This leads to failures in virt_kvm_type() on systems that have > >>> a crippled IPA range, as the reported IPA space is larger than > >>> what it should be. > >>> > >>> Instead, honor the user-specified limit to only use the devices > >>> at the lowest end of the spectrum, and fail if we have memory > >>> crossing the 4GiB limit. > >>> > >>> Reviewed-by: Andrew Jones <drjones@redhat.com> > >>> Signed-off-by: Marc Zyngier <maz@kernel.org> > >>> --- > >>> hw/arm/virt.c | 9 ++++++++- > >>> 1 file changed, 8 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c > >>> index 8b600d82c1..84dd3b36fb 100644 > >>> --- a/hw/arm/virt.c > >>> +++ b/hw/arm/virt.c > >>> @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms) > >>> exit(EXIT_FAILURE); > >>> } > >>> > >>> + if (!vms->highmem && > >>> + vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { > >>> + error_report("highmem=off, but memory crosses the 4GiB limit\n"); > >>> + exit(EXIT_FAILURE); > >> The memory is composed of initial memory and device memory. > >> device memory is put after the initial memory but has a 1GB alignment > >> On top of that you have 1G page alignment per device memory slot > >> > >> so potentially the highest mem address is larger than > >> vms->memmap[VIRT_MEM].base + ms->maxram_size. > >> I would rather do the check on device_memory_base + device_memory_size > > Yup, that's a good point. > > > > There is also a corner case in one of the later patches where I check > > this limit against the PA using the rounded-up device_memory_size. > > This could result in returning an error if the last memory slot would > > still fit in the PA space, but the rounded-up quantity wouldn't. I > > don't think it matters much, but I'll fix it anyway. > > > >>> + } > >>> /* > >>> * We compute the base of the high IO region depending on the > >>> * amount of initial and device memory. The device memory start/size > >>> @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms) > >>> vms->memmap[i].size = size; > >>> base += size; > >>> } > >>> - vms->highest_gpa = base - 1; > >>> + vms->highest_gpa = (vms->highmem ? > >>> + base : > >>> + vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; > >> As per the previous comment this looks wrong to me if !highmem. > > Agreed. > > > >> If !highmem, if RAM requirements are low we still could get benefit from > >> REDIST2 and HIGH ECAM which could fit within the 4GB limit. But maybe we > >> simply don't care? > > I don't see how. These devices live at a minimum of 256GB, which > > contradicts the very meaning of !highmem being a 4GB limit. > Yes I corrected the above statement afterwards, sorry for the noise. > > > >> If we don't, why don't we simply skip the extended_memmap overlay as > >> suggested in v2? I did not get your reply sorry. > > Because although this makes sense if you only care about a 32bit > > limit, we eventually want to check against an arbitrary PA limit and > > enable the individual devices that do fit in that space. > > In my understanding that is what virt_kvm_type() was supposed to do by > testing the result of kvm_arm_get_max_vm_ipa_size and requested_pa_size > (which accounted the high regions) and exiting if they were > incompatible. But I must miss something. This is a chicken and egg problem: you need the IPA size to compute the memory map, and you need the memory map to compute the IPA size. Fun, isn't it? At the moment, virt_set_memmap() doesn't know about the IPA space, generates a highest_gpa that may not work, and we end-up failing because the resulting VM type is out of bound. My solution to that is to feed the *maximum* IPA size to virt_set_memmap(), compute the memory map there, and then use highest_gpa to compute the actual IPA size that is used to create the VM. By knowing the IPA limit in virt_set_memmap(), I'm able to keep it in check and avoid generating an unusable memory map. I've tried to make that clearer in my v4. Hopefully I succeeded. Thanks, M.
On Fri, 7 Jan 2022 at 18:18, Marc Zyngier <maz@kernel.org> wrote: > This is a chicken and egg problem: you need the IPA size to compute > the memory map, and you need the memory map to compute the IPA > size. Fun, isn't it? > > At the moment, virt_set_memmap() doesn't know about the IPA space, > generates a highest_gpa that may not work, and we end-up failing > because the resulting VM type is out of bound. > > My solution to that is to feed the *maximum* IPA size to > virt_set_memmap(), compute the memory map there, and then use > highest_gpa to compute the actual IPA size that is used to create the > VM. By knowing the IPA limit in virt_set_memmap(), I'm able to keep it > in check and avoid generating an unusable memory map. Is there any reason not to just always create the VM with the maximum supported IPA size, rather than trying to create it with the smallest IPA size that will work? (ie skip the last step of computing the IPA size to create the VM with) -- PMM
On Fri, 07 Jan 2022 18:48:16 +0000, Peter Maydell <peter.maydell@linaro.org> wrote: > > On Fri, 7 Jan 2022 at 18:18, Marc Zyngier <maz@kernel.org> wrote: > > This is a chicken and egg problem: you need the IPA size to compute > > the memory map, and you need the memory map to compute the IPA > > size. Fun, isn't it? > > > > At the moment, virt_set_memmap() doesn't know about the IPA space, > > generates a highest_gpa that may not work, and we end-up failing > > because the resulting VM type is out of bound. > > > > My solution to that is to feed the *maximum* IPA size to > > virt_set_memmap(), compute the memory map there, and then use > > highest_gpa to compute the actual IPA size that is used to create the > > VM. By knowing the IPA limit in virt_set_memmap(), I'm able to keep it > > in check and avoid generating an unusable memory map. > > Is there any reason not to just always create the VM with the > maximum supported IPA size, rather than trying to create it > with the smallest IPA size that will work? (ie skip the last > step of computing the IPA size to create the VM with) That gives KVM the opportunity to reduce the depth of the S2 page tables. On HW that supports a large PA space, there is a real advantage in keeping these shallow if at all possible. Thanks, M.
diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 8b600d82c1..84dd3b36fb 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -1678,6 +1678,11 @@ static void virt_set_memmap(VirtMachineState *vms) exit(EXIT_FAILURE); } + if (!vms->highmem && + vms->memmap[VIRT_MEM].base + ms->maxram_size > 4 * GiB) { + error_report("highmem=off, but memory crosses the 4GiB limit\n"); + exit(EXIT_FAILURE); + } /* * We compute the base of the high IO region depending on the * amount of initial and device memory. The device memory start/size @@ -1707,7 +1712,9 @@ static void virt_set_memmap(VirtMachineState *vms) vms->memmap[i].size = size; base += size; } - vms->highest_gpa = base - 1; + vms->highest_gpa = (vms->highmem ? + base : + vms->memmap[VIRT_MEM].base + ms->maxram_size) - 1; if (device_memory_size > 0) { ms->device_memory = g_malloc0(sizeof(*ms->device_memory)); ms->device_memory->base = device_memory_base;