diff mbox series

memory: Do not allow subregion out of the parent region range

Message ID 20191214160223.20012-1-philmd@redhat.com (mailing list archive)
State New, archived
Headers show
Series memory: Do not allow subregion out of the parent region range | expand

Commit Message

Philippe Mathieu-Daudé Dec. 14, 2019, 4:02 p.m. UTC
If a subregion is mapped out of the parent region range, it
will never get accessed. Since this is a bug, abort to help
the developer notice the mistake.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 memory.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Paolo Bonzini Dec. 16, 2019, 1:08 p.m. UTC | #1
On 14/12/19 17:02, Philippe Mathieu-Daudé wrote:
> If a subregion is mapped out of the parent region range, it
> will never get accessed. Since this is a bug, abort to help
> the developer notice the mistake.
> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  memory.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/memory.c b/memory.c
> index 06484c2bff..61f355dcd5 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -2390,6 +2390,7 @@ static void memory_region_add_subregion_common(MemoryRegion *mr,
>  {
>      assert(!subregion->container);
>      subregion->container = mr;
> +    assert(offset + memory_region_size(subregion) <= memory_region_size(mr));
>      subregion->addr = offset;
>      memory_region_update_container_subregions(subregion);
>  }
> 

I think in some cases this could be intentional, for example if you have
different models with different BAR sizes and you organize this with the
same tree of MemoryRegion and different sizes for the parent.  I'm not
saying this happens in the current devices we support, I'm just
wondering if it should be a reason not to apply the patch.  I suppose
you did spend some time debugging something where the patch would have
been useful; what was that something?

Paolo
Philippe Mathieu-Daudé Dec. 16, 2019, 5:46 p.m. UTC | #2
On 12/16/19 2:08 PM, Paolo Bonzini wrote:
> On 14/12/19 17:02, Philippe Mathieu-Daudé wrote:
>> If a subregion is mapped out of the parent region range, it
>> will never get accessed. Since this is a bug, abort to help
>> the developer notice the mistake.
>>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>>   memory.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/memory.c b/memory.c
>> index 06484c2bff..61f355dcd5 100644
>> --- a/memory.c
>> +++ b/memory.c
>> @@ -2390,6 +2390,7 @@ static void memory_region_add_subregion_common(MemoryRegion *mr,
>>   {
>>       assert(!subregion->container);
>>       subregion->container = mr;
>> +    assert(offset + memory_region_size(subregion) <= memory_region_size(mr));
>>       subregion->addr = offset;
>>       memory_region_update_container_subregions(subregion);
>>   }
>>
> 
> I think in some cases this could be intentional, for example if you have
> different models with different BAR sizes and you organize this with the
> same tree of MemoryRegion and different sizes for the parent.

But if a child is outside of the parent range, it can't be reached, 
right? I'm confused, maybe AddressSpace can do that, but MemoryRegion 
shouldn't?

In the PCI case, you will simply alias a subregion with 
memory_region_init_alias(..., size), and size has to be <= parent size.
But you won't add the PCI region, you'll add the alias, so the assert 
won't fire.

> I'm not
> saying this happens in the current devices we support, I'm just
> wondering if it should be a reason not to apply the patch.  I suppose
> you did spend some time debugging something where the patch would have
> been useful; what was that something?

I'm updating some devices to use relative base address, instead of 
absolute one. This is useful when a subdevice is reused in another 
device but mapped at a different location.

One case is the Raspberry Pi:

$ git grep BCM2835_VC_PERI_BASE
hw/arm/bcm2835_peripherals.c:20:#define BCM2835_VC_PERI_BASE 0x7e000000
hw/arm/bcm2835_peripherals.c:156: 
memory_region_add_subregion_overlap(&s->gpu_bus_mr, BCM2835_VC_PERI_BASE,

The GPU physical address space is 1GiB, and virtual is 4GiB. Currently 
the device is mapped in virtual space, not respecting the GPU cache 
mappings. If we move the chipset base address (and correct the cache 
mappings) this device ends out of the GPU physical address space.

To have it working I had to correct the physical address to 0x3e000000.

Maybe 'info mtree' is more explicit:

before:

address-space: bcm2835-dma-memory
0000000000000000-00000000ffffffff (prio 0, i/o): bcm2835-gpu
   0000000000000000-000000003fffffff (prio 0, i/o): alias 
bcm2835-gpu-ram-alias[*] @ram 0000000000000000-000000003fffffff
   0000000040000000-000000007fffffff (prio 0, i/o): alias 
bcm2835-gpu-ram-alias[*] @ram 0000000000000000-000000003fffffff
   000000007e000000-000000007effffff (prio 1, i/o): alias 
bcm2835-peripherals @bcm2835-peripherals 0000000000000000-0000000000ffffff
   0000000080000000-00000000bfffffff (prio 0, i/o): alias 
bcm2835-gpu-ram-alias[*] @ram 0000000000000000-000000003fffffff
   00000000c0000000-00000000ffffffff (prio 0, i/o): alias 
bcm2835-gpu-ram-alias[*] @ram 0000000000000000-000000003fffffff

after:

address-space: bcm2835-dma-memory
0000000000000000-00000000ffffffff (prio 0, i/o): gpu-bus
   0000000000000000-000000003fffffff (prio 0, i/o): alias l1-l2-cached 
@videocore 0000000000000000-000000003fffffff
   0000000040000000-000000007fffffff (prio 0, i/o): alias 
l2-cached-coherent @videocore 0000000000000000-000000003fffffff
   0000000080000000-00000000bfffffff (prio 0, i/o): alias l2-cached 
@videocore 0000000000000000-000000003fffffff
   00000000c0000000-00000000ffffffff (prio 0, i/o): alias 
direct-uncached @videocore 0000000000000000-000000003fffffff

memory-region: videocore
0000000000000000-000000003fffffff (prio 0, i/o): videocore
   0000000000000000-000000003fffffff (prio 0, i/o): alias vc-ram @ram 
0000000000000000-000000003fffffff
   000000003e000000-000000003effffff (prio 1, i/o): alias vc-peripherals 
@bcm2835-peripherals 0000000000000000-0000000000ffffff

Even clearer, a diff of 'info mtree -f':

  (qemu) info mtree -f
  FlatView #0
   AS "bcm2835-fb-memory", root: gpu-bus
   AS "bcm2835-property-memory", root: gpu-bus
   AS "bcm2835-dma-memory", root: gpu-bus
   Root memory region: gpu-bus
-  0000000000000000-000000003fffffff (prio 0, ram): ram
+  0000000000000000-000000003e002fff (prio 0, ram): ram
+  000000003e003000-000000003e00301f (prio 0, i/o): bcm2835-sys-timer
+  000000003e003020-000000003e006fff (prio 0, ram): ram @000000003e003020
+  000000003e007000-000000003e007fff (prio 0, i/o): bcm2835-dma
+  000000003e008000-000000003e00b1ff (prio 0, ram): ram @000000003e008000
+  000000003e00b200-000000003e00b3ff (prio 0, i/o): bcm2835-ic
+  000000003e00b400-000000003e00b43f (prio -1000, i/o): bcm2835-sp804
+  000000003e00b440-000000003e00b7ff (prio 0, ram): ram @000000003e00b440
+  000000003e00b800-000000003e00bbff (prio 0, i/o): bcm2835-mbox
+  000000003e00bc00-000000003e0fffff (prio 0, ram): ram @000000003e00bc00
+  000000003e100000-000000003e100fff (prio -1000, i/o): bcm2835-cprman
+  000000003e101000-000000003e101fff (prio 0, ram): ram @000000003e101000
+  000000003e102000-000000003e102fff (prio -1000, i/o): bcm2835-a2w
+  000000003e103000-000000003e103fff (prio 0, ram): ram @000000003e103000
+  000000003e104000-000000003e10400f (prio 0, i/o): bcm2835-rng
+  000000003e104010-000000003e1fffff (prio 0, ram): ram @000000003e104010
+  000000003e200000-000000003e200fff (prio 0, i/o): bcm2835_gpio
+  000000003e201000-000000003e201fff (prio 0, i/o): pl011
+  000000003e202000-000000003e202fff (prio 0, i/o): bcm2835-sdhost
+  000000003e203000-000000003e2030ff (prio -1000, i/o): bcm2835-i2s
+  000000003e203100-000000003e203fff (prio 0, ram): ram @000000003e203100
+  000000003e204000-000000003e20401f (prio -1000, i/o): bcm2835-spi0
+  000000003e204020-000000003e204fff (prio 0, ram): ram @000000003e204020
+  000000003e205000-000000003e20501f (prio -1000, i/o): bcm2835-i2c0
+  000000003e205020-000000003e20efff (prio 0, ram): ram @000000003e205020
+  000000003e20f000-000000003e20f07f (prio -1000, i/o): bcm2835-otp
+  000000003e20f080-000000003e211fff (prio 0, ram): ram @000000003e20f080
+  000000003e212000-000000003e212007 (prio 0, i/o): bcm2835-thermal
+  000000003e212008-000000003e213fff (prio 0, ram): ram @000000003e212008
+  000000003e214000-000000003e2140ff (prio -1000, i/o): bcm2835-spis
+  000000003e214100-000000003e214fff (prio 0, ram): ram @000000003e214100
+  000000003e215000-000000003e2150ff (prio 0, i/o): bcm2835-aux
+  000000003e215100-000000003e2fffff (prio 0, ram): ram @000000003e215100
+  000000003e300000-000000003e3000ff (prio 0, i/o): sdhci
+  000000003e300100-000000003e5fffff (prio 0, ram): ram @000000003e300100
+  000000003e600000-000000003e6000ff (prio -1000, i/o): bcm2835-smi
+  000000003e600100-000000003e803fff (prio 0, ram): ram @000000003e600100
+  000000003e804000-000000003e80401f (prio -1000, i/o): bcm2835-i2c1
+  000000003e804020-000000003e804fff (prio 0, ram): ram @000000003e804020
+  000000003e805000-000000003e80501f (prio -1000, i/o): bcm2835-i2c2
+  000000003e805020-000000003e8fffff (prio 0, ram): ram @000000003e805020
+  000000003e900000-000000003e907fff (prio -1000, i/o): bcm2835-dbus
+  000000003e908000-000000003e90ffff (prio 0, ram): ram @000000003e908000
+  000000003e910000-000000003e917fff (prio -1000, i/o): bcm2835-ave0
+  000000003e918000-000000003e97ffff (prio 0, ram): ram @000000003e918000
+  000000003e980000-000000003e980fff (prio -1000, i/o): dwc-usb2
+  000000003e981000-000000003edfffff (prio 0, ram): ram @000000003e981000
+  000000003ee00000-000000003ee000ff (prio -1000, i/o): bcm2835-sdramc
+  000000003ee00100-000000003ee04fff (prio 0, ram): ram @000000003ee00100
+  000000003ee05000-000000003ee050ff (prio 0, i/o): bcm2835-dma-chan15
+  000000003ee05100-000000003fffffff (prio 0, ram): ram @000000003ee05100
    0000000040000000-000000007e002fff (prio 0, ram): ram
    000000007e003000-000000007e00301f (prio 0, i/o): bcm2835-sys-timer
    000000007e003020-000000007e006fff (prio 0, ram): ram @000000003e003020
@@ -56,5 +106,105 @@
    000000007ee00100-000000007ee04fff (prio 0, ram): ram @000000003ee00100
    000000007ee05000-000000007ee050ff (prio 0, i/o): bcm2835-dma-chan15
    000000007ee05100-000000007fffffff (prio 0, ram): ram @000000003ee05100
-  0000000080000000-00000000bfffffff (prio 0, ram): ram
-  00000000c0000000-00000000ffffffff (prio 0, ram): ram
+  0000000080000000-00000000be002fff (prio 0, ram): ram
+  00000000be003000-00000000be00301f (prio 0, i/o): bcm2835-sys-timer
+  00000000be003020-00000000be006fff (prio 0, ram): ram @000000003e003020
+  00000000be007000-00000000be007fff (prio 0, i/o): bcm2835-dma
+  00000000be008000-00000000be00b1ff (prio 0, ram): ram @000000003e008000
+  00000000be00b200-00000000be00b3ff (prio 0, i/o): bcm2835-ic
+  00000000be00b400-00000000be00b43f (prio -1000, i/o): bcm2835-sp804
+  00000000be00b440-00000000be00b7ff (prio 0, ram): ram @000000003e00b440
+  00000000be00b800-00000000be00bbff (prio 0, i/o): bcm2835-mbox
+  00000000be00bc00-00000000be0fffff (prio 0, ram): ram @000000003e00bc00
+  00000000be100000-00000000be100fff (prio -1000, i/o): bcm2835-cprman
+  00000000be101000-00000000be101fff (prio 0, ram): ram @000000003e101000
+  00000000be102000-00000000be102fff (prio -1000, i/o): bcm2835-a2w
+  00000000be103000-00000000be103fff (prio 0, ram): ram @000000003e103000
+  00000000be104000-00000000be10400f (prio 0, i/o): bcm2835-rng
+  00000000be104010-00000000be1fffff (prio 0, ram): ram @000000003e104010
+  00000000be200000-00000000be200fff (prio 0, i/o): bcm2835_gpio
+  00000000be201000-00000000be201fff (prio 0, i/o): pl011
+  00000000be202000-00000000be202fff (prio 0, i/o): bcm2835-sdhost
+  00000000be203000-00000000be2030ff (prio -1000, i/o): bcm2835-i2s
+  00000000be203100-00000000be203fff (prio 0, ram): ram @000000003e203100
+  00000000be204000-00000000be20401f (prio -1000, i/o): bcm2835-spi0
+  00000000be204020-00000000be204fff (prio 0, ram): ram @000000003e204020
+  00000000be205000-00000000be20501f (prio -1000, i/o): bcm2835-i2c0
+  00000000be205020-00000000be20efff (prio 0, ram): ram @000000003e205020
+  00000000be20f000-00000000be20f07f (prio -1000, i/o): bcm2835-otp
+  00000000be20f080-00000000be211fff (prio 0, ram): ram @000000003e20f080
+  00000000be212000-00000000be212007 (prio 0, i/o): bcm2835-thermal
+  00000000be212008-00000000be213fff (prio 0, ram): ram @000000003e212008
+  00000000be214000-00000000be2140ff (prio -1000, i/o): bcm2835-spis
+  00000000be214100-00000000be214fff (prio 0, ram): ram @000000003e214100
+  00000000be215000-00000000be2150ff (prio 0, i/o): bcm2835-aux
+  00000000be215100-00000000be2fffff (prio 0, ram): ram @000000003e215100
+  00000000be300000-00000000be3000ff (prio 0, i/o): sdhci
+  00000000be300100-00000000be5fffff (prio 0, ram): ram @000000003e300100
+  00000000be600000-00000000be6000ff (prio -1000, i/o): bcm2835-smi
+  00000000be600100-00000000be803fff (prio 0, ram): ram @000000003e600100
+  00000000be804000-00000000be80401f (prio -1000, i/o): bcm2835-i2c1
+  00000000be804020-00000000be804fff (prio 0, ram): ram @000000003e804020
+  00000000be805000-00000000be80501f (prio -1000, i/o): bcm2835-i2c2
+  00000000be805020-00000000be8fffff (prio 0, ram): ram @000000003e805020
+  00000000be900000-00000000be907fff (prio -1000, i/o): bcm2835-dbus
+  00000000be908000-00000000be90ffff (prio 0, ram): ram @000000003e908000
+  00000000be910000-00000000be917fff (prio -1000, i/o): bcm2835-ave0
+  00000000be918000-00000000be97ffff (prio 0, ram): ram @000000003e918000
+  00000000be980000-00000000be980fff (prio -1000, i/o): dwc-usb2
+  00000000be981000-00000000bedfffff (prio 0, ram): ram @000000003e981000
+  00000000bee00000-00000000bee000ff (prio -1000, i/o): bcm2835-sdramc
+  00000000bee00100-00000000bee04fff (prio 0, ram): ram @000000003ee00100
+  00000000bee05000-00000000bee050ff (prio 0, i/o): bcm2835-dma-chan15
+  00000000bee05100-00000000bfffffff (prio 0, ram): ram @000000003ee05100
+  00000000c0000000-00000000fe002fff (prio 0, ram): ram
+  00000000fe003000-00000000fe00301f (prio 0, i/o): bcm2835-sys-timer
+  00000000fe003020-00000000fe006fff (prio 0, ram): ram @000000003e003020
+  00000000fe007000-00000000fe007fff (prio 0, i/o): bcm2835-dma
+  00000000fe008000-00000000fe00b1ff (prio 0, ram): ram @000000003e008000
+  00000000fe00b200-00000000fe00b3ff (prio 0, i/o): bcm2835-ic
+  00000000fe00b400-00000000fe00b43f (prio -1000, i/o): bcm2835-sp804
+  00000000fe00b440-00000000fe00b7ff (prio 0, ram): ram @000000003e00b440
+  00000000fe00b800-00000000fe00bbff (prio 0, i/o): bcm2835-mbox
+  00000000fe00bc00-00000000fe0fffff (prio 0, ram): ram @000000003e00bc00
+  00000000fe100000-00000000fe100fff (prio -1000, i/o): bcm2835-cprman
+  00000000fe101000-00000000fe101fff (prio 0, ram): ram @000000003e101000
+  00000000fe102000-00000000fe102fff (prio -1000, i/o): bcm2835-a2w
+  00000000fe103000-00000000fe103fff (prio 0, ram): ram @000000003e103000
+  00000000fe104000-00000000fe10400f (prio 0, i/o): bcm2835-rng
+  00000000fe104010-00000000fe1fffff (prio 0, ram): ram @000000003e104010
+  00000000fe200000-00000000fe200fff (prio 0, i/o): bcm2835_gpio
+  00000000fe201000-00000000fe201fff (prio 0, i/o): pl011
+  00000000fe202000-00000000fe202fff (prio 0, i/o): bcm2835-sdhost
+  00000000fe203000-00000000fe2030ff (prio -1000, i/o): bcm2835-i2s
+  00000000fe203100-00000000fe203fff (prio 0, ram): ram @000000003e203100
+  00000000fe204000-00000000fe20401f (prio -1000, i/o): bcm2835-spi0
+  00000000fe204020-00000000fe204fff (prio 0, ram): ram @000000003e204020
+  00000000fe205000-00000000fe20501f (prio -1000, i/o): bcm2835-i2c0
+  00000000fe205020-00000000fe20efff (prio 0, ram): ram @000000003e205020
+  00000000fe20f000-00000000fe20f07f (prio -1000, i/o): bcm2835-otp
+  00000000fe20f080-00000000fe211fff (prio 0, ram): ram @000000003e20f080
+  00000000fe212000-00000000fe212007 (prio 0, i/o): bcm2835-thermal
+  00000000fe212008-00000000fe213fff (prio 0, ram): ram @000000003e212008
+  00000000fe214000-00000000fe2140ff (prio -1000, i/o): bcm2835-spis
+  00000000fe214100-00000000fe214fff (prio 0, ram): ram @000000003e214100
+  00000000fe215000-00000000fe2150ff (prio 0, i/o): bcm2835-aux
+  00000000fe215100-00000000fe2fffff (prio 0, ram): ram @000000003e215100
+  00000000fe300000-00000000fe3000ff (prio 0, i/o): sdhci
+  00000000fe300100-00000000fe5fffff (prio 0, ram): ram @000000003e300100
+  00000000fe600000-00000000fe6000ff (prio -1000, i/o): bcm2835-smi
+  00000000fe600100-00000000fe803fff (prio 0, ram): ram @000000003e600100
+  00000000fe804000-00000000fe80401f (prio -1000, i/o): bcm2835-i2c1
+  00000000fe804020-00000000fe804fff (prio 0, ram): ram @000000003e804020
+  00000000fe805000-00000000fe80501f (prio -1000, i/o): bcm2835-i2c2
+  00000000fe805020-00000000fe8fffff (prio 0, ram): ram @000000003e805020
+  00000000fe900000-00000000fe907fff (prio -1000, i/o): bcm2835-dbus
+  00000000fe908000-00000000fe90ffff (prio 0, ram): ram @000000003e908000
+  00000000fe910000-00000000fe917fff (prio -1000, i/o): bcm2835-ave0
+  00000000fe918000-00000000fe97ffff (prio 0, ram): ram @000000003e918000
+  00000000fe980000-00000000fe980fff (prio -1000, i/o): dwc-usb2
+  00000000fe981000-00000000fedfffff (prio 0, ram): ram @000000003e981000
+  00000000fee00000-00000000fee000ff (prio -1000, i/o): bcm2835-sdramc
+  00000000fee00100-00000000fee04fff (prio 0, ram): ram @000000003ee00100
+  00000000fee05000-00000000fee050ff (prio 0, i/o): bcm2835-dma-chan15
+  00000000fee05100-00000000ffffffff (prio 0, ram): ram @000000003ee05100

This is for the raspi2, but there is a mmap schema for the raspi1:
https://www.cnx-software.com/wp-content/uploads/2012/02/BCM2835-Memory-Map-Large.png


I found another discrepancy in the Bonito north bridge, which maps in 
KSEG1 instead of physical:

$ git grep 'sysbus_mmio_map(.*, 0xb'
hw/pci-host/bonito.c:645:    sysbus_mmio_map(sysbus, 3, 0xbfe00200);
hw/pci-host/bonito.c:650:    sysbus_mmio_map(sysbus, 4, 0xbfe00300);


We can put this patch on hold for now, I might came back to this thread 
later with more use cases.

Regards,

Phil.
Paolo Bonzini Dec. 17, 2019, 10:51 a.m. UTC | #3
On 16/12/19 18:46, Philippe Mathieu-Daudé wrote:
>>>
>>
>> I think in some cases this could be intentional, for example if you have
>> different models with different BAR sizes and you organize this with the
>> same tree of MemoryRegion and different sizes for the parent.
> 
> But if a child is outside of the parent range, it can't be reached,
> right? I'm confused, maybe AddressSpace can do that, but MemoryRegion
> shouldn't?

Yes, the idea is that you could have for one version of the device

   parent 0x000-0x7ff
     stuff 0x000-0x3ff
     morestuff 0x400-0x7ff

and for another

   parent 0x000-0x3ff
     stuff 0x000-0x3ff
     morestuff 0x400-0x7ff

where parent is the BAR, and you can share the code to generate the tree
underneath parent.

> In the PCI case, you will simply alias a subregion with
> memory_region_init_alias(..., size), and size has to be <= parent size.
> But you won't add the PCI region, you'll add the alias, so the assert
> won't fire.

Yes, this is a workaround though.  You shouldn't need the alias.

I can see a case for your patch but I can also see one for the current
behavior...

Paolo
Christophe de Dinechin Dec. 17, 2019, 11:58 a.m. UTC | #4
> On 17 Dec 2019, at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 16/12/19 18:46, Philippe Mathieu-Daudé wrote:
>>>> 
>>> 
>>> I think in some cases this could be intentional, for example if you have
>>> different models with different BAR sizes and you organize this with the
>>> same tree of MemoryRegion and different sizes for the parent.
>> 
>> But if a child is outside of the parent range, it can't be reached,
>> right? I'm confused, maybe AddressSpace can do that, but MemoryRegion
>> shouldn't?
> 
> Yes, the idea is that you could have for one version of the device
> 
>   parent 0x000-0x7ff
>     stuff 0x000-0x3ff
>     morestuff 0x400-0x7ff
> 
> and for another
> 
>   parent 0x000-0x3ff
>     stuff 0x000-0x3ff
>     morestuff 0x400-0x7ff
> 
> where parent is the BAR, and you can share the code to generate the tree
> underneath parent.

I can see why you would have code reuse reasons to do that,
but frankly it looks buggy and confusing. In the rare cases
where this is indented, maybe add a flag making it explicit?

> 
>> In the PCI case, you will simply alias a subregion with
>> memory_region_init_alias(..., size), and size has to be <= parent size.
>> But you won't add the PCI region, you'll add the alias, so the assert
>> won't fire.
> 
> Yes, this is a workaround though.  You shouldn't need the alias.
> 
> I can see a case for your patch but I can also see one for the current
> behavior...
> 
> Paolo
> 
>
Richard Henderson Dec. 17, 2019, 4:57 p.m. UTC | #5
On 12/17/19 1:58 AM, Christophe de Dinechin wrote:
> 
> 
>> On 17 Dec 2019, at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 16/12/19 18:46, Philippe Mathieu-Daudé wrote:
>>>>>
>>>>
>>>> I think in some cases this could be intentional, for example if you have
>>>> different models with different BAR sizes and you organize this with the
>>>> same tree of MemoryRegion and different sizes for the parent.
>>>
>>> But if a child is outside of the parent range, it can't be reached,
>>> right? I'm confused, maybe AddressSpace can do that, but MemoryRegion
>>> shouldn't?
>>
>> Yes, the idea is that you could have for one version of the device
>>
>>   parent 0x000-0x7ff
>>     stuff 0x000-0x3ff
>>     morestuff 0x400-0x7ff
>>
>> and for another
>>
>>   parent 0x000-0x3ff
>>     stuff 0x000-0x3ff
>>     morestuff 0x400-0x7ff
>>
>> where parent is the BAR, and you can share the code to generate the tree
>> underneath parent.
> 
> I can see why you would have code reuse reasons to do that,
> but frankly it looks buggy and confusing. In the rare cases
> where this is indented, maybe add a flag making it explicit?

The guest OS is programming the BAR, producing a configuration that, while it
doesn't make sense, is also legal per PCI.  QEMU cannot abort for this
configuration.


r~
Peter Maydell Dec. 17, 2019, 6:17 p.m. UTC | #6
On Tue, 17 Dec 2019 at 16:57, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 12/17/19 1:58 AM, Christophe de Dinechin wrote:
> >
> >
> >> On 17 Dec 2019, at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >> Yes, the idea is that you could have for one version of the device
> >>
> >>   parent 0x000-0x7ff
> >>     stuff 0x000-0x3ff
> >>     morestuff 0x400-0x7ff
> >>
> >> and for another
> >>
> >>   parent 0x000-0x3ff
> >>     stuff 0x000-0x3ff
> >>     morestuff 0x400-0x7ff
> >>
> >> where parent is the BAR, and you can share the code to generate the tree
> >> underneath parent.
> >
> > I can see why you would have code reuse reasons to do that,
> > but frankly it looks buggy and confusing. In the rare cases
> > where this is indented, maybe add a flag making it explicit?
>
> The guest OS is programming the BAR, producing a configuration that, while it
> doesn't make sense, is also legal per PCI.  QEMU cannot abort for this
> configuration.

Does guest programming of the PCI BAR size actually change the size
of the 'parent' region, or does it just result in the creation
of an appropriately sized alias into 'parent' ?

thanks
-- PMM
Paolo Bonzini Dec. 17, 2019, 6:31 p.m. UTC | #7
On 17/12/19 19:17, Peter Maydell wrote:
> On Tue, 17 Dec 2019 at 16:57, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> On 12/17/19 1:58 AM, Christophe de Dinechin wrote:
>>>
>>>
>>>> On 17 Dec 2019, at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> Yes, the idea is that you could have for one version of the device
>>>>
>>>>   parent 0x000-0x7ff
>>>>     stuff 0x000-0x3ff
>>>>     morestuff 0x400-0x7ff
>>>>
>>>> and for another
>>>>
>>>>   parent 0x000-0x3ff
>>>>     stuff 0x000-0x3ff
>>>>     morestuff 0x400-0x7ff
>>>>
>>>> where parent is the BAR, and you can share the code to generate the tree
>>>> underneath parent.
>>>
>>> I can see why you would have code reuse reasons to do that,
>>> but frankly it looks buggy and confusing. In the rare cases
>>> where this is indented, maybe add a flag making it explicit?
>>
>> The guest OS is programming the BAR, producing a configuration that, while it
>> doesn't make sense, is also legal per PCI.  QEMU cannot abort for this
>> configuration.
> 
> Does guest programming of the PCI BAR size actually change the size
> of the 'parent' region, or does it just result in the creation
> of an appropriately sized alias into 'parent' ?

Resizable BARs are not handled by the PCI host bridge but rather from
the device itself, so the device is free to handle them either way.

Paolo
Alex Williamson Dec. 17, 2019, 6:52 p.m. UTC | #8
On Tue, 17 Dec 2019 19:31:41 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 17/12/19 19:17, Peter Maydell wrote:
> > On Tue, 17 Dec 2019 at 16:57, Richard Henderson
> > <richard.henderson@linaro.org> wrote:  
> >>
> >> On 12/17/19 1:58 AM, Christophe de Dinechin wrote:  
> >>>
> >>>  
> >>>> On 17 Dec 2019, at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>> Yes, the idea is that you could have for one version of the device
> >>>>
> >>>>   parent 0x000-0x7ff
> >>>>     stuff 0x000-0x3ff
> >>>>     morestuff 0x400-0x7ff
> >>>>
> >>>> and for another
> >>>>
> >>>>   parent 0x000-0x3ff
> >>>>     stuff 0x000-0x3ff
> >>>>     morestuff 0x400-0x7ff
> >>>>
> >>>> where parent is the BAR, and you can share the code to generate the tree
> >>>> underneath parent.  
> >>>
> >>> I can see why you would have code reuse reasons to do that,
> >>> but frankly it looks buggy and confusing. In the rare cases
> >>> where this is indented, maybe add a flag making it explicit?  
> >>
> >> The guest OS is programming the BAR, producing a configuration that, while it
> >> doesn't make sense, is also legal per PCI.  QEMU cannot abort for this
> >> configuration.  
> > 
> > Does guest programming of the PCI BAR size actually change the size
> > of the 'parent' region, or does it just result in the creation
> > of an appropriately sized alias into 'parent' ?  
> 
> Resizable BARs are not handled by the PCI host bridge but rather from
> the device itself, so the device is free to handle them either way.

More specifically, it's the responsibility of drivers within the guest
to resize the parent bridge aperture to make the extent of the BAR
accessible.  This does seem like an interesting way to implement a
resizable BAR in QEMU though.  Thanks,

Alex
Philippe Mathieu-Daudé Dec. 17, 2019, 7:17 p.m. UTC | #9
On 12/17/19 7:52 PM, Alex Williamson wrote:
> On Tue, 17 Dec 2019 19:31:41 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>> On 17/12/19 19:17, Peter Maydell wrote:
>>> On Tue, 17 Dec 2019 at 16:57, Richard Henderson
>>> <richard.henderson@linaro.org> wrote:
>>>>
>>>> On 12/17/19 1:58 AM, Christophe de Dinechin wrote:
>>>>>
>>>>>   
>>>>>> On 17 Dec 2019, at 11:51, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>>>> Yes, the idea is that you could have for one version of the device
>>>>>>
>>>>>>    parent 0x000-0x7ff
>>>>>>      stuff 0x000-0x3ff
>>>>>>      morestuff 0x400-0x7ff
>>>>>>
>>>>>> and for another
>>>>>>
>>>>>>    parent 0x000-0x3ff
>>>>>>      stuff 0x000-0x3ff
>>>>>>      morestuff 0x400-0x7ff
>>>>>>
>>>>>> where parent is the BAR, and you can share the code to generate the tree
>>>>>> underneath parent.
>>>>>
>>>>> I can see why you would have code reuse reasons to do that,
>>>>> but frankly it looks buggy and confusing. In the rare cases
>>>>> where this is indented, maybe add a flag making it explicit?
>>>>
>>>> The guest OS is programming the BAR, producing a configuration that, while it
>>>> doesn't make sense, is also legal per PCI.  QEMU cannot abort for this
>>>> configuration.
>>>
>>> Does guest programming of the PCI BAR size actually change the size
>>> of the 'parent' region, or does it just result in the creation
>>> of an appropriately sized alias into 'parent' ?
>>
>> Resizable BARs are not handled by the PCI host bridge but rather from
>> the device itself, so the device is free to handle them either way.
> 
> More specifically, it's the responsibility of drivers within the guest
> to resize the parent bridge aperture to make the extent of the BAR
> accessible.  This does seem like an interesting way to implement a
> resizable BAR in QEMU though.  Thanks,

This is something I'm thinking about since some time, as I observed this 
behavior in 3 different MIPS boards with different northbridge chipset 
(Malta with the GT64120, Fuloong2E with the Bonito).
The firmware sets one layout, Linux (or other) reinit & reorder all the 
memory layout. I guess Mark hit the same issue with his sparc64 based 
boards.
diff mbox series

Patch

diff --git a/memory.c b/memory.c
index 06484c2bff..61f355dcd5 100644
--- a/memory.c
+++ b/memory.c
@@ -2390,6 +2390,7 @@  static void memory_region_add_subregion_common(MemoryRegion *mr,
 {
     assert(!subregion->container);
     subregion->container = mr;
+    assert(offset + memory_region_size(subregion) <= memory_region_size(mr));
     subregion->addr = offset;
     memory_region_update_container_subregions(subregion);
 }