diff mbox series

[v5,27/36] sysbus: add a sysbus_mmio_unmap() helper

Message ID 20181116105729.23240-28-clg@kaod.org (mailing list archive)
State New, archived
Headers show
Series ppc: support for the XIVE interrupt controller (POWER9) | expand

Commit Message

Cédric Le Goater Nov. 16, 2018, 10:57 a.m. UTC
This will be used to remove the MMIO regions of the POWER9 XIVE
interrupt controller when the sPAPR machine is reseted.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 include/hw/sysbus.h |  1 +
 hw/core/sysbus.c    | 10 ++++++++++
 2 files changed, 11 insertions(+)

Comments

David Gibson Nov. 29, 2018, 4:09 a.m. UTC | #1
On Fri, Nov 16, 2018 at 11:57:20AM +0100, Cédric Le Goater wrote:
> This will be used to remove the MMIO regions of the POWER9 XIVE
> interrupt controller when the sPAPR machine is reseted.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

Since the code looks sane.

Hoever, I think using memory_region_set_enabled() would be a better
idea for our purposes than actually adding/deleting the subregion.

> ---
>  include/hw/sysbus.h |  1 +
>  hw/core/sysbus.c    | 10 ++++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
> index 0b59a3b8d605..bc641984b5da 100644
> --- a/include/hw/sysbus.h
> +++ b/include/hw/sysbus.h
> @@ -92,6 +92,7 @@ qemu_irq sysbus_get_connected_irq(SysBusDevice *dev, int n);
>  void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr);
>  void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr addr,
>                               int priority);
> +void sysbus_mmio_unmap(SysBusDevice *dev, int n);
>  void sysbus_add_io(SysBusDevice *dev, hwaddr addr,
>                     MemoryRegion *mem);
>  MemoryRegion *sysbus_address_space(SysBusDevice *dev);
> diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
> index 7ac36ad3e707..09f202167dcb 100644
> --- a/hw/core/sysbus.c
> +++ b/hw/core/sysbus.c
> @@ -153,6 +153,16 @@ static void sysbus_mmio_map_common(SysBusDevice *dev, int n, hwaddr addr,
>      }
>  }
>  
> +void sysbus_mmio_unmap(SysBusDevice *dev, int n)
> +{
> +    assert(n >= 0 && n < dev->num_mmio);
> +
> +    if (dev->mmio[n].addr != (hwaddr)-1) {
> +        memory_region_del_subregion(get_system_memory(), dev->mmio[n].memory);
> +        dev->mmio[n].addr = (hwaddr)-1;
> +    }
> +}
> +
>  void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr)
>  {
>      sysbus_mmio_map_common(dev, n, addr, false, 0);
Cédric Le Goater Nov. 29, 2018, 4:36 p.m. UTC | #2
On 11/29/18 5:09 AM, David Gibson wrote:
> On Fri, Nov 16, 2018 at 11:57:20AM +0100, Cédric Le Goater wrote:
>> This will be used to remove the MMIO regions of the POWER9 XIVE
>> interrupt controller when the sPAPR machine is reseted.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> 
> Since the code looks sane.
> 
> Hoever, I think using memory_region_set_enabled() would be a better
> idea for our purposes than actually adding/deleting the subregion.

Yes and we might not need this one anymore. 

Thanks,

C.

>> ---
>>  include/hw/sysbus.h |  1 +
>>  hw/core/sysbus.c    | 10 ++++++++++
>>  2 files changed, 11 insertions(+)
>>
>> diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
>> index 0b59a3b8d605..bc641984b5da 100644
>> --- a/include/hw/sysbus.h
>> +++ b/include/hw/sysbus.h
>> @@ -92,6 +92,7 @@ qemu_irq sysbus_get_connected_irq(SysBusDevice *dev, int n);
>>  void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr);
>>  void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr addr,
>>                               int priority);
>> +void sysbus_mmio_unmap(SysBusDevice *dev, int n);
>>  void sysbus_add_io(SysBusDevice *dev, hwaddr addr,
>>                     MemoryRegion *mem);
>>  MemoryRegion *sysbus_address_space(SysBusDevice *dev);
>> diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
>> index 7ac36ad3e707..09f202167dcb 100644
>> --- a/hw/core/sysbus.c
>> +++ b/hw/core/sysbus.c
>> @@ -153,6 +153,16 @@ static void sysbus_mmio_map_common(SysBusDevice *dev, int n, hwaddr addr,
>>      }
>>  }
>>  
>> +void sysbus_mmio_unmap(SysBusDevice *dev, int n)
>> +{
>> +    assert(n >= 0 && n < dev->num_mmio);
>> +
>> +    if (dev->mmio[n].addr != (hwaddr)-1) {
>> +        memory_region_del_subregion(get_system_memory(), dev->mmio[n].memory);
>> +        dev->mmio[n].addr = (hwaddr)-1;
>> +    }
>> +}
>> +
>>  void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr)
>>  {
>>      sysbus_mmio_map_common(dev, n, addr, false, 0);
>
Cédric Le Goater Dec. 3, 2018, 3:52 p.m. UTC | #3
On 11/29/18 5:36 PM, Cédric Le Goater wrote:
> On 11/29/18 5:09 AM, David Gibson wrote:
>> On Fri, Nov 16, 2018 at 11:57:20AM +0100, Cédric Le Goater wrote:
>>> This will be used to remove the MMIO regions of the POWER9 XIVE
>>> interrupt controller when the sPAPR machine is reseted.
>>>
>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>
>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>>
>> Since the code looks sane.
>>
>> Hoever, I think using memory_region_set_enabled() would be a better
>> idea for our purposes than actually adding/deleting the subregion.
> 
> Yes and we might not need this one anymore. 

As we are destroying the KVM device, we also need to remove the mmap 
in QEMU, else we will have a VMA with a page fault handler pointing
on a bogus KVM device.  which means destroying the memory region, so
we can not use  memory_region_set_enabled(). 

Anyhow mapping/unmapping works well.

Thanks,

C.
Peter Maydell Dec. 3, 2018, 5:48 p.m. UTC | #4
On Thu, 29 Nov 2018 at 04:55, David Gibson <david@gibson.dropbear.id.au> wrote:
>
> On Fri, Nov 16, 2018 at 11:57:20AM +0100, Cédric Le Goater wrote:
> > This will be used to remove the MMIO regions of the POWER9 XIVE
> > interrupt controller when the sPAPR machine is reseted.
> >
> > Signed-off-by: Cédric Le Goater <clg@kaod.org>
>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>
> Since the code looks sane.
>
> Hoever, I think using memory_region_set_enabled() would be a better
> idea for our purposes than actually adding/deleting the subregion.

The other approach I've used in the past is to use
sysbus_mmio_get_region() and then just map and unmap
that directly, rather than using the sysbus_mmio_map()
convenience function. (Often the kind of device that's
doing complicated things like this will be working in
a setup where it doesn't necessarily want to be mapping
directly into system memory rather than an SoC or similar
container MemoryRegion anyway.)

thanks
-- PMM
David Gibson Dec. 4, 2018, 1:59 a.m. UTC | #5
On Mon, Dec 03, 2018 at 04:52:46PM +0100, Cédric Le Goater wrote:
> On 11/29/18 5:36 PM, Cédric Le Goater wrote:
> > On 11/29/18 5:09 AM, David Gibson wrote:
> >> On Fri, Nov 16, 2018 at 11:57:20AM +0100, Cédric Le Goater wrote:
> >>> This will be used to remove the MMIO regions of the POWER9 XIVE
> >>> interrupt controller when the sPAPR machine is reseted.
> >>>
> >>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >>
> >> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> >>
> >> Since the code looks sane.
> >>
> >> Hoever, I think using memory_region_set_enabled() would be a better
> >> idea for our purposes than actually adding/deleting the subregion.
> > 
> > Yes and we might not need this one anymore. 
> 
> As we are destroying the KVM device, we also need to remove the mmap 
> in QEMU, else we will have a VMA with a page fault handler pointing
> on a bogus KVM device.  which means destroying the memory region, so
> we can not use  memory_region_set_enabled(). 
> 
> Anyhow mapping/unmapping works well.

Ah, ok.
Cédric Le Goater Dec. 4, 2018, 12:33 p.m. UTC | #6
Hello Peter,

On 12/3/18 6:48 PM, Peter Maydell wrote:
> On Thu, 29 Nov 2018 at 04:55, David Gibson <david@gibson.dropbear.id.au> wrote:
>>
>> On Fri, Nov 16, 2018 at 11:57:20AM +0100, Cédric Le Goater wrote:
>>> This will be used to remove the MMIO regions of the POWER9 XIVE
>>> interrupt controller when the sPAPR machine is reseted.
>>>
>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>>
>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>>
>> Since the code looks sane.
>>
>> Hoever, I think using memory_region_set_enabled() would be a better
>> idea for our purposes than actually adding/deleting the subregion.
> 
> The other approach I've used in the past is to use
> sysbus_mmio_get_region() and then just map and unmap
> that directly, rather than using the sysbus_mmio_map()
> convenience function. (Often the kind of device that's
> doing complicated things like this will be working in
> a setup where it doesn't necessarily want to be mapping
> directly into system memory rather than an SoC or similar
> container MemoryRegion anyway.)

Thanks for chiming in on that patch. Here is some background on 
what we are trying to model. May be you have some suggestions.

A completely new interrupt controller was introduced on the POWER9 
processor and it uses MMIO regions for interrupt management. These
regions are backed by simple MRs in QEMU, when using TCG, and backed
by ram_device_ptr MRs under KVM.

Difficulties arise with the fact that POWER9 pseries guests need
to support the old mode (XICS, no MMIOs) and the new mode XIVE.
The interrupt mode is negotiated at boot between the hypervisor
and the guest and a reset is generated to take into account 
the changes. Which means that, at reset, we may need to disconnect 
from a KVM IC device and reconnect to another. 

When switching from XICS to XIVE mode : 

  if kvm
    - destroy KVM XICS device
    - create KVM XIVE device 
    - get fd, mmap, init ram_device_ptr MRs
    - map mmio
  - enable MMIOs

When switching from XIVE to XICS  : 
  
  - disable MMIOs 
  if kvm
    - delete MRs
    - munmap
    - destroy KVM XIVE device
    - create KVM XICS device 


Thanks,

C.
Peter Maydell Dec. 4, 2018, 1:04 p.m. UTC | #7
On Tue, 4 Dec 2018 at 12:33, Cédric Le Goater <clg@kaod.org> wrote:
> A completely new interrupt controller was introduced on the POWER9
> processor and it uses MMIO regions for interrupt management. These
> regions are backed by simple MRs in QEMU, when using TCG, and backed
> by ram_device_ptr MRs under KVM.
>
> Difficulties arise with the fact that POWER9 pseries guests need
> to support the old mode (XICS, no MMIOs) and the new mode XIVE.
> The interrupt mode is negotiated at boot between the hypervisor
> and the guest and a reset is generated to take into account
> the changes. Which means that, at reset, we may need to disconnect
> from a KVM IC device and reconnect to another.

This is a painful API for QEMU to implement, incidentally,
because we don't have any concept really of a warm reset. In
theory reset should get you back to exactly the same state
as if you'd just started QEMU. You can probably bodge something
together, though.

> When switching from XICS to XIVE mode :
>
>   if kvm
>     - destroy KVM XICS device
>     - create KVM XIVE device
>     - get fd, mmap, init ram_device_ptr MRs
>     - map mmio
>   - enable MMIOs
>
> When switching from XIVE to XICS  :
>
>   - disable MMIOs
>   if kvm
>     - delete MRs
>     - munmap
>     - destroy KVM XIVE device
>     - create KVM XICS device

This seems basically OK, I think.

thanks
-- PMM
diff mbox series

Patch

diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
index 0b59a3b8d605..bc641984b5da 100644
--- a/include/hw/sysbus.h
+++ b/include/hw/sysbus.h
@@ -92,6 +92,7 @@  qemu_irq sysbus_get_connected_irq(SysBusDevice *dev, int n);
 void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr);
 void sysbus_mmio_map_overlap(SysBusDevice *dev, int n, hwaddr addr,
                              int priority);
+void sysbus_mmio_unmap(SysBusDevice *dev, int n);
 void sysbus_add_io(SysBusDevice *dev, hwaddr addr,
                    MemoryRegion *mem);
 MemoryRegion *sysbus_address_space(SysBusDevice *dev);
diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
index 7ac36ad3e707..09f202167dcb 100644
--- a/hw/core/sysbus.c
+++ b/hw/core/sysbus.c
@@ -153,6 +153,16 @@  static void sysbus_mmio_map_common(SysBusDevice *dev, int n, hwaddr addr,
     }
 }
 
+void sysbus_mmio_unmap(SysBusDevice *dev, int n)
+{
+    assert(n >= 0 && n < dev->num_mmio);
+
+    if (dev->mmio[n].addr != (hwaddr)-1) {
+        memory_region_del_subregion(get_system_memory(), dev->mmio[n].memory);
+        dev->mmio[n].addr = (hwaddr)-1;
+    }
+}
+
 void sysbus_mmio_map(SysBusDevice *dev, int n, hwaddr addr)
 {
     sysbus_mmio_map_common(dev, n, addr, false, 0);