Message ID | 20180920103243.28474-7-david@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | memory-device: complete refactoring + virtio-pmem | expand |
On Thu, Sep 20, 2018 at 12:32:27PM +0200, David Hildenbrand wrote: > Document the functions and when to not expect errors. > > Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > include/hw/mem/memory-device.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h > index f02b229837..d6853156ff 100644 > --- a/include/hw/mem/memory-device.h > +++ b/include/hw/mem/memory-device.h > @@ -29,9 +29,22 @@ typedef struct MemoryDeviceState { > Object parent_obj; > } MemoryDeviceState; > > +/** > + * MemoryDeviceClass: > + * @get_addr: The address of the @md in guest physical memory. "0" means that > + * no address has been specified by the user and that no address has been > + * assigned yet. > + * @get_plugged_size: The amount of memory provided by this @md currently > + * usable ("plugged") by the guest. > + * @get_region_size: The size of the memory region of the @md that's mapped > + * in guest physical memory at @get_addr. > + * @fill_device_info: Translate current @md state into #MemoryDeviceInfo. > + */ > typedef struct MemoryDeviceClass { > + /* private */ > InterfaceClass parent_class; > > + /* public */ > uint64_t (*get_addr)(const MemoryDeviceState *md); > uint64_t (*get_plugged_size)(const MemoryDeviceState *md, Error **errp); > uint64_t (*get_region_size)(const MemoryDeviceState *md, Error **errp);
On Thu, 20 Sep 2018 12:32:27 +0200 David Hildenbrand <david@redhat.com> wrote: > Document the functions and when to not expect errors. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > include/hw/mem/memory-device.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h [...] > + * @get_plugged_size: The amount of memory provided by this @md currently > + * usable ("plugged") by the guest. > + * @get_region_size: The size of the memory region of the @md that's mapped > + * in guest physical memory at @get_addr Tried to read it several times, but still description of hooks makes them look like they are doing the same thing, so what's the difference? [...]
On 24/09/2018 14:22, Igor Mammedov wrote: > On Thu, 20 Sep 2018 12:32:27 +0200 > David Hildenbrand <david@redhat.com> wrote: > >> Document the functions and when to not expect errors. >> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> include/hw/mem/memory-device.h | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h > [...] > >> + * @get_plugged_size: The amount of memory provided by this @md currently >> + * usable ("plugged") by the guest. >> + * @get_region_size: The size of the memory region of the @md that's mapped >> + * in guest physical memory at @get_addr > Tried to read it several times, > but still description of hooks makes them look like they are doing the same thing, > so what's the difference? That's a preparation mainly for virtio-mem, where we manage the amount of memory that is actually accessible and visible by the guest (plugged) inside a memory region dynamically. The difference exists purely for stats. (could have been introduced later, but it made sense to split this off right when factoring this stuff out into memory-device code).
On Mon, 24 Sep 2018 14:26:23 +0200 David Hildenbrand <david@redhat.com> wrote: > On 24/09/2018 14:22, Igor Mammedov wrote: > > On Thu, 20 Sep 2018 12:32:27 +0200 > > David Hildenbrand <david@redhat.com> wrote: > > > >> Document the functions and when to not expect errors. > >> > >> Signed-off-by: David Hildenbrand <david@redhat.com> > >> --- > >> include/hw/mem/memory-device.h | 13 +++++++++++++ > >> 1 file changed, 13 insertions(+) > >> > >> diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h > > [...] > > > >> + * @get_plugged_size: The amount of memory provided by this @md currently > >> + * usable ("plugged") by the guest. > >> + * @get_region_size: The size of the memory region of the @md that's mapped > >> + * in guest physical memory at @get_addr > > Tried to read it several times, > > but still description of hooks makes them look like they are doing the same thing, > > so what's the difference? > > That's a preparation mainly for virtio-mem, where we manage the amount > of memory that is actually accessible and visible by the guest (plugged) > inside a memory region dynamically. The difference exists purely for > stats. (could have been introduced later, but it made sense to split > this off right when factoring this stuff out into memory-device code). Purpose might be obvious (for you) right now, but later when reader would be reading history it's confusing. I'd postpone it until there is an actual user for it.
On 24/09/2018 14:39, Igor Mammedov wrote: > On Mon, 24 Sep 2018 14:26:23 +0200 > David Hildenbrand <david@redhat.com> wrote: > >> On 24/09/2018 14:22, Igor Mammedov wrote: >>> On Thu, 20 Sep 2018 12:32:27 +0200 >>> David Hildenbrand <david@redhat.com> wrote: >>> >>>> Document the functions and when to not expect errors. >>>> >>>> Signed-off-by: David Hildenbrand <david@redhat.com> >>>> --- >>>> include/hw/mem/memory-device.h | 13 +++++++++++++ >>>> 1 file changed, 13 insertions(+) >>>> >>>> diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h >>> [...] >>> >>>> + * @get_plugged_size: The amount of memory provided by this @md currently >>>> + * usable ("plugged") by the guest. >>>> + * @get_region_size: The size of the memory region of the @md that's mapped >>>> + * in guest physical memory at @get_addr >>> Tried to read it several times, >>> but still description of hooks makes them look like they are doing the same thing, >>> so what's the difference? >> >> That's a preparation mainly for virtio-mem, where we manage the amount >> of memory that is actually accessible and visible by the guest (plugged) >> inside a memory region dynamically. The difference exists purely for >> stats. (could have been introduced later, but it made sense to split >> this off right when factoring this stuff out into memory-device code). > Purpose might be obvious (for you) right now, but later when reader > would be reading history it's confusing. > I'd postpone it until there is an actual user for it. > What to postpone? The doc update? get_plugged_size is already upstream.
On Mon, 24 Sep 2018 14:40:24 +0200 David Hildenbrand <david@redhat.com> wrote: > On 24/09/2018 14:39, Igor Mammedov wrote: > > On Mon, 24 Sep 2018 14:26:23 +0200 > > David Hildenbrand <david@redhat.com> wrote: > > > >> On 24/09/2018 14:22, Igor Mammedov wrote: > >>> On Thu, 20 Sep 2018 12:32:27 +0200 > >>> David Hildenbrand <david@redhat.com> wrote: > >>> > >>>> Document the functions and when to not expect errors. > >>>> > >>>> Signed-off-by: David Hildenbrand <david@redhat.com> > >>>> --- > >>>> include/hw/mem/memory-device.h | 13 +++++++++++++ > >>>> 1 file changed, 13 insertions(+) > >>>> > >>>> diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h > >>> [...] > >>> > >>>> + * @get_plugged_size: The amount of memory provided by this @md currently > >>>> + * usable ("plugged") by the guest. > >>>> + * @get_region_size: The size of the memory region of the @md that's mapped > >>>> + * in guest physical memory at @get_addr > >>> Tried to read it several times, > >>> but still description of hooks makes them look like they are doing the same thing, > >>> so what's the difference? > >> > >> That's a preparation mainly for virtio-mem, where we manage the amount > >> of memory that is actually accessible and visible by the guest (plugged) > >> inside a memory region dynamically. The difference exists purely for > >> stats. (could have been introduced later, but it made sense to split > >> this off right when factoring this stuff out into memory-device code). > > Purpose might be obvious (for you) right now, but later when reader > > would be reading history it's confusing. > > I'd postpone it until there is an actual user for it. > > > > What to postpone? The doc update? get_plugged_size is already upstream. We can't do anything to merged stuff (unless one would remove it :/), but doc update as is, just adding more confusion. Alternatively clarify distinction referring to future devices that might return different values for both hooks.
On 24/09/2018 15:19, Igor Mammedov wrote: > On Mon, 24 Sep 2018 14:40:24 +0200 > David Hildenbrand <david@redhat.com> wrote: > >> On 24/09/2018 14:39, Igor Mammedov wrote: >>> On Mon, 24 Sep 2018 14:26:23 +0200 >>> David Hildenbrand <david@redhat.com> wrote: >>> >>>> On 24/09/2018 14:22, Igor Mammedov wrote: >>>>> On Thu, 20 Sep 2018 12:32:27 +0200 >>>>> David Hildenbrand <david@redhat.com> wrote: >>>>> >>>>>> Document the functions and when to not expect errors. >>>>>> >>>>>> Signed-off-by: David Hildenbrand <david@redhat.com> >>>>>> --- >>>>>> include/hw/mem/memory-device.h | 13 +++++++++++++ >>>>>> 1 file changed, 13 insertions(+) >>>>>> >>>>>> diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h >>>>> [...] >>>>> >>>>>> + * @get_plugged_size: The amount of memory provided by this @md currently >>>>>> + * usable ("plugged") by the guest. >>>>>> + * @get_region_size: The size of the memory region of the @md that's mapped >>>>>> + * in guest physical memory at @get_addr >>>>> Tried to read it several times, >>>>> but still description of hooks makes them look like they are doing the same thing, >>>>> so what's the difference? >>>> >>>> That's a preparation mainly for virtio-mem, where we manage the amount >>>> of memory that is actually accessible and visible by the guest (plugged) >>>> inside a memory region dynamically. The difference exists purely for >>>> stats. (could have been introduced later, but it made sense to split >>>> this off right when factoring this stuff out into memory-device code). >>> Purpose might be obvious (for you) right now, but later when reader >>> would be reading history it's confusing. >>> I'd postpone it until there is an actual user for it. >>> >> >> What to postpone? The doc update? get_plugged_size is already upstream. > We can't do anything to merged stuff (unless one would remove it :/), > but doc update as is, just adding more confusion. > > Alternatively clarify distinction referring to future devices that > might return different values for both hooks. Yes, I will add more details then, thanks!
On Thu, 20 Sep 2018 12:32:27 +0200 David Hildenbrand <david@redhat.com> wrote: > Document the functions and when to not expect errors. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > include/hw/mem/memory-device.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h > index f02b229837..d6853156ff 100644 > --- a/include/hw/mem/memory-device.h > +++ b/include/hw/mem/memory-device.h > @@ -29,9 +29,22 @@ typedef struct MemoryDeviceState { > Object parent_obj; > } MemoryDeviceState; > > +/** > + * MemoryDeviceClass: > + * @get_addr: The address of the @md in guest physical memory. "0" means that > + * no address has been specified by the user and that no address has been > + * assigned yet. While looking at attempts to implement '[Qemu-devel] [PATCH v12 6/6] tpm: add ACPI memory clear interface' on QEMU side, where we are trying to clear RAM going via ramblocks list. Maybe we are doing it backwards and instead of trying to deal with host abstraction RAMBlocks/MemoryRegion and follow up efforts to tag it with device model attributes '[PATCH] RFC: mark non-volatile memory region' we should do it other way around and approach it from guest point of view like firmware would do, i.e. make TPM enumerate RAM devices and tell them to clear related memory. Concrete device would know how to do it properly and/or how to ask backend to do it without need to tag RAMBlocks with device model specific info. That would allow to enumerate all RAM without trying to filter out not RAM RAMBlocks and not pollute host specific RAMBlock/MemoryRegion with device specific flags. However we can't do it right now, since built-in memory doesn't have a corresponding device model and we use MemoryRegions directly. Maybe we should use memory-devices and create derivative 'ramchip' device to represent builtin RAM and use it instead. So if we were to do that, we would need to make get_addr() return value 0 a valid one (suggest (uint64_t)-1 as unset value) and/or make built in memory work outside of device_memory region as short cut impl.
On 25/09/2018 16:19, Igor Mammedov wrote: > On Thu, 20 Sep 2018 12:32:27 +0200 > David Hildenbrand <david@redhat.com> wrote: > >> Document the functions and when to not expect errors. >> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> include/hw/mem/memory-device.h | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h >> index f02b229837..d6853156ff 100644 >> --- a/include/hw/mem/memory-device.h >> +++ b/include/hw/mem/memory-device.h >> @@ -29,9 +29,22 @@ typedef struct MemoryDeviceState { >> Object parent_obj; >> } MemoryDeviceState; >> >> +/** >> + * MemoryDeviceClass: >> + * @get_addr: The address of the @md in guest physical memory. "0" means that >> + * no address has been specified by the user and that no address has been >> + * assigned yet. > While looking at attempts to implement > '[Qemu-devel] [PATCH v12 6/6] tpm: add ACPI memory clear interface' > on QEMU side, where we are trying to clear RAM going via ramblocks list. > > Maybe we are doing it backwards and instead of trying to deal with host > abstraction RAMBlocks/MemoryRegion and follow up efforts to tag it with > device model attributes > '[PATCH] RFC: mark non-volatile memory region' > we should do it other way around and approach it from guest point of view > like firmware would do, i.e. > make TPM enumerate RAM devices and tell them to clear related memory. > Concrete device would know how to do it properly and/or how to ask backend > to do it without need to tag RAMBlocks with device model specific info. > That would allow to enumerate all RAM without trying to filter out not RAM > RAMBlocks and not pollute host specific RAMBlock/MemoryRegion with device > specific flags. I agree, I consider it also as a problem that many different mechanism in the code assume that any RAMBlock can e.g. be read/written etc. Or, as you said, try to resolve certain properties from RAMBlocks. If we could let the devices handle details (e.g. memory-device class callbacks or similar), we would move the logic to a place where we actually know what is happenig. > > However we can't do it right now, since built-in memory doesn't have > a corresponding device model and we use MemoryRegions directly. > Maybe we should use memory-devices and create derivative 'ramchip' device > to represent builtin RAM and use it instead. Yes, we would need some sort of internal DIMM (however, as you correctly state, DIMM is the wrong term, it really is *some* kind of ram device). > > So if we were to do that, we would need to make get_addr() return value 0 > a valid one (suggest (uint64_t)-1 as unset value) and/or make built in > memory work outside of device_memory region as short cut impl. > Something like that will certainly work, but require many other changes. E.g. the device memory region has to be sized and setup completely different. But then, we can change the default (have to whatch out about compatibility handling when e.g. addr=0 is passed to e.g. dimms on the cmdline).
On Tue, 25 Sep 2018 17:10:43 +0200 David Hildenbrand <david@redhat.com> wrote: > On 25/09/2018 16:19, Igor Mammedov wrote: > > On Thu, 20 Sep 2018 12:32:27 +0200 > > David Hildenbrand <david@redhat.com> wrote: > > > >> Document the functions and when to not expect errors. > >> > >> Signed-off-by: David Hildenbrand <david@redhat.com> > >> --- > >> include/hw/mem/memory-device.h | 13 +++++++++++++ > >> 1 file changed, 13 insertions(+) > >> > >> diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h > >> index f02b229837..d6853156ff 100644 > >> --- a/include/hw/mem/memory-device.h > >> +++ b/include/hw/mem/memory-device.h > >> @@ -29,9 +29,22 @@ typedef struct MemoryDeviceState { > >> Object parent_obj; > >> } MemoryDeviceState; > >> > >> +/** > >> + * MemoryDeviceClass: > >> + * @get_addr: The address of the @md in guest physical memory. "0" means that > >> + * no address has been specified by the user and that no address has been > >> + * assigned yet. > > While looking at attempts to implement > > '[Qemu-devel] [PATCH v12 6/6] tpm: add ACPI memory clear interface' > > on QEMU side, where we are trying to clear RAM going via ramblocks list. > > > > Maybe we are doing it backwards and instead of trying to deal with host > > abstraction RAMBlocks/MemoryRegion and follow up efforts to tag it with > > device model attributes > > '[PATCH] RFC: mark non-volatile memory region' > > we should do it other way around and approach it from guest point of view > > like firmware would do, i.e. > > make TPM enumerate RAM devices and tell them to clear related memory. > > Concrete device would know how to do it properly and/or how to ask backend > > to do it without need to tag RAMBlocks with device model specific info. > > That would allow to enumerate all RAM without trying to filter out not RAM > > RAMBlocks and not pollute host specific RAMBlock/MemoryRegion with device > > specific flags. > > I agree, I consider it also as a problem that many different mechanism > in the code assume that any RAMBlock can e.g. be read/written etc. Or, > as you said, try to resolve certain properties from RAMBlocks. If we > could let the devices handle details (e.g. memory-device class callbacks > or similar), we would move the logic to a place where we actually know > what is happenig. > > > > > However we can't do it right now, since built-in memory doesn't have > > a corresponding device model and we use MemoryRegions directly. > > Maybe we should use memory-devices and create derivative 'ramchip' device > > to represent builtin RAM and use it instead. > > Yes, we would need some sort of internal DIMM (however, as you correctly > state, DIMM is the wrong term, it really is *some* kind of ram device). > > > > > So if we were to do that, we would need to make get_addr() return value 0 > > a valid one (suggest (uint64_t)-1 as unset value) and/or make built in > > memory work outside of device_memory region as short cut impl. > > > > Something like that will certainly work, but require many other changes. > E.g. the device memory region has to be sized and setup completely > different. But then, we can change the default (have to whatch out about > compatibility handling when e.g. addr=0 is passed to e.g. dimms on the > cmdline). if we create a builtin ram device (i.e. no CLI available for it) and gate logic by type in memory-device handlers then we probably won't have any conflicts with dimm device. It's a bit hackish but it will be internal to memory-device helpers, then we would be able call 'device_add' internally to replace creating MemoryRegions manually. That way we would keep compatibility with old layout but will add device abstraction layer over builtin RAM. I've contemplated switching to pc-dimm and it would be ideal but that's a lot more refactoring most likely with guest ABI breaking and introducing multiple device_memory zones to pc-dimm. so we probably would end up keeping old way to instantiate initial memory for old machine types and a new way for new ones. I'm not sure it's worth effort considering complexity it would bring. But I haven't actually tried to implement it to know issues in detail, so it's just my expectation of problems we would face if we go this route.
On 26/09/2018 10:29, Igor Mammedov wrote: > On Tue, 25 Sep 2018 17:10:43 +0200 > David Hildenbrand <david@redhat.com> wrote: > >> On 25/09/2018 16:19, Igor Mammedov wrote: >>> On Thu, 20 Sep 2018 12:32:27 +0200 >>> David Hildenbrand <david@redhat.com> wrote: >>> >>>> Document the functions and when to not expect errors. >>>> >>>> Signed-off-by: David Hildenbrand <david@redhat.com> >>>> --- >>>> include/hw/mem/memory-device.h | 13 +++++++++++++ >>>> 1 file changed, 13 insertions(+) >>>> >>>> diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h >>>> index f02b229837..d6853156ff 100644 >>>> --- a/include/hw/mem/memory-device.h >>>> +++ b/include/hw/mem/memory-device.h >>>> @@ -29,9 +29,22 @@ typedef struct MemoryDeviceState { >>>> Object parent_obj; >>>> } MemoryDeviceState; >>>> >>>> +/** >>>> + * MemoryDeviceClass: >>>> + * @get_addr: The address of the @md in guest physical memory. "0" means that >>>> + * no address has been specified by the user and that no address has been >>>> + * assigned yet. >>> While looking at attempts to implement >>> '[Qemu-devel] [PATCH v12 6/6] tpm: add ACPI memory clear interface' >>> on QEMU side, where we are trying to clear RAM going via ramblocks list. >>> >>> Maybe we are doing it backwards and instead of trying to deal with host >>> abstraction RAMBlocks/MemoryRegion and follow up efforts to tag it with >>> device model attributes >>> '[PATCH] RFC: mark non-volatile memory region' >>> we should do it other way around and approach it from guest point of view >>> like firmware would do, i.e. >>> make TPM enumerate RAM devices and tell them to clear related memory. >>> Concrete device would know how to do it properly and/or how to ask backend >>> to do it without need to tag RAMBlocks with device model specific info. >>> That would allow to enumerate all RAM without trying to filter out not RAM >>> RAMBlocks and not pollute host specific RAMBlock/MemoryRegion with device >>> specific flags. >> >> I agree, I consider it also as a problem that many different mechanism >> in the code assume that any RAMBlock can e.g. be read/written etc. Or, >> as you said, try to resolve certain properties from RAMBlocks. If we >> could let the devices handle details (e.g. memory-device class callbacks >> or similar), we would move the logic to a place where we actually know >> what is happenig. >> >>> >>> However we can't do it right now, since built-in memory doesn't have >>> a corresponding device model and we use MemoryRegions directly. >>> Maybe we should use memory-devices and create derivative 'ramchip' device >>> to represent builtin RAM and use it instead. >> >> Yes, we would need some sort of internal DIMM (however, as you correctly >> state, DIMM is the wrong term, it really is *some* kind of ram device). >> >>> >>> So if we were to do that, we would need to make get_addr() return value 0 >>> a valid one (suggest (uint64_t)-1 as unset value) and/or make built in >>> memory work outside of device_memory region as short cut impl. >>> >> >> Something like that will certainly work, but require many other changes. >> E.g. the device memory region has to be sized and setup completely >> different. But then, we can change the default (have to whatch out about >> compatibility handling when e.g. addr=0 is passed to e.g. dimms on the >> cmdline). > if we create a builtin ram device (i.e. no CLI available for it) and > gate logic by type in memory-device handlers then we probably won't have > any conflicts with dimm device. > It's a bit hackish but it will be internal to memory-device helpers, > then we would be able call 'device_add' internally to replace creating > MemoryRegions manually. That way we would keep compatibility with old > layout but will add device abstraction layer over builtin RAM. > > I've contemplated switching to pc-dimm and it would be ideal but > that's a lot more refactoring most likely with guest ABI breaking > and introducing multiple device_memory zones to pc-dimm. > so we probably would end up keeping old way to instantiate initial > memory for old machine types and a new way for new ones. > I'm not sure it's worth effort considering complexity it would bring. > But I haven't actually tried to implement it to know issues in detail, > so it's just my expectation of problems we would face if we go this route. I consider this way the right thing to do. But as you mention, we have to watch out for compatibility things. Especially numa nodes might be tricky - we want one device per node most probably. Also haven't looked into the details. Converting one machine at a time would be possible.
diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h index f02b229837..d6853156ff 100644 --- a/include/hw/mem/memory-device.h +++ b/include/hw/mem/memory-device.h @@ -29,9 +29,22 @@ typedef struct MemoryDeviceState { Object parent_obj; } MemoryDeviceState; +/** + * MemoryDeviceClass: + * @get_addr: The address of the @md in guest physical memory. "0" means that + * no address has been specified by the user and that no address has been + * assigned yet. + * @get_plugged_size: The amount of memory provided by this @md currently + * usable ("plugged") by the guest. + * @get_region_size: The size of the memory region of the @md that's mapped + * in guest physical memory at @get_addr. + * @fill_device_info: Translate current @md state into #MemoryDeviceInfo. + */ typedef struct MemoryDeviceClass { + /* private */ InterfaceClass parent_class; + /* public */ uint64_t (*get_addr)(const MemoryDeviceState *md); uint64_t (*get_plugged_size)(const MemoryDeviceState *md, Error **errp); uint64_t (*get_region_size)(const MemoryDeviceState *md, Error **errp);
Document the functions and when to not expect errors. Signed-off-by: David Hildenbrand <david@redhat.com> --- include/hw/mem/memory-device.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)