Message ID | 20180917084016.12750-1-damien.hedde@greensocs.com (mailing list archive) |
---|---|
Headers | show |
Series | Clock framework API. | expand |
On 17 September 2018 at 01:40, <damien.hedde@greensocs.com> wrote: > Regarding the migration strategy, clocks do not hold the clock state > internally, so there is nothing to migrate there. The consequence is that > a device must update its output clocks in its post_load to propagate the > migrated clock state. This allows migration from old-qemu-with-no-clock > to new-qemu-with-clock-support: newly added clocks will be correctly > initialized during migration. > But it is more complex for input clocks handling: there is no order > guarantee between a device state migration and the update of its inputs clocks > which will occur during other device state migrations. > I think that, for most the cases, this does not rise problems, although there > might be some jitter/glitch during migration before hitting the right value > (with consequences such as the baudrate of a character device changing several > times during migration, I don't think it is a problem but may well be wrong > here). This doesn't seem like a good idea to me, since as you say there is no guarantee on migration order. It breaks a general principle that devices should migrate their own state and not do anything that disturbs other devices. There are several possible approaches here I think: (1) the "clock" object holds no internal state; if a device on the destination end of a clock connection cares about clock state then it keeps and updates a copy of that state when the callback is called, and it is responsible for migrating that copy along with all its other state. This is how qemu_irq/gpio lines work. (2) the "clock" object does hold internal state, and it is owned by the source-end device, which is responsible for migrating that state. This is how ptimer objects work -- hw/core/ptimer.c defines a vmstate struct, but it is the devices that use a ptimer that put a VMSTATE_PTIMER entry in their vmstate structs to migrate the data. (3) the "clock" object can be a fully fledged device (ie a subclass of TYPE_DEVICE) which migrates its state entirely by itself. I don't have a firm view currently on which would be best here, but I guess I lean towards 2. 1 has the advantage of "just like qemu_irq" but the disadvantage that the destination end has no way to query the current clock value so has to manually track it itself. 3 is probably overkill here (and also makes it hard to retain migration backward compatibility when adding clock tree support to an existing machine model). > Concerning this frequency-reset port, we can obviously go back to the simple > frequency-only one if you think it is not a good idea. I don't really understand why reset is related here. Clock trees and reset domains don't sit in a 1-to-1 relationship, generally. Reset is a complicated and painful area and I think I would prefer to see a patchset which aimed to solve the clocktree modelling problem without dragging in the complexities of reset modelling. thanks -- PMM
On 09/19/2018 11:30 PM, Peter Maydell wrote: > On 17 September 2018 at 01:40, <damien.hedde@greensocs.com> wrote: >> Regarding the migration strategy, clocks do not hold the clock state >> internally, so there is nothing to migrate there. The consequence is that >> a device must update its output clocks in its post_load to propagate the >> migrated clock state. This allows migration from old-qemu-with-no-clock >> to new-qemu-with-clock-support: newly added clocks will be correctly >> initialized during migration. >> But it is more complex for input clocks handling: there is no order >> guarantee between a device state migration and the update of its inputs clocks >> which will occur during other device state migrations. >> I think that, for most the cases, this does not rise problems, although there >> might be some jitter/glitch during migration before hitting the right value >> (with consequences such as the baudrate of a character device changing several >> times during migration, I don't think it is a problem but may well be wrong >> here). > > This doesn't seem like a good idea to me, since as you say there is > no guarantee on migration order. It breaks a general principle that > devices should migrate their own state and not do anything that > disturbs other devices. > > There are several possible approaches here I think: > > (1) the "clock" object holds no internal state; if a device on the > destination end of a clock connection cares about clock state then > it keeps and updates a copy of that state when the callback is called, > and it is responsible for migrating that copy along with all its other > state. This is how qemu_irq/gpio lines work. > (2) the "clock" object does hold internal state, and it is owned > by the source-end device, which is responsible for migrating that > state. This is how ptimer objects work -- hw/core/ptimer.c defines > a vmstate struct, but it is the devices that use a ptimer that > put a VMSTATE_PTIMER entry in their vmstate structs to migrate the data. > (3) the "clock" object can be a fully fledged device (ie a subclass > of TYPE_DEVICE) which migrates its state entirely by itself. > > I don't have a firm view currently on which would be best here, > but I guess I lean towards 2. 1 has the advantage of "just like > qemu_irq" but the disadvantage that the destination end has no > way to query the current clock value so has to manually track it > itself. 3 is probably overkill here (and also makes it hard to > retain migration backward compatibility when adding clock tree > support to an existing machine model). I agree with you on doing approach 2. If the clock state needs to be at the end, it seems best to put in inside the clock object. It will save codelines in devices. Thanks for the tips about ptimer. I don't see how approach 3 solves the problem since the clock state will still be migrated by another object (instead of begin the device which generate the clock, it is now the clock input object). So a device (with an input clock) has no guarantee on the clock value being correct when it will handle its own migration. I think the clock vmstate entry needs to be present in the device's vmsd (or am I missing something ?). Regarding backward compatibility on migration, I think we have 2 options: (A) keep updating outputs clocks in post_load. (B) rely on device with an input clock to setup a "good" default value to unmigrated input clocks. (A) has the advantage of ensuring an unmigrated input clock have right value just at the end of backward migration. But there's still the migration order jitter. (B) does not have the jitter, but unmigrated input clocks will be at their default values after the migration and will be updated maybe later on. Given your statement about disturbing other devices, (B) seems the way to go. > >> Concerning this frequency-reset port, we can obviously go back to the simple >> frequency-only one if you think it is not a good idea. > > I don't really understand why reset is related here. Clock trees and > reset domains don't sit in a 1-to-1 relationship, generally. Reset > is a complicated and painful area and I think I would prefer to see > a patchset which aimed to solve the clocktree modelling problem > without dragging in the complexities of reset modelling. OK. Do you think I should do a reroll right now with this 2 modifications without waiting further review ? Thanks, -- Damien
On 09/19/2018 11:30 PM, Peter Maydell wrote: > On 17 September 2018 at 01:40, <damien.hedde@greensocs.com> wrote: >> Regarding the migration strategy, clocks do not hold the clock state >> internally, so there is nothing to migrate there. The consequence is that >> a device must update its output clocks in its post_load to propagate the >> migrated clock state. This allows migration from old-qemu-with-no-clock >> to new-qemu-with-clock-support: newly added clocks will be correctly >> initialized during migration. >> But it is more complex for input clocks handling: there is no order >> guarantee between a device state migration and the update of its inputs clocks >> which will occur during other device state migrations. >> I think that, for most the cases, this does not rise problems, although there >> might be some jitter/glitch during migration before hitting the right value >> (with consequences such as the baudrate of a character device changing several >> times during migration, I don't think it is a problem but may well be wrong >> here). > > This doesn't seem like a good idea to me, since as you say there is > no guarantee on migration order. It breaks a general principle that > devices should migrate their own state and not do anything that > disturbs other devices. > > There are several possible approaches here I think: > > (1) the "clock" object holds no internal state; if a device on the > destination end of a clock connection cares about clock state then > it keeps and updates a copy of that state when the callback is called, > and it is responsible for migrating that copy along with all its other > state. This is how qemu_irq/gpio lines work. > (2) the "clock" object does hold internal state, and it is owned > by the source-end device, which is responsible for migrating that > state. This is how ptimer objects work -- hw/core/ptimer.c defines > a vmstate struct, but it is the devices that use a ptimer that > put a VMSTATE_PTIMER entry in their vmstate structs to migrate the data. > (3) the "clock" object can be a fully fledged device (ie a subclass > of TYPE_DEVICE) which migrates its state entirely by itself. > > I don't have a firm view currently on which would be best here, > but I guess I lean towards 2. 1 has the advantage of "just like > qemu_irq" but the disadvantage that the destination end has no > way to query the current clock value so has to manually track it > itself. 3 is probably overkill here (and also makes it hard to > retain migration backward compatibility when adding clock tree > support to an existing machine model). I agree with you on doing approach 2. If the clock state needs to be at the end, it seems best to put in inside the clock object. It will save codelines in devices. Thanks for the tips about ptimer. I don't see how approach 3 solves the problem since the clock state will still be migrated by another object (instead of begin the device which generate the clock, it is now the clock input object). So a device (with an input clock) has no guarantee on the clock value being correct when it will handle its own migration. I think the clock vmstate entry needs to be present in the device's vmsd (or am I missing something ?). Regarding backward compatibility on migration, I think we have 2 options: (A) keep updating outputs clocks in post_load. (B) rely on device with an input clock to setup a "good" default value to unmigrated input clocks. (A) has the advantage of ensuring an unmigrated input clock have right value just at the end of backward migration. But there's still the migration order jitter. (B) does not have the jitter, but unmigrated input clocks will be at their default values after the migration and will be updated maybe later on. Given your statement about disturbing other devices, (B) seems the way to go. > >> Concerning this frequency-reset port, we can obviously go back to the simple >> frequency-only one if you think it is not a good idea. > > I don't really understand why reset is related here. Clock trees and > reset domains don't sit in a 1-to-1 relationship, generally. Reset > is a complicated and painful area and I think I would prefer to see > a patchset which aimed to solve the clocktree modelling problem > without dragging in the complexities of reset modelling. OK. Do you think I should do a reroll right now with this 2 modifications without waiting further review ? Thanks, -- Damien
On 21 September 2018 at 06:39, Damien Hedde <damien.hedde@greensocs.com> wrote: > On 09/19/2018 11:30 PM, Peter Maydell wrote: >> There are several possible approaches here I think: >> >> (1) the "clock" object holds no internal state; if a device on the >> destination end of a clock connection cares about clock state then >> it keeps and updates a copy of that state when the callback is called, >> and it is responsible for migrating that copy along with all its other >> state. This is how qemu_irq/gpio lines work. >> (2) the "clock" object does hold internal state, and it is owned >> by the source-end device, which is responsible for migrating that >> state. This is how ptimer objects work -- hw/core/ptimer.c defines >> a vmstate struct, but it is the devices that use a ptimer that >> put a VMSTATE_PTIMER entry in their vmstate structs to migrate the data. >> (3) the "clock" object can be a fully fledged device (ie a subclass >> of TYPE_DEVICE) which migrates its state entirely by itself. >> >> I don't have a firm view currently on which would be best here, >> but I guess I lean towards 2. 1 has the advantage of "just like >> qemu_irq" but the disadvantage that the destination end has no >> way to query the current clock value so has to manually track it >> itself. 3 is probably overkill here (and also makes it hard to >> retain migration backward compatibility when adding clock tree >> support to an existing machine model). > > I agree with you on doing approach 2. If the clock state needs to be at > the end, it seems best to put in inside the clock object. It will save > codelines in devices. Thanks for the tips about ptimer. > > I don't see how approach 3 solves the problem since the clock state will > still be migrated by another object (instead of begin the device which > generate the clock, it is now the clock input object). So a device (with > an input clock) has no guarantee on the clock value being correct when > it will handle its own migration. I think the clock vmstate entry needs > to be present in the device's vmsd (or am I missing something ?). The point about (3) is that every TYPE_DEVICE object manages migration of its own state, so the device which has the clock output does not need to. No device should ever care about whether other devices in the system have had their state loaded on a migration or not yet: their migration load must affect only their own internal state. If you find yourself in a position where you need to care then you've probably got some part of the design wrong. (The difference between 2 and 3 is that in 2 the clock-object is not a full device, so it's just a part of the output-end device and the output-end device does its state migration. In 3 it is a full device and does its own migration.) > Regarding backward compatibility on migration, I think we have 2 options: > (A) keep updating outputs clocks in post_load. > (B) rely on device with an input clock to setup a "good" default value > to unmigrated input clocks. I think what you need (assuming a type 2 design) is for there to be a function on the clock object which says "here's your state, but don't tell the output end" (or just directly set the clock struct fields). That way the output end device can in its post-load function use that if it is doing a migration from an old version that didn't include the clock device. The input end device can't help you because it is not in a position to change the state of the clock object, which belongs to the output end. (Consider also the case where one clock connects to multiple inputs -- the input end can't set the value, because the different inputs might have different ideas of the right thing.) >> I don't really understand why reset is related here. Clock trees and >> reset domains don't sit in a 1-to-1 relationship, generally. Reset >> is a complicated and painful area and I think I would prefer to see >> a patchset which aimed to solve the clocktree modelling problem >> without dragging in the complexities of reset modelling. > > OK. > > Do you think I should do a reroll right now with this 2 modifications > without waiting further review ? I think that's probably a good idea, yes. (I have been thinking a bit about the reset problem this week and will see if I can write up my thoughts on it next week.) thanks -- PMM
On 9/21/18 5:37 PM, Peter Maydell wrote: > On 21 September 2018 at 06:39, Damien Hedde <damien.hedde@greensocs.com> wrote: >> On 09/19/2018 11:30 PM, Peter Maydell wrote: >>> There are several possible approaches here I think: >>> >>> (1) the "clock" object holds no internal state; if a device on the >>> destination end of a clock connection cares about clock state then >>> it keeps and updates a copy of that state when the callback is called, >>> and it is responsible for migrating that copy along with all its other >>> state. This is how qemu_irq/gpio lines work. >>> (2) the "clock" object does hold internal state, and it is owned >>> by the source-end device, which is responsible for migrating that >>> state. This is how ptimer objects work -- hw/core/ptimer.c defines >>> a vmstate struct, but it is the devices that use a ptimer that >>> put a VMSTATE_PTIMER entry in their vmstate structs to migrate the data. >>> (3) the "clock" object can be a fully fledged device (ie a subclass >>> of TYPE_DEVICE) which migrates its state entirely by itself. >>> >>> I don't have a firm view currently on which would be best here, >>> but I guess I lean towards 2. 1 has the advantage of "just like >>> qemu_irq" but the disadvantage that the destination end has no >>> way to query the current clock value so has to manually track it >>> itself. 3 is probably overkill here (and also makes it hard to >>> retain migration backward compatibility when adding clock tree >>> support to an existing machine model). >> >> I agree with you on doing approach 2. If the clock state needs to be at >> the end, it seems best to put in inside the clock object. It will save >> codelines in devices. Thanks for the tips about ptimer. >> >> I don't see how approach 3 solves the problem since the clock state will >> still be migrated by another object (instead of begin the device which >> generate the clock, it is now the clock input object). So a device (with >> an input clock) has no guarantee on the clock value being correct when >> it will handle its own migration. I think the clock vmstate entry needs >> to be present in the device's vmsd (or am I missing something ?). > > The point about (3) is that every TYPE_DEVICE object manages migration > of its own state, so the device which has the clock output does not > need to. > > No device should ever care about whether other devices in the system > have had their state loaded on a migration or not yet: their migration > load must affect only their own internal state. If you find yourself in > a position where you need to care then you've probably got some part of > the design wrong. > > (The difference between 2 and 3 is that in 2 the clock-object is not > a full device, so it's just a part of the output-end device and the > output-end device does its state migration. In 3 it is a full device > and does its own migration.) > >> Regarding backward compatibility on migration, I think we have 2 options: >> (A) keep updating outputs clocks in post_load. >> (B) rely on device with an input clock to setup a "good" default value >> to unmigrated input clocks. > > I think what you need (assuming a type 2 design) is for there to be a > function on the clock object which says "here's your state, but don't > tell the output end" (or just directly set the clock struct fields). > That way the output end device can in its post-load function use that > if it is doing a migration from an old version that didn't include > the clock device. > > The input end device can't help you because it is not in a position > to change the state of the clock object, which belongs to the output > end. (Consider also the case where one clock connects to multiple > inputs -- the input end can't set the value, because the different > inputs might have different ideas of the right thing.) I was thinking of putting a state in the input clock so that it belongs to the input end device. This would be some kind of cache of the value and it will be loaded by the input device during migration. If there are multiple inputs, each input will migrate its own copy. Every copy should be identical and no action needs to be performed on the output side on migration apart from updating its own clock state (but I think it doesn't need one if every input has one). During a migration, if the state was not in the source vm (migration from an old qemu), this local copy can be initialized to a default value by the input device. This value would be eventually updated later on if the clock is changed at the output end. In case there are several inputs, different values will maybe exist in the different input end devices until the output end does an update. For example, in the zynq, the cadence_uart today consider it has a 50Mhz reference clock. This could be its default value on migration, whatever the frequency set by the output. In that case, after migration, the uart will continue working like before migration until the software does a clock configuration (which may never happen). It has the advantage to be simple (everything stay in the input end perimeter) and independent of migration order. We can still try to fetch the right value from the output instead of putting a default value, but its a fifty-fifty chance of having the value before migration, which means default value is imposed by the output. Or the output can silently put a value like you said, but I think the result depends on the migration order as well. A problem I see with having the input doing the migration if the output end "changed" because of a migration. Consider a bug-fix in clock computation or a static clock whose frequency has changed between the 2 versions of qemu. What are the acceptable results after the migration ? My initial point of view was to consider: (1) the clock frequency belongs the clock controller device (ie: the output end device) and is migrated by it. (2) a clocked device (ie: an input end device) has a state which does not contains the clock frequency. (3) if the clocked device needs the clock frequency to compute some things (eg: backend config in a uart, or visibility of mmio), it will do the computation as many time as required until its inputs and its state are up-to-date. In fact, some kind of clock propagation side-effect. I can try to find a way to delay this computation until everything (inputs + state) is up-to-date to be independent of migration order. But I think there will be corner cases where the computation will never occurs. If we had some kind of post_load_delayed_until_end_of_migration, it would be easy, but I don't think we have that. > >>> I don't really understand why reset is related here. Clock trees and >>> reset domains don't sit in a 1-to-1 relationship, generally. Reset >>> is a complicated and painful area and I think I would prefer to see >>> a patchset which aimed to solve the clocktree modelling problem >>> without dragging in the complexities of reset modelling. >> >> OK. >> >> Do you think I should do a reroll right now with this 2 modifications >> without waiting further review ? > > I think that's probably a good idea, yes. > > (I have been thinking a bit about the reset problem this week > and will see if I can write up my thoughts on it next week.) Regarding the reset, the functionality I added can be implemented using gpio. What I did is just a factorization to avoid having to connect a clock and a gpio for the reset. > > thanks > -- PMM >
From: Damien Hedde <damien.hedde@greensocs.com> This series corresponds to the v4 of "clock framework api" patches which were discussed in 2017, here: https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg07218.html It is a big refactoring trying to respond to comments and to integrate the clock gating cases I sent recently here: https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg05363.html Note that, for now, the power cases are not considered in this patchset. For the user, the framework is now very similar to the device's gpio API. Clocks inputs and outputs can be added in devices during initialization phase. Then an input can be connected to an output: it means every time the output clock changes, a callback in the input is trigerred allowing any action to be taken. A difference with gpios is that several inputs can be connected to a single output without doing any glue. Compared to v3, the following notable changes happen: - Bindings are now fixed during machine realisation, - Input clock objects have been removed from the point of view of the user, i.e. nothing need to be added in the device state. A device can now declare a input clock by calling qdev_init_clock_in() (similarly of what is done for GPIOs), - Callbacks called on clock change now return void. The input owner is responsible of making necessary updates accordingly (e.g. update another clock output, or modify its internal state) (again, mostly like GPIOs). - For now, internal state have been removed. It could re-added as a cached value to print it in info qtree, or to avoid unnecessary updates. Behind the scene, there is 2 objects: a clock input which is a placeholder for a callback, and a clock output which is a list of inputs. The value transferred between an output and an input is a ClockState which contains 2 fields: - an integer to store the frequency - a boolean to indicate whether the clock domain reset is asserted or not The reset flag has been added because it seems both signals are closely related and very often controlled by the same device. Regarding the migration strategy, clocks do not hold the clock state internally, so there is nothing to migrate there. The consequence is that a device must update its output clocks in its post_load to propagate the migrated clock state. This allows migration from old-qemu-with-no-clock to new-qemu-with-clock-support: newly added clocks will be correctly initialized during migration. But it is more complex for input clocks handling: there is no order guarantee between a device state migration and the update of its inputs clocks which will occur during other device state migrations. I think that, for most the cases, this does not rise problems, although there might be some jitter/glitch during migration before hitting the right value (with consequences such as the baudrate of a character device changing several times during migration, I don't think it is a problem but may well be wrong here). For example, if we have 2 devices A and B with a clock between them: | dev A clk|>----->| dev B | There is 2 migration scenarios (this can happen in our example implementation between the slcr and cadence_uart devices): 1. A before B: - A is migrated: + A's state is loaded + A's post_load restore the clock which is propagated to B + B reacts to the clock change (eg: update serial baudrate with migrated clock value but old B state) - B is migrated: + B's state is loaded + B's post_load reacts to new B's state (eg: update serial baudrate to the final right value) 2. B before A: - B is migrated: + B's state is loaded + B's post_load reacts to ne B's state (eg: update serial baudrate with migrated B state but old clock value) - A is migrated: + A's state is loaded + A's post_load restore the clock which is propagated to B + B reacts to the clock change (eg: update serial baudrate to the final right value) Regarding clock gating. The 0 frequency value means the clock is gated. If need be, a gate device can be built taking an input gpio and clock and generating an output clock. We are considering switching to a generic payload evolution of this API. For example by specifying the qom carried type when adding an input/output to a device. The current implementation is poorly dependent on what the data type is since only pointers are transferred between output and inputs. Changes would probably be minor. This would allow us, for example, to add a power input port to handle power gating. We could also have the basic clock port (frequency-only) and the one we have here (frequency-reset). Concerning this frequency-reset port, we can obviously go back to the simple frequency-only one if you think it is not a good idea. I've tested this patchset running Xilinx's Linux on the xilinx-zynq-a9 machine. Clocks are correctly updated and we ends up with a configured baudrate of 115601 on the console uart (for a theorical 115200) which is nice. "cadence_uart*" and "clock*" traces can be enabled to see what's going on in this platform. Any comments and suggestion are welcomed. The patches are organised as follows: + Patches 1 to 4 adds the clock support in qemu. + Patch 5 add some documentation in docs/devel + Patch 6 adds support for a default clock in sysbus devices which control the mmios visibility. + Patches 7 to 10 adds the uart's clocks to the xilinx_zynq platform as an example for this framework. It updates the zynq's slcr clock controller, the cadence_uart device, and the zynq toplevel platform. Thanks to the Xilinx QEMU team who sponsored this development. Damien Hedde (10): hw/core/clock-port: introduce clock port objects qdev: add clock input&output support to devices. qdev-monitor: print the device's clock with info qtree qdev-clock: introduce an init array to ease the device construction docs/clocks: add device's clock documentation sysbus: add bus_interface_clock feature to sysbus devices hw/misc/zynq_slcr: use standard register definition hw/misc/zynq_slcr: add clock generation for uarts hw/char/cadence_uart: add clock support hw/arm/xilinx_zynq: connect uart clocks to slcr docs/devel/clock.txt | 144 ++++++++ Makefile.objs | 1 + include/hw/char/cadence_uart.h | 2 + include/hw/clock-port.h | 153 ++++++++ include/hw/qdev-clock.h | 129 +++++++ include/hw/qdev-core.h | 14 + include/hw/qdev.h | 1 + include/hw/sysbus.h | 22 ++ hw/arm/xilinx_zynq.c | 17 +- hw/char/cadence_uart.c | 73 +++- hw/core/clock-port.c | 145 ++++++++ hw/core/qdev-clock.c | 166 +++++++++ hw/core/qdev.c | 29 ++ hw/core/sysbus.c | 25 ++ hw/misc/zynq_slcr.c | 637 ++++++++++++++++++++------------- qdev-monitor.c | 6 + hw/char/trace-events | 3 + hw/core/Makefile.objs | 3 +- hw/core/trace-events | 6 + 19 files changed, 1324 insertions(+), 252 deletions(-) create mode 100644 docs/devel/clock.txt create mode 100644 include/hw/clock-port.h create mode 100644 include/hw/qdev-clock.h create mode 100644 hw/core/clock-port.c create mode 100644 hw/core/qdev-clock.c create mode 100644 hw/core/trace-events