Message ID | 1377150793-27864-6-git-send-email-mturquette@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 22 August 2013 13:53, Mike Turquette <mturquette@linaro.org> wrote: > Device Tree binding for the basic clock gate, plus the setup function to > register the clock. Based on the existing fixed-clock binding. > > A different approach to this was proposed in 2012[1] and a similar > binding was proposed more recently[2] if anyone wants some extra > reading. > > [1] http://article.gmane.org/gmane.linux.documentation/5679 > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2012-December/137878.html > > Tero Kristo contributed helpful bug fixes to this patch. > > Signed-off-by: Mike Turquette <mturquette@linaro.org> > Tested-by: Heiko Stuebner <heiko@sntech.de> > Reviewed-by: Heiko Stuebner <heiko@sntech.de> > --- > Changes since v3: > * replaced underscores with dashes in DT property names > * bail from of clock setup function early if of_iomap fails > * removed unecessary explict cast > > .../devicetree/bindings/clock/gate-clock.txt | 36 +++++++++++++++++ > drivers/clk/clk-gate.c | 47 ++++++++++++++++++++++ > include/linux/clk-provider.h | 2 + > 3 files changed, 85 insertions(+) > create mode 100644 Documentation/devicetree/bindings/clock/gate-clock.txt > > diff --git a/Documentation/devicetree/bindings/clock/gate-clock.txt b/Documentation/devicetree/bindings/clock/gate-clock.txt > new file mode 100644 > index 0000000..1c0d4d5 > --- /dev/null > +++ b/Documentation/devicetree/bindings/clock/gate-clock.txt > @@ -0,0 +1,36 @@ > +Binding for simple gate clock. > + > +This binding uses the common clock binding[1]. It assumes a > +register-mapped clock gate controlled by a single bit that has only one > +input clock or parent. By default setting the specified bit gates the > +clock signal and clearing the bit ungates it. > + > +The binding must provide the register to control the gate and the bit > +shift for the corresponding gate control bit. Some clocks set the bit to > +gate the clock signal, and clear it to ungate the clock signal. The > +optional "set-bit-to-disable" specifies this behavior. > + > +[1] Documentation/devicetree/bindings/clock/clock-bindings.txt > + > +Required properties: > +- compatible : shall be "gate-clock". > +- #clock-cells : from common clock binding; shall be set to 0. > +- clocks : link to phandle of parent clock > +- reg : base address for register controlling adjustable gate > +- bit-shift : bit shift for programming the clock gate > + > +Optional properties: > +- clock-output-names : from common clock binding. > +- set-bit-to-disable : inverts default gate programming. Setting the bit > + gates the clock and clearing the bit ungates the clock. > +- hiword-mask : lower half of the register controls the gate, upper half > + of the register indicates bits that were updated in the lower half > + > +Examples: > + clock_foo: clock_foo@4a008100 { > + compatible = "gate-clock"; > + #clock-cells = <0>; > + clocks = <&clock_bar>; > + reg = <0x4a008100 0x4> > + bit-shift = <3> There's some argument on my clock binding patch set of Hi3620. I defined each clock as one clock node and some of them are sharing one register. Stephen attacked on this since it means multiple clock node sharing one register. Now the same thing also exists in Mike's patch. Mike's patch could also support this behavior. And it's very common that one register is sharing among multiple clocks in every SoC. Which one should I follow? > + }; > diff --git a/drivers/clk/clk-gate.c b/drivers/clk/clk-gate.c > index 2b28a00..63641c2 100644 > --- a/drivers/clk/clk-gate.c > +++ b/drivers/clk/clk-gate.c > @@ -15,6 +15,8 @@ > #include <linux/io.h> > #include <linux/err.h> > #include <linux/string.h> > +#include <linux/of.h> > +#include <linux/of_address.h> > > /** > * DOC: basic gatable clock which can gate and ungate it's ouput > @@ -162,3 +164,48 @@ struct clk *clk_register_gate(struct device *dev, const char *name, > return clk; > } > EXPORT_SYMBOL_GPL(clk_register_gate); > + > +#ifdef CONFIG_OF > +/** > + * of_gate_clk_setup() - Setup function for simple gate rate clock > + */ > +void of_gate_clk_setup(struct device_node *node) > +{ > + struct clk *clk; > + const char *clk_name = node->name; > + void __iomem *reg; > + const char *parent_name; > + u8 clk_gate_flags = 0; > + u32 bit_idx = 0; > + > + of_property_read_string(node, "clock-output-names", &clk_name); > + > + parent_name = of_clk_get_parent_name(node, 0); > + > + reg = of_iomap(node, 0); I suggest not using of_iomap for each clock node. If each clock is one node, it means hundreds of clock node existing in device tree. Hundreds of mapping page only cost unnecessary mapping. Maybe we can resolve it by this way. DTS file: clock register bank { reg = <{start} {size}>; #address-cells = <1>; #size-cells = <0>; /* each clock only exists in one register */ clock node { compatible = "xxx"; #clock-cells = <0>; clock-output-names = yyy"; reg = <{offset}>; ... other properties ... }; }; clock driver: void __iomem *base, *reg; const __be32 *prop; /* base comes from the iomapping of the parent node */ prop = of_get_address(np, 0, NULL, NULL); if (!prop) goto err; reg = base + be32_to_cpu(*prop); Then we can get the register virtual address at here. It could avoid always calling ioremap(). Maybe it's not the best solution, it's just for your reference. > + if (!reg) { > + pr_err("%s: no memory mapped for property reg\n", __func__); > + return; > + } > + > + if (of_property_read_u32(node, "bit-shift", &bit_idx)) { > + pr_err("%s: missing bit-shift property for %s\n", > + __func__, node->name); > + return; > + } > + > + if (of_property_read_bool(node, "set-bit-to-disable")) > + clk_gate_flags |= CLK_GATE_SET_TO_DISABLE; > + > + if (of_property_read_bool(node, "hiword-mask")) > + clk_gate_flags |= CLK_GATE_HIWORD_MASK; > + > + clk = clk_register_gate(NULL, clk_name, parent_name, 0, reg, bit_idx, > + clk_gate_flags, NULL); > + > + if (!IS_ERR(clk)) > + of_clk_add_provider(node, of_clk_src_simple_get, clk); > +} > +EXPORT_SYMBOL_GPL(of_gate_clk_setup); > +CLK_OF_DECLARE(gate_clk, "gate-clock", of_gate_clk_setup); > +#endif > diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h > index 218d923..b471e37 100644 > --- a/include/linux/clk-provider.h > +++ b/include/linux/clk-provider.h > @@ -240,6 +240,8 @@ struct clk *clk_register_gate(struct device *dev, const char *name, > void __iomem *reg, u8 bit_idx, > u8 clk_gate_flags, spinlock_t *lock); > > +void of_gate_clk_setup(struct device_node *node); > + > struct clk_div_table { > unsigned int val; > unsigned int div; > -- > 1.8.1.2 >
On 08/29/2013 07:45 PM, Haojian Zhuang wrote: > On 22 August 2013 13:53, Mike Turquette <mturquette@linaro.org> wrote: >> Device Tree binding for the basic clock gate, plus the setup function to >> register the clock. Based on the existing fixed-clock binding. >> >> A different approach to this was proposed in 2012[1] and a similar >> binding was proposed more recently[2] if anyone wants some extra >> reading. >> diff --git a/Documentation/devicetree/bindings/clock/gate-clock.txt b/Documentation/devicetree/bindings/clock/gate-clock.txt >> +Binding for simple gate clock. ... >> + clock_foo: clock_foo@4a008100 { >> + compatible = "gate-clock"; >> + #clock-cells = <0>; >> + clocks = <&clock_bar>; >> + reg = <0x4a008100 0x4> >> + bit-shift = <3> > > There's some argument on my clock binding patch set of Hi3620. > > I defined each clock as one clock node and some of them are sharing one > register. Stephen attacked on this since it means multiple clock node sharing > one register. s/attacked/disagreed with/ I think:-) > Now the same thing also exists in Mike's patch. Mike's patch could also > support this behavior. And it's very common that one register is sharing among > multiple clocks in every SoC. Which one should I follow? I believe it's a matter of how the HW is structured. If the HW truly has segregated register regions for each individual clock, and is documented in a way that implies each individual clock is a separate HW module, then it makes sense to describe each clock as a separate DT node. However, if the HW simply has a "clock module" that provides many clocks, with inter-mingled registers all over the place, and is documented as a single module that generates lots of clocks, then it makes sense to describe that one module as a single DT node. In other words, the DT representation should match the HW design and documentation. This is exactly why we have the #clock-cells property in DT; so that a clock provider can provide multiple clocks if that's the way the HW is designed. >> diff --git a/drivers/clk/clk-gate.c b/drivers/clk/clk-gate.c >> +void of_gate_clk_setup(struct device_node *node) >> +{ >> + struct clk *clk; >> + const char *clk_name = node->name; >> + void __iomem *reg; >> + const char *parent_name; >> + u8 clk_gate_flags = 0; >> + u32 bit_idx = 0; >> + >> + of_property_read_string(node, "clock-output-names", &clk_name); >> + >> + parent_name = of_clk_get_parent_name(node, 0); >> + >> + reg = of_iomap(node, 0); > > I suggest not using of_iomap for each clock node. > > If each clock is one node, it means hundreds of clock node existing in > device tree. Hundreds of mapping page only cost unnecessary mapping. The page table entries will get re-used. I'm not familiar with the mm code, but multiple of_iomap() for the exact same range probably just map down to incrementing a refcount on some kernel data structure, so actually has zero overhead? > Maybe we can resolve it by this way. > > DTS file: > clock register bank { You need a compatible value here, in order to instantiate the top-level driver for the "clock generator" HW block. > reg = <{start} {size}>; > #address-cells = <1>; > #size-cells = <0>; /* each clock only > exists in one register */ You would need a ranges property here, to map the child reg properties into the parent node's address space. > clock node { > compatible = "xxx"; > #clock-cells = <0>; > clock-output-names = yyy"; > reg = <{offset}>; > ... other properties ... > }; > }; That could be a reasonable solution; the existence of a single "clock generator" HW block is clearly called out by the existing of the top-level DT node, yet the internals of that node are free to be whatever you want, since this is purely defined by the binding definition for that top-level "clock generator" node. That all said, I see almost zero value in having all these child nodes, since the top-level compatible value implies every single detail about the HW, so the list of clocks and their parameters could just as easily be a table in the driver for the HW, in order to avoid having to parse that data every boot just to end up with the same table... The only exception would be if the SoC designer truly had composed the top-level "clock generator" HW block out of many completely independent re-usable clock IP blocks, and many SoCs existed that used those individual clock blocks completely unchanged, without any SoC-specific differences such as additional SoC-specific clock block types, so that one could get greater re-use by parametrizing everything in DT. In my (perhaps limited) experience of SoCS, this seems like an /extremely/ unlikely situation.
On 31 August 2013 04:06, Stephen Warren <swarren@wwwdotorg.org> wrote: > On 08/29/2013 07:45 PM, Haojian Zhuang wrote: >> On 22 August 2013 13:53, Mike Turquette <mturquette@linaro.org> wrote: >>> Device Tree binding for the basic clock gate, plus the setup function to >>> register the clock. Based on the existing fixed-clock binding. >>> >>> A different approach to this was proposed in 2012[1] and a similar >>> binding was proposed more recently[2] if anyone wants some extra >>> reading. > >>> diff --git a/Documentation/devicetree/bindings/clock/gate-clock.txt b/Documentation/devicetree/bindings/clock/gate-clock.txt > >>> +Binding for simple gate clock. > ... >>> + clock_foo: clock_foo@4a008100 { >>> + compatible = "gate-clock"; >>> + #clock-cells = <0>; >>> + clocks = <&clock_bar>; >>> + reg = <0x4a008100 0x4> >>> + bit-shift = <3> >> >> There's some argument on my clock binding patch set of Hi3620. >> >> I defined each clock as one clock node and some of them are sharing one >> register. Stephen attacked on this since it means multiple clock node sharing >> one register. > > s/attacked/disagreed with/ I think:-) > >> Now the same thing also exists in Mike's patch. Mike's patch could also >> support this behavior. And it's very common that one register is sharing among >> multiple clocks in every SoC. Which one should I follow? > > I believe it's a matter of how the HW is structured. > > If the HW truly has segregated register regions for each individual > clock, and is documented in a way that implies each individual clock is > a separate HW module, then it makes sense to describe each clock as a > separate DT node. > > However, if the HW simply has a "clock module" that provides many > clocks, with inter-mingled registers all over the place, and is > documented as a single module that generates lots of clocks, then it > makes sense to describe that one module as a single DT node. > > In other words, the DT representation should match the HW design and > documentation. > > This is exactly why we have the #clock-cells property in DT; so that a > clock provider can provide multiple clocks if that's the way the HW is > designed. > >>> diff --git a/drivers/clk/clk-gate.c b/drivers/clk/clk-gate.c > >>> +void of_gate_clk_setup(struct device_node *node) >>> +{ >>> + struct clk *clk; >>> + const char *clk_name = node->name; >>> + void __iomem *reg; >>> + const char *parent_name; >>> + u8 clk_gate_flags = 0; >>> + u32 bit_idx = 0; >>> + >>> + of_property_read_string(node, "clock-output-names", &clk_name); >>> + >>> + parent_name = of_clk_get_parent_name(node, 0); >>> + >>> + reg = of_iomap(node, 0); >> >> I suggest not using of_iomap for each clock node. >> >> If each clock is one node, it means hundreds of clock node existing in >> device tree. Hundreds of mapping page only cost unnecessary mapping. > > The page table entries will get re-used. I'm not familiar with the mm > code, but multiple of_iomap() for the exact same range probably just map > down to incrementing a refcount on some kernel data structure, so > actually has zero overhead? I think it's not right. Let check the implemenation in ioremap(). __arm_ioremap_pfn_caller(): /* try to reuse one of the static mapping whenever possible. */ svm = find_static_vm_paddr(); if (svm) { addr = svm->vm.addr; addr += paddr - svm->vm.phys_addr; return (void __iomem *)(offset + addr); } ... area = get_vm_area_caller(size, VM_IOREMAP, caller); addr = area->addr; ioremap_page_range(addr, addr+size, paddr, ..); We can see that it'll try to find static mapping. What's the static mapping? If we define iotable in machine driver, we have the static mapping, just like debug_ll. If we parse everything from DTS file, it'll always get a new virtual address from vm area. So it always create a new page mapping even for one register. > > >> Maybe we can resolve it by this way. >> >> DTS file: >> clock register bank { > > You need a compatible value here, in order to instantiate the top-level > driver for the "clock generator" HW block. > >> reg = <{start} {size}>; >> #address-cells = <1>; >> #size-cells = <0>; /* each clock only >> exists in one register */ > > You would need a ranges property here, to map the child reg properties > into the parent node's address space. > >> clock node { >> compatible = "xxx"; >> #clock-cells = <0>; >> clock-output-names = yyy"; >> reg = <{offset}>; >> ... other properties ... >> }; >> }; > > That could be a reasonable solution; the existence of a single "clock > generator" HW block is clearly called out by the existing of the > top-level DT node, yet the internals of that node are free to be > whatever you want, since this is purely defined by the binding > definition for that top-level "clock generator" node. > > That all said, I see almost zero value in having all these child nodes, > since the top-level compatible value implies every single detail about > the HW, so the list of clocks and their parameters could just as easily > be a table in the driver for the HW, in order to avoid having to parse > that data every boot just to end up with the same table... > > The only exception would be if the SoC designer truly had composed the > top-level "clock generator" HW block out of many completely independent > re-usable clock IP blocks, and many SoCs existed that used those > individual clock blocks completely unchanged, without any SoC-specific > differences such as additional SoC-specific clock block types, so that > one could get greater re-use by parametrizing everything in DT. In my > (perhaps limited) experience of SoCS, this seems like an /extremely/ > unlikely situation. >
* Haojian Zhuang <haojian.zhuang@linaro.org> [130903 20:11]: > > We can see that it'll try to find static mapping. What's the static mapping? > If we define iotable in machine driver, we have the static mapping, just like > debug_ll. If we parse everything from DTS file, it'll always get a new virtual > address from vm area. So it always create a new page mapping even for one > register. I may not follow you here.. But it seems that you've missing something with the static mapping: It's found based on the physical address. So if you create static mappings for your SoC with iotable_init(), those mappings will be available everywhere including drivers when you do ioremap(). Regards, Tony
Hi Tony, On Wednesday 04 of September 2013 10:59:09 Tony Lindgren wrote: > * Haojian Zhuang <haojian.zhuang@linaro.org> [130903 20:11]: > > We can see that it'll try to find static mapping. What's the static > > mapping? If we define iotable in machine driver, we have the static > > mapping, just like debug_ll. If we parse everything from DTS file, > > it'll always get a new virtual address from vm area. So it always > > create a new page mapping even for one register. > > I may not follow you here.. But it seems that you've missing something > with the static mapping: It's found based on the physical address. So > if you create static mappings for your SoC with iotable_init(), those > mappings will be available everywhere including drivers when you do > ioremap(). The thing is that today we are moving in favour of fully dynamic mapping, based on data from device tree, with as little as possible (or even no) static mapping based on hardcoded values. So, back to the original problem, we end up doing multiple dynamic mappings of the same physical page, because there is no refcounting in ioremap and, if it doesn't find a static mapping containing the region we're interested in, it simply creates a new mapping. Best regards, Tomasz
diff --git a/Documentation/devicetree/bindings/clock/gate-clock.txt b/Documentation/devicetree/bindings/clock/gate-clock.txt new file mode 100644 index 0000000..1c0d4d5 --- /dev/null +++ b/Documentation/devicetree/bindings/clock/gate-clock.txt @@ -0,0 +1,36 @@ +Binding for simple gate clock. + +This binding uses the common clock binding[1]. It assumes a +register-mapped clock gate controlled by a single bit that has only one +input clock or parent. By default setting the specified bit gates the +clock signal and clearing the bit ungates it. + +The binding must provide the register to control the gate and the bit +shift for the corresponding gate control bit. Some clocks set the bit to +gate the clock signal, and clear it to ungate the clock signal. The +optional "set-bit-to-disable" specifies this behavior. + +[1] Documentation/devicetree/bindings/clock/clock-bindings.txt + +Required properties: +- compatible : shall be "gate-clock". +- #clock-cells : from common clock binding; shall be set to 0. +- clocks : link to phandle of parent clock +- reg : base address for register controlling adjustable gate +- bit-shift : bit shift for programming the clock gate + +Optional properties: +- clock-output-names : from common clock binding. +- set-bit-to-disable : inverts default gate programming. Setting the bit + gates the clock and clearing the bit ungates the clock. +- hiword-mask : lower half of the register controls the gate, upper half + of the register indicates bits that were updated in the lower half + +Examples: + clock_foo: clock_foo@4a008100 { + compatible = "gate-clock"; + #clock-cells = <0>; + clocks = <&clock_bar>; + reg = <0x4a008100 0x4> + bit-shift = <3> + }; diff --git a/drivers/clk/clk-gate.c b/drivers/clk/clk-gate.c index 2b28a00..63641c2 100644 --- a/drivers/clk/clk-gate.c +++ b/drivers/clk/clk-gate.c @@ -15,6 +15,8 @@ #include <linux/io.h> #include <linux/err.h> #include <linux/string.h> +#include <linux/of.h> +#include <linux/of_address.h> /** * DOC: basic gatable clock which can gate and ungate it's ouput @@ -162,3 +164,48 @@ struct clk *clk_register_gate(struct device *dev, const char *name, return clk; } EXPORT_SYMBOL_GPL(clk_register_gate); + +#ifdef CONFIG_OF +/** + * of_gate_clk_setup() - Setup function for simple gate rate clock + */ +void of_gate_clk_setup(struct device_node *node) +{ + struct clk *clk; + const char *clk_name = node->name; + void __iomem *reg; + const char *parent_name; + u8 clk_gate_flags = 0; + u32 bit_idx = 0; + + of_property_read_string(node, "clock-output-names", &clk_name); + + parent_name = of_clk_get_parent_name(node, 0); + + reg = of_iomap(node, 0); + if (!reg) { + pr_err("%s: no memory mapped for property reg\n", __func__); + return; + } + + if (of_property_read_u32(node, "bit-shift", &bit_idx)) { + pr_err("%s: missing bit-shift property for %s\n", + __func__, node->name); + return; + } + + if (of_property_read_bool(node, "set-bit-to-disable")) + clk_gate_flags |= CLK_GATE_SET_TO_DISABLE; + + if (of_property_read_bool(node, "hiword-mask")) + clk_gate_flags |= CLK_GATE_HIWORD_MASK; + + clk = clk_register_gate(NULL, clk_name, parent_name, 0, reg, bit_idx, + clk_gate_flags, NULL); + + if (!IS_ERR(clk)) + of_clk_add_provider(node, of_clk_src_simple_get, clk); +} +EXPORT_SYMBOL_GPL(of_gate_clk_setup); +CLK_OF_DECLARE(gate_clk, "gate-clock", of_gate_clk_setup); +#endif diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h index 218d923..b471e37 100644 --- a/include/linux/clk-provider.h +++ b/include/linux/clk-provider.h @@ -240,6 +240,8 @@ struct clk *clk_register_gate(struct device *dev, const char *name, void __iomem *reg, u8 bit_idx, u8 clk_gate_flags, spinlock_t *lock); +void of_gate_clk_setup(struct device_node *node); + struct clk_div_table { unsigned int val; unsigned int div;