Message ID | 20210524182745.22923-1-sven@svenpeter.dev (mailing list archive) |
---|---|
Headers | show |
Series | Apple M1 clock gate driver | expand |
On Mon, May 24, 2021 at 1:28 PM Sven Peter <sven@svenpeter.dev> wrote: > > > Hi, > > This series adds support for the clock gates present in Apple's SoCs such as > the M1. > > These SoCs have various clock gates for their peripherals usually located in > one or two MMIO regions. Each clock gate is a single 32 bit register which > contains bits for the requested and the actual mode. The mode is represented by > four bits. We however only care about "everything enabled" (0b1111) and > "everything disabled" (0b000) for now. The other modes are probably different > low-power states which don't even seem to be used by MacOS. The actual power > management is located in a different region (and probably handled by a separate > processor on the M1). > > Additionally, these clocks have a topology that is usually specified in Apple's > Device Tree. As far as I can tell most Linux drivers seem to encode this topology > directly in the source code though. Despite this, we would still like to encode > the topology in the device tree itself: We only define leaf clocks primarily. There's some exceptions such as if PLLs are a separate h/w block. The reason for this is because typical SoCs have 100s of just leaf clocks. If we tried to model everything, it would never be complete nor correct. Actually, we did try that at first. > Apple seems to only change HW blocks when they have a very good reason and even > then they seem to keep the changes minimal. This specific clock gate MMIO block > has already been used in the Apple A7 released in 2013. The only differences > since then are the available clocks (which can be identified by a simple offset) > and their topology. Clock gates are easy. What about muxes, dividers, etc.? > It's likely that Apple will still use this block in future SoCs as well. By > encoding the clock gate topology inside the device tree as well we can use a > single driver and only need updates to the device tree once they are released. > This might allow end users to already run their favorite distribution by just > updating the bootloader with a new device tree instead of having to wait until > the new topology is integrated into their distro kernel. > > Additionally, the driver itself can be kept very simple and not much code needs > to be duplicated and then updated for each new SoC between various consumers of > these device tree nodes (Linux, u-boot, and OpenBSD for now). > > > This series therefore creates a single device tree node for each clock gate. > This unfortunately maps a whole page out of which only a single register will > be used for each node. > > The other alternative I considered was to create a syscon/simple-mfd node > and have the clock nodes as children. The gates would then use a regmap which > would only require a single entry in the pagetable for all clocks. I decided > against this option since the clock gate MMIO region actually isn't a > multi-function device. I would do a single node per mmio region with the register offset (or offset / 4) being the clock id. This can still support new SoCs easily if you have a fallback compatible. If you want/need to get all the clocks, just walk the DT 'clocks' properties and extract all the IDs. Rob
Hi, * Rob Herring <robh+dt@kernel.org> [210525 18:09]: > I would do a single node per mmio region with the register offset (or > offset / 4) being the clock id. This can still support new SoCs easily > if you have a fallback compatible. If you want/need to get all the > clocks, just walk the DT 'clocks' properties and extract all the IDs. I mostly agree.. Except I'd also leave out the artificial clock ID and just use real register offsets from the clock controller base instead. So a single clock controller node for each MMIO range, then set #clock=cells = <1>. Then the binding follows what we have for the interrupts-extended binding for example. If the clock controller optionally needs some data in the dts, that can be added to the clock controller node. Or it can be driver internal built-in data. If the data for dts can be described in a generic way, even better :) This would make the consumer interface look like below with a clock controller node and register offset from it: clocks = <&clock_controller1 0x1234>; Regards, Tony
On Tue, May 25, 2021, at 19:41, Rob Herring wrote: > On Mon, May 24, 2021 at 1:28 PM Sven Peter <sven@svenpeter.dev> wrote: > > > > Additionally, these clocks have a topology that is usually specified in Apple's > > Device Tree. As far as I can tell most Linux drivers seem to encode this topology > > directly in the source code though. Despite this, we would still like to encode > > the topology in the device tree itself: > > We only define leaf clocks primarily. There's some exceptions such as > if PLLs are a separate h/w block. The reason for this is because > typical SoCs have 100s of just leaf clocks. If we tried to model > everything, it would never be complete nor correct. Actually, we did > try that at first. Makes sense. For Apple SoCs specifically most topology seems to documented (with names!) in the arm-io/pmgr device node of their ADT. > > > Apple seems to only change HW blocks when they have a very good reason and even > > then they seem to keep the changes minimal. This specific clock gate MMIO block > > has already been used in the Apple A7 released in 2013. The only differences > > since then are the available clocks (which can be identified by a simple offset) > > and their topology. > > Clock gates are easy. What about muxes, dividers, etc.? These are preconfigured by iBoot long before we get control and I'm not sure we will be able to figure out how they work in detail. All we know are their frequencies and we will probably have to just treat them as fixed frequency clocks. Some details: In Apple's device tree, each node can have a "clock-gates" or "power-gates" property which refer to clock gates required to enable the peripheral. These are implemented with this small driver and required to make most of the hardware work. Most of these are turned off by the time we get control. Additionally, there is a "clocks" property that either refers to one of eight or so static clocks (starting at index 0) which are not backed by any configurable PLL, mux or divider. On top of that there is a list of what I believe to be mux/divider clocks (index >0x100) specified by the clock-frequencies and clock-frequencies-regs properties of the arm-io node. These registers are in a separate MMIO region and XNU only seems to use clock-frequencies to find the clock rate. It never even reads clock-frequencies-regs. While I can guess some of the bits of these registers (I believe I know the enable bit and the divisor field and possibly the input select but I do not know the inputs and believe they are actually different for each clock) I don't see a need to implement them right now. If we need them in the future they'd probably be a separate node anyway since they are all also located in an entirely different MMIO region and not related to these gates at all. > > > It's likely that Apple will still use this block in future SoCs as well. By > > encoding the clock gate topology inside the device tree as well we can use a > > single driver and only need updates to the device tree once they are released. > > This might allow end users to already run their favorite distribution by just > > updating the bootloader with a new device tree instead of having to wait until > > the new topology is integrated into their distro kernel. > > > > Additionally, the driver itself can be kept very simple and not much code needs > > to be duplicated and then updated for each new SoC between various consumers of > > these device tree nodes (Linux, u-boot, and OpenBSD for now). > > > > > > This series therefore creates a single device tree node for each clock gate. > > This unfortunately maps a whole page out of which only a single register will > > be used for each node. > > > > The other alternative I considered was to create a syscon/simple-mfd node > > and have the clock nodes as children. The gates would then use a regmap which > > would only require a single entry in the pagetable for all clocks. I decided > > against this option since the clock gate MMIO region actually isn't a > > multi-function device. > > I would do a single node per mmio region with the register offset (or > offset / 4) being the clock id. This can still support new SoCs easily > if you have a fallback compatible. If you want/need to get all the > clocks, just walk the DT 'clocks' properties and extract all the IDs. > The problem with that approach is that to enable e.g. UART_0 we actually need to enable its parents as well, e.g. the Apple Device Tree for the M1 has the following clock topology: UART0 (0x23b700270), parent: UART_P UART_P (0x23b700220), parent: SIO SIO (0x23b7001c0), parent: n/a The offsets and the parent/child relationship for all of these three clocks change between SoCs. If I now use the offset as the clock id I still need to specify that if e.g. UART uses <&clk_controller 0x270> I first need to enable 0x1c0 and then 0x220 and only then 0x270. I can hardcode this in the driver itself by just selecting the appropriate table depending on the compatible but I think we're then back to requiring a code change for new SoCs. Just using the fallback compatible won't be enough unless I introduce dummy nodes that enable UART_P and SIO as well. If I do it like this I'll also need two compatibles for the two MMIO regions since they have different child/parent clocks in there. Any suggestions for how to call those two then? Just t8103-gate-clock-0/1? I'd still prefer to somehow put this information into the device tree since then only a single compatible that will likely work across many SoC generations is required. I'm not sure how to do this though. Best, Sven
Hi, On Wed, May 26, 2021, at 09:18, Tony Lindgren wrote: > Hi, > > * Rob Herring <robh+dt@kernel.org> [210525 18:09]: > > I would do a single node per mmio region with the register offset (or > > offset / 4) being the clock id. This can still support new SoCs easily > > if you have a fallback compatible. If you want/need to get all the > > clocks, just walk the DT 'clocks' properties and extract all the IDs. > > I mostly agree.. Except I'd also leave out the artificial clock ID and > just use real register offsets from the clock controller base instead. Sure, I'll do that. > > So a single clock controller node for each MMIO range, then set > #clock=cells = <1>. Then the binding follows what we have for the > interrupts-extended binding for example. > > If the clock controller optionally needs some data in the dts, > that can be added to the clock controller node. Or it can be driver > internal built-in data. If the data for dts can be described in a > generic way, even better :) Now the big question is *how* to describe this additional data in the dts. Essentially I need to specify that e.g. to enable clock 0x270 I first need to enable the (internal) clocks 0x1c0 and then 0x220. Are you aware of any generic way to describe this? I'm not even sure how a sane non-generic way would look like when I just have a single clock controller node. Best, Sven
* Sven Peter <sven@svenpeter.dev> [210530 11:09]: > Hi, > > On Wed, May 26, 2021, at 09:18, Tony Lindgren wrote: > > Hi, > > > > * Rob Herring <robh+dt@kernel.org> [210525 18:09]: > > > I would do a single node per mmio region with the register offset (or > > > offset / 4) being the clock id. This can still support new SoCs easily > > > if you have a fallback compatible. If you want/need to get all the > > > clocks, just walk the DT 'clocks' properties and extract all the IDs. > > > > I mostly agree.. Except I'd also leave out the artificial clock ID and > > just use real register offsets from the clock controller base instead. > > Sure, I'll do that. > > > > > So a single clock controller node for each MMIO range, then set > > #clock=cells = <1>. Then the binding follows what we have for the > > interrupts-extended binding for example. > > > > If the clock controller optionally needs some data in the dts, > > that can be added to the clock controller node. Or it can be driver > > internal built-in data. If the data for dts can be described in a > > generic way, even better :) > > Now the big question is *how* to describe this additional data in the > dts. Essentially I need to specify that e.g. to enable clock 0x270 > I first need to enable the (internal) clocks 0x1c0 and then 0x220. > Are you aware of any generic way to describe this? I'm not even sure > how a sane non-generic way would look like when I just have a single > clock controller node. To me it seems you might be able to recycle the assigned-clocks and assigned-clock-parents etc properties in the clock controller node. Sure the assigned-clocks property will point to clocks in the clock controller itself, and will have tens of entries, but should work :) And sounds like you can generate that list with some script from the Apple dtb. Regards, Tony
* Sven Peter <sven@svenpeter.dev> [210530 11:11]: > The problem with that approach is that to enable e.g. UART_0 we actually need > to enable its parents as well, e.g. the Apple Device Tree for the M1 has the > following clock topology: > > UART0 (0x23b700270), parent: UART_P > UART_P (0x23b700220), parent: SIO > SIO (0x23b7001c0), parent: n/a > > The offsets and the parent/child relationship for all of these three clocks > change between SoCs. If I now use the offset as the clock id I still need > to specify that if e.g. UART uses <&clk_controller 0x270> I first need > to enable 0x1c0 and then 0x220 and only then 0x270. Maybe take a look what I suggested on using assigned-clocks and related properties in the clock controller node. That might solve the issue in a generic way for other SoCs too. Regards, Tony
Hi Tony, On Wed, Jun 2, 2021, at 11:26, Tony Lindgren wrote: > * Sven Peter <sven@svenpeter.dev> [210530 11:09]: > > > > Now the big question is *how* to describe this additional data in the > > dts. Essentially I need to specify that e.g. to enable clock 0x270 > > I first need to enable the (internal) clocks 0x1c0 and then 0x220. > > Are you aware of any generic way to describe this? I'm not even sure > > how a sane non-generic way would look like when I just have a single > > clock controller node. > > To me it seems you might be able to recycle the assigned-clocks and > assigned-clock-parents etc properties in the clock controller node. > > Sure the assigned-clocks property will point to clocks in the > clock controller itself, and will have tens of entries, but should > work :) Thanks for the suggestion, I really like that idea! If I understand the assigned-clocks-parent property correctly it's meant to select one of the possible parents of e.g. a mux clock at startup. Internally it just uses clk_set_parent which, unless I'm mistaken, would require to assign each clock as a possible parent of every other clock. Now the good news is that Apple seems to have sorted the clocks topologically on the bus, i.e. there never is a clock at address X that requires a parent with an address bigger than X. The bad news is that I'd probably still have to setup clocks at 0x0, 0x4, 0x8, ..., X-0x4 as possible parents of the clock at X. Another possibility this made me think of is to instead just use the clocks property the way it's usually used and simply refer to the controller itself, e.g. #define APPLE_CLK_UART0 0x270 #define APPLE_CLK_UART_P 0x220 #define APPLE_CLK_SIO 0x1c0 pmgr0: clock-controller@23b700000 { compatible = "apple,t8103-gate-clock"; #clock-cells = <1>; reg = <0x2 0x3b700000 0x0 0x4000>; clock-indices = <APPLE_CLK_SIO>, <APPLE_CLK_UART_P>, <APPLE_CLK_UART0>; clock-output-names = "clock-sio", "clock-uart-", "clock-uart0"; clocks = <&some_dummy_root_clock>, <&pmgr0 APPLE_CLK_SIO>, <&pmgr0 APPLE_CLK_UART_P>; }; to represent the UART clocks UART0 (0x23b700270), parent: UART_P UART_P (0x23b700220), parent: SIO SIO (0x23b7001c0), parent: n/a and then have the consumer as serial0: serial@235200000 { // ... clocks = <&pmgr0 APPLE_CLK_UART0>, <&clk24>; clock-names = "uart", "clk_uart_baud0"; // ... }; I'd have to see if it's possible to implement this sanely though if this binding was acceptable. The self-reference to a node that hasn't been fully initialized yet may turn out to be ugly. > > And sounds like you can generate that list with some script from the > Apple dtb. Yup, absolutely. I don't want to write this all out by hand if I can avoid that :-) > > Regards, > > Tony > Thanks, Sven
Hi, * Sven Peter <sven@svenpeter.dev> [210603 12:56]: > Another possibility this made me think of is to instead just use the clocks > property the way it's usually used and simply refer to the controller itself, e.g. > > #define APPLE_CLK_UART0 0x270 > #define APPLE_CLK_UART_P 0x220 > #define APPLE_CLK_SIO 0x1c0 > > pmgr0: clock-controller@23b700000 { > compatible = "apple,t8103-gate-clock"; > #clock-cells = <1>; > reg = <0x2 0x3b700000 0x0 0x4000>; > clock-indices = <APPLE_CLK_SIO>, <APPLE_CLK_UART_P>, <APPLE_CLK_UART0>; > clock-output-names = "clock-sio", "clock-uart-", "clock-uart0"; > clocks = <&some_dummy_root_clock>, <&pmgr0 APPLE_CLK_SIO>, > <&pmgr0 APPLE_CLK_UART_P>; > }; How about the following where you set up the gate clocks as separate child nodes: pmgr0: clock-controller@23b700000 { compatible = "apple,foo-clock-controller"; #clock-cells = <1>; reg = <0x2 0x3b700000 0x0 0x4000>; clk_uart0: clock@270 { compatible = "apple,t8103-gate-clock"; #clock-cells = <0>; assigned-clock-parents = <&pmgr0 APPLE_CLK_SIO>, <&pmgr0 APPLE_CLK_UART_P>; // ... }; }; Keep the clock controller still addressable by offset from base as discussed, and additionally have the driver parse and set up the child node clocks. Then I think the consumer driver can just do: serial0: serial@235200000 { // ... clocks = <&clk_uart0>, <&clk24>; clock-names = "uart", "clk_uart_baud0"; // ... }; Regards, Tony
Hi Tony, On Fri, Jun 4, 2021, at 09:43, Tony Lindgren wrote: > Hi, > > How about the following where you set up the gate clocks as separate > child nodes: > > pmgr0: clock-controller@23b700000 { > compatible = "apple,foo-clock-controller"; > #clock-cells = <1>; > reg = <0x2 0x3b700000 0x0 0x4000>; > > clk_uart0: clock@270 { > compatible = "apple,t8103-gate-clock"; > #clock-cells = <0>; > assigned-clock-parents = <&pmgr0 APPLE_CLK_SIO>, > <&pmgr0 APPLE_CLK_UART_P>; > // ... > }; > > }; > > Keep the clock controller still addressable by offset from base as discussed, > and additionally have the driver parse and set up the child node clocks. Nice, I like that one even better! We can keep the internal clocks "hidden" inside the parent node and only need to model the actual consumer clocks as separate nodes. Are you aware of any clock driver that implements something similar? I'd like to avoid reinventing the wheel if it's already been done before. > > Then I think the consumer driver can just do: > > serial0: serial@235200000 { > // ... > clocks = <&clk_uart0>, <&clk24>; > clock-names = "uart", "clk_uart_baud0"; > // ... > }; > > Regards, > > Tony > Best, Sven
* Sven Peter <sven@svenpeter.dev> [210605 12:13]: > Hi Tony, > > On Fri, Jun 4, 2021, at 09:43, Tony Lindgren wrote: > > Hi, > > > > How about the following where you set up the gate clocks as separate > > child nodes: > > > > pmgr0: clock-controller@23b700000 { > > compatible = "apple,foo-clock-controller"; > > #clock-cells = <1>; > > reg = <0x2 0x3b700000 0x0 0x4000>; > > > > clk_uart0: clock@270 { > > compatible = "apple,t8103-gate-clock"; > > #clock-cells = <0>; > > assigned-clock-parents = <&pmgr0 APPLE_CLK_SIO>, > > <&pmgr0 APPLE_CLK_UART_P>; > > // ... > > }; > > > > }; > > > > Keep the clock controller still addressable by offset from base as discussed, > > and additionally have the driver parse and set up the child node clocks. > > Nice, I like that one even better! We can keep the internal clocks "hidden" > inside the parent node and only need to model the actual consumer clocks > as separate nodes. I guess the child nodes could also use just a clocks property above instead of assigned-clock related properties if there are no configurable source clock mux registers. > Are you aware of any clock driver that implements something similar? > I'd like to avoid reinventing the wheel if it's already been done before. I'm only aware of a partial implementation so far :) The "offset from clock controller base" approach has worked well for the TI clkctrl clocks. The clkctrl gate clocks still have the SoC specific source clock data build into the clock driver(s). That's the Documentation/devicetree/bindings/clock/ti-clkctrl.txt binding. For the clkctrl clocks, the SoC specific source clock data is in drivers/clk/ti/clk-*.c files. With the Apple dtb describing the gate clock parents, you might be able to leave out most of the SoC specific built-in driver data sounds like. Regards, Tony