[2/4] of: DT quirks infrastructure

Message ID	1424271576-1952-3-git-send-email-pantelis.antoniou@konsulko.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Pantelis Antoniou <pantelis.antoniou@konsulko.com> To: Grant Likely <grant.likely@secretlab.ca> Subject: [PATCH 2/4] of: DT quirks infrastructure Date: Wed, 18 Feb 2015 16:59:34 +0200 Message-Id: <1424271576-1952-3-git-send-email-pantelis.antoniou@konsulko.com> In-Reply-To: <1424271576-1952-1-git-send-email-pantelis.antoniou@konsulko.com> References: <1424271576-1952-1-git-send-email-pantelis.antoniou@konsulko.com> Cc: devicetree@vger.kernel.org, Tony Lindgren <tony@atomide.com>, Pantelis Antoniou <pantelis.antoniou@konsulko.com>, Koen Kooi <koen@dominion.thruhere.net>, Nicolas Ferre <nicolas.ferre@atmel.com>, linux-kernel@vger.kernel.org, Ludovic Desroches <ludovic.desroches@atmel.com>, linux-arm-kernel@lists.infradead.org, Pantelis Antoniou <panto@antoniou-consulting.com>, Matt Porter <matt.porter@linaro.org>, Guenter Roeck <linux@roeck-us.net> Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Pantelis Antoniou Feb. 18, 2015, 2:59 p.m. UTC

Implement a method of applying DT quirks early in the boot sequence.

A DT quirk is a subtree of the boot DT that can be applied to
a target in the base DT resulting in a modification of the live
tree. The format of the quirk nodes is that of a device tree overlay.

For details please refer to Documentation/devicetree/quirks.txt

Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
---
 Documentation/devicetree/quirks.txt | 101 ++++++++++
 drivers/of/dynamic.c                | 358 ++++++++++++++++++++++++++++++++++++
 include/linux/of.h                  |  16 ++
 3 files changed, 475 insertions(+)
 create mode 100644 Documentation/devicetree/quirks.txt

Mark Rutland Feb. 18, 2015, 3:41 p.m. UTC | #1

Hi Pantelis,

On Wed, Feb 18, 2015 at 02:59:34PM +0000, Pantelis Antoniou wrote:
> Implement a method of applying DT quirks early in the boot sequence.
>
> A DT quirk is a subtree of the boot DT that can be applied to
> a target in the base DT resulting in a modification of the live
> tree. The format of the quirk nodes is that of a device tree overlay.
>
> For details please refer to Documentation/devicetree/quirks.txt
>
> Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
> ---
>  Documentation/devicetree/quirks.txt | 101 ++++++++++
>  drivers/of/dynamic.c                | 358 ++++++++++++++++++++++++++++++++++++
>  include/linux/of.h                  |  16 ++
>  3 files changed, 475 insertions(+)
>  create mode 100644 Documentation/devicetree/quirks.txt
>
> diff --git a/Documentation/devicetree/quirks.txt b/Documentation/devicetree/quirks.txt
> new file mode 100644
> index 0000000..789319a
> --- /dev/null
> +++ b/Documentation/devicetree/quirks.txt
> @@ -0,0 +1,101 @@
> +A Device Tree quirk is the way which allows modification of the
> +boot device tree under the control of a per-platform specific method.
> +
> +Take for instance the case of a board family that comprises of a
> +number of different board revisions, all being incremental changes
> +after an initial release.
> +
> +Since all board revisions must be supported via a single software image
> +the only way to support this scheme is by having a different DTB for each
> +revision with the bootloader selecting which one to use at boot time.

Not necessarily at boot time. The boards don't have to run the exact
same FW/bootloader binary, so the relevant DTB could be programmed onto
each board.

> +While this may in theory work, in practice it is very cumbersome
> +for the following reasons:
> +
> +1. The act of selecting a different boot device tree blob requires
> +a reasonably advanced bootloader with some kind of configuration or
> +scripting capabilities. Sadly this is not the case many times, the
> +bootloader is extremely dumb and can only use a single dt blob.

You can have several bootloader builds, or even a single build with
something like appended DTB to get an appropriate DTB if the same binary
will otherwise work across all variants of a board.

So it's not necessarily true that you need a complex bootloader.

> +2. On many instances boot time is extremely critical; in some cases
> +there are hard requirements like having working video feeds in under
> +2 seconds from power-up. This leaves an extremely small time budget for
> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
> +is by removing the standard bootloader from the normal boot sequence
> +altogether by having a very small boot shim that loads the kernel and
> +immediately jumps to kernel, like falcon-boot mode in u-boot does.

Given my previous comments above I don't see why this is relevant.
You're already passing _some_ DTB here, so if you can organise for the
board to statically provide a sane DTB that's fine, or you can resort to
appended DTB if it's not possible to update the board configuration.

> +3. Having different DTBs/DTSs for different board revisions easily leads to
> +drift between versions. Since no developer is expected to have every single
> +board revision at hand, things are easy to get out of sync, with board versions
> +failing to boot even though the kernel is up to date.

I'm not sure I follow. Surely if you don't have every board revision at
hand you can't test quirks exhaustively either?

Additionally you face the problem that two boards of the same variant
could have different base DTBs that you would need to test that each
board's quirks worked for a range of base DTBs.

> +4. One final problem is the static way that device tree works.
> +For example it might be desirable for various boards to have a way to
> +selectively configure the boot device tree, possibly by the use of command
> +line options.  For instance a device might be disabled if a given command line
> +option is present, different configuration to various devices for debugging
> +purposes can be selected and so on. Currently the only way to do so is by
> +recompiling the DTS and installing, which is an chore for developers and
> +a completely unreasonable expectation from end-users.

I'm not sure I follow here.

Which devices do you envisage this being the case for?

Outside of debug scenarios when would you envisage we do this?

For the debug case it seems reasonable to have command line parameters
to get the kernel to do what we want. Just because there's a device in
the DTB that's useful in a debug scenario doesn't mean we have to use it
by default.

> +Device Tree quirks solve all those problems by having an in-kernel interface
> +which per-board/platform method can use to selectively modify the device tree
> +right after unflattening.
> +
> +A DT quirk is a subtree of the boot DT that can be applied to
> +a target in the base DT resulting in a modification of the live
> +tree. The format of the quirk nodes is that of a device tree overlay.
> +
> +As an example the following DTS contains a quirk.
> +
> +/ {
> +       foo: foo-node {
> +               bar = <10>;
> +       };
> +
> +       select-quirk = <&quirk>;
> +
> +       quirk: quirk {
> +               fragment@0 {
> +                       target = <&foo>;
> +                       __overlay {
> +                               bar = <0xf00>;
> +                               baz = <11>;
> +                       };
> +               };
> +       };
> +};
> +
> +The quirk when applied would result at the following tree:
> +
> +/ {
> +       foo: foo-node {
> +               bar = <0xf00>;
> +               baz = <11>;
> +       };
> +
> +       select-quirk = <&quirk>;
> +
> +       quirk: quirk {
> +               fragment@0 {
> +                       target = <&foo>;
> +                       __overlay {
> +                               bar = <0xf00>;
> +                               baz = <11>;
> +                       };
> +               };
> +       };
> +
> +};
> +
> +The two public method used to accomplish this are of_quirk_apply_by_node()
> +and of_quirk_apply_by_phandle();
> +
> +To apply the quirk, a per-platform method can retrieve the phandle from the
> +select-quirk property and pass it to the of_quirk_apply_by_phandle() node.
> +
> +The method which the per-platform method is using to select the quirk to apply
> +is out of the scope of the DT quirk definition, but possible methods include
> +and are not limited to: revision encoding in a GPIO input range, board id
> +located in external or internal EEPROM or flash, DMI board ids, etc.

It seems rather unfortunate that to get a useful device tree we have to
resort to board-specific mechanisms. That means yet more platform code,
which is rather unfortunate. This would also require any other DT users
to understand and potentially have to sanitize any quirks (e.g. in the
case of Xen).

Mark.

Pantelis Antoniou Feb. 18, 2015, 3:53 p.m. UTC | #2

Hi Mark,

> On Feb 18, 2015, at 17:41 , Mark Rutland <mark.rutland@arm.com> wrote:
> 
> Hi Pantelis,
> 
> On Wed, Feb 18, 2015 at 02:59:34PM +0000, Pantelis Antoniou wrote:
>> Implement a method of applying DT quirks early in the boot sequence.
>> 
>> A DT quirk is a subtree of the boot DT that can be applied to
>> a target in the base DT resulting in a modification of the live
>> tree. The format of the quirk nodes is that of a device tree overlay.
>> 
>> For details please refer to Documentation/devicetree/quirks.txt
>> 
>> Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
>> ---
>> Documentation/devicetree/quirks.txt | 101 ++++++++++
>> drivers/of/dynamic.c                | 358 ++++++++++++++++++++++++++++++++++++
>> include/linux/of.h                  |  16 ++
>> 3 files changed, 475 insertions(+)
>> create mode 100644 Documentation/devicetree/quirks.txt
>> 
>> diff --git a/Documentation/devicetree/quirks.txt b/Documentation/devicetree/quirks.txt
>> new file mode 100644
>> index 0000000..789319a
>> --- /dev/null
>> +++ b/Documentation/devicetree/quirks.txt
>> @@ -0,0 +1,101 @@
>> +A Device Tree quirk is the way which allows modification of the
>> +boot device tree under the control of a per-platform specific method.
>> +
>> +Take for instance the case of a board family that comprises of a
>> +number of different board revisions, all being incremental changes
>> +after an initial release.
>> +
>> +Since all board revisions must be supported via a single software image
>> +the only way to support this scheme is by having a different DTB for each
>> +revision with the bootloader selecting which one to use at boot time.
> 
> Not necessarily at boot time. The boards don't have to run the exact
> same FW/bootloader binary, so the relevant DTB could be programmed onto
> each board.
> 

That has not been the case in any kind of board I’ve worked with.

A special firmware image that requires a different programming step at
the factory to select the correct DTB for each is always one more thing
that can go wrong.

>> +While this may in theory work, in practice it is very cumbersome
>> +for the following reasons:
>> +
>> +1. The act of selecting a different boot device tree blob requires
>> +a reasonably advanced bootloader with some kind of configuration or
>> +scripting capabilities. Sadly this is not the case many times, the
>> +bootloader is extremely dumb and can only use a single dt blob.
> 
> You can have several bootloader builds, or even a single build with
> something like appended DTB to get an appropriate DTB if the same binary
> will otherwise work across all variants of a board.
> 

No, the same DTB will not work across all the variants of a board.

> So it's not necessarily true that you need a complex bootloader.
> 

>> +2. On many instances boot time is extremely critical; in some cases
>> +there are hard requirements like having working video feeds in under
>> +2 seconds from power-up. This leaves an extremely small time budget for
>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>> +is by removing the standard bootloader from the normal boot sequence
>> +altogether by having a very small boot shim that loads the kernel and
>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
> 
> Given my previous comments above I don't see why this is relevant.
> You're already passing _some_ DTB here, so if you can organise for the
> board to statically provide a sane DTB that's fine, or you can resort to
> appended DTB if it's not possible to update the board configuration.
> 

You’re missing the point. I can’t use the same DTB for each revision of the
board. Each board is similar but it’s not identical.

>> +3. Having different DTBs/DTSs for different board revisions easily leads to
>> +drift between versions. Since no developer is expected to have every single
>> +board revision at hand, things are easy to get out of sync, with board versions
>> +failing to boot even though the kernel is up to date.
> 
> I'm not sure I follow. Surely if you don't have every board revision at
> hand you can't test quirks exhaustively either?
> 

It’s one less thing to worry about. For example in the current mainline kernel
already there is a drift between the beaglebone white and the beaglebone black.

Having the same DTS is just easier to keep things in sync.

> Additionally you face the problem that two boards of the same variant
> could have different base DTBs that you would need to test that each
> board's quirks worked for a range of base DTBs.
> 

This is not a valid case. This patch is about boards that have the same base DTB.

>> +4. One final problem is the static way that device tree works.
>> +For example it might be desirable for various boards to have a way to
>> +selectively configure the boot device tree, possibly by the use of command
>> +line options.  For instance a device might be disabled if a given command line
>> +option is present, different configuration to various devices for debugging
>> +purposes can be selected and so on. Currently the only way to do so is by
>> +recompiling the DTS and installing, which is an chore for developers and
>> +a completely unreasonable expectation from end-users.
> 
> I'm not sure I follow here.
> 
> Which devices do you envisage this being the case for?
> 
> Outside of debug scenarios when would you envisage we do this?
> 

We already have to do this on the beaglebone black. The onboard EMMC and HDMI
devices conflict with any capes that use the same pins. So you have to
have a way to disable them so that the attached cape will work.
 
> For the debug case it seems reasonable to have command line parameters
> to get the kernel to do what we want. Just because there's a device in
> the DTB that's useful in a debug scenario doesn't mean we have to use it
> by default.

I don’t follow. Users need this functionality to work. I.e. pass a command
line option to use different OPPs etc. Real world usage is messy.

> 
>> +Device Tree quirks solve all those problems by having an in-kernel interface
>> +which per-board/platform method can use to selectively modify the device tree
>> +right after unflattening.
>> +
>> +A DT quirk is a subtree of the boot DT that can be applied to
>> +a target in the base DT resulting in a modification of the live
>> +tree. The format of the quirk nodes is that of a device tree overlay.
>> +
>> +As an example the following DTS contains a quirk.
>> +
>> +/ {
>> +       foo: foo-node {
>> +               bar = <10>;
>> +       };
>> +
>> +       select-quirk = <&quirk>;
>> +
>> +       quirk: quirk {
>> +               fragment@0 {
>> +                       target = <&foo>;
>> +                       __overlay {
>> +                               bar = <0xf00>;
>> +                               baz = <11>;
>> +                       };
>> +               };
>> +       };
>> +};
>> +
>> +The quirk when applied would result at the following tree:
>> +
>> +/ {
>> +       foo: foo-node {
>> +               bar = <0xf00>;
>> +               baz = <11>;
>> +       };
>> +
>> +       select-quirk = <&quirk>;
>> +
>> +       quirk: quirk {
>> +               fragment@0 {
>> +                       target = <&foo>;
>> +                       __overlay {
>> +                               bar = <0xf00>;
>> +                               baz = <11>;
>> +                       };
>> +               };
>> +       };
>> +
>> +};
>> +
>> +The two public method used to accomplish this are of_quirk_apply_by_node()
>> +and of_quirk_apply_by_phandle();
>> +
>> +To apply the quirk, a per-platform method can retrieve the phandle from the
>> +select-quirk property and pass it to the of_quirk_apply_by_phandle() node.
>> +
>> +The method which the per-platform method is using to select the quirk to apply
>> +is out of the scope of the DT quirk definition, but possible methods include
>> +and are not limited to: revision encoding in a GPIO input range, board id
>> +located in external or internal EEPROM or flash, DMI board ids, etc.
> 
> It seems rather unfortunate that to get a useful device tree we have to
> resort to board-specific mechanisms. That means yet more platform code,
> which is rather unfortunate. This would also require any other DT users
> to understand and potentially have to sanitize any quirks (e.g. in the
> case of Xen).

The original internal version of the patches used early platform devices and
a generic firmware quirk mechanism, but I was directed to the per-platform
method instead. It is perfectly doable to go back at the original implementation
but I need to get the ball rolling with a discussion about the internals.
 
> 
> Mark.

Regards

— Pantelis

Ludovic Desroches Feb. 18, 2015, 4:32 p.m. UTC | #3

Hi,

Great something we are waiting for a long time!

On Wed, Feb 18, 2015 at 05:53:50PM +0200, Pantelis Antoniou wrote:
> Hi Mark,
> 
> > On Feb 18, 2015, at 17:41 , Mark Rutland <mark.rutland@arm.com> wrote:
> > 
> > Hi Pantelis,
> > 
> > On Wed, Feb 18, 2015 at 02:59:34PM +0000, Pantelis Antoniou wrote:
> >> Implement a method of applying DT quirks early in the boot sequence.
> >> 
> >> A DT quirk is a subtree of the boot DT that can be applied to
> >> a target in the base DT resulting in a modification of the live
> >> tree. The format of the quirk nodes is that of a device tree overlay.
> >> 
> >> For details please refer to Documentation/devicetree/quirks.txt
> >> 
> >> Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
> >> ---
> >> Documentation/devicetree/quirks.txt | 101 ++++++++++
> >> drivers/of/dynamic.c                | 358 ++++++++++++++++++++++++++++++++++++
> >> include/linux/of.h                  |  16 ++
> >> 3 files changed, 475 insertions(+)
> >> create mode 100644 Documentation/devicetree/quirks.txt
> >> 
> >> diff --git a/Documentation/devicetree/quirks.txt b/Documentation/devicetree/quirks.txt
> >> new file mode 100644
> >> index 0000000..789319a
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/quirks.txt
> >> @@ -0,0 +1,101 @@
> >> +A Device Tree quirk is the way which allows modification of the
> >> +boot device tree under the control of a per-platform specific method.
> >> +
> >> +Take for instance the case of a board family that comprises of a
> >> +number of different board revisions, all being incremental changes
> >> +after an initial release.
> >> +
> >> +Since all board revisions must be supported via a single software image
> >> +the only way to support this scheme is by having a different DTB for each
> >> +revision with the bootloader selecting which one to use at boot time.
> > 
> > Not necessarily at boot time. The boards don't have to run the exact
> > same FW/bootloader binary, so the relevant DTB could be programmed onto
> > each board.
> > 
> 
> That has not been the case in any kind of board I’ve worked with.
> 
> A special firmware image that requires a different programming step at
> the factory to select the correct DTB for each is always one more thing
> that can go wrong.
> 

I agree. We have boards with several display modules, even if it seems
quite easy to know which dtb has to be loaded since we use a prefix
describing the display module (_pda4, _pda7, etc.) it is a pain for
customers. Moreover you can add the revision of the board, we have a
mother board and a cpu module so it can quickly lead to something like
this:
at91-sama5d31ek_mb-revc_cm-revd_pda7.

It is only an example, at the moment it is a bit less complicated but I
am not so far from the reality: sama5d31ek_revc_pda7.dts,
sama5d33ek_revc_pda4.dts, etc. For a SoC family, we have 27 DTS files...

As for the single zImage, we should find a way to have a single DTB.

> >> +While this may in theory work, in practice it is very cumbersome
> >> +for the following reasons:
> >> +
> >> +1. The act of selecting a different boot device tree blob requires
> >> +a reasonably advanced bootloader with some kind of configuration or
> >> +scripting capabilities. Sadly this is not the case many times, the
> >> +bootloader is extremely dumb and can only use a single dt blob.
> > 
> > You can have several bootloader builds, or even a single build with
> > something like appended DTB to get an appropriate DTB if the same binary
> > will otherwise work across all variants of a board.
> > 
> 
> No, the same DTB will not work across all the variants of a board.
> 
> > So it's not necessarily true that you need a complex bootloader.
> > 
> 
> >> +2. On many instances boot time is extremely critical; in some cases
> >> +there are hard requirements like having working video feeds in under
> >> +2 seconds from power-up. This leaves an extremely small time budget for
> >> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
> >> +is by removing the standard bootloader from the normal boot sequence
> >> +altogether by having a very small boot shim that loads the kernel and
> >> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
> > 
> > Given my previous comments above I don't see why this is relevant.
> > You're already passing _some_ DTB here, so if you can organise for the
> > board to statically provide a sane DTB that's fine, or you can resort to
> > appended DTB if it's not possible to update the board configuration.
> > 
> 
> You’re missing the point. I can’t use the same DTB for each revision of the
> board. Each board is similar but it’s not identical.
> 
> >> +3. Having different DTBs/DTSs for different board revisions easily leads to
> >> +drift between versions. Since no developer is expected to have every single
> >> +board revision at hand, things are easy to get out of sync, with board versions
> >> +failing to boot even though the kernel is up to date.
> > 
> > I'm not sure I follow. Surely if you don't have every board revision at
> > hand you can't test quirks exhaustively either?
> > 
> 
> It’s one less thing to worry about. For example in the current mainline kernel
> already there is a drift between the beaglebone white and the beaglebone black.
> 
> Having the same DTS is just easier to keep things in sync.
> 
> > Additionally you face the problem that two boards of the same variant
> > could have different base DTBs that you would need to test that each
> > board's quirks worked for a range of base DTBs.
> > 
> 
> This is not a valid case. This patch is about boards that have the same base DTB.
> 
> >> +4. One final problem is the static way that device tree works.
> >> +For example it might be desirable for various boards to have a way to
> >> +selectively configure the boot device tree, possibly by the use of command
> >> +line options.  For instance a device might be disabled if a given command line
> >> +option is present, different configuration to various devices for debugging
> >> +purposes can be selected and so on. Currently the only way to do so is by
> >> +recompiling the DTS and installing, which is an chore for developers and
> >> +a completely unreasonable expectation from end-users.
> > 
> > I'm not sure I follow here.
> > 
> > Which devices do you envisage this being the case for?
> > 
> > Outside of debug scenarios when would you envisage we do this?
> > 
> 
> We already have to do this on the beaglebone black. The onboard EMMC and HDMI
> devices conflict with any capes that use the same pins. So you have to
> have a way to disable them so that the attached cape will work.
>  
> > For the debug case it seems reasonable to have command line parameters
> > to get the kernel to do what we want. Just because there's a device in
> > the DTB that's useful in a debug scenario doesn't mean we have to use it
> > by default.
> 
> I don’t follow. Users need this functionality to work. I.e. pass a command
> line option to use different OPPs etc. Real world usage is messy.
> 
> > 
> >> +Device Tree quirks solve all those problems by having an in-kernel interface
> >> +which per-board/platform method can use to selectively modify the device tree
> >> +right after unflattening.
> >> +
> >> +A DT quirk is a subtree of the boot DT that can be applied to
> >> +a target in the base DT resulting in a modification of the live
> >> +tree. The format of the quirk nodes is that of a device tree overlay.
> >> +
> >> +As an example the following DTS contains a quirk.
> >> +
> >> +/ {
> >> +       foo: foo-node {
> >> +               bar = <10>;
> >> +       };
> >> +
> >> +       select-quirk = <&quirk>;
> >> +
> >> +       quirk: quirk {
> >> +               fragment@0 {
> >> +                       target = <&foo>;
> >> +                       __overlay {
> >> +                               bar = <0xf00>;
> >> +                               baz = <11>;
> >> +                       };
> >> +               };
> >> +       };
> >> +};
> >> +
> >> +The quirk when applied would result at the following tree:
> >> +
> >> +/ {
> >> +       foo: foo-node {
> >> +               bar = <0xf00>;
> >> +               baz = <11>;
> >> +       };
> >> +
> >> +       select-quirk = <&quirk>;
> >> +
> >> +       quirk: quirk {
> >> +               fragment@0 {
> >> +                       target = <&foo>;
> >> +                       __overlay {
> >> +                               bar = <0xf00>;
> >> +                               baz = <11>;
> >> +                       };
> >> +               };
> >> +       };
> >> +
> >> +};
> >> +
> >> +The two public method used to accomplish this are of_quirk_apply_by_node()
> >> +and of_quirk_apply_by_phandle();
> >> +
> >> +To apply the quirk, a per-platform method can retrieve the phandle from the
> >> +select-quirk property and pass it to the of_quirk_apply_by_phandle() node.
> >> +
> >> +The method which the per-platform method is using to select the quirk to apply
> >> +is out of the scope of the DT quirk definition, but possible methods include
> >> +and are not limited to: revision encoding in a GPIO input range, board id
> >> +located in external or internal EEPROM or flash, DMI board ids, etc.
> > 
> > It seems rather unfortunate that to get a useful device tree we have to
> > resort to board-specific mechanisms. That means yet more platform code,
> > which is rather unfortunate. This would also require any other DT users
> > to understand and potentially have to sanitize any quirks (e.g. in the
> > case of Xen).
> 
> The original internal version of the patches used early platform devices and
> a generic firmware quirk mechanism, but I was directed to the per-platform
> method instead. It is perfectly doable to go back at the original implementation
> but I need to get the ball rolling with a discussion about the internals.

I also think we should used early platform devices to not add platform
specific code. What were the cons to swith to per-platform method?

>  
> > 
> > Mark.
> 
> Regards
> 
> — Pantelis
> 

Regards

Ludovic

Pantelis Antoniou Feb. 18, 2015, 4:39 p.m. UTC | #4

Hi Ludovic,

> On Feb 18, 2015, at 18:32 , Ludovic Desroches <ludovic.desroches@atmel.com> wrote:
> 
> Hi,
> 
> Great something we are waiting for a long time!
> 
> On Wed, Feb 18, 2015 at 05:53:50PM +0200, Pantelis Antoniou wrote:
>> Hi Mark,
>> 
>>> On Feb 18, 2015, at 17:41 , Mark Rutland <mark.rutland@arm.com> wrote:
>>> 
>>> Hi Pantelis,
>>> 
>>> On Wed, Feb 18, 2015 at 02:59:34PM +0000, Pantelis Antoniou wrote:
>>>> Implement a method of applying DT quirks early in the boot sequence.
>>>> 
>>>> A DT quirk is a subtree of the boot DT that can be applied to
>>>> a target in the base DT resulting in a modification of the live
>>>> tree. The format of the quirk nodes is that of a device tree overlay.
>>>> 
>>>> For details please refer to Documentation/devicetree/quirks.txt
>>>> 
>>>> Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
>>>> ---
>>>> Documentation/devicetree/quirks.txt | 101 ++++++++++
>>>> drivers/of/dynamic.c                | 358 ++++++++++++++++++++++++++++++++++++
>>>> include/linux/of.h                  |  16 ++
>>>> 3 files changed, 475 insertions(+)
>>>> create mode 100644 Documentation/devicetree/quirks.txt
>>>> 
>>>> diff --git a/Documentation/devicetree/quirks.txt b/Documentation/devicetree/quirks.txt
>>>> new file mode 100644
>>>> index 0000000..789319a
>>>> --- /dev/null
>>>> +++ b/Documentation/devicetree/quirks.txt
>>>> @@ -0,0 +1,101 @@
>>>> +A Device Tree quirk is the way which allows modification of the
>>>> +boot device tree under the control of a per-platform specific method.
>>>> +
>>>> +Take for instance the case of a board family that comprises of a
>>>> +number of different board revisions, all being incremental changes
>>>> +after an initial release.
>>>> +
>>>> +Since all board revisions must be supported via a single software image
>>>> +the only way to support this scheme is by having a different DTB for each
>>>> +revision with the bootloader selecting which one to use at boot time.
>>> 
>>> Not necessarily at boot time. The boards don't have to run the exact
>>> same FW/bootloader binary, so the relevant DTB could be programmed onto
>>> each board.
>>> 
>> 
>> That has not been the case in any kind of board I’ve worked with.
>> 
>> A special firmware image that requires a different programming step at
>> the factory to select the correct DTB for each is always one more thing
>> that can go wrong.
>> 
> 
> I agree. We have boards with several display modules, even if it seems
> quite easy to know which dtb has to be loaded since we use a prefix
> describing the display module (_pda4, _pda7, etc.) it is a pain for
> customers. Moreover you can add the revision of the board, we have a
> mother board and a cpu module so it can quickly lead to something like
> this:
> at91-sama5d31ek_mb-revc_cm-revd_pda7.
> 
> It is only an example, at the moment it is a bit less complicated but I
> am not so far from the reality: sama5d31ek_revc_pda7.dts,
> sama5d33ek_revc_pda4.dts, etc. For a SoC family, we have 27 DTS files…
> 

I bet it’s easy to screw up no? Any horror stories from the factory floor? :)

> As for the single zImage, we should find a way to have a single DTB.
> 
>>>> +While this may in theory work, in practice it is very cumbersome
>>>> +for the following reasons:
>>>> +
>>>> +1. The act of selecting a different boot device tree blob requires
>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>> 
>>> You can have several bootloader builds, or even a single build with
>>> something like appended DTB to get an appropriate DTB if the same binary
>>> will otherwise work across all variants of a board.
>>> 
>> 
>> No, the same DTB will not work across all the variants of a board.
>> 
>>> So it's not necessarily true that you need a complex bootloader.
>>> 
>> 
>>>> +2. On many instances boot time is extremely critical; in some cases
>>>> +there are hard requirements like having working video feeds in under
>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>> +is by removing the standard bootloader from the normal boot sequence
>>>> +altogether by having a very small boot shim that loads the kernel and
>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>> 
>>> Given my previous comments above I don't see why this is relevant.
>>> You're already passing _some_ DTB here, so if you can organise for the
>>> board to statically provide a sane DTB that's fine, or you can resort to
>>> appended DTB if it's not possible to update the board configuration.
>>> 
>> 
>> You’re missing the point. I can’t use the same DTB for each revision of the
>> board. Each board is similar but it’s not identical.
>> 
>>>> +3. Having different DTBs/DTSs for different board revisions easily leads to
>>>> +drift between versions. Since no developer is expected to have every single
>>>> +board revision at hand, things are easy to get out of sync, with board versions
>>>> +failing to boot even though the kernel is up to date.
>>> 
>>> I'm not sure I follow. Surely if you don't have every board revision at
>>> hand you can't test quirks exhaustively either?
>>> 
>> 
>> It’s one less thing to worry about. For example in the current mainline kernel
>> already there is a drift between the beaglebone white and the beaglebone black.
>> 
>> Having the same DTS is just easier to keep things in sync.
>> 
>>> Additionally you face the problem that two boards of the same variant
>>> could have different base DTBs that you would need to test that each
>>> board's quirks worked for a range of base DTBs.
>>> 
>> 
>> This is not a valid case. This patch is about boards that have the same base DTB.
>> 
>>>> +4. One final problem is the static way that device tree works.
>>>> +For example it might be desirable for various boards to have a way to
>>>> +selectively configure the boot device tree, possibly by the use of command
>>>> +line options.  For instance a device might be disabled if a given command line
>>>> +option is present, different configuration to various devices for debugging
>>>> +purposes can be selected and so on. Currently the only way to do so is by
>>>> +recompiling the DTS and installing, which is an chore for developers and
>>>> +a completely unreasonable expectation from end-users.
>>> 
>>> I'm not sure I follow here.
>>> 
>>> Which devices do you envisage this being the case for?
>>> 
>>> Outside of debug scenarios when would you envisage we do this?
>>> 
>> 
>> We already have to do this on the beaglebone black. The onboard EMMC and HDMI
>> devices conflict with any capes that use the same pins. So you have to
>> have a way to disable them so that the attached cape will work.
>> 
>>> For the debug case it seems reasonable to have command line parameters
>>> to get the kernel to do what we want. Just because there's a device in
>>> the DTB that's useful in a debug scenario doesn't mean we have to use it
>>> by default.
>> 
>> I don’t follow. Users need this functionality to work. I.e. pass a command
>> line option to use different OPPs etc. Real world usage is messy.
>> 
>>> 
>>>> +Device Tree quirks solve all those problems by having an in-kernel interface
>>>> +which per-board/platform method can use to selectively modify the device tree
>>>> +right after unflattening.
>>>> +
>>>> +A DT quirk is a subtree of the boot DT that can be applied to
>>>> +a target in the base DT resulting in a modification of the live
>>>> +tree. The format of the quirk nodes is that of a device tree overlay.
>>>> +
>>>> +As an example the following DTS contains a quirk.
>>>> +
>>>> +/ {
>>>> +       foo: foo-node {
>>>> +               bar = <10>;
>>>> +       };
>>>> +
>>>> +       select-quirk = <&quirk>;
>>>> +
>>>> +       quirk: quirk {
>>>> +               fragment@0 {
>>>> +                       target = <&foo>;
>>>> +                       __overlay {
>>>> +                               bar = <0xf00>;
>>>> +                               baz = <11>;
>>>> +                       };
>>>> +               };
>>>> +       };
>>>> +};
>>>> +
>>>> +The quirk when applied would result at the following tree:
>>>> +
>>>> +/ {
>>>> +       foo: foo-node {
>>>> +               bar = <0xf00>;
>>>> +               baz = <11>;
>>>> +       };
>>>> +
>>>> +       select-quirk = <&quirk>;
>>>> +
>>>> +       quirk: quirk {
>>>> +               fragment@0 {
>>>> +                       target = <&foo>;
>>>> +                       __overlay {
>>>> +                               bar = <0xf00>;
>>>> +                               baz = <11>;
>>>> +                       };
>>>> +               };
>>>> +       };
>>>> +
>>>> +};
>>>> +
>>>> +The two public method used to accomplish this are of_quirk_apply_by_node()
>>>> +and of_quirk_apply_by_phandle();
>>>> +
>>>> +To apply the quirk, a per-platform method can retrieve the phandle from the
>>>> +select-quirk property and pass it to the of_quirk_apply_by_phandle() node.
>>>> +
>>>> +The method which the per-platform method is using to select the quirk to apply
>>>> +is out of the scope of the DT quirk definition, but possible methods include
>>>> +and are not limited to: revision encoding in a GPIO input range, board id
>>>> +located in external or internal EEPROM or flash, DMI board ids, etc.
>>> 
>>> It seems rather unfortunate that to get a useful device tree we have to
>>> resort to board-specific mechanisms. That means yet more platform code,
>>> which is rather unfortunate. This would also require any other DT users
>>> to understand and potentially have to sanitize any quirks (e.g. in the
>>> case of Xen).
>> 
>> The original internal version of the patches used early platform devices and
>> a generic firmware quirk mechanism, but I was directed to the per-platform
>> method instead. It is perfectly doable to go back at the original implementation
>> but I need to get the ball rolling with a discussion about the internals.
> 
> I also think we should used early platform devices to not add platform
> specific code. What were the cons to swith to per-platform method?
> 

Supposedly easier to accept. /me shrugs

>> 
>>> 
>>> Mark.
>> 
>> Regards
>> 
>> — Pantelis
>> 
> 
> Regards
> 
> Ludovic

Regards

— Pantelis

Matt Porter Feb. 18, 2015, 4:46 p.m. UTC | #5

On Wed, Feb 18, 2015 at 05:53:50PM +0200, Pantelis Antoniou wrote:
> Hi Mark,
> 
> > On Feb 18, 2015, at 17:41 , Mark Rutland <mark.rutland@arm.com> wrote:
> > 
> > Hi Pantelis,
> > 
> > On Wed, Feb 18, 2015 at 02:59:34PM +0000, Pantelis Antoniou wrote:
> >> Implement a method of applying DT quirks early in the boot sequence.
> >> 
> >> A DT quirk is a subtree of the boot DT that can be applied to
> >> a target in the base DT resulting in a modification of the live
> >> tree. The format of the quirk nodes is that of a device tree overlay.
> >> 
> >> For details please refer to Documentation/devicetree/quirks.txt
> >> 
> >> Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
> >> ---
> >> Documentation/devicetree/quirks.txt | 101 ++++++++++
> >> drivers/of/dynamic.c                | 358 ++++++++++++++++++++++++++++++++++++
> >> include/linux/of.h                  |  16 ++
> >> 3 files changed, 475 insertions(+)
> >> create mode 100644 Documentation/devicetree/quirks.txt
> >> 
> >> diff --git a/Documentation/devicetree/quirks.txt b/Documentation/devicetree/quirks.txt
> >> new file mode 100644
> >> index 0000000..789319a
> >> --- /dev/null
> >> +++ b/Documentation/devicetree/quirks.txt
> >> @@ -0,0 +1,101 @@
> >> +A Device Tree quirk is the way which allows modification of the
> >> +boot device tree under the control of a per-platform specific method.
> >> +
> >> +Take for instance the case of a board family that comprises of a
> >> +number of different board revisions, all being incremental changes
> >> +after an initial release.
> >> +
> >> +Since all board revisions must be supported via a single software image
> >> +the only way to support this scheme is by having a different DTB for each
> >> +revision with the bootloader selecting which one to use at boot time.
> > 
> > Not necessarily at boot time. The boards don't have to run the exact
> > same FW/bootloader binary, so the relevant DTB could be programmed onto
> > each board.
> > 
> 
> That has not been the case in any kind of board I’ve worked with.
> 
> A special firmware image that requires a different programming step at
> the factory to select the correct DTB for each is always one more thing
> that can go wrong.
> 
> >> +While this may in theory work, in practice it is very cumbersome
> >> +for the following reasons:
> >> +
> >> +1. The act of selecting a different boot device tree blob requires
> >> +a reasonably advanced bootloader with some kind of configuration or
> >> +scripting capabilities. Sadly this is not the case many times, the
> >> +bootloader is extremely dumb and can only use a single dt blob.
> > 
> > You can have several bootloader builds, or even a single build with
> > something like appended DTB to get an appropriate DTB if the same binary
> > will otherwise work across all variants of a board.
> > 
> 
> No, the same DTB will not work across all the variants of a board.
> 
> > So it's not necessarily true that you need a complex bootloader.
> > 
> 
> >> +2. On many instances boot time is extremely critical; in some cases
> >> +there are hard requirements like having working video feeds in under
> >> +2 seconds from power-up. This leaves an extremely small time budget for
> >> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
> >> +is by removing the standard bootloader from the normal boot sequence
> >> +altogether by having a very small boot shim that loads the kernel and
> >> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
> > 
> > Given my previous comments above I don't see why this is relevant.
> > You're already passing _some_ DTB here, so if you can organise for the
> > board to statically provide a sane DTB that's fine, or you can resort to
> > appended DTB if it's not possible to update the board configuration.
> > 
> 
> You’re missing the point. I can’t use the same DTB for each revision of the
> board. Each board is similar but it’s not identical.
> 
> >> +3. Having different DTBs/DTSs for different board revisions easily leads to
> >> +drift between versions. Since no developer is expected to have every single
> >> +board revision at hand, things are easy to get out of sync, with board versions
> >> +failing to boot even though the kernel is up to date.
> > 
> > I'm not sure I follow. Surely if you don't have every board revision at
> > hand you can't test quirks exhaustively either?
> > 
> 
> It’s one less thing to worry about. For example in the current mainline kernel
> already there is a drift between the beaglebone white and the beaglebone black.
> 
> Having the same DTS is just easier to keep things in sync.

Absolutely, we have the problem in the dts files today where we have
lots of duplicated bits. I know one case that I think you are alluding
to is how BBW has the aes and sham enabled but BBB does not. That's with
just two variants. :( This would definitely drive better organization
of dtses and cure a lot of bitrot if they could share a common file.
OTOH, some might argue that the bone common dtsi should just enable those
nodes staticly for this case.

-Matt

Ludovic Desroches Feb. 18, 2015, 4:47 p.m. UTC | #6

On Wed, Feb 18, 2015 at 06:39:01PM +0200, Pantelis Antoniou wrote:
> Hi Ludovic,
> 
> > On Feb 18, 2015, at 18:32 , Ludovic Desroches <ludovic.desroches@atmel.com> wrote:
> > 
> > Hi,
> > 
> > Great something we are waiting for a long time!
> > 
> > On Wed, Feb 18, 2015 at 05:53:50PM +0200, Pantelis Antoniou wrote:
> >> Hi Mark,
> >> 
> >>> On Feb 18, 2015, at 17:41 , Mark Rutland <mark.rutland@arm.com> wrote:
> >>> 
> >>> Hi Pantelis,
> >>> 
> >>> On Wed, Feb 18, 2015 at 02:59:34PM +0000, Pantelis Antoniou wrote:
> >>>> Implement a method of applying DT quirks early in the boot sequence.
> >>>> 
> >>>> A DT quirk is a subtree of the boot DT that can be applied to
> >>>> a target in the base DT resulting in a modification of the live
> >>>> tree. The format of the quirk nodes is that of a device tree overlay.
> >>>> 
> >>>> For details please refer to Documentation/devicetree/quirks.txt
> >>>> 
> >>>> Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
> >>>> ---
> >>>> Documentation/devicetree/quirks.txt | 101 ++++++++++
> >>>> drivers/of/dynamic.c                | 358 ++++++++++++++++++++++++++++++++++++
> >>>> include/linux/of.h                  |  16 ++
> >>>> 3 files changed, 475 insertions(+)
> >>>> create mode 100644 Documentation/devicetree/quirks.txt
> >>>> 
> >>>> diff --git a/Documentation/devicetree/quirks.txt b/Documentation/devicetree/quirks.txt
> >>>> new file mode 100644
> >>>> index 0000000..789319a
> >>>> --- /dev/null
> >>>> +++ b/Documentation/devicetree/quirks.txt
> >>>> @@ -0,0 +1,101 @@
> >>>> +A Device Tree quirk is the way which allows modification of the
> >>>> +boot device tree under the control of a per-platform specific method.
> >>>> +
> >>>> +Take for instance the case of a board family that comprises of a
> >>>> +number of different board revisions, all being incremental changes
> >>>> +after an initial release.
> >>>> +
> >>>> +Since all board revisions must be supported via a single software image
> >>>> +the only way to support this scheme is by having a different DTB for each
> >>>> +revision with the bootloader selecting which one to use at boot time.
> >>> 
> >>> Not necessarily at boot time. The boards don't have to run the exact
> >>> same FW/bootloader binary, so the relevant DTB could be programmed onto
> >>> each board.
> >>> 
> >> 
> >> That has not been the case in any kind of board I’ve worked with.
> >> 
> >> A special firmware image that requires a different programming step at
> >> the factory to select the correct DTB for each is always one more thing
> >> that can go wrong.
> >> 
> > 
> > I agree. We have boards with several display modules, even if it seems
> > quite easy to know which dtb has to be loaded since we use a prefix
> > describing the display module (_pda4, _pda7, etc.) it is a pain for
> > customers. Moreover you can add the revision of the board, we have a
> > mother board and a cpu module so it can quickly lead to something like
> > this:
> > at91-sama5d31ek_mb-revc_cm-revd_pda7.
> > 
> > It is only an example, at the moment it is a bit less complicated but I
> > am not so far from the reality: sama5d31ek_revc_pda7.dts,
> > sama5d33ek_revc_pda4.dts, etc. For a SoC family, we have 27 DTS files…
> > 
> 
> I bet it’s easy to screw up no? Any horror stories from the factory floor? :)

Yes it is.

I keep these horror stories for Halloween! At the moment no incident because
we are trying hard to find workarounds with the help of our
In-system programmer or U-boot but the goal is to not rely on a particular
component to do this job.

> 
> > As for the single zImage, we should find a way to have a single DTB.
> > 
> >>>> +While this may in theory work, in practice it is very cumbersome
> >>>> +for the following reasons:
> >>>> +
> >>>> +1. The act of selecting a different boot device tree blob requires
> >>>> +a reasonably advanced bootloader with some kind of configuration or
> >>>> +scripting capabilities. Sadly this is not the case many times, the
> >>>> +bootloader is extremely dumb and can only use a single dt blob.
> >>> 
> >>> You can have several bootloader builds, or even a single build with
> >>> something like appended DTB to get an appropriate DTB if the same binary
> >>> will otherwise work across all variants of a board.
> >>> 
> >> 
> >> No, the same DTB will not work across all the variants of a board.
> >> 
> >>> So it's not necessarily true that you need a complex bootloader.
> >>> 
> >> 
> >>>> +2. On many instances boot time is extremely critical; in some cases
> >>>> +there are hard requirements like having working video feeds in under
> >>>> +2 seconds from power-up. This leaves an extremely small time budget for
> >>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
> >>>> +is by removing the standard bootloader from the normal boot sequence
> >>>> +altogether by having a very small boot shim that loads the kernel and
> >>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
> >>> 
> >>> Given my previous comments above I don't see why this is relevant.
> >>> You're already passing _some_ DTB here, so if you can organise for the
> >>> board to statically provide a sane DTB that's fine, or you can resort to
> >>> appended DTB if it's not possible to update the board configuration.
> >>> 
> >> 
> >> You’re missing the point. I can’t use the same DTB for each revision of the
> >> board. Each board is similar but it’s not identical.
> >> 
> >>>> +3. Having different DTBs/DTSs for different board revisions easily leads to
> >>>> +drift between versions. Since no developer is expected to have every single
> >>>> +board revision at hand, things are easy to get out of sync, with board versions
> >>>> +failing to boot even though the kernel is up to date.
> >>> 
> >>> I'm not sure I follow. Surely if you don't have every board revision at
> >>> hand you can't test quirks exhaustively either?
> >>> 
> >> 
> >> It’s one less thing to worry about. For example in the current mainline kernel
> >> already there is a drift between the beaglebone white and the beaglebone black.
> >> 
> >> Having the same DTS is just easier to keep things in sync.
> >> 
> >>> Additionally you face the problem that two boards of the same variant
> >>> could have different base DTBs that you would need to test that each
> >>> board's quirks worked for a range of base DTBs.
> >>> 
> >> 
> >> This is not a valid case. This patch is about boards that have the same base DTB.
> >> 
> >>>> +4. One final problem is the static way that device tree works.
> >>>> +For example it might be desirable for various boards to have a way to
> >>>> +selectively configure the boot device tree, possibly by the use of command
> >>>> +line options.  For instance a device might be disabled if a given command line
> >>>> +option is present, different configuration to various devices for debugging
> >>>> +purposes can be selected and so on. Currently the only way to do so is by
> >>>> +recompiling the DTS and installing, which is an chore for developers and
> >>>> +a completely unreasonable expectation from end-users.
> >>> 
> >>> I'm not sure I follow here.
> >>> 
> >>> Which devices do you envisage this being the case for?
> >>> 
> >>> Outside of debug scenarios when would you envisage we do this?
> >>> 
> >> 
> >> We already have to do this on the beaglebone black. The onboard EMMC and HDMI
> >> devices conflict with any capes that use the same pins. So you have to
> >> have a way to disable them so that the attached cape will work.
> >> 
> >>> For the debug case it seems reasonable to have command line parameters
> >>> to get the kernel to do what we want. Just because there's a device in
> >>> the DTB that's useful in a debug scenario doesn't mean we have to use it
> >>> by default.
> >> 
> >> I don’t follow. Users need this functionality to work. I.e. pass a command
> >> line option to use different OPPs etc. Real world usage is messy.
> >> 
> >>> 
> >>>> +Device Tree quirks solve all those problems by having an in-kernel interface
> >>>> +which per-board/platform method can use to selectively modify the device tree
> >>>> +right after unflattening.
> >>>> +
> >>>> +A DT quirk is a subtree of the boot DT that can be applied to
> >>>> +a target in the base DT resulting in a modification of the live
> >>>> +tree. The format of the quirk nodes is that of a device tree overlay.
> >>>> +
> >>>> +As an example the following DTS contains a quirk.
> >>>> +
> >>>> +/ {
> >>>> +       foo: foo-node {
> >>>> +               bar = <10>;
> >>>> +       };
> >>>> +
> >>>> +       select-quirk = <&quirk>;
> >>>> +
> >>>> +       quirk: quirk {
> >>>> +               fragment@0 {
> >>>> +                       target = <&foo>;
> >>>> +                       __overlay {
> >>>> +                               bar = <0xf00>;
> >>>> +                               baz = <11>;
> >>>> +                       };
> >>>> +               };
> >>>> +       };
> >>>> +};
> >>>> +
> >>>> +The quirk when applied would result at the following tree:
> >>>> +
> >>>> +/ {
> >>>> +       foo: foo-node {
> >>>> +               bar = <0xf00>;
> >>>> +               baz = <11>;
> >>>> +       };
> >>>> +
> >>>> +       select-quirk = <&quirk>;
> >>>> +
> >>>> +       quirk: quirk {
> >>>> +               fragment@0 {
> >>>> +                       target = <&foo>;
> >>>> +                       __overlay {
> >>>> +                               bar = <0xf00>;
> >>>> +                               baz = <11>;
> >>>> +                       };
> >>>> +               };
> >>>> +       };
> >>>> +
> >>>> +};
> >>>> +
> >>>> +The two public method used to accomplish this are of_quirk_apply_by_node()
> >>>> +and of_quirk_apply_by_phandle();
> >>>> +
> >>>> +To apply the quirk, a per-platform method can retrieve the phandle from the
> >>>> +select-quirk property and pass it to the of_quirk_apply_by_phandle() node.
> >>>> +
> >>>> +The method which the per-platform method is using to select the quirk to apply
> >>>> +is out of the scope of the DT quirk definition, but possible methods include
> >>>> +and are not limited to: revision encoding in a GPIO input range, board id
> >>>> +located in external or internal EEPROM or flash, DMI board ids, etc.
> >>> 
> >>> It seems rather unfortunate that to get a useful device tree we have to
> >>> resort to board-specific mechanisms. That means yet more platform code,
> >>> which is rather unfortunate. This would also require any other DT users
> >>> to understand and potentially have to sanitize any quirks (e.g. in the
> >>> case of Xen).
> >> 
> >> The original internal version of the patches used early platform devices and
> >> a generic firmware quirk mechanism, but I was directed to the per-platform
> >> method instead. It is perfectly doable to go back at the original implementation
> >> but I need to get the ball rolling with a discussion about the internals.
> > 
> > I also think we should used early platform devices to not add platform
> > specific code. What were the cons to swith to per-platform method?
> > 
> 
> Supposedly easier to accept. /me shrugs
> 
> >> 
> >>> 
> >>> Mark.
> >> 
> >> Regards
> >> 
> >> — Pantelis
> >> 
> > 
> > Regards
> > 
> > Ludovic
> 
> Regards
> 
> — Pantelis
>

Mark Rutland Feb. 18, 2015, 5:31 p.m. UTC | #7

> >> +While this may in theory work, in practice it is very cumbersome
> >> +for the following reasons:
> >> +
> >> +1. The act of selecting a different boot device tree blob requires
> >> +a reasonably advanced bootloader with some kind of configuration or
> >> +scripting capabilities. Sadly this is not the case many times, the
> >> +bootloader is extremely dumb and can only use a single dt blob.
> > 
> > You can have several bootloader builds, or even a single build with
> > something like appended DTB to get an appropriate DTB if the same binary
> > will otherwise work across all variants of a board.
> > 
> 
> No, the same DTB will not work across all the variants of a board.

I wasn't on about the DTB. I was on about the loader binary, in the case
the FW/bootloader could be common even if the DTB couldn't.

To some extent there must be a DTB that will work across all variants
(albeit with limited utility) or the quirk approach wouldn't work...

> > So it's not necessarily true that you need a complex bootloader.
> > 
> 
> >> +2. On many instances boot time is extremely critical; in some cases
> >> +there are hard requirements like having working video feeds in under
> >> +2 seconds from power-up. This leaves an extremely small time budget for
> >> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
> >> +is by removing the standard bootloader from the normal boot sequence
> >> +altogether by having a very small boot shim that loads the kernel and
> >> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
> > 
> > Given my previous comments above I don't see why this is relevant.
> > You're already passing _some_ DTB here, so if you can organise for the
> > board to statically provide a sane DTB that's fine, or you can resort to
> > appended DTB if it's not possible to update the board configuration.
> > 
> 
> You’re missing the point. I can’t use the same DTB for each revision of the
> board. Each board is similar but it’s not identical.

I think you've misunderstood my point. If you program the board with the
relevant DTB, or use appended DTB, then you will pass the correct DTB to
the kernel without need for quirks.

I understand that each variant is somewhat incompatible (and hence needs
its own DTB).

> >> +3. Having different DTBs/DTSs for different board revisions easily leads to
> >> +drift between versions. Since no developer is expected to have every single
> >> +board revision at hand, things are easy to get out of sync, with board versions
> >> +failing to boot even though the kernel is up to date.
> > 
> > I'm not sure I follow. Surely if you don't have every board revision at
> > hand you can't test quirks exhaustively either?
> > 
> 
> It’s one less thing to worry about. For example in the current mainline kernel
> already there is a drift between the beaglebone white and the beaglebone black.

What's currently out of sync?

> Having the same DTS is just easier to keep things in sync.

We have dtsi files to keep this common stuff in sync currently. It
sounds like we need to factor something out to bring beaglebone black
and beaglebone white back in sync.

> > Additionally you face the problem that two boards of the same variant
> > could have different base DTBs that you would need to test that each
> > board's quirks worked for a range of base DTBs.
> > 
> 
> This is not a valid case. This patch is about boards that have the same base DTB.

Hmm. I mangled my words here.

Say you have a DTB for some set of boards, relying on quirks. A new
board comes out requiring both updates to the DT and the quirk detection
mechanism. Now you need to test that the new detection code works with
the old boards with the old DTB, in addition to all the boards with the
new DTB.

If anything is updated, testing can't be avoided, regardless of this
quirks mechanism.

> >> +4. One final problem is the static way that device tree works.
> >> +For example it might be desirable for various boards to have a way to
> >> +selectively configure the boot device tree, possibly by the use of command
> >> +line options.  For instance a device might be disabled if a given command line
> >> +option is present, different configuration to various devices for debugging
> >> +purposes can be selected and so on. Currently the only way to do so is by
> >> +recompiling the DTS and installing, which is an chore for developers and
> >> +a completely unreasonable expectation from end-users.
> > 
> > I'm not sure I follow here.
> > 
> > Which devices do you envisage this being the case for?
> > 
> > Outside of debug scenarios when would you envisage we do this?
> > 
> 
> We already have to do this on the beaglebone black. The onboard EMMC and HDMI
> devices conflict with any capes that use the same pins. So you have to
> have a way to disable them so that the attached cape will work.

Surely the boot-time quirks are independent of the cape/overlay cases?

If you pass a sane DTB to begin with for the board then things should
work, and if you override with an overlay then that should work
regardless of whatever the boot-time configuration was, no?

> > For the debug case it seems reasonable to have command line parameters
> > to get the kernel to do what we want. Just because there's a device in
> > the DTB that's useful in a debug scenario doesn't mean we have to use it
> > by default.
> 
> I don’t follow. Users need this functionality to work. I.e. pass a command
> line option to use different OPPs etc. Real world usage is messy.

This was all about the debug case; for anything that's required for
normal operation I agree that a command line argument doesn't work.

> >> +Device Tree quirks solve all those problems by having an in-kernel interface
> >> +which per-board/platform method can use to selectively modify the device tree
> >> +right after unflattening.
> >> +
> >> +A DT quirk is a subtree of the boot DT that can be applied to
> >> +a target in the base DT resulting in a modification of the live
> >> +tree. The format of the quirk nodes is that of a device tree overlay.
> >> +
> >> +As an example the following DTS contains a quirk.
> >> +
> >> +/ {
> >> +       foo: foo-node {
> >> +               bar = <10>;
> >> +       };
> >> +
> >> +       select-quirk = <&quirk>;
> >> +
> >> +       quirk: quirk {
> >> +               fragment@0 {
> >> +                       target = <&foo>;
> >> +                       __overlay {
> >> +                               bar = <0xf00>;
> >> +                               baz = <11>;
> >> +                       };
> >> +               };
> >> +       };
> >> +};
> >> +
> >> +The quirk when applied would result at the following tree:
> >> +
> >> +/ {
> >> +       foo: foo-node {
> >> +               bar = <0xf00>;
> >> +               baz = <11>;
> >> +       };
> >> +
> >> +       select-quirk = <&quirk>;
> >> +
> >> +       quirk: quirk {
> >> +               fragment@0 {
> >> +                       target = <&foo>;
> >> +                       __overlay {
> >> +                               bar = <0xf00>;
> >> +                               baz = <11>;
> >> +                       };
> >> +               };
> >> +       };
> >> +
> >> +};
> >> +
> >> +The two public method used to accomplish this are of_quirk_apply_by_node()
> >> +and of_quirk_apply_by_phandle();
> >> +
> >> +To apply the quirk, a per-platform method can retrieve the phandle from the
> >> +select-quirk property and pass it to the of_quirk_apply_by_phandle() node.
> >> +
> >> +The method which the per-platform method is using to select the quirk to apply
> >> +is out of the scope of the DT quirk definition, but possible methods include
> >> +and are not limited to: revision encoding in a GPIO input range, board id
> >> +located in external or internal EEPROM or flash, DMI board ids, etc.
> > 
> > It seems rather unfortunate that to get a useful device tree we have to
> > resort to board-specific mechanisms. That means yet more platform code,
> > which is rather unfortunate. This would also require any other DT users
> > to understand and potentially have to sanitize any quirks (e.g. in the
> > case of Xen).
> 
> The original internal version of the patches used early platform devices and
> a generic firmware quirk mechanism, but I was directed to the per-platform
> method instead. It is perfectly doable to go back at the original implementation
> but I need to get the ball rolling with a discussion about the internals.

Either way my concern is the same; some platform-specific code (be it in
arch/arm or drivers/) is potentially required in order to get a usable
DTB of any level of functionality.

Thanks,
Mark.

Guenter Roeck Feb. 18, 2015, 7:32 p.m. UTC | #8

On Wed, Feb 18, 2015 at 05:31:16PM +0000, Mark Rutland wrote:
> > >> +While this may in theory work, in practice it is very cumbersome
> > >> +for the following reasons:
> > >> +
> > >> +1. The act of selecting a different boot device tree blob requires
> > >> +a reasonably advanced bootloader with some kind of configuration or
> > >> +scripting capabilities. Sadly this is not the case many times, the
> > >> +bootloader is extremely dumb and can only use a single dt blob.
> > > 
> > > You can have several bootloader builds, or even a single build with
> > > something like appended DTB to get an appropriate DTB if the same binary
> > > will otherwise work across all variants of a board.
> > > 
> > 
> > No, the same DTB will not work across all the variants of a board.
> 
> I wasn't on about the DTB. I was on about the loader binary, in the case
> the FW/bootloader could be common even if the DTB couldn't.
> 
> To some extent there must be a DTB that will work across all variants
> (albeit with limited utility) or the quirk approach wouldn't work...
> 

Not necessarily. I have another use case: A cpu card can be plugged into
multiple carrier card variants, each with different functionality.
At production time, it is not known which carrier card the CPU card
will be plugged in. It is not feasible for manufacturing to update
the dtb files after the cpu card has been plugged into the carrier,
since both are manufactured separately and plugged together at a
later step in the production process (and may be swapped around later).

External overlays do not work in this case because those have to be loaded
through the carrier card. A common dtb file as proposed here would be an
elegant solution for that problem.

Guenter

Frank Rowand Feb. 19, 2015, 2:08 a.m. UTC | #9

On 2/18/2015 6:59 AM, Pantelis Antoniou wrote:
> Implement a method of applying DT quirks early in the boot sequence.
> 
> A DT quirk is a subtree of the boot DT that can be applied to
> a target in the base DT resulting in a modification of the live
> tree. The format of the quirk nodes is that of a device tree overlay.

The use of the word "quirk" is a different mental model for me than what
this patch series appears to be addressing.  I would suggest totally
removing the word "quirk" from this proposal to avoid confusing the
mental models of future generations of kernel folks.

What this patch series seems to be proposing is a method to apply DT
overlays as soon as unflatten_device_tree() completes.  In other words,
making the device tree a dynamic object, that is partially defined by
the kernel during boot.  Well, to be fair, the kernel chooses among
several possible alternatives encoded in the DT blob.  So the device
tree is no longer a static object that describes the hardware of the
system.  It may not sound like a big deal, but it seems to me to be
a fundamental shift in what the device tree blob is.  Something that
should be thought about carefully and not just applied as a patch to
solve a point problem.

The stated use of this proposal is to create dynamic DT blobs that can
describe many similar variants of a given system instead of creating
unique DT blobs for each different system.

I obviously have not thought through the architectural implications yet,
but just a quick example.  One of the issues we have been trying to fix
is device tree validation.  The not yet existent (except as a few proof
of concept attempts) validator would need to validate a device tree
for each dynamic variant.  Probably not a big deal, but an example of
the ripple effects this conceptual change implies.

A second function that this patch is proposing is a method to enable
or disable devices via command line options.  If I understand
correctly, this is meant to solve a problem with run time overlays
that require disabling a device previously enabled by the DT blob.
If so, it seems like it could easily be implemented in a simpler
generic way than in the board specific code in this patch series.

I share the concerns that Mark Rutland has expressed in his comments
about this series.

< snip >

I have read through the patches and will have comments on the code
later if this proposal is seen as a good idea.

-Frank

Pantelis Antoniou Feb. 19, 2015, 2:29 p.m. UTC | #10

Hi Mark,

> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
> 
>>>> +While this may in theory work, in practice it is very cumbersome
>>>> +for the following reasons:
>>>> +
>>>> +1. The act of selecting a different boot device tree blob requires
>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>> 
>>> You can have several bootloader builds, or even a single build with
>>> something like appended DTB to get an appropriate DTB if the same binary
>>> will otherwise work across all variants of a board.
>>> 
>> 
>> No, the same DTB will not work across all the variants of a board.
> 
> I wasn't on about the DTB. I was on about the loader binary, in the case
> the FW/bootloader could be common even if the DTB couldn't.
> 
> To some extent there must be a DTB that will work across all variants
> (albeit with limited utility) or the quirk approach wouldn't work…
> 

That’s not correct; the only part of the DTB that needs to be common
is the model property that would allow the quirk detection logic to fire.

So, there is a base DTB that will work on all variants, but that only means
that it will work only up to the point that the quirk detector method
can work. So while in recommended practice there are common subsets
of the DTB that might work, they might be unsafe.

For instance on the beaglebone the regulator configuration is different
between white and black, it is imperative you get them right otherwise
you risk board damage.

>>> So it's not necessarily true that you need a complex bootloader.
>>> 
>> 
>>>> +2. On many instances boot time is extremely critical; in some cases
>>>> +there are hard requirements like having working video feeds in under
>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>> +is by removing the standard bootloader from the normal boot sequence
>>>> +altogether by having a very small boot shim that loads the kernel and
>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>> 
>>> Given my previous comments above I don't see why this is relevant.
>>> You're already passing _some_ DTB here, so if you can organise for the
>>> board to statically provide a sane DTB that's fine, or you can resort to
>>> appended DTB if it's not possible to update the board configuration.
>>> 
>> 
>> You’re missing the point. I can’t use the same DTB for each revision of the
>> board. Each board is similar but it’s not identical.
> 
> I think you've misunderstood my point. If you program the board with the
> relevant DTB, or use appended DTB, then you will pass the correct DTB to
> the kernel without need for quirks.
> 
> I understand that each variant is somewhat incompatible (and hence needs
> its own DTB).

In theory it might work, in practice this does not. Ludovic mentioned that they
have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
that’s 27x60k = 1.6MB of DTBs, that need to be installed.

> 
>>>> +3. Having different DTBs/DTSs for different board revisions easily leads to
>>>> +drift between versions. Since no developer is expected to have every single
>>>> +board revision at hand, things are easy to get out of sync, with board versions
>>>> +failing to boot even though the kernel is up to date.
>>> 
>>> I'm not sure I follow. Surely if you don't have every board revision at
>>> hand you can't test quirks exhaustively either?
>>> 
>> 
>> It’s one less thing to worry about. For example in the current mainline kernel
>> already there is a drift between the beaglebone white and the beaglebone black.
> 
> What's currently out of sync?
> 
>> Having the same DTS is just easier to keep things in sync.
> 
> We have dtsi files to keep this common stuff in sync currently. It
> sounds like we need to factor something out to bring beaglebone black
> and beaglebone white back in sync.
> 

Yes, they need to. Unfortunately it is very easy for DTBs to get out of sync
as it appears in practice.

>>> Additionally you face the problem that two boards of the same variant
>>> could have different base DTBs that you would need to test that each
>>> board's quirks worked for a range of base DTBs.
>>> 
>> 
>> This is not a valid case. This patch is about boards that have the same base DTB.
> 
> Hmm. I mangled my words here.
> 
> Say you have a DTB for some set of boards, relying on quirks. A new
> board comes out requiring both updates to the DT and the quirk detection
> mechanism. Now you need to test that the new detection code works with
> the old boards with the old DTB, in addition to all the boards with the
> new DTB.
> 
> If anything is updated, testing can't be avoided, regardless of this
> quirks mechanism.
> 

Obviously testing is needed everytime something is updated. In practice
the way the boards are identified tend to be similar; on am33xx boards it’s
a standard formatted EEPROM at a known address on a specific i2c bus.

On ATMEL boards it’s the same, but the EEPROM is 1wire, and so on.

>>>> +4. One final problem is the static way that device tree works.
>>>> +For example it might be desirable for various boards to have a way to
>>>> +selectively configure the boot device tree, possibly by the use of command
>>>> +line options.  For instance a device might be disabled if a given command line
>>>> +option is present, different configuration to various devices for debugging
>>>> +purposes can be selected and so on. Currently the only way to do so is by
>>>> +recompiling the DTS and installing, which is an chore for developers and
>>>> +a completely unreasonable expectation from end-users.
>>> 
>>> I'm not sure I follow here.
>>> 
>>> Which devices do you envisage this being the case for?
>>> 
>>> Outside of debug scenarios when would you envisage we do this?
>>> 
>> 
>> We already have to do this on the beaglebone black. The onboard EMMC and HDMI
>> devices conflict with any capes that use the same pins. So you have to
>> have a way to disable them so that the attached cape will work.
> 
> Surely the boot-time quirks are independent of the cape/overlay cases?
> 
> If you pass a sane DTB to begin with for the board then things should
> work, and if you override with an overlay then that should work
> regardless of whatever the boot-time configuration was, no?
> 

No. There is no such thing as a ‘sane’ DTB that will work under every case.

Let me expand on the specific HDMI case for the beaglebone black.

The black has an on board HDMI framer that’s connected to the LCDC interface.
However the LCDC pins are brought out to the dual connectors and a add-on
LCD cape can be attached.

For it to work, the LCDC driver must be disabled in the boot DTB so that
the overlay that enables the LCD cape can work.

But since the image drop for both case is the same you have to be able to
disable the HDMI using a command line option, instead of installing a different
DTB.

>>> For the debug case it seems reasonable to have command line parameters
>>> to get the kernel to do what we want. Just because there's a device in
>>> the DTB that's useful in a debug scenario doesn't mean we have to use it
>>> by default.
>> 
>> I don’t follow. Users need this functionality to work. I.e. pass a command
>> line option to use different OPPs etc. Real world usage is messy.
> 
> This was all about the debug case; for anything that's required for
> normal operation I agree that a command line argument doesn't work.
> 
>>>> +Device Tree quirks solve all those problems by having an in-kernel interface
>>>> +which per-board/platform method can use to selectively modify the device tree
>>>> +right after unflattening.
>>>> +
>>>> +A DT quirk is a subtree of the boot DT that can be applied to
>>>> +a target in the base DT resulting in a modification of the live
>>>> +tree. The format of the quirk nodes is that of a device tree overlay.
>>>> +
>>>> +As an example the following DTS contains a quirk.
>>>> +
>>>> +/ {
>>>> +       foo: foo-node {
>>>> +               bar = <10>;
>>>> +       };
>>>> +
>>>> +       select-quirk = <&quirk>;
>>>> +
>>>> +       quirk: quirk {
>>>> +               fragment@0 {
>>>> +                       target = <&foo>;
>>>> +                       __overlay {
>>>> +                               bar = <0xf00>;
>>>> +                               baz = <11>;
>>>> +                       };
>>>> +               };
>>>> +       };
>>>> +};
>>>> +
>>>> +The quirk when applied would result at the following tree:
>>>> +
>>>> +/ {
>>>> +       foo: foo-node {
>>>> +               bar = <0xf00>;
>>>> +               baz = <11>;
>>>> +       };
>>>> +
>>>> +       select-quirk = <&quirk>;
>>>> +
>>>> +       quirk: quirk {
>>>> +               fragment@0 {
>>>> +                       target = <&foo>;
>>>> +                       __overlay {
>>>> +                               bar = <0xf00>;
>>>> +                               baz = <11>;
>>>> +                       };
>>>> +               };
>>>> +       };
>>>> +
>>>> +};
>>>> +
>>>> +The two public method used to accomplish this are of_quirk_apply_by_node()
>>>> +and of_quirk_apply_by_phandle();
>>>> +
>>>> +To apply the quirk, a per-platform method can retrieve the phandle from the
>>>> +select-quirk property and pass it to the of_quirk_apply_by_phandle() node.
>>>> +
>>>> +The method which the per-platform method is using to select the quirk to apply
>>>> +is out of the scope of the DT quirk definition, but possible methods include
>>>> +and are not limited to: revision encoding in a GPIO input range, board id
>>>> +located in external or internal EEPROM or flash, DMI board ids, etc.
>>> 
>>> It seems rather unfortunate that to get a useful device tree we have to
>>> resort to board-specific mechanisms. That means yet more platform code,
>>> which is rather unfortunate. This would also require any other DT users
>>> to understand and potentially have to sanitize any quirks (e.g. in the
>>> case of Xen).
>> 
>> The original internal version of the patches used early platform devices and
>> a generic firmware quirk mechanism, but I was directed to the per-platform
>> method instead. It is perfectly doable to go back at the original implementation
>> but I need to get the ball rolling with a discussion about the internals.
> 
> Either way my concern is the same; some platform-specific code (be it in
> arch/arm or drivers/) is potentially required in order to get a usable
> DTB of any level of functionality.
> 

Yes, platform specific code is required, but that’s a given for any platform no?

> Thanks,
> Mark.

Regards

— Pantelis

Pantelis Antoniou Feb. 19, 2015, 2:41 p.m. UTC | #11

Hi Frank,

> On Feb 19, 2015, at 04:08 , Frank Rowand <frowand.list@gmail.com> wrote:
> 
> On 2/18/2015 6:59 AM, Pantelis Antoniou wrote:
>> Implement a method of applying DT quirks early in the boot sequence.
>> 
>> A DT quirk is a subtree of the boot DT that can be applied to
>> a target in the base DT resulting in a modification of the live
>> tree. The format of the quirk nodes is that of a device tree overlay.
> 
> The use of the word "quirk" is a different mental model for me than what
> this patch series appears to be addressing.  I would suggest totally
> removing the word "quirk" from this proposal to avoid confusing the
> mental models of future generations of kernel folks.
> 

Naming things is hard to do. Suggestions?

> What this patch series seems to be proposing is a method to apply DT
> overlays as soon as unflatten_device_tree() completes.  In other words,
> making the device tree a dynamic object, that is partially defined by
> the kernel during boot.  Well, to be fair, the kernel chooses among
> several possible alternatives encoded in the DT blob.  So the device
> tree is no longer a static object that describes the hardware of the
> system.  It may not sound like a big deal, but it seems to me to be
> a fundamental shift in what the device tree blob is.  Something that
> should be thought about carefully and not just applied as a patch to
> solve a point problem.
> 

There is a fundamental shift going on about what hardware is. It is nowhere
as static as it used to be. It is time for the kernel to keep up.

> The stated use of this proposal is to create dynamic DT blobs that can
> describe many similar variants of a given system instead of creating
> unique DT blobs for each different system.
> 

Yes.

> I obviously have not thought through the architectural implications yet,
> but just a quick example.  One of the issues we have been trying to fix
> is device tree validation.  The not yet existent (except as a few proof
> of concept attempts) validator would need to validate a device tree
> for each dynamic variant.  Probably not a big deal, but an example of
> the ripple effects this conceptual change implies.
> 

I don’t see what the big problem with the validator is. The ‘quirk’
are easily identified by the presence of the __overlay__ nodes and
the validator can apply each overlay and perform the validation check 
at each resultant tree.
 
> A second function that this patch is proposing is a method to enable
> or disable devices via command line options.  If I understand
> correctly, this is meant to solve a problem with run time overlays
> that require disabling a device previously enabled by the DT blob.
> If so, it seems like it could easily be implemented in a simpler
> generic way than in the board specific code in this patch series.
> 

Disabling a device is the most common case, but other options are desired
too. For instance changing OPPs by a command line option, etc.

> I share the concerns that Mark Rutland has expressed in his comments
> about this series.
> 
> < snip >
> 
> I have read through the patches and will have comments on the code
> later if this proposal is seen as a good idea.
> 

OK

> -Frank

Regards

— Pantelis

Frank Rowand Feb. 19, 2015, 4:40 p.m. UTC | #12

On 2/19/2015 6:41 AM, Pantelis Antoniou wrote:
> Hi Frank,
> 
>> On Feb 19, 2015, at 04:08 , Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 2/18/2015 6:59 AM, Pantelis Antoniou wrote:
>>> Implement a method of applying DT quirks early in the boot sequence.
>>>
>>> A DT quirk is a subtree of the boot DT that can be applied to
>>> a target in the base DT resulting in a modification of the live
>>> tree. The format of the quirk nodes is that of a device tree overlay.
>>
>> The use of the word "quirk" is a different mental model for me than what
>> this patch series appears to be addressing.  I would suggest totally
>> removing the word "quirk" from this proposal to avoid confusing the
>> mental models of future generations of kernel folks.
>>
> 
> Naming things is hard to do. Suggestions?

You are inviting me to bikeshed.  I'll leave that to you.

> 
>> What this patch series seems to be proposing is a method to apply DT
>> overlays as soon as unflatten_device_tree() completes.  In other words,
>> making the device tree a dynamic object, that is partially defined by
>> the kernel during boot.  Well, to be fair, the kernel chooses among
>> several possible alternatives encoded in the DT blob.  So the device
>> tree is no longer a static object that describes the hardware of the
>> system.  It may not sound like a big deal, but it seems to me to be
>> a fundamental shift in what the device tree blob is.  Something that
>> should be thought about carefully and not just applied as a patch to
>> solve a point problem.
>>
> 
> There is a fundamental shift going on about what hardware is. It is nowhere
> as static as it used to be. It is time for the kernel to keep up.

Run time overlays do that.

The problem you seem to be dealing with here is that you want a single
DT blob that can be installed on many different _physical_ boards.


> 
>> The stated use of this proposal is to create dynamic DT blobs that can
>> describe many similar variants of a given system instead of creating
>> unique DT blobs for each different system.
>>
> 
> Yes.
> 
>> I obviously have not thought through the architectural implications yet,
>> but just a quick example.  One of the issues we have been trying to fix
>> is device tree validation.  The not yet existent (except as a few proof
>> of concept attempts) validator would need to validate a device tree
>> for each dynamic variant.  Probably not a big deal, but an example of
>> the ripple effects this conceptual change implies.
>>
> 
> I don’t see what the big problem with the validator is. The ‘quirk’
> are easily identified by the presence of the __overlay__ nodes and
> the validator can apply each overlay and perform the validation check 
> at each resultant tree.

I said "not a big deal".  I was trying to make people think about the
bigger picture.  Defending that this is a non-issue for the validator
is totally missing my point.  Step back and think architecturally
about the big picture.  I do _not_ know if this is a problem, but
they will be ripples from this proposal.

>  
>> A second function that this patch is proposing is a method to enable
>> or disable devices via command line options.  If I understand
>> correctly, this is meant to solve a problem with run time overlays
>> that require disabling a device previously enabled by the DT blob.
>> If so, it seems like it could easily be implemented in a simpler
>> generic way than in the board specific code in this patch series.
>>
> 
> Disabling a device is the most common case, but other options are desired
> too. For instance changing OPPs by a command line option, etc.

The device tree is supposed to describe what the hardware is.  Why would
you want a command line option to change what OPPs are possible for the
hardware?

> 
>> I share the concerns that Mark Rutland has expressed in his comments
>> about this series.
>>
>> < snip >
>>
>> I have read through the patches and will have comments on the code
>> later if this proposal is seen as a good idea.
>>
> 
> OK
> 
>> -Frank
> 
> Regards
> 
> — Pantelis
> 
>

Frank Rowand Feb. 19, 2015, 4:48 p.m. UTC | #13

On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
> Hi Mark,
> 
>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>
>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>> +for the following reasons:
>>>>> +
>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>
>>>> You can have several bootloader builds, or even a single build with
>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>> will otherwise work across all variants of a board.
>>>>
>>>
>>> No, the same DTB will not work across all the variants of a board.
>>
>> I wasn't on about the DTB. I was on about the loader binary, in the case
>> the FW/bootloader could be common even if the DTB couldn't.
>>
>> To some extent there must be a DTB that will work across all variants
>> (albeit with limited utility) or the quirk approach wouldn't work…
>>
> 
> That’s not correct; the only part of the DTB that needs to be common
> is the model property that would allow the quirk detection logic to fire.
> 
> So, there is a base DTB that will work on all variants, but that only means
> that it will work only up to the point that the quirk detector method
> can work. So while in recommended practice there are common subsets
> of the DTB that might work, they might be unsafe.
> 
> For instance on the beaglebone the regulator configuration is different
> between white and black, it is imperative you get them right otherwise
> you risk board damage.
> 
>>>> So it's not necessarily true that you need a complex bootloader.
>>>>
>>>
>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>> +there are hard requirements like having working video feeds in under
>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>
>>>> Given my previous comments above I don't see why this is relevant.
>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>> appended DTB if it's not possible to update the board configuration.
>>>>
>>>
>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>> board. Each board is similar but it’s not identical.
>>
>> I think you've misunderstood my point. If you program the board with the
>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>> the kernel without need for quirks.
>>
>> I understand that each variant is somewhat incompatible (and hence needs
>> its own DTB).
> 
> In theory it might work, in practice this does not. Ludovic mentioned that they
> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
> that’s 27x60k = 1.6MB of DTBs, that need to be installed.

< snip >

Or you can install the correct DTB on the board.  You trust your manufacturing line
to install the correct resistors.  You trust your manufacturing line to install the
correct kernel version (eg an updated version to resolve a security issue).

I thought the DT blob was supposed to follow the same standard that other OS's or
bootloaders understood.  Are you willing to break that?  (This is one of those
ripples I mentioned in my other emails.)

-Frank

Frank Rowand Feb. 19, 2015, 4:51 p.m. UTC | #14

On 2/19/2015 8:40 AM, Frank Rowand wrote:
> On 2/19/2015 6:41 AM, Pantelis Antoniou wrote:
>> Hi Frank,
>>
>>> On Feb 19, 2015, at 04:08 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>
>>> On 2/18/2015 6:59 AM, Pantelis Antoniou wrote:
>>>> Implement a method of applying DT quirks early in the boot sequence.
>>>>
>>>> A DT quirk is a subtree of the boot DT that can be applied to
>>>> a target in the base DT resulting in a modification of the live
>>>> tree. The format of the quirk nodes is that of a device tree overlay.
>>>
>>> The use of the word "quirk" is a different mental model for me than what
>>> this patch series appears to be addressing.  I would suggest totally
>>> removing the word "quirk" from this proposal to avoid confusing the
>>> mental models of future generations of kernel folks.
>>>
>>
>> Naming things is hard to do. Suggestions?
> 
> You are inviting me to bikeshed.  I'll leave that to you.
> 
>>
>>> What this patch series seems to be proposing is a method to apply DT
>>> overlays as soon as unflatten_device_tree() completes.  In other words,
>>> making the device tree a dynamic object, that is partially defined by
>>> the kernel during boot.  Well, to be fair, the kernel chooses among
>>> several possible alternatives encoded in the DT blob.  So the device
>>> tree is no longer a static object that describes the hardware of the
>>> system.  It may not sound like a big deal, but it seems to me to be
>>> a fundamental shift in what the device tree blob is.  Something that
>>> should be thought about carefully and not just applied as a patch to
>>> solve a point problem.
>>>
>>
>> There is a fundamental shift going on about what hardware is. It is nowhere
>> as static as it used to be. It is time for the kernel to keep up.
> 
> Run time overlays do that.
> 
> The problem you seem to be dealing with here is that you want a single
> DT blob that can be installed on many different _physical_ boards.
> 
> 
>>
>>> The stated use of this proposal is to create dynamic DT blobs that can
>>> describe many similar variants of a given system instead of creating
>>> unique DT blobs for each different system.
>>>
>>
>> Yes.
>>
>>> I obviously have not thought through the architectural implications yet,
>>> but just a quick example.  One of the issues we have been trying to fix
>>> is device tree validation.  The not yet existent (except as a few proof
>>> of concept attempts) validator would need to validate a device tree
>>> for each dynamic variant.  Probably not a big deal, but an example of
>>> the ripple effects this conceptual change implies.
>>>
>>
>> I don’t see what the big problem with the validator is. The ‘quirk’
>> are easily identified by the presence of the __overlay__ nodes and
>> the validator can apply each overlay and perform the validation check 
>> at each resultant tree.
> 
> I said "not a big deal".  I was trying to make people think about the
> bigger picture.  Defending that this is a non-issue for the validator
> is totally missing my point.  Step back and think architecturally
> about the big picture.  I do _not_ know if this is a problem, but
> they will be ripples from this proposal.

oops:  there will be ripples from this proposal.

> 
>>  
>>> A second function that this patch is proposing is a method to enable
>>> or disable devices via command line options.  If I understand
>>> correctly, this is meant to solve a problem with run time overlays
>>> that require disabling a device previously enabled by the DT blob.
>>> If so, it seems like it could easily be implemented in a simpler
>>> generic way than in the board specific code in this patch series.
>>>
>>
>> Disabling a device is the most common case, but other options are desired
>> too. For instance changing OPPs by a command line option, etc.
> 
> The device tree is supposed to describe what the hardware is.  Why would
> you want a command line option to change what OPPs are possible for the
> hardware?
> 
>>
>>> I share the concerns that Mark Rutland has expressed in his comments
>>> about this series.
>>>
>>> < snip >
>>>
>>> I have read through the patches and will have comments on the code
>>> later if this proposal is seen as a good idea.
>>>
>>
>> OK
>>
>>> -Frank
>>
>> Regards
>>
>> — Pantelis
>>
>>
>

Pantelis Antoniou Feb. 19, 2015, 5 p.m. UTC | #15

Hi Frank,

> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
> 
> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>> Hi Mark,
>> 
>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>> 
>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>> +for the following reasons:
>>>>>> +
>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>> 
>>>>> You can have several bootloader builds, or even a single build with
>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>> will otherwise work across all variants of a board.
>>>>> 
>>>> 
>>>> No, the same DTB will not work across all the variants of a board.
>>> 
>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>> the FW/bootloader could be common even if the DTB couldn't.
>>> 
>>> To some extent there must be a DTB that will work across all variants
>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>> 
>> 
>> That’s not correct; the only part of the DTB that needs to be common
>> is the model property that would allow the quirk detection logic to fire.
>> 
>> So, there is a base DTB that will work on all variants, but that only means
>> that it will work only up to the point that the quirk detector method
>> can work. So while in recommended practice there are common subsets
>> of the DTB that might work, they might be unsafe.
>> 
>> For instance on the beaglebone the regulator configuration is different
>> between white and black, it is imperative you get them right otherwise
>> you risk board damage.
>> 
>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>> 
>>>> 
>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>> +there are hard requirements like having working video feeds in under
>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>> 
>>>>> Given my previous comments above I don't see why this is relevant.
>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>> appended DTB if it's not possible to update the board configuration.
>>>>> 
>>>> 
>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>> board. Each board is similar but it’s not identical.
>>> 
>>> I think you've misunderstood my point. If you program the board with the
>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>> the kernel without need for quirks.
>>> 
>>> I understand that each variant is somewhat incompatible (and hence needs
>>> its own DTB).
>> 
>> In theory it might work, in practice this does not. Ludovic mentioned that they
>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
> 
> < snip >
> 
> Or you can install the correct DTB on the board.  You trust your manufacturing line
> to install the correct resistors.  You trust your manufacturing line to install the
> correct kernel version (eg an updated version to resolve a security issue).
> 
> I thought the DT blob was supposed to follow the same standard that other OS's or
> bootloaders understood.  Are you willing to break that?  (This is one of those
> ripples I mentioned in my other emails.)
> 

Trust no-one.

This is one of those things that the kernel community doesn’t understand which makes people
who push product quite mad.

Engineering a product is not only about meeting customer spec, in order to turn a profit
the whole endeavor must be engineered as well for manufacturability.

Yes, you can always manually install files in the bootloader. For 1 board no problem.
For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
instead of turning a profit you’re losing money if you only have a few cents of profit
per unit.

No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
for a few million units. 

And frankly I don’t care what other OSes do. If you were to take a look at the sorry DT support
they have you’d be amazed.

I would be very surprised if there’s another OS out there that can boot with a late Linux DTB.
 
> -Frank

Regards

— Pantelis

PS. For a real use case please take a look at the answer Guenter gave on this thread a little
while back.

Frank Rowand Feb. 19, 2015, 5:30 p.m. UTC | #16

On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
> Hi Frank,
> 
>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>> Hi Mark,
>>>
>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>>>
>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>> +for the following reasons:
>>>>>>> +
>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>
>>>>>> You can have several bootloader builds, or even a single build with
>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>> will otherwise work across all variants of a board.
>>>>>>
>>>>>
>>>>> No, the same DTB will not work across all the variants of a board.
>>>>
>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>
>>>> To some extent there must be a DTB that will work across all variants
>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>
>>>
>>> That’s not correct; the only part of the DTB that needs to be common
>>> is the model property that would allow the quirk detection logic to fire.
>>>
>>> So, there is a base DTB that will work on all variants, but that only means
>>> that it will work only up to the point that the quirk detector method
>>> can work. So while in recommended practice there are common subsets
>>> of the DTB that might work, they might be unsafe.
>>>
>>> For instance on the beaglebone the regulator configuration is different
>>> between white and black, it is imperative you get them right otherwise
>>> you risk board damage.
>>>
>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>
>>>>>
>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>
>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>
>>>>>
>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>> board. Each board is similar but it’s not identical.
>>>>
>>>> I think you've misunderstood my point. If you program the board with the
>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>> the kernel without need for quirks.
>>>>
>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>> its own DTB).
>>>
>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>
>> < snip >
>>
>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>> to install the correct resistors.  You trust your manufacturing line to install the
>> correct kernel version (eg an updated version to resolve a security issue).
>>
>> I thought the DT blob was supposed to follow the same standard that other OS's or
>> bootloaders understood.  Are you willing to break that?  (This is one of those
>> ripples I mentioned in my other emails.)
>>
> 
> Trust no-one.
> 
> This is one of those things that the kernel community doesn’t understand which makes people
> who push product quite mad.
> 
> Engineering a product is not only about meeting customer spec, in order to turn a profit
> the whole endeavor must be engineered as well for manufacturability.
> 
> Yes, you can always manually install files in the bootloader. For 1 board no problem.
> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
> instead of turning a profit you’re losing money if you only have a few cents of profit
> per unit.

I'm not installing physical components manually.  Why would I be installing software
manually?  (rhetorical question)

> 
> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
> for a few million units. 

And you produce a few million units before testing that the first one off the line works?

> 
> And frankly I don’t care what other OSes do. If you were to take a look at the sorry DT support
> they have you’d be amazed.
> 
> I would be very surprised if there’s another OS out there that can boot with a late Linux DTB.
>  
>> -Frank
> 
> Regards
> 
> — Pantelis
> 
> PS. For a real use case please take a look at the answer Guenter gave on this thread a little
> while back.
> 

My previous comments were written after reading Guenter's comment.

-Frank

Pantelis Antoniou Feb. 19, 2015, 5:38 p.m. UTC | #17

> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
> 
> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>> Hi Frank,
>> 
>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
>>> 
>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>> Hi Mark,
>>>> 
>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>>>> 
>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>> +for the following reasons:
>>>>>>>> +
>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>> 
>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>> will otherwise work across all variants of a board.
>>>>>>> 
>>>>>> 
>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>> 
>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>> 
>>>>> To some extent there must be a DTB that will work across all variants
>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>> 
>>>> 
>>>> That’s not correct; the only part of the DTB that needs to be common
>>>> is the model property that would allow the quirk detection logic to fire.
>>>> 
>>>> So, there is a base DTB that will work on all variants, but that only means
>>>> that it will work only up to the point that the quirk detector method
>>>> can work. So while in recommended practice there are common subsets
>>>> of the DTB that might work, they might be unsafe.
>>>> 
>>>> For instance on the beaglebone the regulator configuration is different
>>>> between white and black, it is imperative you get them right otherwise
>>>> you risk board damage.
>>>> 
>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>> 
>>>>>> 
>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>> 
>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>> 
>>>>>> 
>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>>> board. Each board is similar but it’s not identical.
>>>>> 
>>>>> I think you've misunderstood my point. If you program the board with the
>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>> the kernel without need for quirks.
>>>>> 
>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>> its own DTB).
>>>> 
>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>> 
>>> < snip >
>>> 
>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>>> to install the correct resistors.  You trust your manufacturing line to install the
>>> correct kernel version (eg an updated version to resolve a security issue).
>>> 
>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>> bootloaders understood.  Are you willing to break that?  (This is one of those
>>> ripples I mentioned in my other emails.)
>>> 
>> 
>> Trust no-one.
>> 
>> This is one of those things that the kernel community doesn’t understand which makes people
>> who push product quite mad.
>> 
>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>> the whole endeavor must be engineered as well for manufacturability.
>> 
>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>> instead of turning a profit you’re losing money if you only have a few cents of profit
>> per unit.
> 
> I'm not installing physical components manually.  Why would I be installing software
> manually?  (rhetorical question)
> 

Because on high volume product runs the flash comes preprogrammed and is soldered as is.

Having a single binary to flash to every revision of the board makes logistics considerably
easier.

Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
on the flash medium) takes time and is error-prone.

Factory time == money, errors == money.

>> 
>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>> for a few million units. 
> 
> And you produce a few million units before testing that the first one off the line works?
> 

The first one off the line works. The rest will get some burn in and functional testing if you’re
lucky. In many cases where the product is very cheap it might make financial sense to just ship
as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.

Hardware is hard :)

>> 
>> And frankly I don’t care what other OSes do. If you were to take a look at the sorry DT support
>> they have you’d be amazed.
>> 
>> I would be very surprised if there’s another OS out there that can boot with a late Linux DTB.
>> 
>>> -Frank
>> 
>> Regards
>> 
>> — Pantelis
>> 
>> PS. For a real use case please take a look at the answer Guenter gave on this thread a little
>> while back.
>> 
> 
> My previous comments were written after reading Guenter's comment.
> 

OK, sorry for coming off a bit strong there :)

> -Frank
> 

Regards

— Pantelis

Rob Herring Feb. 19, 2015, 6:01 p.m. UTC | #18

On Wed, Feb 18, 2015 at 8:08 PM, Frank Rowand <frowand.list@gmail.com> wrote:
> On 2/18/2015 6:59 AM, Pantelis Antoniou wrote:
>> Implement a method of applying DT quirks early in the boot sequence.
>>
>> A DT quirk is a subtree of the boot DT that can be applied to
>> a target in the base DT resulting in a modification of the live
>> tree. The format of the quirk nodes is that of a device tree overlay.
>
> The use of the word "quirk" is a different mental model for me than what
> this patch series appears to be addressing.  I would suggest totally
> removing the word "quirk" from this proposal to avoid confusing the
> mental models of future generations of kernel folks.

This comes from me as quirks are a different usecase I had in mind,
but one that could use a similar mechanism. Although, in the case of
quirks, I would expect them to be overlays built into the kernel. It
would be more a way to update old dtbs.

> What this patch series seems to be proposing is a method to apply DT
> overlays as soon as unflatten_device_tree() completes.  In other words,
> making the device tree a dynamic object, that is partially defined by
> the kernel during boot.  Well, to be fair, the kernel chooses among
> several possible alternatives encoded in the DT blob.  So the device
> tree is no longer a static object that describes the hardware of the
> system.  It may not sound like a big deal, but it seems to me to be
> a fundamental shift in what the device tree blob is.  Something that
> should be thought about carefully and not just applied as a patch to
> solve a point problem.

I agree. I would not want to see every board for an SOC become an
overlay for example. I think it has to be limited to truly plugable
h/w (e.g. capes) or minor changes. We just have to define what is
minor. :)

Rob

Maxime Bizon Feb. 19, 2015, 6:01 p.m. UTC | #19

On Thu, 2015-02-19 at 19:38 +0200, Pantelis Antoniou wrote:

> Having to boot and tweak the bootloader settings to select the correct
> dtb (even if it’s present on the flash medium) takes time and is
> error-prone.

Dedicate a set of GPIO for board/PCB revision detection (it only costs a
few resistors connected to VCC or GND), then use that information to
choose the correct DTB.

Sylvain Rochet Feb. 19, 2015, 6:12 p.m. UTC | #20

Hello,

On Thu, Feb 19, 2015 at 07:01:59PM +0100, Maxime Bizon wrote:
> On Thu, 2015-02-19 at 19:38 +0200, Pantelis Antoniou wrote:
> 
> > Having to boot and tweak the bootloader settings to select the correct
> > dtb (even if it’s present on the flash medium) takes time and is
> > error-prone.
> 
> Dedicate a set of GPIO for board/PCB revision detection (it only costs a
> few resistors connected to VCC or GND), then use that information to
> choose the correct DTB.

Or use a 1-wire or I2C EEPROM to store your board information.

Or, even better, if you have an I2C device, just chose a different 
address on each board for this device and then probe I2C devices in your 
boot loader until you found one you are looking for, this way, you don't 
need spare GPIO at all. You don't even need to populate the same I2C 
device on all boards, you can actually probe anything.

Sylvain

Guenter Roeck Feb. 19, 2015, 6:12 p.m. UTC | #21

On Thu, Feb 19, 2015 at 12:01:14PM -0600, Rob Herring wrote:
> On Wed, Feb 18, 2015 at 8:08 PM, Frank Rowand <frowand.list@gmail.com> wrote:
> > On 2/18/2015 6:59 AM, Pantelis Antoniou wrote:
> >> Implement a method of applying DT quirks early in the boot sequence.
> >>
> >> A DT quirk is a subtree of the boot DT that can be applied to
> >> a target in the base DT resulting in a modification of the live
> >> tree. The format of the quirk nodes is that of a device tree overlay.
> >
> > The use of the word "quirk" is a different mental model for me than what
> > this patch series appears to be addressing.  I would suggest totally
> > removing the word "quirk" from this proposal to avoid confusing the
> > mental models of future generations of kernel folks.
> 
> This comes from me as quirks are a different usecase I had in mind,
> but one that could use a similar mechanism. Although, in the case of
> quirks, I would expect them to be overlays built into the kernel. It
> would be more a way to update old dtbs.
> 
> > What this patch series seems to be proposing is a method to apply DT
> > overlays as soon as unflatten_device_tree() completes.  In other words,
> > making the device tree a dynamic object, that is partially defined by
> > the kernel during boot.  Well, to be fair, the kernel chooses among
> > several possible alternatives encoded in the DT blob.  So the device
> > tree is no longer a static object that describes the hardware of the
> > system.  It may not sound like a big deal, but it seems to me to be
> > a fundamental shift in what the device tree blob is.  Something that
> > should be thought about carefully and not just applied as a patch to
> > solve a point problem.
> 
> I agree. I would not want to see every board for an SOC become an
> overlay for example. I think it has to be limited to truly plugable
> h/w (e.g. capes) or minor changes. We just have to define what is
> minor. :)
> 

Everything that isn't OIR capable but fixed at boot time. OIR can and
should be handled with "real" overlays. OIR implies removal; I would
assume that what is discused here is insertion-only, and runtime removal
is neither required nor wanted. Or at least that is our use case.

Guenter

Maxime Bizon Feb. 19, 2015, 6:22 p.m. UTC | #22

On Thu, 2015-02-19 at 19:12 +0100, Sylvain Rochet wrote:

> Or use a 1-wire or I2C EEPROM to store your board information.

no, you don't reduce the human error probability.

eeprom needs to be preprogrammed, factory will at some point have a lot
of eeprom with different version, and will potentially equip the wrong
one.

much more error prone than them putting the resistor at the wrong place.

> Or, even better, if you have an I2C device, just chose a different 
> address on each board for this device and then probe I2C devices in your 
> boot loader until you found one you are looking for, this way, you don't 
> need spare GPIO at all. You don't even need to populate the same I2C 
> device on all boards, you can actually probe anything.

I'm not a fan of schemes where not probing something is not a definitive
sign of hardware failure.

Ludovic Desroches Feb. 20, 2015, 8:04 a.m. UTC | #23

On Thu, Feb 19, 2015 at 09:30:58AM -0800, Frank Rowand wrote:
> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
> > Hi Frank,
> > 
> >> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
> >>
> >> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
> >>> Hi Mark,
> >>>
> >>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
> >>>>
> >>>>>>> +While this may in theory work, in practice it is very cumbersome
> >>>>>>> +for the following reasons:
> >>>>>>> +
> >>>>>>> +1. The act of selecting a different boot device tree blob requires
> >>>>>>> +a reasonably advanced bootloader with some kind of configuration or
> >>>>>>> +scripting capabilities. Sadly this is not the case many times, the
> >>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
> >>>>>>
> >>>>>> You can have several bootloader builds, or even a single build with
> >>>>>> something like appended DTB to get an appropriate DTB if the same binary
> >>>>>> will otherwise work across all variants of a board.
> >>>>>>
> >>>>>
> >>>>> No, the same DTB will not work across all the variants of a board.
> >>>>
> >>>> I wasn't on about the DTB. I was on about the loader binary, in the case
> >>>> the FW/bootloader could be common even if the DTB couldn't.
> >>>>
> >>>> To some extent there must be a DTB that will work across all variants
> >>>> (albeit with limited utility) or the quirk approach wouldn't work…
> >>>>
> >>>
> >>> That’s not correct; the only part of the DTB that needs to be common
> >>> is the model property that would allow the quirk detection logic to fire.
> >>>
> >>> So, there is a base DTB that will work on all variants, but that only means
> >>> that it will work only up to the point that the quirk detector method
> >>> can work. So while in recommended practice there are common subsets
> >>> of the DTB that might work, they might be unsafe.
> >>>
> >>> For instance on the beaglebone the regulator configuration is different
> >>> between white and black, it is imperative you get them right otherwise
> >>> you risk board damage.
> >>>
> >>>>>> So it's not necessarily true that you need a complex bootloader.
> >>>>>>
> >>>>>
> >>>>>>> +2. On many instances boot time is extremely critical; in some cases
> >>>>>>> +there are hard requirements like having working video feeds in under
> >>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
> >>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
> >>>>>>> +is by removing the standard bootloader from the normal boot sequence
> >>>>>>> +altogether by having a very small boot shim that loads the kernel and
> >>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
> >>>>>>
> >>>>>> Given my previous comments above I don't see why this is relevant.
> >>>>>> You're already passing _some_ DTB here, so if you can organise for the
> >>>>>> board to statically provide a sane DTB that's fine, or you can resort to
> >>>>>> appended DTB if it's not possible to update the board configuration.
> >>>>>>
> >>>>>
> >>>>> You’re missing the point. I can’t use the same DTB for each revision of the
> >>>>> board. Each board is similar but it’s not identical.
> >>>>
> >>>> I think you've misunderstood my point. If you program the board with the
> >>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
> >>>> the kernel without need for quirks.
> >>>>
> >>>> I understand that each variant is somewhat incompatible (and hence needs
> >>>> its own DTB).
> >>>
> >>> In theory it might work, in practice this does not. Ludovic mentioned that they
> >>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
> >>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
> >>
> >> < snip >
> >>
> >> Or you can install the correct DTB on the board.  You trust your manufacturing line
> >> to install the correct resistors.  You trust your manufacturing line to install the
> >> correct kernel version (eg an updated version to resolve a security issue).
> >>
> >> I thought the DT blob was supposed to follow the same standard that other OS's or
> >> bootloaders understood.  Are you willing to break that?  (This is one of those
> >> ripples I mentioned in my other emails.)
> >>
> > 
> > Trust no-one.
> > 
> > This is one of those things that the kernel community doesn’t understand which makes people
> > who push product quite mad.
> > 
> > Engineering a product is not only about meeting customer spec, in order to turn a profit
> > the whole endeavor must be engineered as well for manufacturability.
> > 
> > Yes, you can always manually install files in the bootloader. For 1 board no problem.
> > For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
> > instead of turning a profit you’re losing money if you only have a few cents of profit
> > per unit.
> 
> I'm not installing physical components manually.  Why would I be installing software
> manually?  (rhetorical question)

It is not only about manufacturing. You can provide software updates and
trust me even if it seems easy to identify which dtb you have to load
with a good naming, some customers will use the bad one.

Other use case, we have a cpu module with the nand flash and a mother
board, we put the cpu module on another mother board with a different
revision, you don't have to update your dtb.

Maybe it is not necessary but it is a ease of use. I don't understand
how we could want a single zImage (even if it helps to clean the code) and we
don't take care about dtb stuff.

> 
> > 
> > No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
> > for a few million units. 
> 
> And you produce a few million units before testing that the first one off the line works?
> 
> > 
> > And frankly I don’t care what other OSes do. If you were to take a look at the sorry DT support
> > they have you’d be amazed.
> > 
> > I would be very surprised if there’s another OS out there that can boot with a late Linux DTB.
> >  
> >> -Frank
> > 
> > Regards
> > 
> > — Pantelis
> > 
> > PS. For a real use case please take a look at the answer Guenter gave on this thread a little
> > while back.
> > 
> 
> My previous comments were written after reading Guenter's comment.
> 
> -Frank
>

Ludovic Desroches Feb. 20, 2015, 8:16 a.m. UTC | #24

On Thu, Feb 19, 2015 at 12:01:14PM -0600, Rob Herring wrote:
> On Wed, Feb 18, 2015 at 8:08 PM, Frank Rowand <frowand.list@gmail.com> wrote:
> > On 2/18/2015 6:59 AM, Pantelis Antoniou wrote:
> >> Implement a method of applying DT quirks early in the boot sequence.
> >>
> >> A DT quirk is a subtree of the boot DT that can be applied to
> >> a target in the base DT resulting in a modification of the live
> >> tree. The format of the quirk nodes is that of a device tree overlay.
> >
> > The use of the word "quirk" is a different mental model for me than what
> > this patch series appears to be addressing.  I would suggest totally
> > removing the word "quirk" from this proposal to avoid confusing the
> > mental models of future generations of kernel folks.
> 
> This comes from me as quirks are a different usecase I had in mind,
> but one that could use a similar mechanism. Although, in the case of
> quirks, I would expect them to be overlays built into the kernel. It
> would be more a way to update old dtbs.
> 
> > What this patch series seems to be proposing is a method to apply DT
> > overlays as soon as unflatten_device_tree() completes.  In other words,
> > making the device tree a dynamic object, that is partially defined by
> > the kernel during boot.  Well, to be fair, the kernel chooses among
> > several possible alternatives encoded in the DT blob.  So the device
> > tree is no longer a static object that describes the hardware of the
> > system.  It may not sound like a big deal, but it seems to me to be
> > a fundamental shift in what the device tree blob is.  Something that
> > should be thought about carefully and not just applied as a patch to
> > solve a point problem.
> 
> I agree. I would not want to see every board for an SOC become an
> overlay for example. I think it has to be limited to truly plugable
> h/w (e.g. capes) or minor changes. We just have to define what is
> minor. :)

I don't think it is the goal. It should allow to deal with board
revisions, for example some components put on an other i2c bus to
isolate an i2c device corrupting the bus in some cases. It should
concern plugable h/w which is only needed at boot time for instance a
display panel if we want to show a splash screen.

Ludovic

Peter Hurley Feb. 20, 2015, 1:23 p.m. UTC | #25

On 02/19/2015 09:41 AM, Pantelis Antoniou wrote:
> Hi Frank,
> 
>> On Feb 19, 2015, at 04:08 , Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 2/18/2015 6:59 AM, Pantelis Antoniou wrote:
>>> Implement a method of applying DT quirks early in the boot sequence.
>>>
>>> A DT quirk is a subtree of the boot DT that can be applied to
>>> a target in the base DT resulting in a modification of the live
>>> tree. The format of the quirk nodes is that of a device tree overlay.
>>
>> The use of the word "quirk" is a different mental model for me than what
>> this patch series appears to be addressing.  I would suggest totally
>> removing the word "quirk" from this proposal to avoid confusing the
>> mental models of future generations of kernel folks.
>>
> 
> Naming things is hard to do. Suggestions?

variant

Peter Hurley Feb. 20, 2015, 2:21 p.m. UTC | #26

On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
> 
>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>> Hi Frank,
>>>
>>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>
>>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>>> Hi Mark,
>>>>>
>>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>>>>>
>>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>>> +for the following reasons:
>>>>>>>>> +
>>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>>>
>>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>>> will otherwise work across all variants of a board.
>>>>>>>>
>>>>>>>
>>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>>>
>>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>>>
>>>>>> To some extent there must be a DTB that will work across all variants
>>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>>>
>>>>>
>>>>> That’s not correct; the only part of the DTB that needs to be common
>>>>> is the model property that would allow the quirk detection logic to fire.
>>>>>
>>>>> So, there is a base DTB that will work on all variants, but that only means
>>>>> that it will work only up to the point that the quirk detector method
>>>>> can work. So while in recommended practice there are common subsets
>>>>> of the DTB that might work, they might be unsafe.
>>>>>
>>>>> For instance on the beaglebone the regulator configuration is different
>>>>> between white and black, it is imperative you get them right otherwise
>>>>> you risk board damage.
>>>>>
>>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>>>
>>>>>>>
>>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>>>
>>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>>>
>>>>>>>
>>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>>>> board. Each board is similar but it’s not identical.
>>>>>>
>>>>>> I think you've misunderstood my point. If you program the board with the
>>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>>> the kernel without need for quirks.
>>>>>>
>>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>>> its own DTB).
>>>>>
>>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>>>
>>>> < snip >
>>>>
>>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>>>> to install the correct resistors.  You trust your manufacturing line to install the
>>>> correct kernel version (eg an updated version to resolve a security issue).
>>>>
>>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>>> bootloaders understood.  Are you willing to break that?  (This is one of those
>>>> ripples I mentioned in my other emails.)
>>>>
>>>
>>> Trust no-one.
>>>
>>> This is one of those things that the kernel community doesn’t understand which makes people
>>> who push product quite mad.
>>>
>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>> the whole endeavor must be engineered as well for manufacturability.
>>>
>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>> instead of turning a profit you’re losing money if you only have a few cents of profit
>>> per unit.
>>
>> I'm not installing physical components manually.  Why would I be installing software
>> manually?  (rhetorical question)
>>
> 
> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
> 
> Having a single binary to flash to every revision of the board makes logistics considerably
> easier.
> 
> Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
> on the flash medium) takes time and is error-prone.
> 
> Factory time == money, errors == money.
> 
>>>
>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>> for a few million units. 
>>
>> And you produce a few million units before testing that the first one off the line works?
>>
> 
> The first one off the line works. The rest will get some burn in and functional testing if you’re
> lucky. In many cases where the product is very cheap it might make financial sense to just ship
> as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
> 
> Hardware is hard :)

I'm failing to see how this series improves your manufacturing process at all.

1. Won't you have to provide the factory with different eeprom images for the
   White and Black?  You _trust_ them to get that right, or more likely, you
   have process control procedures in place so that you don't get 1 million Blacks
   flashed with the White eeprom image.

2. The White and Black use different memory technology so it's not as if the
   eMMC from the Black will end up on the White SMT line (or vice versa).

3  For that matter, why wouldn't you worry that all the microSD cards intended
   for the White were accidentally assembled with the first 50,000 Blacks; at
   that point you're losing a lot more than a few cents of profit. And that has
   nothing to do with what image you provided.

3. The factory is just as likely to use some other customer's image by accident,
   so you're just as likely to have the same failure rate if you have no test
   process at the factory.

4. If you're using offline programming, the image has to be tested after
   reflow anyway.

IOW, your QA process will not change at all == same cost.

Regards,
Peter Hurley

Ludovic Desroches Feb. 20, 2015, 2:35 p.m. UTC | #27

On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
> > 
> >> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
> >>
> >> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
> >>> Hi Frank,
> >>>
> >>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
> >>>>
> >>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
> >>>>> Hi Mark,
> >>>>>
> >>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
> >>>>>>
> >>>>>>>>> +While this may in theory work, in practice it is very cumbersome
> >>>>>>>>> +for the following reasons:
> >>>>>>>>> +
> >>>>>>>>> +1. The act of selecting a different boot device tree blob requires
> >>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
> >>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
> >>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
> >>>>>>>>
> >>>>>>>> You can have several bootloader builds, or even a single build with
> >>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
> >>>>>>>> will otherwise work across all variants of a board.
> >>>>>>>>
> >>>>>>>
> >>>>>>> No, the same DTB will not work across all the variants of a board.
> >>>>>>
> >>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
> >>>>>> the FW/bootloader could be common even if the DTB couldn't.
> >>>>>>
> >>>>>> To some extent there must be a DTB that will work across all variants
> >>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
> >>>>>>
> >>>>>
> >>>>> That’s not correct; the only part of the DTB that needs to be common
> >>>>> is the model property that would allow the quirk detection logic to fire.
> >>>>>
> >>>>> So, there is a base DTB that will work on all variants, but that only means
> >>>>> that it will work only up to the point that the quirk detector method
> >>>>> can work. So while in recommended practice there are common subsets
> >>>>> of the DTB that might work, they might be unsafe.
> >>>>>
> >>>>> For instance on the beaglebone the regulator configuration is different
> >>>>> between white and black, it is imperative you get them right otherwise
> >>>>> you risk board damage.
> >>>>>
> >>>>>>>> So it's not necessarily true that you need a complex bootloader.
> >>>>>>>>
> >>>>>>>
> >>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
> >>>>>>>>> +there are hard requirements like having working video feeds in under
> >>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
> >>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
> >>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
> >>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
> >>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
> >>>>>>>>
> >>>>>>>> Given my previous comments above I don't see why this is relevant.
> >>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
> >>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
> >>>>>>>> appended DTB if it's not possible to update the board configuration.
> >>>>>>>>
> >>>>>>>
> >>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
> >>>>>>> board. Each board is similar but it’s not identical.
> >>>>>>
> >>>>>> I think you've misunderstood my point. If you program the board with the
> >>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
> >>>>>> the kernel without need for quirks.
> >>>>>>
> >>>>>> I understand that each variant is somewhat incompatible (and hence needs
> >>>>>> its own DTB).
> >>>>>
> >>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
> >>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
> >>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
> >>>>
> >>>> < snip >
> >>>>
> >>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
> >>>> to install the correct resistors.  You trust your manufacturing line to install the
> >>>> correct kernel version (eg an updated version to resolve a security issue).
> >>>>
> >>>> I thought the DT blob was supposed to follow the same standard that other OS's or
> >>>> bootloaders understood.  Are you willing to break that?  (This is one of those
> >>>> ripples I mentioned in my other emails.)
> >>>>
> >>>
> >>> Trust no-one.
> >>>
> >>> This is one of those things that the kernel community doesn’t understand which makes people
> >>> who push product quite mad.
> >>>
> >>> Engineering a product is not only about meeting customer spec, in order to turn a profit
> >>> the whole endeavor must be engineered as well for manufacturability.
> >>>
> >>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
> >>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
> >>> instead of turning a profit you’re losing money if you only have a few cents of profit
> >>> per unit.
> >>
> >> I'm not installing physical components manually.  Why would I be installing software
> >> manually?  (rhetorical question)
> >>
> > 
> > Because on high volume product runs the flash comes preprogrammed and is soldered as is.
> > 
> > Having a single binary to flash to every revision of the board makes logistics considerably
> > easier.
> > 
> > Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
> > on the flash medium) takes time and is error-prone.
> > 
> > Factory time == money, errors == money.
> > 
> >>>
> >>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
> >>> for a few million units. 
> >>
> >> And you produce a few million units before testing that the first one off the line works?
> >>
> > 
> > The first one off the line works. The rest will get some burn in and functional testing if you’re
> > lucky. In many cases where the product is very cheap it might make financial sense to just ship
> > as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
> > 
> > Hardware is hard :)
> 
> I'm failing to see how this series improves your manufacturing process at all.
> 
> 1. Won't you have to provide the factory with different eeprom images for the
>    White and Black?  You _trust_ them to get that right, or more likely, you
>    have process control procedures in place so that you don't get 1 million Blacks
>    flashed with the White eeprom image.
> 
> 2. The White and Black use different memory technology so it's not as if the
>    eMMC from the Black will end up on the White SMT line (or vice versa).
> 
> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>    for the White were accidentally assembled with the first 50,000 Blacks; at
>    that point you're losing a lot more than a few cents of profit. And that has
>    nothing to do with what image you provided.
> 

As you said, we can imagine many reasons to have a failure during the
production, having several DTB files will increase the risk.

> 3. The factory is just as likely to use some other customer's image by accident,
>    so you're just as likely to have the same failure rate if you have no test
>    process at the factory.
> 
> 4. If you're using offline programming, the image has to be tested after
>    reflow anyway.
> 
> IOW, your QA process will not change at all == same cost.
> 
> Regards,
> Peter Hurley

Pantelis Antoniou Feb. 20, 2015, 2:38 p.m. UTC | #28

Hi Peter,

> On Feb 20, 2015, at 16:21 , Peter Hurley <peter@hurleysoftware.com> wrote:
> 
> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>> 
>>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
>>> 
>>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>>> Hi Frank,
>>>> 
>>>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>> 
>>>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>>>> Hi Mark,
>>>>>> 
>>>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>>>>>> 
>>>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>>>> +for the following reasons:
>>>>>>>>>> +
>>>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>>>> 
>>>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>>>> will otherwise work across all variants of a board.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>>>> 
>>>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>>>> 
>>>>>>> To some extent there must be a DTB that will work across all variants
>>>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>>>> 
>>>>>> 
>>>>>> That’s not correct; the only part of the DTB that needs to be common
>>>>>> is the model property that would allow the quirk detection logic to fire.
>>>>>> 
>>>>>> So, there is a base DTB that will work on all variants, but that only means
>>>>>> that it will work only up to the point that the quirk detector method
>>>>>> can work. So while in recommended practice there are common subsets
>>>>>> of the DTB that might work, they might be unsafe.
>>>>>> 
>>>>>> For instance on the beaglebone the regulator configuration is different
>>>>>> between white and black, it is imperative you get them right otherwise
>>>>>> you risk board damage.
>>>>>> 
>>>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>>>> 
>>>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>>>>> board. Each board is similar but it’s not identical.
>>>>>>> 
>>>>>>> I think you've misunderstood my point. If you program the board with the
>>>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>>>> the kernel without need for quirks.
>>>>>>> 
>>>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>>>> its own DTB).
>>>>>> 
>>>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>>>> 
>>>>> < snip >
>>>>> 
>>>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>>>>> to install the correct resistors.  You trust your manufacturing line to install the
>>>>> correct kernel version (eg an updated version to resolve a security issue).
>>>>> 
>>>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>>>> bootloaders understood.  Are you willing to break that?  (This is one of those
>>>>> ripples I mentioned in my other emails.)
>>>>> 
>>>> 
>>>> Trust no-one.
>>>> 
>>>> This is one of those things that the kernel community doesn’t understand which makes people
>>>> who push product quite mad.
>>>> 
>>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>>> the whole endeavor must be engineered as well for manufacturability.
>>>> 
>>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>>> instead of turning a profit you’re losing money if you only have a few cents of profit
>>>> per unit.
>>> 
>>> I'm not installing physical components manually.  Why would I be installing software
>>> manually?  (rhetorical question)
>>> 
>> 
>> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>> 
>> Having a single binary to flash to every revision of the board makes logistics considerably
>> easier.
>> 
>> Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
>> on the flash medium) takes time and is error-prone.
>> 
>> Factory time == money, errors == money.
>> 
>>>> 
>>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>>> for a few million units. 
>>> 
>>> And you produce a few million units before testing that the first one off the line works?
>>> 
>> 
>> The first one off the line works. The rest will get some burn in and functional testing if you’re
>> lucky. In many cases where the product is very cheap it might make financial sense to just ship
>> as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
>> 
>> Hardware is hard :)
> 
> I'm failing to see how this series improves your manufacturing process at all.
> 
> 1. Won't you have to provide the factory with different eeprom images for the
>   White and Black?  You _trust_ them to get that right, or more likely, you
>   have process control procedures in place so that you don't get 1 million Blacks
>   flashed with the White eeprom image.
> 
> 2. The White and Black use different memory technology so it's not as if the
>   eMMC from the Black will end up on the White SMT line (or vice versa).
> 
> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>   for the White were accidentally assembled with the first 50,000 Blacks; at
>   that point you're losing a lot more than a few cents of profit. And that has
>   nothing to do with what image you provided.
> 
> 3. The factory is just as likely to use some other customer's image by accident,
>   so you're just as likely to have the same failure rate if you have no test
>   process at the factory.
> 
> 4. If you're using offline programming, the image has to be tested after
>   reflow anyway.
> 
> IOW, your QA process will not change at all == same cost.
> 

I never said this fixes every case where someone might screw up, I just said
that it makes it extremely less likely.

And no-one is holding the beaglebone as the paragon of good design process for
ease of manufacturing as far as I know.
 
> Regards,
> Peter Hurley

Regards

— Pantelis

Peter Hurley Feb. 20, 2015, 3 p.m. UTC | #29

On 02/20/2015 09:35 AM, Ludovic Desroches wrote:
> On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
>> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>>>
>>>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>
>>>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>>>> Hi Frank,
>>>>>
>>>>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>>>
>>>>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>>>>> Hi Mark,
>>>>>>>
>>>>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>>>>>>>
>>>>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>>>>> +for the following reasons:
>>>>>>>>>>> +
>>>>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>>>>>
>>>>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>>>>> will otherwise work across all variants of a board.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>>>>>
>>>>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>>>>>
>>>>>>>> To some extent there must be a DTB that will work across all variants
>>>>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>>>>>
>>>>>>>
>>>>>>> That’s not correct; the only part of the DTB that needs to be common
>>>>>>> is the model property that would allow the quirk detection logic to fire.
>>>>>>>
>>>>>>> So, there is a base DTB that will work on all variants, but that only means
>>>>>>> that it will work only up to the point that the quirk detector method
>>>>>>> can work. So while in recommended practice there are common subsets
>>>>>>> of the DTB that might work, they might be unsafe.
>>>>>>>
>>>>>>> For instance on the beaglebone the regulator configuration is different
>>>>>>> between white and black, it is imperative you get them right otherwise
>>>>>>> you risk board damage.
>>>>>>>
>>>>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>>>>>
>>>>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>>>>>> board. Each board is similar but it’s not identical.
>>>>>>>>
>>>>>>>> I think you've misunderstood my point. If you program the board with the
>>>>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>>>>> the kernel without need for quirks.
>>>>>>>>
>>>>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>>>>> its own DTB).
>>>>>>>
>>>>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>>>>>
>>>>>> < snip >
>>>>>>
>>>>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>>>>>> to install the correct resistors.  You trust your manufacturing line to install the
>>>>>> correct kernel version (eg an updated version to resolve a security issue).
>>>>>>
>>>>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>>>>> bootloaders understood.  Are you willing to break that?  (This is one of those
>>>>>> ripples I mentioned in my other emails.)
>>>>>>
>>>>>
>>>>> Trust no-one.
>>>>>
>>>>> This is one of those things that the kernel community doesn’t understand which makes people
>>>>> who push product quite mad.
>>>>>
>>>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>>>> the whole endeavor must be engineered as well for manufacturability.
>>>>>
>>>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>>>> instead of turning a profit you’re losing money if you only have a few cents of profit
>>>>> per unit.
>>>>
>>>> I'm not installing physical components manually.  Why would I be installing software
>>>> manually?  (rhetorical question)
>>>>
>>>
>>> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>>>
>>> Having a single binary to flash to every revision of the board makes logistics considerably
>>> easier.
>>>
>>> Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
>>> on the flash medium) takes time and is error-prone.
>>>
>>> Factory time == money, errors == money.
>>>
>>>>>
>>>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>>>> for a few million units. 
>>>>
>>>> And you produce a few million units before testing that the first one off the line works?
>>>>
>>>
>>> The first one off the line works. The rest will get some burn in and functional testing if you’re
>>> lucky. In many cases where the product is very cheap it might make financial sense to just ship
>>> as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
>>>
>>> Hardware is hard :)
>>
>> I'm failing to see how this series improves your manufacturing process at all.
>>
>> 1. Won't you have to provide the factory with different eeprom images for the
>>    White and Black?  You _trust_ them to get that right, or more likely, you
>>    have process control procedures in place so that you don't get 1 million Blacks
>>    flashed with the White eeprom image.
>>
>> 2. The White and Black use different memory technology so it's not as if the
>>    eMMC from the Black will end up on the White SMT line (or vice versa).
>>
>> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>>    for the White were accidentally assembled with the first 50,000 Blacks; at
>>    that point you're losing a lot more than a few cents of profit. And that has
>>    nothing to do with what image you provided.
>>
> 
> As you said, we can imagine many reasons to have a failure during the
> production, having several DTB files will increase the risk.

It's interesting that you don't see the added complexity of open-coding
the i2c driver or mixing DTS fragments for different designs as increased risk
(for us all).


>> 3. The factory is just as likely to use some other customer's image by accident,
>>    so you're just as likely to have the same failure rate if you have no test
>>    process at the factory.
>>
>> 4. If you're using offline programming, the image has to be tested after
>>    reflow anyway.
>>
>> IOW, your QA process will not change at all == same cost.
>>
>> Regards,
>> Peter Hurley

Pantelis Antoniou Feb. 20, 2015, 3:02 p.m. UTC | #30

Hi Peter,

> On Feb 20, 2015, at 17:00 , Peter Hurley <peter@hurleysoftware.com> wrote:
> 
> On 02/20/2015 09:35 AM, Ludovic Desroches wrote:
>> On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
>>> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>>>> 
>>>>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>> 
>>>>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>>>>> Hi Frank,
>>>>>> 
>>>>>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>>>> 
>>>>>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>>>>>> Hi Mark,
>>>>>>>> 
>>>>>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>>>>>>>> 
>>>>>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>>>>>> +for the following reasons:
>>>>>>>>>>>> +
>>>>>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>>>>>> 
>>>>>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>>>>>> will otherwise work across all variants of a board.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>>>>>> 
>>>>>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>>>>>> 
>>>>>>>>> To some extent there must be a DTB that will work across all variants
>>>>>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> That’s not correct; the only part of the DTB that needs to be common
>>>>>>>> is the model property that would allow the quirk detection logic to fire.
>>>>>>>> 
>>>>>>>> So, there is a base DTB that will work on all variants, but that only means
>>>>>>>> that it will work only up to the point that the quirk detector method
>>>>>>>> can work. So while in recommended practice there are common subsets
>>>>>>>> of the DTB that might work, they might be unsafe.
>>>>>>>> 
>>>>>>>> For instance on the beaglebone the regulator configuration is different
>>>>>>>> between white and black, it is imperative you get them right otherwise
>>>>>>>> you risk board damage.
>>>>>>>> 
>>>>>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>>>>>> 
>>>>>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>>>>>>> board. Each board is similar but it’s not identical.
>>>>>>>>> 
>>>>>>>>> I think you've misunderstood my point. If you program the board with the
>>>>>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>>>>>> the kernel without need for quirks.
>>>>>>>>> 
>>>>>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>>>>>> its own DTB).
>>>>>>>> 
>>>>>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>>>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>>>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>>>>>> 
>>>>>>> < snip >
>>>>>>> 
>>>>>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>>>>>>> to install the correct resistors.  You trust your manufacturing line to install the
>>>>>>> correct kernel version (eg an updated version to resolve a security issue).
>>>>>>> 
>>>>>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>>>>>> bootloaders understood.  Are you willing to break that?  (This is one of those
>>>>>>> ripples I mentioned in my other emails.)
>>>>>>> 
>>>>>> 
>>>>>> Trust no-one.
>>>>>> 
>>>>>> This is one of those things that the kernel community doesn’t understand which makes people
>>>>>> who push product quite mad.
>>>>>> 
>>>>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>>>>> the whole endeavor must be engineered as well for manufacturability.
>>>>>> 
>>>>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>>>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>>>>> instead of turning a profit you’re losing money if you only have a few cents of profit
>>>>>> per unit.
>>>>> 
>>>>> I'm not installing physical components manually.  Why would I be installing software
>>>>> manually?  (rhetorical question)
>>>>> 
>>>> 
>>>> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>>>> 
>>>> Having a single binary to flash to every revision of the board makes logistics considerably
>>>> easier.
>>>> 
>>>> Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
>>>> on the flash medium) takes time and is error-prone.
>>>> 
>>>> Factory time == money, errors == money.
>>>> 
>>>>>> 
>>>>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>>>>> for a few million units. 
>>>>> 
>>>>> And you produce a few million units before testing that the first one off the line works?
>>>>> 
>>>> 
>>>> The first one off the line works. The rest will get some burn in and functional testing if you’re
>>>> lucky. In many cases where the product is very cheap it might make financial sense to just ship
>>>> as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
>>>> 
>>>> Hardware is hard :)
>>> 
>>> I'm failing to see how this series improves your manufacturing process at all.
>>> 
>>> 1. Won't you have to provide the factory with different eeprom images for the
>>>   White and Black?  You _trust_ them to get that right, or more likely, you
>>>   have process control procedures in place so that you don't get 1 million Blacks
>>>   flashed with the White eeprom image.
>>> 
>>> 2. The White and Black use different memory technology so it's not as if the
>>>   eMMC from the Black will end up on the White SMT line (or vice versa).
>>> 
>>> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>>>   for the White were accidentally assembled with the first 50,000 Blacks; at
>>>   that point you're losing a lot more than a few cents of profit. And that has
>>>   nothing to do with what image you provided.
>>> 
>> 
>> As you said, we can imagine many reasons to have a failure during the
>> production, having several DTB files will increase the risk.
> 
> It's interesting that you don't see the added complexity of open-coding
> the i2c driver or mixing DTS fragments for different designs as increased risk
> (for us all).
> 
> 

You don’t have to use it. Some people really do though. As for increased risk
I expect to see arguments instead of a statement.

>>> 3. The factory is just as likely to use some other customer's image by accident,
>>>   so you're just as likely to have the same failure rate if you have no test
>>>   process at the factory.
>>> 
>>> 4. If you're using offline programming, the image has to be tested after
>>>   reflow anyway.
>>> 
>>> IOW, your QA process will not change at all == same cost.
>>> 
>>> Regards,
>>> Peter Hurley
> 

Regards

— Pantelis

Peter Hurley Feb. 20, 2015, 3:24 p.m. UTC | #31

On 02/20/2015 10:02 AM, Pantelis Antoniou wrote:
> Hi Peter,
> 
>> On Feb 20, 2015, at 17:00 , Peter Hurley <peter@hurleysoftware.com> wrote:
>>
>> On 02/20/2015 09:35 AM, Ludovic Desroches wrote:
>>> On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
>>>> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>>>>>
>>>>>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>>>
>>>>>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>>>>>> Hi Frank,
>>>>>>>
>>>>>>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>>>>>
>>>>>>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>>>>>>> Hi Mark,
>>>>>>>>>
>>>>>>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>>>>>>> +for the following reasons:
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>>>>>>>
>>>>>>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>>>>>>> will otherwise work across all variants of a board.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>>>>>>>
>>>>>>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>>>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>>>>>>>
>>>>>>>>>> To some extent there must be a DTB that will work across all variants
>>>>>>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That’s not correct; the only part of the DTB that needs to be common
>>>>>>>>> is the model property that would allow the quirk detection logic to fire.
>>>>>>>>>
>>>>>>>>> So, there is a base DTB that will work on all variants, but that only means
>>>>>>>>> that it will work only up to the point that the quirk detector method
>>>>>>>>> can work. So while in recommended practice there are common subsets
>>>>>>>>> of the DTB that might work, they might be unsafe.
>>>>>>>>>
>>>>>>>>> For instance on the beaglebone the regulator configuration is different
>>>>>>>>> between white and black, it is imperative you get them right otherwise
>>>>>>>>> you risk board damage.
>>>>>>>>>
>>>>>>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>>>>>>>
>>>>>>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>>>>>>>> board. Each board is similar but it’s not identical.
>>>>>>>>>>
>>>>>>>>>> I think you've misunderstood my point. If you program the board with the
>>>>>>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>>>>>>> the kernel without need for quirks.
>>>>>>>>>>
>>>>>>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>>>>>>> its own DTB).
>>>>>>>>>
>>>>>>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>>>>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>>>>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>>>>>>>
>>>>>>>> < snip >
>>>>>>>>
>>>>>>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>>>>>>>> to install the correct resistors.  You trust your manufacturing line to install the
>>>>>>>> correct kernel version (eg an updated version to resolve a security issue).
>>>>>>>>
>>>>>>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>>>>>>> bootloaders understood.  Are you willing to break that?  (This is one of those
>>>>>>>> ripples I mentioned in my other emails.)
>>>>>>>>
>>>>>>>
>>>>>>> Trust no-one.
>>>>>>>
>>>>>>> This is one of those things that the kernel community doesn’t understand which makes people
>>>>>>> who push product quite mad.
>>>>>>>
>>>>>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>>>>>> the whole endeavor must be engineered as well for manufacturability.
>>>>>>>
>>>>>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>>>>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>>>>>> instead of turning a profit you’re losing money if you only have a few cents of profit
>>>>>>> per unit.
>>>>>>
>>>>>> I'm not installing physical components manually.  Why would I be installing software
>>>>>> manually?  (rhetorical question)
>>>>>>
>>>>>
>>>>> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>>>>>
>>>>> Having a single binary to flash to every revision of the board makes logistics considerably
>>>>> easier.
>>>>>
>>>>> Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
>>>>> on the flash medium) takes time and is error-prone.
>>>>>
>>>>> Factory time == money, errors == money.
>>>>>
>>>>>>>
>>>>>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>>>>>> for a few million units. 
>>>>>>
>>>>>> And you produce a few million units before testing that the first one off the line works?
>>>>>>
>>>>>
>>>>> The first one off the line works. The rest will get some burn in and functional testing if you’re
>>>>> lucky. In many cases where the product is very cheap it might make financial sense to just ship
>>>>> as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
>>>>>
>>>>> Hardware is hard :)
>>>>
>>>> I'm failing to see how this series improves your manufacturing process at all.
>>>>
>>>> 1. Won't you have to provide the factory with different eeprom images for the
>>>>   White and Black?  You _trust_ them to get that right, or more likely, you
>>>>   have process control procedures in place so that you don't get 1 million Blacks
>>>>   flashed with the White eeprom image.
>>>>
>>>> 2. The White and Black use different memory technology so it's not as if the
>>>>   eMMC from the Black will end up on the White SMT line (or vice versa).
>>>>
>>>> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>>>>   for the White were accidentally assembled with the first 50,000 Blacks; at
>>>>   that point you're losing a lot more than a few cents of profit. And that has
>>>>   nothing to do with what image you provided.
>>>>
>>>
>>> As you said, we can imagine many reasons to have a failure during the
>>> production, having several DTB files will increase the risk.
>>
>> It's interesting that you don't see the added complexity of open-coding
>> the i2c driver or mixing DTS fragments for different designs as increased risk
>> (for us all).
>>
>>
> 
> You don’t have to use it.

> diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
> index 5d27dfd..02129e7 100644
> --- a/arch/arm/mach-omap2/Makefile
> +++ b/arch/arm/mach-omap2/Makefile
> @@ -259,6 +259,11 @@ obj-$(CONFIG_MACH_CRANEBOARD)		+= board-am3517crane.o
>  
>  obj-$(CONFIG_MACH_SBC3530)		+= board-omap3stalker.o
>  
> +# DT quirks
> +ifneq ($(CONFIG_OF_DYNAMIC),)
> +obj-$(CONFIG_SOC_AM33XX)		+= am33xx-dt-quirks.o
> +endif

Won't this automatically be included on my Black that supports DT overlays?


> Some people really do though. As for increased risk
> I expect to see arguments instead of a statement.

No one is wasting your time with random arguments. Please keep your tone civil.

Regards,
Peter Hurley


>>>> 3. The factory is just as likely to use some other customer's image by accident,
>>>>   so you're just as likely to have the same failure rate if you have no test
>>>>   process at the factory.
>>>>
>>>> 4. If you're using offline programming, the image has to be tested after
>>>>   reflow anyway.
>>>>
>>>> IOW, your QA process will not change at all == same cost.
>>>>
>>>> Regards,
>>>> Peter Hurley
>>
> 
> Regards
> 
> — Pantelis
>

Pantelis Antoniou Feb. 20, 2015, 3:38 p.m. UTC | #32

Hi Peter,

> On Feb 20, 2015, at 17:24 , Peter Hurley <peter@hurleysoftware.com> wrote:
> 
> On 02/20/2015 10:02 AM, Pantelis Antoniou wrote:
>> Hi Peter,
>> 
>>> On Feb 20, 2015, at 17:00 , Peter Hurley <peter@hurleysoftware.com> wrote:
>>> 
>>> On 02/20/2015 09:35 AM, Ludovic Desroches wrote:
>>>> On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
>>>>> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>>>>>> 
>>>>>>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>>>> 
>>>>>>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>>>>>>> Hi Frank,
>>>>>>>> 
>>>>>>>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
>>>>>>>>>> Hi Mark,
>>>>>>>>>> 
>>>>>>>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>>>> +While this may in theory work, in practice it is very cumbersome
>>>>>>>>>>>>>> +for the following reasons:
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +1. The act of selecting a different boot device tree blob requires
>>>>>>>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
>>>>>>>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
>>>>>>>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> You can have several bootloader builds, or even a single build with
>>>>>>>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
>>>>>>>>>>>>> will otherwise work across all variants of a board.
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> No, the same DTB will not work across all the variants of a board.
>>>>>>>>>>> 
>>>>>>>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
>>>>>>>>>>> the FW/bootloader could be common even if the DTB couldn't.
>>>>>>>>>>> 
>>>>>>>>>>> To some extent there must be a DTB that will work across all variants
>>>>>>>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> That’s not correct; the only part of the DTB that needs to be common
>>>>>>>>>> is the model property that would allow the quirk detection logic to fire.
>>>>>>>>>> 
>>>>>>>>>> So, there is a base DTB that will work on all variants, but that only means
>>>>>>>>>> that it will work only up to the point that the quirk detector method
>>>>>>>>>> can work. So while in recommended practice there are common subsets
>>>>>>>>>> of the DTB that might work, they might be unsafe.
>>>>>>>>>> 
>>>>>>>>>> For instance on the beaglebone the regulator configuration is different
>>>>>>>>>> between white and black, it is imperative you get them right otherwise
>>>>>>>>>> you risk board damage.
>>>>>>>>>> 
>>>>>>>>>>>>> So it's not necessarily true that you need a complex bootloader.
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
>>>>>>>>>>>>>> +there are hard requirements like having working video feeds in under
>>>>>>>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
>>>>>>>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
>>>>>>>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
>>>>>>>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
>>>>>>>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Given my previous comments above I don't see why this is relevant.
>>>>>>>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
>>>>>>>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
>>>>>>>>>>>>> appended DTB if it's not possible to update the board configuration.
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
>>>>>>>>>>>> board. Each board is similar but it’s not identical.
>>>>>>>>>>> 
>>>>>>>>>>> I think you've misunderstood my point. If you program the board with the
>>>>>>>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
>>>>>>>>>>> the kernel without need for quirks.
>>>>>>>>>>> 
>>>>>>>>>>> I understand that each variant is somewhat incompatible (and hence needs
>>>>>>>>>>> its own DTB).
>>>>>>>>>> 
>>>>>>>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
>>>>>>>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
>>>>>>>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
>>>>>>>>> 
>>>>>>>>> < snip >
>>>>>>>>> 
>>>>>>>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
>>>>>>>>> to install the correct resistors.  You trust your manufacturing line to install the
>>>>>>>>> correct kernel version (eg an updated version to resolve a security issue).
>>>>>>>>> 
>>>>>>>>> I thought the DT blob was supposed to follow the same standard that other OS's or
>>>>>>>>> bootloaders understood.  Are you willing to break that?  (This is one of those
>>>>>>>>> ripples I mentioned in my other emails.)
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Trust no-one.
>>>>>>>> 
>>>>>>>> This is one of those things that the kernel community doesn’t understand which makes people
>>>>>>>> who push product quite mad.
>>>>>>>> 
>>>>>>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>>>>>>> the whole endeavor must be engineered as well for manufacturability.
>>>>>>>> 
>>>>>>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>>>>>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>>>>>>> instead of turning a profit you’re losing money if you only have a few cents of profit
>>>>>>>> per unit.
>>>>>>> 
>>>>>>> I'm not installing physical components manually.  Why would I be installing software
>>>>>>> manually?  (rhetorical question)
>>>>>>> 
>>>>>> 
>>>>>> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>>>>>> 
>>>>>> Having a single binary to flash to every revision of the board makes logistics considerably
>>>>>> easier.
>>>>>> 
>>>>>> Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
>>>>>> on the flash medium) takes time and is error-prone.
>>>>>> 
>>>>>> Factory time == money, errors == money.
>>>>>> 
>>>>>>>> 
>>>>>>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>>>>>>> for a few million units. 
>>>>>>> 
>>>>>>> And you produce a few million units before testing that the first one off the line works?
>>>>>>> 
>>>>>> 
>>>>>> The first one off the line works. The rest will get some burn in and functional testing if you’re
>>>>>> lucky. In many cases where the product is very cheap it might make financial sense to just ship
>>>>>> as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
>>>>>> 
>>>>>> Hardware is hard :)
>>>>> 
>>>>> I'm failing to see how this series improves your manufacturing process at all.
>>>>> 
>>>>> 1. Won't you have to provide the factory with different eeprom images for the
>>>>>  White and Black?  You _trust_ them to get that right, or more likely, you
>>>>>  have process control procedures in place so that you don't get 1 million Blacks
>>>>>  flashed with the White eeprom image.
>>>>> 
>>>>> 2. The White and Black use different memory technology so it's not as if the
>>>>>  eMMC from the Black will end up on the White SMT line (or vice versa).
>>>>> 
>>>>> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>>>>>  for the White were accidentally assembled with the first 50,000 Blacks; at
>>>>>  that point you're losing a lot more than a few cents of profit. And that has
>>>>>  nothing to do with what image you provided.
>>>>> 
>>>> 
>>>> As you said, we can imagine many reasons to have a failure during the
>>>> production, having several DTB files will increase the risk.
>>> 
>>> It's interesting that you don't see the added complexity of open-coding
>>> the i2c driver or mixing DTS fragments for different designs as increased risk
>>> (for us all).
>>> 
>>> 
>> 
>> You don’t have to use it.
> 
>> diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
>> index 5d27dfd..02129e7 100644
>> --- a/arch/arm/mach-omap2/Makefile
>> +++ b/arch/arm/mach-omap2/Makefile
>> @@ -259,6 +259,11 @@ obj-$(CONFIG_MACH_CRANEBOARD)		+= board-am3517crane.o
>> 
>> obj-$(CONFIG_MACH_SBC3530)		+= board-omap3stalker.o
>> 
>> +# DT quirks
>> +ifneq ($(CONFIG_OF_DYNAMIC),)
>> +obj-$(CONFIG_SOC_AM33XX)		+= am33xx-dt-quirks.o
>> +endif
> 
> Won't this automatically be included on my Black that supports DT overlays?
> 

Yes it will. It is a grand total of 498 lines of code, and the total size of
the code segment is about 2.2K.

You do realize that you’re probably booting a multi-platform kernel on the 
black right? Including things for all 2xxx/3xxx and 44xx platforms?
For instance:

> ~/ti/kernels/linux-github.git $ wc -l arch/arm/mach-omap2/*44xx*.c
>    443 arch/arm/mach-omap2/clockdomains44xx_data.c
>    526 arch/arm/mach-omap2/cminst44xx.c
>    251 arch/arm/mach-omap2/cpuidle44xx.c
>    250 arch/arm/mach-omap2/dpll44xx.c
>   4864 arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>    295 arch/arm/mach-omap2/pm44xx.c
>    358 arch/arm/mach-omap2/powerdomains44xx_data.c
>     62 arch/arm/mach-omap2/prcm_mpu44xx.c
>    770 arch/arm/mach-omap2/prm44xx.c
>    210 arch/arm/mach-omap2/prminst44xx.c
>    117 arch/arm/mach-omap2/vc44xx_data.c
>    130 arch/arm/mach-omap2/voltagedomains44xx_data.c
>    104 arch/arm/mach-omap2/vp44xx_data.c
>   8380 total

I bet those things are far larger than 2.2K. And I also bet that in the
tradeoff analysis that the board maintainer did things came down to 
increasing complexity so that he can consolidate the kernels for all the
other platforms he has to support besides the black.

> 
>> Some people really do though. As for increased risk
>> I expect to see arguments instead of a statement.
> 
> No one is wasting your time with random arguments. Please keep your tone civil.
> 

A statement like 'increasing risk for all of us' is very open ended. What is
the risk? How much of it exists?

If I offended you I’m really sorry though, I meant no such thing.

> Regards,
> Peter Hurley
> 

Regards

— Pantelis

> 
>>>>> 3. The factory is just as likely to use some other customer's image by accident,
>>>>>  so you're just as likely to have the same failure rate if you have no test
>>>>>  process at the factory.
>>>>> 
>>>>> 4. If you're using offline programming, the image has to be tested after
>>>>>  reflow anyway.
>>>>> 
>>>>> IOW, your QA process will not change at all == same cost.
>>>>> 
>>>>> Regards,
>>>>> Peter Hurley
>>> 
>> 
>> Regards
>> 
>> — Pantelis

Peter Hurley Feb. 20, 2015, 4:34 p.m. UTC | #33

On 02/20/2015 10:38 AM, Pantelis Antoniou wrote:
> Hi Peter,
> 
>> On Feb 20, 2015, at 17:24 , Peter Hurley <peter@hurleysoftware.com> wrote:
>>
>> On 02/20/2015 10:02 AM, Pantelis Antoniou wrote:
>>> Hi Peter,
>>>
>>>> On Feb 20, 2015, at 17:00 , Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>
>>>> On 02/20/2015 09:35 AM, Ludovic Desroches wrote:

[...]

>>>>> As you said, we can imagine many reasons to have a failure during the
>>>>> production, having several DTB files will increase the risk.
>>>>
>>>> It's interesting that you don't see the added complexity of open-coding
>>>> the i2c driver or mixing DTS fragments for different designs as increased risk
>>>> (for us all).
>>>>
>>>>
>>>
>>> You don’t have to use it.
>>
>>> diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
>>> index 5d27dfd..02129e7 100644
>>> --- a/arch/arm/mach-omap2/Makefile
>>> +++ b/arch/arm/mach-omap2/Makefile
>>> @@ -259,6 +259,11 @@ obj-$(CONFIG_MACH_CRANEBOARD)		+= board-am3517crane.o
>>>
>>> obj-$(CONFIG_MACH_SBC3530)		+= board-omap3stalker.o
>>>
>>> +# DT quirks
>>> +ifneq ($(CONFIG_OF_DYNAMIC),)
>>> +obj-$(CONFIG_SOC_AM33XX)		+= am33xx-dt-quirks.o
>>> +endif
>>
>> Won't this automatically be included on my Black that supports DT overlays?
>>
> 
> Yes it will. It is a grand total of 498 lines of code, and the total size of
> the code segment is about 2.2K.
> 
> You do realize that you’re probably booting a multi-platform kernel on the 
> black right? Including things for all 2xxx/3xxx and 44xx platforms?
> For instance:
> 
>> ~/ti/kernels/linux-github.git $ wc -l arch/arm/mach-omap2/*44xx*.c
>>    443 arch/arm/mach-omap2/clockdomains44xx_data.c
>>    526 arch/arm/mach-omap2/cminst44xx.c
>>    251 arch/arm/mach-omap2/cpuidle44xx.c
>>    250 arch/arm/mach-omap2/dpll44xx.c
>>   4864 arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>>    295 arch/arm/mach-omap2/pm44xx.c
>>    358 arch/arm/mach-omap2/powerdomains44xx_data.c
>>     62 arch/arm/mach-omap2/prcm_mpu44xx.c
>>    770 arch/arm/mach-omap2/prm44xx.c
>>    210 arch/arm/mach-omap2/prminst44xx.c
>>    117 arch/arm/mach-omap2/vc44xx_data.c
>>    130 arch/arm/mach-omap2/voltagedomains44xx_data.c
>>    104 arch/arm/mach-omap2/vp44xx_data.c
>>   8380 total
> 
> I bet those things are far larger than 2.2K. And I also bet that in the
> tradeoff analysis that the board maintainer did things came down to 
> increasing complexity so that he can consolidate the kernels for all the
> other platforms he has to support besides the black.

Not that it really matters, but I'm not using any of that.


>>> Some people really do though. As for increased risk
>>> I expect to see arguments instead of a statement.
>>
>> No one is wasting your time with random arguments. Please keep your tone civil.
>>
> 
> A statement like 'increasing risk for all of us' is very open ended. What is
> the risk? How much of it exists?

My point was simply that this trades reduced complexity in one area
with increased complexity in another area.

For you, that trade-off is worth it, but for others, not so much.

FWIW, I agree that some mechanism is required to support the other
use cases. I just don't think ease of manufacturing, when the
submit configuration is the BeagleBone, is where I would hang my hat.


> If I offended you I’m really sorry though, I meant no such thing.

In re-reading it, I realize I shouldn't have taken offense. Thanks anyway
for the apology.

Regards,
Peter Hurley

Guenter Roeck Feb. 20, 2015, 4:47 p.m. UTC | #34

On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
> > 
> >> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
> >>
> >> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
> >>> Hi Frank,
> >>>
> >>>> On Feb 19, 2015, at 18:48 , Frank Rowand <frowand.list@gmail.com> wrote:
> >>>>
> >>>> On 2/19/2015 6:29 AM, Pantelis Antoniou wrote:
> >>>>> Hi Mark,
> >>>>>
> >>>>>> On Feb 18, 2015, at 19:31 , Mark Rutland <mark.rutland@arm.com> wrote:
> >>>>>>
> >>>>>>>>> +While this may in theory work, in practice it is very cumbersome
> >>>>>>>>> +for the following reasons:
> >>>>>>>>> +
> >>>>>>>>> +1. The act of selecting a different boot device tree blob requires
> >>>>>>>>> +a reasonably advanced bootloader with some kind of configuration or
> >>>>>>>>> +scripting capabilities. Sadly this is not the case many times, the
> >>>>>>>>> +bootloader is extremely dumb and can only use a single dt blob.
> >>>>>>>>
> >>>>>>>> You can have several bootloader builds, or even a single build with
> >>>>>>>> something like appended DTB to get an appropriate DTB if the same binary
> >>>>>>>> will otherwise work across all variants of a board.
> >>>>>>>>
> >>>>>>>
> >>>>>>> No, the same DTB will not work across all the variants of a board.
> >>>>>>
> >>>>>> I wasn't on about the DTB. I was on about the loader binary, in the case
> >>>>>> the FW/bootloader could be common even if the DTB couldn't.
> >>>>>>
> >>>>>> To some extent there must be a DTB that will work across all variants
> >>>>>> (albeit with limited utility) or the quirk approach wouldn't work…
> >>>>>>
> >>>>>
> >>>>> That’s not correct; the only part of the DTB that needs to be common
> >>>>> is the model property that would allow the quirk detection logic to fire.
> >>>>>
> >>>>> So, there is a base DTB that will work on all variants, but that only means
> >>>>> that it will work only up to the point that the quirk detector method
> >>>>> can work. So while in recommended practice there are common subsets
> >>>>> of the DTB that might work, they might be unsafe.
> >>>>>
> >>>>> For instance on the beaglebone the regulator configuration is different
> >>>>> between white and black, it is imperative you get them right otherwise
> >>>>> you risk board damage.
> >>>>>
> >>>>>>>> So it's not necessarily true that you need a complex bootloader.
> >>>>>>>>
> >>>>>>>
> >>>>>>>>> +2. On many instances boot time is extremely critical; in some cases
> >>>>>>>>> +there are hard requirements like having working video feeds in under
> >>>>>>>>> +2 seconds from power-up. This leaves an extremely small time budget for
> >>>>>>>>> +boot-up, as low as 500ms to kernel entry. The sanest way to get there
> >>>>>>>>> +is by removing the standard bootloader from the normal boot sequence
> >>>>>>>>> +altogether by having a very small boot shim that loads the kernel and
> >>>>>>>>> +immediately jumps to kernel, like falcon-boot mode in u-boot does.
> >>>>>>>>
> >>>>>>>> Given my previous comments above I don't see why this is relevant.
> >>>>>>>> You're already passing _some_ DTB here, so if you can organise for the
> >>>>>>>> board to statically provide a sane DTB that's fine, or you can resort to
> >>>>>>>> appended DTB if it's not possible to update the board configuration.
> >>>>>>>>
> >>>>>>>
> >>>>>>> You’re missing the point. I can’t use the same DTB for each revision of the
> >>>>>>> board. Each board is similar but it’s not identical.
> >>>>>>
> >>>>>> I think you've misunderstood my point. If you program the board with the
> >>>>>> relevant DTB, or use appended DTB, then you will pass the correct DTB to
> >>>>>> the kernel without need for quirks.
> >>>>>>
> >>>>>> I understand that each variant is somewhat incompatible (and hence needs
> >>>>>> its own DTB).
> >>>>>
> >>>>> In theory it might work, in practice this does not. Ludovic mentioned that they
> >>>>> have 27 different DTBs in use at the moment. At a relatively common 60k per DTB
> >>>>> that’s 27x60k = 1.6MB of DTBs, that need to be installed.
> >>>>
> >>>> < snip >
> >>>>
> >>>> Or you can install the correct DTB on the board.  You trust your manufacturing line
> >>>> to install the correct resistors.  You trust your manufacturing line to install the
> >>>> correct kernel version (eg an updated version to resolve a security issue).
> >>>>
> >>>> I thought the DT blob was supposed to follow the same standard that other OS's or
> >>>> bootloaders understood.  Are you willing to break that?  (This is one of those
> >>>> ripples I mentioned in my other emails.)
> >>>>
> >>>
> >>> Trust no-one.
> >>>
> >>> This is one of those things that the kernel community doesn’t understand which makes people
> >>> who push product quite mad.
> >>>
> >>> Engineering a product is not only about meeting customer spec, in order to turn a profit
> >>> the whole endeavor must be engineered as well for manufacturability.
> >>>
> >>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
> >>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
> >>> instead of turning a profit you’re losing money if you only have a few cents of profit
> >>> per unit.
> >>
> >> I'm not installing physical components manually.  Why would I be installing software
> >> manually?  (rhetorical question)
> >>
> > 
> > Because on high volume product runs the flash comes preprogrammed and is soldered as is.
> > 
> > Having a single binary to flash to every revision of the board makes logistics considerably
> > easier.
> > 
> > Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
> > on the flash medium) takes time and is error-prone.
> > 
> > Factory time == money, errors == money.
> > 
> >>>
> >>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
> >>> for a few million units. 
> >>
> >> And you produce a few million units before testing that the first one off the line works?
> >>
> > 
> > The first one off the line works. The rest will get some burn in and functional testing if you’re
> > lucky. In many cases where the product is very cheap it might make financial sense to just ship
> > as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
> > 
> > Hardware is hard :)
> 
> I'm failing to see how this series improves your manufacturing process at all.
> 
> 1. Won't you have to provide the factory with different eeprom images for the
>    White and Black?  You _trust_ them to get that right, or more likely, you
>    have process control procedures in place so that you don't get 1 million Blacks
>    flashed with the White eeprom image.
> 

I am open to hearing your suggestions for our use case, where the CPU card with
the eeprom is manufactured separately from its carier cards.

I assume you might suggest that manufacturing should (re-)program the EEPROM
on the CPU card after it was inserted into the carrier.

Problem is though that the CPU card may be inserted into ts carrier outside
manufacturing, at the final stages of assembly or in product repair. Those
groups would typically not even have the means to (re-)program the eeprom.
Besides, manufacturing would, quite understandably, go ballistic if we demand
that they start programming EEPROMs after insertion into carrier, and no longer
use pre-programmed EEPROMs.

Note that it is not feasible to put the necessary EEPROM onto the carrier
either. Maybe in a later design. Maybe that makes sense, and we will go along
that route at some point. However, forcing a specific hardware solution
due to software limitations, ie lack of ability by core software to handle
the different carries, seems to be not the right decision to make on an
OS level.

In the PCI world it has long since been accepted that the world is not perfect.  
The argument here is pretty much equivalent to demanding that PCI drop its
quirks mechanism, to force the HW manufacturers to finally get it right from
the beginning. I somehow suspect that this won't happen.

Instead of questioning the need for a mechanism such as the one proposed by
Pantelis, I think our time would be better spent arguing if it is the right
mechanism and, if not, how it can be improved.

Thanks,
Guenter

Pantelis Antoniou Feb. 20, 2015, 4:49 p.m. UTC | #35

Hi Peter,

> On Feb 20, 2015, at 18:34 , Peter Hurley <peter@hurleysoftware.com> wrote:
> 
> On 02/20/2015 10:38 AM, Pantelis Antoniou wrote:
>> Hi Peter,
>> 
>>> On Feb 20, 2015, at 17:24 , Peter Hurley <peter@hurleysoftware.com> wrote:
>>> 
>>> On 02/20/2015 10:02 AM, Pantelis Antoniou wrote:
>>>> Hi Peter,
>>>> 
>>>>> On Feb 20, 2015, at 17:00 , Peter Hurley <peter@hurleysoftware.com> wrote:
>>>>> 
>>>>> On 02/20/2015 09:35 AM, Ludovic Desroches wrote:
> 
> [...]
> 
>>>>>> As you said, we can imagine many reasons to have a failure during the
>>>>>> production, having several DTB files will increase the risk.
>>>>> 
>>>>> It's interesting that you don't see the added complexity of open-coding
>>>>> the i2c driver or mixing DTS fragments for different designs as increased risk
>>>>> (for us all).
>>>>> 
>>>>> 
>>>> 
>>>> You don’t have to use it.
>>> 
>>>> diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
>>>> index 5d27dfd..02129e7 100644
>>>> --- a/arch/arm/mach-omap2/Makefile
>>>> +++ b/arch/arm/mach-omap2/Makefile
>>>> @@ -259,6 +259,11 @@ obj-$(CONFIG_MACH_CRANEBOARD)		+= board-am3517crane.o
>>>> 
>>>> obj-$(CONFIG_MACH_SBC3530)		+= board-omap3stalker.o
>>>> 
>>>> +# DT quirks
>>>> +ifneq ($(CONFIG_OF_DYNAMIC),)
>>>> +obj-$(CONFIG_SOC_AM33XX)		+= am33xx-dt-quirks.o
>>>> +endif
>>> 
>>> Won't this automatically be included on my Black that supports DT overlays?
>>> 
>> 
>> Yes it will. It is a grand total of 498 lines of code, and the total size of
>> the code segment is about 2.2K.
>> 
>> You do realize that you’re probably booting a multi-platform kernel on the 
>> black right? Including things for all 2xxx/3xxx and 44xx platforms?
>> For instance:
>> 
>>> ~/ti/kernels/linux-github.git $ wc -l arch/arm/mach-omap2/*44xx*.c
>>>   443 arch/arm/mach-omap2/clockdomains44xx_data.c
>>>   526 arch/arm/mach-omap2/cminst44xx.c
>>>   251 arch/arm/mach-omap2/cpuidle44xx.c
>>>   250 arch/arm/mach-omap2/dpll44xx.c
>>>  4864 arch/arm/mach-omap2/omap_hwmod_44xx_data.c
>>>   295 arch/arm/mach-omap2/pm44xx.c
>>>   358 arch/arm/mach-omap2/powerdomains44xx_data.c
>>>    62 arch/arm/mach-omap2/prcm_mpu44xx.c
>>>   770 arch/arm/mach-omap2/prm44xx.c
>>>   210 arch/arm/mach-omap2/prminst44xx.c
>>>   117 arch/arm/mach-omap2/vc44xx_data.c
>>>   130 arch/arm/mach-omap2/voltagedomains44xx_data.c
>>>   104 arch/arm/mach-omap2/vp44xx_data.c
>>>  8380 total
>> 
>> I bet those things are far larger than 2.2K. And I also bet that in the
>> tradeoff analysis that the board maintainer did things came down to 
>> increasing complexity so that he can consolidate the kernels for all the
>> other platforms he has to support besides the black.
> 
> Not that it really matters, but I'm not using any of that.
> 
> 
>>>> Some people really do though. As for increased risk
>>>> I expect to see arguments instead of a statement.
>>> 
>>> No one is wasting your time with random arguments. Please keep your tone civil.
>>> 
>> 
>> A statement like 'increasing risk for all of us' is very open ended. What is
>> the risk? How much of it exists?
> 
> My point was simply that this trades reduced complexity in one area
> with increased complexity in another area.
> 
> For you, that trade-off is worth it, but for others, not so much.
> 

Of course and that’s the point. No-one is going to remove the vanilla
DTSs for the beaglebone black. If you don’t want to use this capability
there won’t be any impact.

The EEPROM probe is only triggered by the presence of the quirk node,
so if you don’t boot with a DTB that contains it there is no change
in the boot process.

> FWIW, I agree that some mechanism is required to support the other
> use cases. I just don't think ease of manufacturing, when the
> submit configuration is the BeagleBone, is where I would hang my hat.
> 

The beaglebone is just an open source friendly platform that I can
demonstrate the capability.

> 
>> If I offended you I’m really sorry though, I meant no such thing.
> 
> In re-reading it, I realize I shouldn't have taken offense. Thanks anyway
> for the apology.
> 
> Regards,
> Peter Hurley

Regards

— Pantelis

Rob Herring Feb. 20, 2015, 5:30 p.m. UTC | #36

On Fri, Feb 20, 2015 at 8:35 AM, Ludovic Desroches
<ludovic.desroches@atmel.com> wrote:
> On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
>> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>> >
>> >> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
>> >>
>> >> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>> >>> Hi Frank,

[...]

>> >>> This is one of those things that the kernel community doesn’t understand which makes people
>> >>> who push product quite mad.
>> >>>
>> >>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>> >>> the whole endeavor must be engineered as well for manufacturability.
>> >>>
>> >>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>> >>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>> >>> instead of turning a profit you’re losing money if you only have a few cents of profit
>> >>> per unit.
>> >>
>> >> I'm not installing physical components manually.  Why would I be installing software
>> >> manually?  (rhetorical question)
>> >>
>> >
>> > Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>> >
>> > Having a single binary to flash to every revision of the board makes logistics considerably
>> > easier.
>> >
>> > Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
>> > on the flash medium) takes time and is error-prone.
>> >
>> > Factory time == money, errors == money.
>> >
>> >>>
>> >>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>> >>> for a few million units.
>> >>
>> >> And you produce a few million units before testing that the first one off the line works?
>> >>
>> >
>> > The first one off the line works. The rest will get some burn in and functional testing if you’re
>> > lucky. In many cases where the product is very cheap it might make financial sense to just ship
>> > as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
>> >
>> > Hardware is hard :)
>>
>> I'm failing to see how this series improves your manufacturing process at all.
>>
>> 1. Won't you have to provide the factory with different eeprom images for the
>>    White and Black?  You _trust_ them to get that right, or more likely, you
>>    have process control procedures in place so that you don't get 1 million Blacks
>>    flashed with the White eeprom image.
>>
>> 2. The White and Black use different memory technology so it's not as if the
>>    eMMC from the Black will end up on the White SMT line (or vice versa).
>>
>> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>>    for the White were accidentally assembled with the first 50,000 Blacks; at
>>    that point you're losing a lot more than a few cents of profit. And that has
>>    nothing to do with what image you provided.
>>
>
> As you said, we can imagine many reasons to have a failure during the
> production, having several DTB files will increase the risk.

Then package them as a single file. You can even use DT to do that.
See u-boot FIT image.

Rob

Pantelis Antoniou Feb. 20, 2015, 5:37 p.m. UTC | #37

Hi Rob,

> On Feb 20, 2015, at 19:30 , Rob Herring <robherring2@gmail.com> wrote:
> 
> On Fri, Feb 20, 2015 at 8:35 AM, Ludovic Desroches
> <ludovic.desroches@atmel.com> wrote:
>> On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
>>> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
>>>> 
>>>>> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
>>>>> 
>>>>> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
>>>>>> Hi Frank,
> 
> [...]
> 
>>>>>> This is one of those things that the kernel community doesn’t understand which makes people
>>>>>> who push product quite mad.
>>>>>> 
>>>>>> Engineering a product is not only about meeting customer spec, in order to turn a profit
>>>>>> the whole endeavor must be engineered as well for manufacturability.
>>>>>> 
>>>>>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
>>>>>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
>>>>>> instead of turning a profit you’re losing money if you only have a few cents of profit
>>>>>> per unit.
>>>>> 
>>>>> I'm not installing physical components manually.  Why would I be installing software
>>>>> manually?  (rhetorical question)
>>>>> 
>>>> 
>>>> Because on high volume product runs the flash comes preprogrammed and is soldered as is.
>>>> 
>>>> Having a single binary to flash to every revision of the board makes logistics considerably
>>>> easier.
>>>> 
>>>> Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
>>>> on the flash medium) takes time and is error-prone.
>>>> 
>>>> Factory time == money, errors == money.
>>>> 
>>>>>> 
>>>>>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
>>>>>> for a few million units.
>>>>> 
>>>>> And you produce a few million units before testing that the first one off the line works?
>>>>> 
>>>> 
>>>> The first one off the line works. The rest will get some burn in and functional testing if you’re
>>>> lucky. In many cases where the product is very cheap it might make financial sense to just ship
>>>> as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
>>>> 
>>>> Hardware is hard :)
>>> 
>>> I'm failing to see how this series improves your manufacturing process at all.
>>> 
>>> 1. Won't you have to provide the factory with different eeprom images for the
>>>   White and Black?  You _trust_ them to get that right, or more likely, you
>>>   have process control procedures in place so that you don't get 1 million Blacks
>>>   flashed with the White eeprom image.
>>> 
>>> 2. The White and Black use different memory technology so it's not as if the
>>>   eMMC from the Black will end up on the White SMT line (or vice versa).
>>> 
>>> 3  For that matter, why wouldn't you worry that all the microSD cards intended
>>>   for the White were accidentally assembled with the first 50,000 Blacks; at
>>>   that point you're losing a lot more than a few cents of profit. And that has
>>>   nothing to do with what image you provided.
>>> 
>> 
>> As you said, we can imagine many reasons to have a failure during the
>> production, having several DTB files will increase the risk.
> 
> Then package them as a single file. You can even use DT to do that.
> See u-boot FIT image.
> 

In the ideal case there is no u-boot, and no bootloader.

Packaging 27 difference DTBs, as in the Atmel people case, differing in only a few 
properties seems a waste of space, no?  

We keep on dancing around the issue, namely that DT does not have a quirk/variant
mechanism. I feel that it is a glaring omission. We can’t keep shoveling crap over
the fence to firmware and expect it to get buried there.

> Rob

Regards

— Pantelis

Peter Hurley Feb. 20, 2015, 6:09 p.m. UTC | #38

Hi Guenter,

On 02/20/2015 11:47 AM, Guenter Roeck wrote:

[...]

> I am open to hearing your suggestions for our use case, where the CPU card with
> the eeprom is manufactured separately from its carier cards.

I think your use case may be more compelling than two flavors of Beaglebone
(one of which is pretty much a dead stick), but it's also less clear what your
design constraints are (not that I really want to know, 'cause I don't).

But the logical extension of this is an N-way dtb that supports unrelated SOCs,
and I think most would agree that's not an acceptable outcome.

My thought was that every design that can afford an EEPROM to probe can afford
a bootloader to select the appropriate dtb, and can afford the extra space
required for multiple dtbs.

I'm not naysaying; I just want to elicit enough information so the community
can make informed decisions.

> I assume you might suggest that manufacturing should (re-)program the EEPROM
> on the CPU card after it was inserted into the carrier.
> 
> Problem is though that the CPU card may be inserted into ts carrier outside
> manufacturing, at the final stages of assembly or in product repair. Those
> groups would typically not even have the means to (re-)program the eeprom.
> Besides, manufacturing would, quite understandably, go ballistic if we demand
> that they start programming EEPROMs after insertion into carrier, and no longer
> use pre-programmed EEPROMs.

I agree; that would be The Wrong Way.

> Note that it is not feasible to put the necessary EEPROM onto the carrier
> either. Maybe in a later design. Maybe that makes sense, and we will go along
> that route at some point. However, forcing a specific hardware solution
> due to software limitations, ie lack of ability by core software to handle
> the different carries, seems to be not the right decision to make on an
> OS level.

Agreed; hardware is what it is.

> In the PCI world it has long since been accepted that the world is not perfect.  
> The argument here is pretty much equivalent to demanding that PCI drop its
> quirks mechanism, to force the HW manufacturers to finally get it right from
> the beginning. I somehow suspect that this won't happen.

I was thinking back to the introductions of fast DEVSEL# and AGP :(

> Instead of questioning the need for a mechanism such as the one proposed by
> Pantelis, I think our time would be better spent arguing if it is the right
> mechanism and, if not, how it can be improved.

My thoughts exactly. Apologies if something I wrote came across as
"You Shall Not Pass" :)

One issue seems to be the moving target that is the compelling use case(s).
The initial submission implied it was the Beaglebone, which comes with
4GB eMMC/microSD so naturally the argument for space-savings with DTBs doesn't
fly. That has since been clarified so no need to rehash that.

Regards,
Peter Hurley

Guenter Roeck Feb. 20, 2015, 6:48 p.m. UTC | #39

On Fri, Feb 20, 2015 at 01:09:58PM -0500, Peter Hurley wrote:
> Hi Guenter,
> 
> On 02/20/2015 11:47 AM, Guenter Roeck wrote:
> 
> [...]
> 
> > I am open to hearing your suggestions for our use case, where the CPU card with
> > the eeprom is manufactured separately from its carier cards.
> 
> I think your use case may be more compelling than two flavors of Beaglebone
> (one of which is pretty much a dead stick), but it's also less clear what your
> design constraints are (not that I really want to know, 'cause I don't).
> 
> But the logical extension of this is an N-way dtb that supports unrelated SOCs,
> and I think most would agree that's not an acceptable outcome.
> 
With this logic you can pretty much refuse to accept anything new, anywhere.
Including everything old, for that matter.

Food can be abused to poison people, therefore no one should be permitted to
distribute food. Houses can be abused to falsely imprison people, therefore
no one should live in houses. And don't even start talking about guns.
Everything can be abused, therefore we should not do anything.

Discussions about possible abuse are for sure valid, and reducing the potential
for abuse is a worthy goal, but not as argument to reject a solution for an
existing and real roblem outright.

> My thought was that every design that can afford an EEPROM to probe can afford
> a bootloader to select the appropriate dtb, and can afford the extra space
> required for multiple dtbs.
> 
There are many more use cases where this is simply not the case. Another one,
for example, is a system where the devicetree is loaded through u-boot using
tftp. In this case, u-boot would have some information about the hardware
to request the correct devicetree file, but it may not know about hardware
variants.

Sure, one could solve that problem by making u-boot aware of such variants
and maintaining a large number of dtb files as you suggest. That means,
however, that there would be that need to maintain all those dtb files
just to address minor differences, and having modify every piece of the
system for each new variant. In essence you put a lot of burden on pretty
much everyone from software to manufacturing to testing, plus possibly
hardware, just for the purpose of rejecting a relatively simple solution
for the problem Pantelis' patch set is trying to address.

Ultimately, no matter what the kernel community ends up doing, Pantelis'
solution or something similar _will_ find its way into our system.
I would very much prefer to have the code upstream, but we will carry
his patches along in our vendor branch if necessary. The functionality
and its benefits for our development, manufacturing, and testing process
are way too valuable to ignore, and it actually solves problems for which
we don't have an acceptable solution today. I would be quite surprised
if other vendors would not end up doing the same.

Guenter

Ludovic Desroches Feb. 23, 2015, 7 a.m. UTC | #40

Hi Rob,

On Fri, Feb 20, 2015 at 11:30:12AM -0600, Rob Herring wrote:
> On Fri, Feb 20, 2015 at 8:35 AM, Ludovic Desroches
> <ludovic.desroches@atmel.com> wrote:
> > On Fri, Feb 20, 2015 at 09:21:38AM -0500, Peter Hurley wrote:
> >> On 02/19/2015 12:38 PM, Pantelis Antoniou wrote:
> >> >
> >> >> On Feb 19, 2015, at 19:30 , Frank Rowand <frowand.list@gmail.com> wrote:
> >> >>
> >> >> On 2/19/2015 9:00 AM, Pantelis Antoniou wrote:
> >> >>> Hi Frank,
> 
> [...]
> 
> >> >>> This is one of those things that the kernel community doesn’t understand which makes people
> >> >>> who push product quite mad.
> >> >>>
> >> >>> Engineering a product is not only about meeting customer spec, in order to turn a profit
> >> >>> the whole endeavor must be engineered as well for manufacturability.
> >> >>>
> >> >>> Yes, you can always manually install files in the bootloader. For 1 board no problem.
> >> >>> For 10 doable. For 100 I guess you can hire an extra guy. For 1 million? Guess what,
> >> >>> instead of turning a profit you’re losing money if you only have a few cents of profit
> >> >>> per unit.
> >> >>
> >> >> I'm not installing physical components manually.  Why would I be installing software
> >> >> manually?  (rhetorical question)
> >> >>
> >> >
> >> > Because on high volume product runs the flash comes preprogrammed and is soldered as is.
> >> >
> >> > Having a single binary to flash to every revision of the board makes logistics considerably
> >> > easier.
> >> >
> >> > Having to boot and tweak the bootloader settings to select the correct dtb (even if it’s present
> >> > on the flash medium) takes time and is error-prone.
> >> >
> >> > Factory time == money, errors == money.
> >> >
> >> >>>
> >> >>> No knobs to tweak means no knobs to break. And a broken knob can have pretty bad consequences
> >> >>> for a few million units.
> >> >>
> >> >> And you produce a few million units before testing that the first one off the line works?
> >> >>
> >> >
> >> > The first one off the line works. The rest will get some burn in and functional testing if you’re
> >> > lucky. In many cases where the product is very cheap it might make financial sense to just ship
> >> > as is and deal with recalls, if you’re reasonably happy after a little bit of statistical sampling.
> >> >
> >> > Hardware is hard :)
> >>
> >> I'm failing to see how this series improves your manufacturing process at all.
> >>
> >> 1. Won't you have to provide the factory with different eeprom images for the
> >>    White and Black?  You _trust_ them to get that right, or more likely, you
> >>    have process control procedures in place so that you don't get 1 million Blacks
> >>    flashed with the White eeprom image.
> >>
> >> 2. The White and Black use different memory technology so it's not as if the
> >>    eMMC from the Black will end up on the White SMT line (or vice versa).
> >>
> >> 3  For that matter, why wouldn't you worry that all the microSD cards intended
> >>    for the White were accidentally assembled with the first 50,000 Blacks; at
> >>    that point you're losing a lot more than a few cents of profit. And that has
> >>    nothing to do with what image you provided.
> >>
> >
> > As you said, we can imagine many reasons to have a failure during the
> > production, having several DTB files will increase the risk.
> 
> Then package them as a single file. You can even use DT to do that.
> See u-boot FIT image.
> 
> Rob

It is acualyy what we did but we are not happy with this solution
because as said previously we rely on U-Boot and on dts/dtsi side we
have too many files.

Regards

Ludovic

Ludovic Desroches Feb. 23, 2015, 7:30 a.m. UTC | #41

On Fri, Feb 20, 2015 at 10:48:18AM -0800, Guenter Roeck wrote:
> On Fri, Feb 20, 2015 at 01:09:58PM -0500, Peter Hurley wrote:
> > Hi Guenter,
> > 
> > On 02/20/2015 11:47 AM, Guenter Roeck wrote:
> > 
> > [...]
> > 
> > > I am open to hearing your suggestions for our use case, where the CPU card with
> > > the eeprom is manufactured separately from its carier cards.
> > 
> > I think your use case may be more compelling than two flavors of Beaglebone
> > (one of which is pretty much a dead stick), but it's also less clear what your
> > design constraints are (not that I really want to know, 'cause I don't).
> > 
> > But the logical extension of this is an N-way dtb that supports unrelated SOCs,
> > and I think most would agree that's not an acceptable outcome.
> > 
> With this logic you can pretty much refuse to accept anything new, anywhere.
> Including everything old, for that matter.
> 
> Food can be abused to poison people, therefore no one should be permitted to
> distribute food. Houses can be abused to falsely imprison people, therefore
> no one should live in houses. And don't even start talking about guns.
> Everything can be abused, therefore we should not do anything.
> 
> Discussions about possible abuse are for sure valid, and reducing the potential
> for abuse is a worthy goal, but not as argument to reject a solution for an
> existing and real roblem outright.
> 
> > My thought was that every design that can afford an EEPROM to probe can afford
> > a bootloader to select the appropriate dtb, and can afford the extra space
> > required for multiple dtbs.
> > 
> There are many more use cases where this is simply not the case. Another one,
> for example, is a system where the devicetree is loaded through u-boot using
> tftp. In this case, u-boot would have some information about the hardware
> to request the correct devicetree file, but it may not know about hardware
> variants.
> 
> Sure, one could solve that problem by making u-boot aware of such variants
> and maintaining a large number of dtb files as you suggest. That means,
> however, that there would be that need to maintain all those dtb files
> just to address minor differences, and having modify every piece of the
> system for each new variant. In essence you put a lot of burden on pretty
> much everyone from software to manufacturing to testing, plus possibly
> hardware, just for the purpose of rejecting a relatively simple solution
> for the problem Pantelis' patch set is trying to address.
> 

Other example that show how it becomes a mess. Tell me if we do that in
a wrong way but I don't think.

We have the following source files:
- a dtsi file for the SoC family
- a dtsi file for SoC variant enabling the devices available
- a dtsi file for the CPU module describing components on this module:
  the ethernet phy, the nand flash, etc.
- a dtsi file for the motherboard
- several dtsi files for different kind of display modules
- many dts file for the final board, it includes the SoC variant, the
  CPU module, the motherboard and one of the display module if needed.

At the end we have something like this (from our github, not all these
boards have been sent to the mainline)

arch/arm/boot/dts/sama5d3.dtsi
arch/arm/boot/dts/sama5d34ek_pda7.dts
arch/arm/boot/dts/sama5d36ek_revc_pda7.dts
arch/arm/boot/dts/sama5d31ek.dts
arch/arm/boot/dts/sama5d34ek_revc.dts
arch/arm/boot/dts/sama5d3xcm.dtsi
arch/arm/boot/dts/sama5d31ek_pda4.dts
arch/arm/boot/dts/sama5d34ek_revc_pda4.dts
arch/arm/boot/dts/sama5d3xdm.dtsi
arch/arm/boot/dts/sama5d31ek_pda7.dts
arch/arm/boot/dts/sama5d34ek_revc_pda7.dts
arch/arm/boot/dts/sama5d3xdm_pda4.dtsi
arch/arm/boot/dts/sama5d31ek_revc.dts
arch/arm/boot/dts/sama5d35ek.dts
arch/arm/boot/dts/sama5d3xdm_pda7.dtsi
arch/arm/boot/dts/sama5d31ek_revc_pda4.dts
arch/arm/boot/dts/sama5d35ek_revc.dts
arch/arm/boot/dts/sama5d3xek.its
arch/arm/boot/dts/sama5d31ek_revc_pda7.dts
arch/arm/boot/dts/sama5d36ek.dts
arch/arm/boot/dts/sama5d3xek_pda4.its
arch/arm/boot/dts/sama5d33ek.dts
arch/arm/boot/dts/sama5d36ek_cmp.dts
arch/arm/boot/dts/sama5d3xek_pda7.its
arch/arm/boot/dts/sama5d33ek_pda4.dts
arch/arm/boot/dts/sama5d36ek_cmp_pins_sleep.dtsi
arch/arm/boot/dts/sama5d3xmb.dtsi
arch/arm/boot/dts/sama5d33ek_pda7.dts
arch/arm/boot/dts/sama5d36ek_hdmi.dts
arch/arm/boot/dts/sama5d3xmb_revc.dtsi
arch/arm/boot/dts/sama5d33ek_revc.dts
arch/arm/boot/dts/sama5d36ek_pda4.dts
arch/arm/boot/dts/sama5d3xmb_revc_isi.dtsi
arch/arm/boot/dts/sama5d33ek_revc_pda4.dts
arch/arm/boot/dts/sama5d36ek_pda7.dts
arch/arm/boot/dts/sama5d4.dtsi
arch/arm/boot/dts/sama5d33ek_revc_pda7.dts
arch/arm/boot/dts/sama5d36ek_revc.dts
arch/arm/boot/dts/sama5d4ek.dts
arch/arm/boot/dts/sama5d34ek.dts
arch/arm/boot/dts/sama5d36ek_revc_cmp.dts
arch/arm/boot/dts/sama5d4ek_hdmi.dts
arch/arm/boot/dts/sama5d34ek_pda4.dts
arch/arm/boot/dts/sama5d36ek_revc_pda4.dts
arch/arm/boot/dts/sama5d4ek_pin_sleep_state.dtsi


> Ultimately, no matter what the kernel community ends up doing, Pantelis'
> solution or something similar _will_ find its way into our system.
> I would very much prefer to have the code upstream, but we will carry
> his patches along in our vendor branch if necessary. The functionality
> and its benefits for our development, manufacturing, and testing process
> are way too valuable to ignore, and it actually solves problems for which
> we don't have an acceptable solution today. I would be quite surprised
> if other vendors would not end up doing the same.

We'll probably do the same. Beaglebone is only an example.

> 
> Guenter


Regards

Ludovic

[2/4] of: DT quirks infrastructure

Commit Message

Comments

Patch