diff mbox

[1/3] dt-bindings: arm/marvell: ABI unstability warning about Marvell 7K/8K

Message ID 1456327007-31008-2-git-send-email-thomas.petazzoni@free-electrons.com (mailing list archive)
State New, archived
Headers show

Commit Message

Thomas Petazzoni Feb. 24, 2016, 3:16 p.m. UTC
Since we are still very early in the support of Marvell Armada 7/8K
chips and those chips are significantly different from earlier 32 bits
Armada SOCs, it is difficult to commit to Device Tree ABI stability at
this point.

Therefore, this commit adds a warning to the Marvell 7K/8K DT binding
documentation to indicate that the bindings are for the moment
unstable.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 .../devicetree/bindings/arm/marvell/armada-7k-8k.txt         | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Mark Rutland Feb. 24, 2016, 5:25 p.m. UTC | #1
Hi,

[adding the device tree list to Cc]

On Wed, Feb 24, 2016 at 04:16:45PM +0100, Thomas Petazzoni wrote:
> Since we are still very early in the support of Marvell Armada 7/8K
> chips and those chips are significantly different from earlier 32 bits
> Armada SOCs, it is difficult to commit to Device Tree ABI stability at
> this point.

I don't see why difference to an existing SoC means that we cannot
guarantee support for this DT. Why does that contribute to instability?

If there's something we're unsure about, we should be seeding the DT
with sufficient information to do the right thing(TM) in future. What
areas in particular do you think are likely to be problematic? Is there
anything in particular that you think that approach doesn't work for?

From my PoV, if we're so uncertain about a DT binding that it cannot be
kept supported, it is a prototype not yet fit for mainline.

> Therefore, this commit adds a warning to the Marvell 7K/8K DT binding
> documentation to indicate that the bindings are for the moment
> unstable.

As with the other thread, I'm strongly opposed to this mentality.

This is really the last thing we want to see for new platforms which are
DT from day one. The messy problems with platform data => DT don't
apply, and I really can't see a compelling reason to not put the effort
in here to do things "properly".

Thanks.
Mark.

> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  .../devicetree/bindings/arm/marvell/armada-7k-8k.txt         | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/arm/marvell/armada-7k-8k.txt b/Documentation/devicetree/bindings/arm/marvell/armada-7k-8k.txt
> index df98a9c..758b3ae 100644
> --- a/Documentation/devicetree/bindings/arm/marvell/armada-7k-8k.txt
> +++ b/Documentation/devicetree/bindings/arm/marvell/armada-7k-8k.txt
> @@ -1,6 +1,18 @@
>  Marvell Armada 7K/8K Platforms Device Tree Bindings
>  ---------------------------------------------------
>  
> +Work in progress statement:
> +
> +Device tree files and bindings applying to Marvell Armada 7K and 8K
> +SoCs and boards are considered "unstable". Any Marvell Armada 7K/8K
> +device tree binding may change at any time. Be sure to use a device
> +tree binary and a kernel image generated from the same source tree.
> +
> +Please refer to Documentation/devicetree/bindings/ABI.txt for a
> +definition of a stable binding/ABI.
> +
> +---------------------------------------------------------------
> +
>  Boards using a SoC of the Marvell Armada 7K or 8K families must carry
>  the following root node property:
>  
> -- 
> 2.6.4
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Thomas Petazzoni Feb. 24, 2016, 8:10 p.m. UTC | #2
Hello,

On Wed, 24 Feb 2016 17:25:46 +0000, Mark Rutland wrote:

> On Wed, Feb 24, 2016 at 04:16:45PM +0100, Thomas Petazzoni wrote:
> > Since we are still very early in the support of Marvell Armada 7/8K
> > chips and those chips are significantly different from earlier 32 bits
> > Armada SOCs, it is difficult to commit to Device Tree ABI stability at
> > this point.
> 
> I don't see why difference to an existing SoC means that we cannot
> guarantee support for this DT. Why does that contribute to instability?
> 
> If there's something we're unsure about, we should be seeding the DT
> with sufficient information to do the right thing(TM) in future. What
> areas in particular do you think are likely to be problematic? Is there
> anything in particular that you think that approach doesn't work for?

At this point, we have incomplete information about the HW capabilities
and register layout. Our past experience has shown that when we start
supporting SoC that early, 

> From my PoV, if we're so uncertain about a DT binding that it cannot be
> kept supported, it is a prototype not yet fit for mainline.

Are you really serious when you say that ?

The Linux kernel community has been encouraging hardware vendor for
*years* to get their code merged as early as possible. And now that
they are doing it, you are telling them that they shouldn't submit
their code to mainline ?

If that's what you want, then sorry, but I don't buy this argument, and
I will continue to encourage HW vendors to submit their code early,
even if it means that indeed in the early stages, things remain a bit
in-flux and need to be refined progressively.

You seem to really be in your ivory tower and not seeing the reality of
an SoC development and how to push the support for it in mainline as
early as possible, so that the BSPs shipped to actual customers are as
close to mainline as possible. Together with the Marvell engineers, we
are doing a huge amount of work to ensure that this vendor BSP is as
close as mainline as possible. And you're basically telling us we
should this effort ?

Come on, be serious.

> > Therefore, this commit adds a warning to the Marvell 7K/8K DT binding
> > documentation to indicate that the bindings are for the moment
> > unstable.
> 
> As with the other thread, I'm strongly opposed to this mentality.

Many other platforms have made this choice: Atmel, Marvell Berlin, and
probably others.

I think it should be up to each platform vendor to decide whether they
care about DT stability or not, and at which point in time in the life
cycle of their platform they want to claim the DT bindings as stable.
You might believe that DT bindings are stable, but they are not. They
change constantly, in very subtle way, and nobody ever tests DT from
kernel version N-x with kernel version N. On any real-life platform, it
simply doesn't work, unless that platform is 5+ years old.

I made a talk at ELC about DT stability being a fairy tale, and I
remember very well Rob Herring coming to me during the social event
afterwards, telling me that it has never been said that all platforms
should conform strictly to this DT stability requirement.

So please get to an agreement between DT binding maintainers. And stop
saying this ridiculous statement that HW vendors should stop submitting
their code to the mainline.

Best regards,

Thomas
Rob Herring Feb. 24, 2016, 10:07 p.m. UTC | #3
On Wed, Feb 24, 2016 at 2:10 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> Hello,
>
> On Wed, 24 Feb 2016 17:25:46 +0000, Mark Rutland wrote:
>
>> On Wed, Feb 24, 2016 at 04:16:45PM +0100, Thomas Petazzoni wrote:
>> > Since we are still very early in the support of Marvell Armada 7/8K
>> > chips and those chips are significantly different from earlier 32 bits
>> > Armada SOCs, it is difficult to commit to Device Tree ABI stability at
>> > this point.
>>
>> I don't see why difference to an existing SoC means that we cannot
>> guarantee support for this DT. Why does that contribute to instability?
>>
>> If there's something we're unsure about, we should be seeding the DT
>> with sufficient information to do the right thing(TM) in future. What
>> areas in particular do you think are likely to be problematic? Is there
>> anything in particular that you think that approach doesn't work for?
>
> At this point, we have incomplete information about the HW capabilities
> and register layout. Our past experience has shown that when we start
> supporting SoC that early,

that early ...?

You should fix the incomplete information problem. I pushed back on
this, you got more information and already the binding is closer to
reality.

>> From my PoV, if we're so uncertain about a DT binding that it cannot be
>> kept supported, it is a prototype not yet fit for mainline.
>
> Are you really serious when you say that ?
>
> The Linux kernel community has been encouraging hardware vendor for
> *years* to get their code merged as early as possible. And now that
> they are doing it, you are telling them that they shouldn't submit
> their code to mainline ?

Mark didn't say don't submit. First, there is value in submitting even
if not accepted immediately and you have to carry some patches for
some time. It was also suggested that you err on the side of less
information in DT if things are uncertain and you have not done that.

> If that's what you want, then sorry, but I don't buy this argument, and
> I will continue to encourage HW vendors to submit their code early,
> even if it means that indeed in the early stages, things remain a bit
> in-flux and need to be refined progressively.
>
> You seem to really be in your ivory tower and not seeing the reality of
> an SoC development and how to push the support for it in mainline as
> early as possible, so that the BSPs shipped to actual customers are as
> close to mainline as possible. Together with the Marvell engineers, we
> are doing a huge amount of work to ensure that this vendor BSP is as
> close as mainline as possible. And you're basically telling us we
> should this effort ?
>
> Come on, be serious.
>
>> > Therefore, this commit adds a warning to the Marvell 7K/8K DT binding
>> > documentation to indicate that the bindings are for the moment
>> > unstable.
>>
>> As with the other thread, I'm strongly opposed to this mentality.
>
> Many other platforms have made this choice: Atmel, Marvell Berlin, and
> probably others.

Atmel has had a year now with that statement. Time to remove it IMO.

I think we need to only allow this at a binding level and not at SoC
family level and also require a timeline to marking stable.

> I think it should be up to each platform vendor to decide whether they
> care about DT stability or not, and at which point in time in the life
> cycle of their platform they want to claim the DT bindings as stable.
> You might believe that DT bindings are stable, but they are not. They
> change constantly, in very subtle way, and nobody ever tests DT from
> kernel version N-x with kernel version N. On any real-life platform, it
> simply doesn't work, unless that platform is 5+ years old.
>
> I made a talk at ELC about DT stability being a fairy tale, and I
> remember very well Rob Herring coming to me during the social event
> afterwards, telling me that it has never been said that all platforms
> should conform strictly to this DT stability requirement.

True, but it is still an exception and I never said I wouldn't push back.

> So please get to an agreement between DT binding maintainers. And stop
> saying this ridiculous statement that HW vendors should stop submitting
> their code to the mainline.

You want us to agree, then you won't like the answer. Bindings must be
stable. I'll allow exceptions to that, but not without push back.

In general, if there is disagreement about whether stability is
required, then it is required. See the recent sunxi discussion. That's
more between users and platform maintainers though.

Rob
Thomas Petazzoni Feb. 25, 2016, 8:09 a.m. UTC | #4
Hello Rob,

On Wed, 24 Feb 2016 16:07:04 -0600, Rob Herring wrote:

> You should fix the incomplete information problem. I pushed back on
> this, you got more information and already the binding is closer to
> reality.

"You should fix the incomplete information problem". You seem to think
that it is a simple thing that can be fixed with a magic wand. But it's
not.

Either because the internal processes are complicated, or simply
because the Linux kernel support is done without cooperation from the
HW vendor (it's not the case of this Marvell platform, but it's the
case of many other platforms).

> Mark didn't say don't submit. First, there is value in submitting even
> if not accepted immediately and you have to carry some patches for
> some time. It was also suggested that you err on the side of less
> information in DT if things are uncertain and you have not done that.

Submitting without merging is useless. The point of submitting is to
get the code merged, to reduce the amount of out-of-tree patches we
have to carry, and to allow users of the platform to simply run
mainline on their platform.

So this proposal really doesn't make any sense. Just like Mark initial
statement of not submitting code so early.

> > So please get to an agreement between DT binding maintainers. And stop
> > saying this ridiculous statement that HW vendors should stop submitting
> > their code to the mainline.
> 
> You want us to agree, then you won't like the answer. Bindings must be
> stable. I'll allow exceptions to that, but not without push back.
> 
> In general, if there is disagreement about whether stability is
> required, then it is required. See the recent sunxi discussion. That's
> more between users and platform maintainers though.

Do you realize that this all DT binding stuff is today the *biggest* to
getting HW support in the Linux kernel? It has become more complicated
to merge a 4 properties DT binding than to merge multiple thousands of
lines of driver code. Simply because of this DT stability fairy tale.

In the mean time, I'm withdrawing this patch. I'll let Mark and you in
your Alice in Wonderland world, where you believe that DT bindings are
stable while the reality is that they are not and that their stability
is actually unneeded for most platforms.

Gregory, can you skip patch 1/3, and merge patches 2/3 and 3/3, once
the new clock DT binding has been approved.

Best regards,

Thomas
Gregory CLEMENT Feb. 25, 2016, 3:04 p.m. UTC | #5
Hi Thomas,
 
 On jeu., févr. 25 2016, Thomas Petazzoni <thomas.petazzoni@free-electrons.com> wrote:

>
> Gregory, can you skip patch 1/3, and merge patches 2/3 and 3/3, once
> the new clock DT binding has been approved.

I applied patch 2 and 3 on mvebu/dt64 and I amended the title by using
"arm64: dts : marvell:" instead of "arm64: marvell:" as requested by
Olof on ly first mvebu/dt64 pull request.

Thanks,

Gregory

>
> Best regards,
>
> Thomas
> -- 
> Thomas Petazzoni, CTO, Free Electrons
> Embedded Linux, Kernel and Android engineering
> http://free-electrons.com
Thomas Petazzoni Feb. 25, 2016, 3:27 p.m. UTC | #6
Hello,

On Thu, 25 Feb 2016 16:04:49 +0100, Gregory CLEMENT wrote:

> > Gregory, can you skip patch 1/3, and merge patches 2/3 and 3/3, once
> > the new clock DT binding has been approved.
> 
> I applied patch 2 and 3 on mvebu/dt64 and I amended the title by using
> "arm64: dts : marvell:" instead of "arm64: marvell:" as requested by
> Olof on ly first mvebu/dt64 pull request.

Thanks, I'll try to remember this convention for the next patches!

Thomas
Mark Rutland Feb. 25, 2016, 4:16 p.m. UTC | #7
On Thu, Feb 25, 2016 at 09:09:27AM +0100, Thomas Petazzoni wrote:
> Hello Rob,
> 
> On Wed, 24 Feb 2016 16:07:04 -0600, Rob Herring wrote:
> 
> > You should fix the incomplete information problem. I pushed back on
> > this, you got more information and already the binding is closer to
> > reality.
> 
> "You should fix the incomplete information problem". You seem to think
> that it is a simple thing that can be fixed with a magic wand. But it's
> not.
> 
> Either because the internal processes are complicated, or simply
> because the Linux kernel support is done without cooperation from the
> HW vendor (it's not the case of this Marvell platform, but it's the
> case of many other platforms).

Yes, this is a problem in some cases, and that should be considered in
those cases. There are always shades of grey.

Per the above, that isn't relevant in this case. This is a pretty
black-and-white stand against the usual rules.

> > Mark didn't say don't submit. First, there is value in submitting even
> > if not accepted immediately and you have to carry some patches for
> > some time. It was also suggested that you err on the side of less
> > information in DT if things are uncertain and you have not done that.
> 
> Submitting without merging is useless. The point of submitting is to
> get the code merged, to reduce the amount of out-of-tree patches we
> have to carry, and to allow users of the platform to simply run
> mainline on their platform.

Submitting prototypes and RFCs is the usual way we get things reviewed
early, and to allow maintainers and others to get a feel for things
earlier. Submitting patches _for merging_ when you're not sure about
things and don't want to agree to support them is what's being pushed
back on.

> So this proposal really doesn't make any sense. Just like Mark initial
> statement of not submitting code so early.

As what I said was evidently ambiguous, I'm on about code being _merged_
prior to being ready. Code and bindings should certainly be posted for
review as soon as possible. However, it should be recognised when things
aren't quite ready for mainline.

Even if something's unclear about a device, you can highlight more
general issues (e.g. problematic edge cases in common subsystem code),
and that's possible to have merged even if the binding isn't ready.

If you're unsure about something, but still want it merged, then you
have to commit to maintaining that as far as reasonably possible, even
if it turns out to not be quite right.

> > > So please get to an agreement between DT binding maintainers. And stop
> > > saying this ridiculous statement that HW vendors should stop submitting
> > > their code to the mainline.
> > 
> > You want us to agree, then you won't like the answer. Bindings must be
> > stable. I'll allow exceptions to that, but not without push back.
> > 
> > In general, if there is disagreement about whether stability is
> > required, then it is required. See the recent sunxi discussion. That's
> > more between users and platform maintainers though.
> 
> Do you realize that this all DT binding stuff is today the *biggest* to
> getting HW support in the Linux kernel? It has become more complicated
> to merge a 4 properties DT binding than to merge multiple thousands of
> lines of driver code. 

As times have changed, pain points have moved around.

To some extent that is unavoidable; more up-front effort is required
where crutches we previously relied on are not applicable.

Elsewhere we can certainly do better.

Throwing your hands up and stating "this is unstable, it might change"
is a crutch. It prevents any real solution to the pain points you
encounter, and creates pain points for others. It only provides the
_illusion_ of support.

Deliberately violating the rules and then complaining that the rules
don't work because you don't care for them is a terrible argument.

Thanks,
Mark.
Thomas Petazzoni Feb. 25, 2016, 4:38 p.m. UTC | #8
Hello Mark,

On Thu, 25 Feb 2016 16:16:47 +0000, Mark Rutland wrote:

> > Either because the internal processes are complicated, or simply
> > because the Linux kernel support is done without cooperation from the
> > HW vendor (it's not the case of this Marvell platform, but it's the
> > case of many other platforms).
> 
> Yes, this is a problem in some cases, and that should be considered in
> those cases. There are always shades of grey.

Sure.

> Per the above, that isn't relevant in this case. This is a pretty
> black-and-white stand against the usual rules.

I don't see why. The datasheets have not been completely written yet
for this particular chip. For other chips we've worked on where we
collaborated with the SoC vendor, we never had any datasheet, simply
because the SoC vendor doesn't have any. They have the digital logic
source code, and then tons of spreadsheets or text documents that are
not proper datasheets and that they generally cannot share with
third-parties.

Hence, even when the support for a SoC is being done in collaboration
with the SoC vendor, we don't always have a nice full datasheet that
tells us what all the registers are doing and how they are organized.
We discover things as we go.

Yes, this might be surprising to you, working at ARM, where the
technical documentation is awesome and very detailed. But trust me,
this is *NOT* what you get from many SoC vendors.

> > Submitting without merging is useless. The point of submitting is to
> > get the code merged, to reduce the amount of out-of-tree patches we
> > have to carry, and to allow users of the platform to simply run
> > mainline on their platform.
> 
> Submitting prototypes and RFCs is the usual way we get things reviewed
> early, and to allow maintainers and others to get a feel for things
> earlier. Submitting patches _for merging_ when you're not sure about
> things and don't want to agree to support them is what's being pushed
> back on.

This simply doesn't work. This initial support of a few patches (clock,
basic DT, irqchip, dmaengine) is going to be followed very soon by lots
of other patches to enable more aspects of the SoC. And we should keep
all of those patches out-of-tree, piling up hundreds of out-of-tree
patches ? Not practical at all.

And then when we'll submit them, they will all be accepted in one go,
in one kernel cycle ? Clearly not, so we would have to wait several
kernel cycles, which is clearly not what we want.

Instead, what we want is to submit the basic stuff early, and then
progressively build on top of this basic stuff by merging more and more
features. This way:

 * We don't have to pile up hundreds of out of tree patches;

 * We have a support in the mainline kernel that is progressively
   getting better as we enable more and more feature. We can show that
   4.6 has this small set of features, 4.7 has this slightly extended
   set of features supported, and so on.

So no, we clearly do not want to keep things out of tree.

> > So this proposal really doesn't make any sense. Just like Mark initial
> > statement of not submitting code so early.
> 
> As what I said was evidently ambiguous, I'm on about code being _merged_
> prior to being ready. Code and bindings should certainly be posted for
> review as soon as possible. However, it should be recognised when things
> aren't quite ready for mainline.
> 
> Even if something's unclear about a device, you can highlight more
> general issues (e.g. problematic edge cases in common subsystem code),
> and that's possible to have merged even if the binding isn't ready.
> 
> If you're unsure about something, but still want it merged, then you
> have to commit to maintaining that as far as reasonably possible, even
> if it turns out to not be quite right.

We are perfectly fine with maintaining *code*. And we have been doing
so for several years on Marvell platforms: caring about older
platforms, converting old legacy code and legacy platforms to the
Device Tree, etc.

What we don't want to commit to is to maintain the DT stability
*before* the support for a given device is sufficiently stable/sane.

> > Do you realize that this all DT binding stuff is today the *biggest* to
> > getting HW support in the Linux kernel? It has become more complicated
> > to merge a 4 properties DT binding than to merge multiple thousands of
> > lines of driver code. 
> 
> As times have changed, pain points have moved around.
> 
> To some extent that is unavoidable; more up-front effort is required
> where crutches we previously relied on are not applicable.
> 
> Elsewhere we can certainly do better.
> 
> Throwing your hands up and stating "this is unstable, it might change"
> is a crutch. It prevents any real solution to the pain points you
> encounter, and creates pain points for others. It only provides the
> _illusion_ of support.

Could you please explicit which pain points it creates for others ?

Having unstable DT bindings specific to a platform does not create any
single pain point for anyone.

Why are you talking about "illusion" of support ? Sorry, but with
unstable DT bindings, as long as you use the DT that comes with the
kernel sources, everything works perfectly fine, and is perfectly
supported.

Even Fedora is installing DTBs in a directory that is kernel-version
specific!

Best regards,

Thomas
Mark Rutland Feb. 25, 2016, 5:56 p.m. UTC | #9
Hi,

On Thu, Feb 25, 2016 at 05:38:11PM +0100, Thomas Petazzoni wrote:
> Hello Mark,
> 
> On Thu, 25 Feb 2016 16:16:47 +0000, Mark Rutland wrote:
> 
> > > Either because the internal processes are complicated, or simply
> > > because the Linux kernel support is done without cooperation from the
> > > HW vendor (it's not the case of this Marvell platform, but it's the
> > > case of many other platforms).
> > 
> > Yes, this is a problem in some cases, and that should be considered in
> > those cases. There are always shades of grey.
> 
> Sure.
> 
> > Per the above, that isn't relevant in this case. This is a pretty
> > black-and-white stand against the usual rules.
> 
> I don't see why. The datasheets have not been completely written yet
> for this particular chip. For other chips we've worked on where we
> collaborated with the SoC vendor, we never had any datasheet, simply
> because the SoC vendor doesn't have any. They have the digital logic
> source code, and then tons of spreadsheets or text documents that are
> not proper datasheets and that they generally cannot share with
> third-parties.
> 
> Hence, even when the support for a SoC is being done in collaboration
> with the SoC vendor, we don't always have a nice full datasheet that
> tells us what all the registers are doing and how they are organized.
> We discover things as we go.

So as and when you believe you need to break a binding, we can have a
discussion on that case-by-case basis.

That's different to stating that the usual rules don't apply at all,
which is black-and-white stand I'm referring to.

> > Submitting prototypes and RFCs is the usual way we get things reviewed
> > early, and to allow maintainers and others to get a feel for things
> > earlier. Submitting patches _for merging_ when you're not sure about
> > things and don't want to agree to support them is what's being pushed
> > back on.
> 
> This simply doesn't work. This initial support of a few patches (clock,
> basic DT, irqchip, dmaengine) is going to be followed very soon by lots
> of other patches to enable more aspects of the SoC. And we should keep
> all of those patches out-of-tree, piling up hundreds of out-of-tree
> patches ? Not practical at all.
> 
> And then when we'll submit them, they will all be accepted in one go,
> in one kernel cycle ? Clearly not, so we would have to wait several
> kernel cycles, which is clearly not what we want.
> 
> Instead, what we want is to submit the basic stuff early, and then
> progressively build on top of this basic stuff by merging more and more
> features. This way:
> 
>  * We don't have to pile up hundreds of out of tree patches;

Then _support_ what you have upstreamed already. Either it's ready to be
supported or it is not. If it is, there's no need for DT breakage, and
if it isn't, then it's not ready for mainline. That's all this boils
down to. 

If it must go in, then it must be supported.

>  * We have a support in the mainline kernel that is progressively
>    getting better as we enable more and more feature. We can show that
>    4.6 has this small set of features, 4.7 has this slightly extended
>    set of features supported, and so on.

It's not "progressively getting better", if DTBs are arbitrarily getting
broken.

It's possible to incrementally add support without having to break
existing DTBs.

> > If you're unsure about something, but still want it merged, then you
> > have to commit to maintaining that as far as reasonably possible, even
> > if it turns out to not be quite right.
> 
> We are perfectly fine with maintaining *code*. And we have been doing
> so for several years on Marvell platforms: caring about older
> platforms, converting old legacy code and legacy platforms to the
> Device Tree, etc.

The parsing and handling of the existing binding is code, which you are
arguing you do not need to support.

> What we don't want to commit to is to maintain the DT stability
> *before* the support for a given device is sufficiently stable/sane.

I do not follow what you mean by support being *not sane*. If it isn't
sane, why was it merged?

If it was previously considered sane, and the code was considered
supportable, why is it necessary to break a DT that it supported?

> > > Do you realize that this all DT binding stuff is today the *biggest* to
> > > getting HW support in the Linux kernel? It has become more complicated
> > > to merge a 4 properties DT binding than to merge multiple thousands of
> > > lines of driver code. 
> > 
> > As times have changed, pain points have moved around.
> > 
> > To some extent that is unavoidable; more up-front effort is required
> > where crutches we previously relied on are not applicable.
> > 
> > Elsewhere we can certainly do better.
> > 
> > Throwing your hands up and stating "this is unstable, it might change"
> > is a crutch. It prevents any real solution to the pain points you
> > encounter, and creates pain points for others. It only provides the
> > _illusion_ of support.
> 
> Could you please explicit which pain points it creates for others ?
> 
> Having unstable DT bindings specific to a platform does not create any
> single pain point for anyone.
> 
> Why are you talking about "illusion" of support ? Sorry, but with
> unstable DT bindings, as long as you use the DT that comes with the
> kernel sources, everything works perfectly fine, and is perfectly
> supported.

A few examples:

* Distro installation (in the absence of a stable DT and a consistent
  boot environment, you have no idea how to get a kernel up and
  running). Either your user has to be intimately familiar with the
  platform, or the distro needs a separate release for each and every
  platform.

* Distro maintenance. I just upgraded to a point-release kernel and now
  have to pull down the DTBs. That required the distro to _somehow_
  identify the relevant DTBs, how to install this in the correct
  location such that the firmware and/or bootloader can pick it up, etc.

* Other developers tracking down problems that inevitable crop up on the
  platform, who now have to jump through the hoops above. As the DTBs
  change, so may the user-facing elements and/or kernel behaviour, so
  bisecting is extremely painful (though admittedly possible).

  Notice that in this case, the user has to go back in time for their
  starting point. So it doesn't matter if they're not a user _yet_.

* Portable hypervisors, bootloaders, or other system-level software
  can't add any (DTB-aware) support for a platform until the kernel
  support is considered "sane" per your argument above. That time is an
  arbitrary unknown in the future (which may be at infinity), so they
  are either indefinitely delayed or end up having attempt to have
  stable support for a variety of unstable bindings anyway, which may or
  may not be possible depending on how bindings got broken.

These may not affect _you_ directly, but they do hinder realistic
user-facing support.

> Even Fedora is installing DTBs in a directory that is kernel-version
> specific!

This is a _symptom_ of the problem. Read this as:

	Even Fedora _have to_ install DTBs in a directory that is
	kernel-version specific!

Notice how that does not sound great. Distros are being _forced_ to do
this because stuff gets broken. As covered above, this doesn't always
work anyway, so it's really not an argument in favour of breaking
things.

Thanks,
Mark.
Maxime Ripard Feb. 25, 2016, 9:09 p.m. UTC | #10
Hi,

On Thu, Feb 25, 2016 at 04:16:47PM +0000, Mark Rutland wrote:
> On Thu, Feb 25, 2016 at 09:09:27AM +0100, Thomas Petazzoni wrote:
> > Hello Rob,
> > 
> > On Wed, 24 Feb 2016 16:07:04 -0600, Rob Herring wrote:
> > 
> > > You should fix the incomplete information problem. I pushed back on
> > > this, you got more information and already the binding is closer to
> > > reality.
> > 
> > "You should fix the incomplete information problem". You seem to think
> > that it is a simple thing that can be fixed with a magic wand. But it's
> > not.
> > 
> > Either because the internal processes are complicated, or simply
> > because the Linux kernel support is done without cooperation from the
> > HW vendor (it's not the case of this Marvell platform, but it's the
> > case of many other platforms).
> 
> Yes, this is a problem in some cases, and that should be considered in
> those cases. There are always shades of grey.
> 
> Per the above, that isn't relevant in this case. This is a pretty
> black-and-white stand against the usual rules.

Unless your stance, at least on the sunxi discussion was pretty much
black-and-white too. You basically said that we were supposed to
maintain a stable ABI starting right now, even though we know for a
fact that we have bindings that are simply not working.

> > > Mark didn't say don't submit. First, there is value in submitting even
> > > if not accepted immediately and you have to carry some patches for
> > > some time. It was also suggested that you err on the side of less
> > > information in DT if things are uncertain and you have not done that.
> > 
> > Submitting without merging is useless. The point of submitting is to
> > get the code merged, to reduce the amount of out-of-tree patches we
> > have to carry, and to allow users of the platform to simply run
> > mainline on their platform.
> 
> Submitting prototypes and RFCs is the usual way we get things reviewed
> early, and to allow maintainers and others to get a feel for things
> earlier. Submitting patches _for merging_ when you're not sure about
> things and don't want to agree to support them is what's being pushed
> back on.

Just to make sure we're on the same page, are you saying that we must
not merge code unless we're 100% sure of how it works, in its
entirety?

> > So this proposal really doesn't make any sense. Just like Mark initial
> > statement of not submitting code so early.
> 
> As what I said was evidently ambiguous, I'm on about code being _merged_
> prior to being ready. Code and bindings should certainly be posted for
> review as soon as possible. However, it should be recognised when things
> aren't quite ready for mainline.
> 
> Even if something's unclear about a device, you can highlight more
> general issues (e.g. problematic edge cases in common subsystem code),
> and that's possible to have merged even if the binding isn't ready.
> 
> If you're unsure about something, but still want it merged, then you
> have to commit to maintaining that as far as reasonably possible, even
> if it turns out to not be quite right.

Except that *you* are forcing us to maintain code to deal with a use
case that simply doesn't happen, which is the definition of dead code.

Would you be ok to be forced to maintain dead code?

You've been talking about theorical use-cases, do you have any real
world one where someone actually made a case for this DT as an ABI
non-sense?

I mean, we've had the U-Boot sunxi maintainer saying they didn't care,
and actually arguing against it.

As it turns out, he's also involved in the Fedora port. If it was a
pain point for them, don't you think he would have raised it?

The only ones that seem to have a problem with it are the one not
involved in it in any way.

> 
> > > > So please get to an agreement between DT binding maintainers. And stop
> > > > saying this ridiculous statement that HW vendors should stop submitting
> > > > their code to the mainline.
> > > 
> > > You want us to agree, then you won't like the answer. Bindings must be
> > > stable. I'll allow exceptions to that, but not without push back.
> > > 
> > > In general, if there is disagreement about whether stability is
> > > required, then it is required. See the recent sunxi discussion. That's
> > > more between users and platform maintainers though.
> > 
> > Do you realize that this all DT binding stuff is today the *biggest* to
> > getting HW support in the Linux kernel? It has become more complicated
> > to merge a 4 properties DT binding than to merge multiple thousands of
> > lines of driver code. 
> 
> As times have changed, pain points have moved around.
> 
> To some extent that is unavoidable; more up-front effort is required
> where crutches we previously relied on are not applicable.
> 
> Elsewhere we can certainly do better.
> 
> Throwing your hands up and stating "this is unstable, it might change"
> is a crutch. It prevents any real solution to the pain points you
> encounter, and creates pain points for others. It only provides the
> _illusion_ of support.

https://kernelci.org/soc/mvebu/
https://kernelci.org/soc/sunxi/

I guess we should become magicians.

Maxime
Thomas Petazzoni Feb. 26, 2016, 8:55 a.m. UTC | #11
Hello Mark,

On Thu, 25 Feb 2016 17:56:04 +0000, Mark Rutland wrote:

> > Hence, even when the support for a SoC is being done in collaboration
> > with the SoC vendor, we don't always have a nice full datasheet that
> > tells us what all the registers are doing and how they are organized.
> > We discover things as we go.
> 
> So as and when you believe you need to break a binding, we can have a
> discussion on that case-by-case basis.
> 
> That's different to stating that the usual rules don't apply at all,
> which is black-and-white stand I'm referring to.

This would at least be a somewhat reasonable compromise, if that's your
preference.

However, my proposal had the advantage that potential users of the
platform are *aware* that the DT bindings are in-flux and that they may
change.

With your proposal, there is no such statement, even if we reserve the
right, on case-by-case basis to have a discussion about potentially
breaking the DT bindings.

But if that's OK for you, I'm happy with that as well.

> > Instead, what we want is to submit the basic stuff early, and then
> > progressively build on top of this basic stuff by merging more and more
> > features. This way:
> > 
> >  * We don't have to pile up hundreds of out of tree patches;
> 
> Then _support_ what you have upstreamed already. Either it's ready to be
> supported or it is not. If it is, there's no need for DT breakage, and
> if it isn't, then it's not ready for mainline. That's all this boils
> down to. 
> 
> If it must go in, then it must be supported.

Supporting something has absolutely nothing to do with keeping the DT
bindings unchanged. Supporting something means that I'll watch for
build failures and fix them, I'll watch for bug reports and act on
them, I'll watch on subsystem/infrastructure changes and change my
driver code accordingly. This is what supporting means.

> >  * We have a support in the mainline kernel that is progressively
> >    getting better as we enable more and more feature. We can show that
> >    4.6 has this small set of features, 4.7 has this slightly extended
> >    set of features supported, and so on.
> 
> It's not "progressively getting better", if DTBs are arbitrarily getting
> broken.
> 
> It's possible to incrementally add support without having to break
> existing DTBs.

Except if:

 1/ Your final production chip has some differences from your test chip
    which do affect the DT representation. The kernel community has
    asked for years that HW vendors start submitting code as early as
    possible, and they are now doing it. In order to do it as soon as
    possible, they start doing it based on test chips, which may be
    slightly different from the final chips. The kernel community
    should then understand this and be a bit flexible in that the
    kernel support might change a bit, as the HW might change a bit
    before it reaches production.

 2/ You have no documentation for your HW, and you're simply
    discovering how it works progressively. Look at all the folks who
    write nice upstream code solely based on crappy vendor BSPs, with
    no datasheet whatsoever. Do you think they can sanely have a good
    overall understanding of the HW from day 1 ? Certainly not. They
    will discover that such or such HW block is not only a timer, but
    also a watchdog. That this other HW block not only contains
    clocks, but reset and pinctrl lines.

I don't like to self-promote my own stuff, but I think you should read
the slides of my talk "Device Tree as a stable ABI: a fairy
tale" (http://free-electrons.com/pub/conferences/2015/elc/petazzoni-dt-as-stable-abi-fairy-tale/petazzoni-dt-as-stable-abi-fairy-tale.pdf).
I will clearly not suggest you to watch the video recording, as I don't
want you to suffer from one hour of my terrible English accent.

> > We are perfectly fine with maintaining *code*. And we have been doing
> > so for several years on Marvell platforms: caring about older
> > platforms, converting old legacy code and legacy platforms to the
> > Device Tree, etc.
> 
> The parsing and handling of the existing binding is code, which you are
> arguing you do not need to support.

So you like to have tons and tons of essentially dead code to parse old
versions of DT bindings?

What really strikes me here is that the Linux kernel always had a
version strong position, detailed in
Documentation/stable_api_nonsense.txt, that the kernel developers don't
want to maintain a stable API for kernel modules, because keeping old
APIs around means keeping old code, that isn't tested, confuses people,
etc. Read the file if you've never done so.

And now, for something pretty much as complicated (if not more) as the
kernel module API, the DT bindings, you want to do *exactly* what this
stable_api_nonsense.txt says we shouldn't do ?

> > What we don't want to commit to is to maintain the DT stability
> > *before* the support for a given device is sufficiently stable/sane.
> 
> I do not follow what you mean by support being *not sane*. If it isn't
> sane, why was it merged?
> 
> If it was previously considered sane, and the code was considered
> supportable, why is it necessary to break a DT that it supported?

See above: the HW has changed, or the HW is not working as it was
originally understood when the DT binding was initially designed.

> > Why are you talking about "illusion" of support ? Sorry, but with
> > unstable DT bindings, as long as you use the DT that comes with the
> > kernel sources, everything works perfectly fine, and is perfectly
> > supported.
> 
> A few examples:
> 
> * Distro installation (in the absence of a stable DT and a consistent
>   boot environment, you have no idea how to get a kernel up and
>   running). Either your user has to be intimately familiar with the
>   platform, or the distro needs a separate release for each and every
>   platform.

Irrelevant: the distro shall just ship with all the Device Trees for
all supported platforms, and that's it. Exactly like the x86 kernels
shipped by distros have essentially all kernel drivers enabled in order
to be able to have a kernel that works on all platforms.

> * Distro maintenance. I just upgraded to a point-release kernel and now
>   have to pull down the DTBs. That required the distro to _somehow_
>   identify the relevant DTBs, how to install this in the correct
>   location such that the firmware and/or bootloader can pick it up, etc.

Same, irrelevant: the DTB comes with the distro kernel package, and like
kernel modules, they are installed in a per-kernel version directory.

> * Other developers tracking down problems that inevitable crop up on the
>   platform, who now have to jump through the hoops above. As the DTBs
>   change, so may the user-facing elements and/or kernel behaviour, so
>   bisecting is extremely painful (though admittedly possible).

Exactly like they have to change their kernel modules so that they are
compatible with the kernel version they run.

> * Portable hypervisors, bootloaders, or other system-level software
>   can't add any (DTB-aware) support for a platform until the kernel
>   support is considered "sane" per your argument above. That time is an
>   arbitrary unknown in the future (which may be at infinity), so they
>   are either indefinitely delayed or end up having attempt to have
>   stable support for a variety of unstable bindings anyway, which may or
>   may not be possible depending on how bindings got broken.

Those hypervisors, bootloaders or system-level should not hardcode the
platform DTB, but simply use the one that is provided together with the
kernel.

Best regards,

Thomas
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/arm/marvell/armada-7k-8k.txt b/Documentation/devicetree/bindings/arm/marvell/armada-7k-8k.txt
index df98a9c..758b3ae 100644
--- a/Documentation/devicetree/bindings/arm/marvell/armada-7k-8k.txt
+++ b/Documentation/devicetree/bindings/arm/marvell/armada-7k-8k.txt
@@ -1,6 +1,18 @@ 
 Marvell Armada 7K/8K Platforms Device Tree Bindings
 ---------------------------------------------------
 
+Work in progress statement:
+
+Device tree files and bindings applying to Marvell Armada 7K and 8K
+SoCs and boards are considered "unstable". Any Marvell Armada 7K/8K
+device tree binding may change at any time. Be sure to use a device
+tree binary and a kernel image generated from the same source tree.
+
+Please refer to Documentation/devicetree/bindings/ABI.txt for a
+definition of a stable binding/ABI.
+
+---------------------------------------------------------------
+
 Boards using a SoC of the Marvell Armada 7K or 8K families must carry
 the following root node property: