diff mbox series

[RFC,3/6] dt-bindings: axi-fan-control: add tacho properties

Message ID 20210708120111.519444-4-nuno.sa@analog.com (mailing list archive)
State Changes Requested
Headers show
Series AXI FAN new features and improvements | expand

Commit Message

Nuno Sa July 8, 2021, 12:01 p.m. UTC
Add the bindings for the tacho signal evaluation parameters which depend
on the FAN being used.

Signed-off-by: Nuno Sá <nuno.sa@analog.com>
---
 .../bindings/hwmon/adi,axi-fan-control.yaml          | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Rob Herring July 12, 2021, 5:26 p.m. UTC | #1
On Thu, Jul 08, 2021 at 02:01:08PM +0200, Nuno Sá wrote:
> Add the bindings for the tacho signal evaluation parameters which depend
> on the FAN being used.
> 
> Signed-off-by: Nuno Sá <nuno.sa@analog.com>
> ---
>  .../bindings/hwmon/adi,axi-fan-control.yaml          | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml
> index 6747b870f297..0481eb34d9f1 100644
> --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml
> +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml
> @@ -37,6 +37,18 @@ properties:
>      $ref: /schemas/types.yaml#/definitions/uint32
>      enum: [1, 2, 4]
>  
> +  adi,tacho-25-us:
> +    description: Expected tacho signal when the PWM is at 25%.
> +
> +  adi,tacho-50-us:
> +    description: Expected tacho signal when the PWM is at 50%.
> +
> +  adi,tacho-75-us:
> +    description: Expected tacho signal when the PWM is at 75%.
> +
> +  adi,tacho-100-us:
> +    description: Expected tacho signal when the PWM is at 100%.

This looks like it should be common. But having PWM percents in the 
property names doesn't scale. This is also a property of the fan, not 
the fan controller.

There's only so many ways a fan can be controlled and I'm going to keep 
telling folks to make a common fan binding. There's some start to it, 
but it needs some work.

Rob
Nuno Sa July 15, 2021, 10:26 a.m. UTC | #2
> From: Rob Herring <robh@kernel.org>
> Sent: Monday, July 12, 2021 7:27 PM
> To: Sa, Nuno <Nuno.Sa@analog.com>
> Cc: linux-hwmon@vger.kernel.org; devicetree@vger.kernel.org;
> Guenter Roeck <linux@roeck-us.net>; Jean Delvare
> <jdelvare@suse.com>
> Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho
> properties
> 
> [External]
> 
> On Thu, Jul 08, 2021 at 02:01:08PM +0200, Nuno Sá wrote:
> > Add the bindings for the tacho signal evaluation parameters which
> depend
> > on the FAN being used.
> >
> > Signed-off-by: Nuno Sá <nuno.sa@analog.com>
> > ---
> >  .../bindings/hwmon/adi,axi-fan-control.yaml          | 12
> ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-
> control.yaml b/Documentation/devicetree/bindings/hwmon/adi,axi-
> fan-control.yaml
> > index 6747b870f297..0481eb34d9f1 100644
> > --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-
> control.yaml
> > +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-
> control.yaml
> > @@ -37,6 +37,18 @@ properties:
> >      $ref: /schemas/types.yaml#/definitions/uint32
> >      enum: [1, 2, 4]
> >
> > +  adi,tacho-25-us:
> > +    description: Expected tacho signal when the PWM is at 25%.
> > +
> > +  adi,tacho-50-us:
> > +    description: Expected tacho signal when the PWM is at 50%.
> > +
> > +  adi,tacho-75-us:
> > +    description: Expected tacho signal when the PWM is at 75%.
> > +
> > +  adi,tacho-100-us:
> > +    description: Expected tacho signal when the PWM is at 100%.
> 
> This looks like it should be common. But having PWM percents in the
> property names doesn't scale. This is also a property of the fan, not
> the fan controller.

Yes, I see that these parameters are definitely a property of the attached
fan but the evaluation of these timings are very specific to this controller
(I think). The way it works is that the HW can fully operate without any
runtime SW configuration. In this case, it will use the values in these
registers to evaluate the tacho signal coming from the FAN. And the HW
really uses the evaluation points like this (0, 25%, 50% and 100%). It has
some predefined values that work for the FAN that was used to develop
the IP but naturally the evaluation will fail as soon as other FAN is attached
(resulting in fan fault interrupts). And yes, writing these parameters is 
already SW configuration but what I mean with "runtime" is after probe :). 

So, I honestly do not know how we could name this better... Maybe a
'tacho-eval-points-us' array? The question would be the min/max
elements? Do you have any suggestion for a more generic property?

> There's only so many ways a fan can be controlled and I'm going to
> keep
> telling folks to make a common fan binding. There's some start to it,
> but it needs some work.

You mean the pwm-fan.txt? I gave a look to the driver and I don't think
it fully fits this controller. I'm ok in sending an fan.yaml binding if you
prefer it but I'm just not sure what we would need there... Maybe
pulses-per-revolution and the above property
(whatever the decided name) would be a starting point? 

- Nuno Sá
Guenter Roeck July 15, 2021, 8:39 p.m. UTC | #3
On Thu, Jul 15, 2021 at 10:26:05AM +0000, Sa, Nuno wrote:
> > From: Rob Herring <robh@kernel.org>
> > Sent: Monday, July 12, 2021 7:27 PM
> > To: Sa, Nuno <Nuno.Sa@analog.com>
> > Cc: linux-hwmon@vger.kernel.org; devicetree@vger.kernel.org;
> > Guenter Roeck <linux@roeck-us.net>; Jean Delvare
> > <jdelvare@suse.com>
> > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho
> > properties
> > 
> > [External]
> > 
> > On Thu, Jul 08, 2021 at 02:01:08PM +0200, Nuno Sá wrote:
> > > Add the bindings for the tacho signal evaluation parameters which
> > depend
> > > on the FAN being used.
> > >
> > > Signed-off-by: Nuno Sá <nuno.sa@analog.com>
> > > ---
> > >  .../bindings/hwmon/adi,axi-fan-control.yaml          | 12
> > ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-
> > control.yaml b/Documentation/devicetree/bindings/hwmon/adi,axi-
> > fan-control.yaml
> > > index 6747b870f297..0481eb34d9f1 100644
> > > --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-
> > control.yaml
> > > +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-
> > control.yaml
> > > @@ -37,6 +37,18 @@ properties:
> > >      $ref: /schemas/types.yaml#/definitions/uint32
> > >      enum: [1, 2, 4]
> > >
> > > +  adi,tacho-25-us:
> > > +    description: Expected tacho signal when the PWM is at 25%.
> > > +
> > > +  adi,tacho-50-us:
> > > +    description: Expected tacho signal when the PWM is at 50%.
> > > +
> > > +  adi,tacho-75-us:
> > > +    description: Expected tacho signal when the PWM is at 75%.
> > > +
> > > +  adi,tacho-100-us:
> > > +    description: Expected tacho signal when the PWM is at 100%.
> > 
> > This looks like it should be common. But having PWM percents in the
> > property names doesn't scale. This is also a property of the fan, not
> > the fan controller.
> 
> Yes, I see that these parameters are definitely a property of the attached
> fan but the evaluation of these timings are very specific to this controller
> (I think). The way it works is that the HW can fully operate without any
> runtime SW configuration. In this case, it will use the values in these
> registers to evaluate the tacho signal coming from the FAN. And the HW
> really uses the evaluation points like this (0, 25%, 50% and 100%). It has
> some predefined values that work for the FAN that was used to develop
> the IP but naturally the evaluation will fail as soon as other FAN is attached
> (resulting in fan fault interrupts). And yes, writing these parameters is 
> already SW configuration but what I mean with "runtime" is after probe :). 
> 

Are you sure you can ever get this stable ? Each fan has its own properties
and tolerances. If you replace a fan in a given system, you might get
different RPM numbers. The RPM will differ widely from system to system
and from fan to fan. Anything that assumes a specific RPM in devicetree
data seems to be quite vulnerable to failures. I have experienced that
recently with a different chip which also tries to correlate RPM and PWM
and fails quite miserably.

In my experience, anything other than minimum fan speed is really a recipe
for instability and sporadic false failures. Even setting a minimum fan speed
is tricky because it depends a lot on the fan.

> So, I honestly do not know how we could name this better... Maybe a
> 'tacho-eval-points-us' array? The question would be the min/max
> elements? Do you have any suggestion for a more generic property?
> 
I am guessing that the "us" refers to the time between pulses from the
fan. I think this is a bad value to start with - anything fan speed
related should really be expressed in RPM, not in time between pulses.

Overall I don't think this should be handled as generic set of properties.
Whatever we come up with as standard set of pwm or fan related properties
should not be an expected correlation between pwm and rpm. Assuming such
a property is needed here (after all, the controller is what it is),
maybe a set of tuples makes sense, such as

	adi,pwm-rpm-map = <
		25, 250,
		50, 500,
		75, 750,
		100, 1000
	>;

though I think that each of those should also include the tolerance
instead of just assuming that a 25% tolerance (as implemented in patch
2/6) would work for all fans.

Guenter

> > There's only so many ways a fan can be controlled and I'm going to
> > keep
> > telling folks to make a common fan binding. There's some start to it,
> > but it needs some work.
> 
> You mean the pwm-fan.txt? I gave a look to the driver and I don't think
> it fully fits this controller. I'm ok in sending an fan.yaml binding if you
> prefer it but I'm just not sure what we would need there... Maybe
> pulses-per-revolution and the above property
> (whatever the decided name) would be a starting point? 
> 
> - Nuno Sá
Nuno Sa July 16, 2021, 7:44 a.m. UTC | #4
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter
> Roeck
> Sent: Thursday, July 15, 2021 10:40 PM
> To: Sa, Nuno <Nuno.Sa@analog.com>
> Cc: Rob Herring <robh@kernel.org>; linux-hwmon@vger.kernel.org;
> devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com>
> Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho
> properties
> 
> On Thu, Jul 15, 2021 at 10:26:05AM +0000, Sa, Nuno wrote:
> > > From: Rob Herring <robh@kernel.org>
> > > Sent: Monday, July 12, 2021 7:27 PM
> > > To: Sa, Nuno <Nuno.Sa@analog.com>
> > > Cc: linux-hwmon@vger.kernel.org; devicetree@vger.kernel.org;
> > > Guenter Roeck <linux@roeck-us.net>; Jean Delvare
> > > <jdelvare@suse.com>
> > > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add
> tacho
> > > properties
> > >
> > > [External]
> > >
> > > On Thu, Jul 08, 2021 at 02:01:08PM +0200, Nuno Sá wrote:
> > > > Add the bindings for the tacho signal evaluation parameters
> which
> > > depend
> > > > on the FAN being used.
> > > >
> > > > Signed-off-by: Nuno Sá <nuno.sa@analog.com>
> > > > ---
> > > >  .../bindings/hwmon/adi,axi-fan-control.yaml          | 12
> > > ++++++++++++
> > > >  1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-
> fan-
> > > control.yaml
> b/Documentation/devicetree/bindings/hwmon/adi,axi-
> > > fan-control.yaml
> > > > index 6747b870f297..0481eb34d9f1 100644
> > > > --- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-
> > > control.yaml
> > > > +++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-
> > > control.yaml
> > > > @@ -37,6 +37,18 @@ properties:
> > > >      $ref: /schemas/types.yaml#/definitions/uint32
> > > >      enum: [1, 2, 4]
> > > >
> > > > +  adi,tacho-25-us:
> > > > +    description: Expected tacho signal when the PWM is at 25%.
> > > > +
> > > > +  adi,tacho-50-us:
> > > > +    description: Expected tacho signal when the PWM is at 50%.
> > > > +
> > > > +  adi,tacho-75-us:
> > > > +    description: Expected tacho signal when the PWM is at 75%.
> > > > +
> > > > +  adi,tacho-100-us:
> > > > +    description: Expected tacho signal when the PWM is at 100%.
> > >
> > > This looks like it should be common. But having PWM percents in
> the
> > > property names doesn't scale. This is also a property of the fan, not
> > > the fan controller.
> >
> > Yes, I see that these parameters are definitely a property of the
> attached
> > fan but the evaluation of these timings are very specific to this
> controller
> > (I think). The way it works is that the HW can fully operate without
> any
> > runtime SW configuration. In this case, it will use the values in these
> > registers to evaluate the tacho signal coming from the FAN. And the
> HW
> > really uses the evaluation points like this (0, 25%, 50% and 100%). It
> has
> > some predefined values that work for the FAN that was used to
> develop
> > the IP but naturally the evaluation will fail as soon as other FAN is
> attached
> > (resulting in fan fault interrupts). And yes, writing these parameters
> is
> > already SW configuration but what I mean with "runtime" is after
> probe :).
> >
> 
> Are you sure you can ever get this stable ? Each fan has its own
> properties
> and tolerances. If you replace a fan in a given system, you might get
> different RPM numbers. The RPM will differ widely from system to
> system
> and from fan to fan. Anything that assumes a specific RPM in
> devicetree
> data seems to be quite vulnerable to failures. I have experienced that
> recently with a different chip which also tries to correlate RPM and
> PWM
> and fails quite miserably.
> 
> In my experience, anything other than minimum fan speed is really a
> recipe
> for instability and sporadic false failures. Even setting a minimum fan
> speed
> is tricky because it depends a lot on the fan.

I see what you mean. So, I had to go through this process when testing
this changes because the fan I'm using is different from the default one
used to develop and stablish the default values in the IP core. The core
provides you with a register which contains the tacho measurements in
clock cycles. You can read that for all the PWM points of interest
(with devmem2 for example) and make your own "calibration". I assume
that people have to go through this process before putting some values
in the devicetree. I'm aware this is not the neatest process but I guess it's
acceptable...

> > So, I honestly do not know how we could name this better... Maybe
> a
> > 'tacho-eval-points-us' array? The question would be the min/max
> > elements? Do you have any suggestion for a more generic property?
> >
> I am guessing that the "us" refers to the time between pulses from the
> fan. I think this is a bad value to start with - anything fan speed
> related should really be expressed in RPM, not in time between
> pulses.
> 
> Overall I don't think this should be handled as generic set of
> properties.
> Whatever we come up with as standard set of pwm or fan related
> properties
> should not be an expected correlation between pwm and rpm.
> Assuming such
> a property is needed here (after all, the controller is what it is),
> maybe a set of tuples makes sense, such as
> 
> 	adi,pwm-rpm-map = <
> 		25, 250,
> 		50, 500,
> 		75, 750,
> 		100, 1000
> 	>;
> 
> though I think that each of those should also include the tolerance
> instead of just assuming that a 25% tolerance (as implemented in patch
> 2/6) would work for all fans.

Yes, this makes sense thanks. As the HW default tolerance is 25% I was 
somehow attached to that. Since all these values are correlated it makes
complete sense to also give the tolerance here as it makes things more
flexible. The map is also a good tip, I just have to see if there is a nice
way to specify that the pwm column is constant...

Thanks!
- Nuno Sá
Guenter Roeck July 16, 2021, 3:03 p.m. UTC | #5
On 7/16/21 12:44 AM, Sa, Nuno wrote:
[ ... ]
>>
>> Are you sure you can ever get this stable ? Each fan has its own
>> properties
>> and tolerances. If you replace a fan in a given system, you might get
>> different RPM numbers. The RPM will differ widely from system to
>> system
>> and from fan to fan. Anything that assumes a specific RPM in
>> devicetree
>> data seems to be quite vulnerable to failures. I have experienced that
>> recently with a different chip which also tries to correlate RPM and
>> PWM
>> and fails quite miserably.
>>
>> In my experience, anything other than minimum fan speed is really a
>> recipe
>> for instability and sporadic false failures. Even setting a minimum fan
>> speed
>> is tricky because it depends a lot on the fan.
> 
> I see what you mean. So, I had to go through this process when testing
> this changes because the fan I'm using is different from the default one
> used to develop and stablish the default values in the IP core. The core

Exactly my point.

> provides you with a register which contains the tacho measurements in
> clock cycles. You can read that for all the PWM points of interest
> (with devmem2 for example) and make your own "calibration". I assume
> that people have to go through this process before putting some values
> in the devicetree. I'm aware this is not the neatest process but I guess it's
> acceptable...
> 

Do you really expect everyone using a system with this chip to go through
this process and update its devicetree configuration, and then repeat it
whenever a fan is changed ? Given how dynamic this is, I really wonder
if that information should be in devicetree in the first place.

Guenter
Nuno Sa July 19, 2021, 7:46 a.m. UTC | #6
> -----Original Message-----
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter
> Roeck
> Sent: Friday, July 16, 2021 5:04 PM
> To: Sa, Nuno <Nuno.Sa@analog.com>
> Cc: Rob Herring <robh@kernel.org>; linux-hwmon@vger.kernel.org;
> devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com>
> Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho
> properties
> 
> [External]
> 
> On 7/16/21 12:44 AM, Sa, Nuno wrote:
> [ ... ]
> >>
> >> Are you sure you can ever get this stable ? Each fan has its own
> >> properties
> >> and tolerances. If you replace a fan in a given system, you might get
> >> different RPM numbers. The RPM will differ widely from system to
> >> system
> >> and from fan to fan. Anything that assumes a specific RPM in
> >> devicetree
> >> data seems to be quite vulnerable to failures. I have experienced
> that
> >> recently with a different chip which also tries to correlate RPM and
> >> PWM
> >> and fails quite miserably.
> >>
> >> In my experience, anything other than minimum fan speed is really
> a
> >> recipe
> >> for instability and sporadic false failures. Even setting a minimum
> fan
> >> speed
> >> is tricky because it depends a lot on the fan.
> >
> > I see what you mean. So, I had to go through this process when
> testing
> > this changes because the fan I'm using is different from the default
> one
> > used to develop and stablish the default values in the IP core. The
> core
> 
> Exactly my point.
> 
> > provides you with a register which contains the tacho measurements
> in
> > clock cycles. You can read that for all the PWM points of interest
> > (with devmem2 for example) and make your own "calibration". I
> assume
> > that people have to go through this process before putting some
> values
> > in the devicetree. I'm aware this is not the neatest process but I
> guess it's
> > acceptable...
> >
> 
> Do you really expect everyone using a system with this chip to go
> through
> this process and update its devicetree configuration, and then repeat it
> whenever a fan is changed ? Given how dynamic this is, I really wonder
> if that information should be in devicetree in the first place.
> 

My naive assumption was that we would only do this work at evaluation
time. After that and after we settled with a fan for some system, I expected
that changing to a different fan is not that likely. From your inputs, I guess
this is not really the case which makes this process more cumbersome (as it
also implies recompiling the devicetree for your system).

However, even if we export these as runtime parameters, services/daemons
will also have an hard time doing this "calibration" in a dynamic way. The reason
is because the way the controller works is that it only accepts a new PWM
request if it is an higher value than whatever the HW has at that moment. Thus,
going through the calibration points might be very cumbersome. I can see some
ways of handling this though but not very neat...

Since this is a FPGA core, we might have some flexibility here. Something that
came to my mind would be to have a calibration mode in the HW that would
allow us to freely control the PWM values. In that way we could go freely over
the calibration points. I guess, for safety reasons, this calibration mode would
expire after some reasonable time (that give us enough time for doing the whole
thing). The best place for doing the calibration, I guess it would be directly in the
driver since we do receive the interrupts about new tacho measurements making
things easier to sync and handle. However, given the time that takes for a new
PWM to settle + new tacho measurements, it would not be very acceptable to do this 
during probe which is definitely also not ideal (we could defer this to a worker/timer).

I'm not sure if the above makes much sense to you and it also depends on the HW
guys being on board with this mechanism.

- Nuno Sá
Guenter Roeck July 21, 2021, 3 p.m. UTC | #7
On Mon, Jul 19, 2021 at 07:46:41AM +0000, Sa, Nuno wrote:
> 
> 
> > -----Original Message-----
> > From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter
> > Roeck
> > Sent: Friday, July 16, 2021 5:04 PM
> > To: Sa, Nuno <Nuno.Sa@analog.com>
> > Cc: Rob Herring <robh@kernel.org>; linux-hwmon@vger.kernel.org;
> > devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com>
> > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho
> > properties
> > 
> > [External]
> > 
> > On 7/16/21 12:44 AM, Sa, Nuno wrote:
> > [ ... ]
> > >>
> > >> Are you sure you can ever get this stable ? Each fan has its own
> > >> properties
> > >> and tolerances. If you replace a fan in a given system, you might get
> > >> different RPM numbers. The RPM will differ widely from system to
> > >> system
> > >> and from fan to fan. Anything that assumes a specific RPM in
> > >> devicetree
> > >> data seems to be quite vulnerable to failures. I have experienced
> > that
> > >> recently with a different chip which also tries to correlate RPM and
> > >> PWM
> > >> and fails quite miserably.
> > >>
> > >> In my experience, anything other than minimum fan speed is really
> > a
> > >> recipe
> > >> for instability and sporadic false failures. Even setting a minimum
> > fan
> > >> speed
> > >> is tricky because it depends a lot on the fan.
> > >
> > > I see what you mean. So, I had to go through this process when
> > testing
> > > this changes because the fan I'm using is different from the default
> > one
> > > used to develop and stablish the default values in the IP core. The
> > core
> > 
> > Exactly my point.
> > 
> > > provides you with a register which contains the tacho measurements
> > in
> > > clock cycles. You can read that for all the PWM points of interest
> > > (with devmem2 for example) and make your own "calibration". I
> > assume
> > > that people have to go through this process before putting some
> > values
> > > in the devicetree. I'm aware this is not the neatest process but I
> > guess it's
> > > acceptable...
> > >
> > 
> > Do you really expect everyone using a system with this chip to go
> > through
> > this process and update its devicetree configuration, and then repeat it
> > whenever a fan is changed ? Given how dynamic this is, I really wonder
> > if that information should be in devicetree in the first place.
> > 
> 
> My naive assumption was that we would only do this work at evaluation
> time. After that and after we settled with a fan for some system, I expected
> that changing to a different fan is not that likely. From your inputs, I guess
> this is not really the case which makes this process more cumbersome (as it
> also implies recompiling the devicetree for your system).
> 
> However, even if we export these as runtime parameters, services/daemons
> will also have an hard time doing this "calibration" in a dynamic way. The reason
> is because the way the controller works is that it only accepts a new PWM
> request if it is an higher value than whatever the HW has at that moment. Thus,
> going through the calibration points might be very cumbersome. I can see some
> ways of handling this though but not very neat...
> 
> Since this is a FPGA core, we might have some flexibility here. Something that
> came to my mind would be to have a calibration mode in the HW that would
> allow us to freely control the PWM values. In that way we could go freely over
> the calibration points. I guess, for safety reasons, this calibration mode would
> expire after some reasonable time (that give us enough time for doing the whole
> thing). The best place for doing the calibration, I guess it would be directly in the
> driver since we do receive the interrupts about new tacho measurements making
> things easier to sync and handle. However, given the time that takes for a new
> PWM to settle + new tacho measurements, it would not be very acceptable to do this 
> during probe which is definitely also not ideal (we could defer this to a worker/timer).
> 
> I'm not sure if the above makes much sense to you and it also depends on the HW
> guys being on board with this mechanism.
> 

I don't really know what to say or recommend here. Personally I think any
attempt to tie PWM values to RPM are doomed to fail. Here are a couple of
examples:

Take your test system and move the fan to a restricted place (eg close to a
wall). You'll see the fan RPM change, potentially significantly. Put it into
some place with airflow towards or away from the system (eg blow air into
the system from another place, which may happen if the system is installed
in a lab), and again you'll see fan speed changes. Open the chassis, and
the fan speed will change. I have seen fan speeds vary by up to 50% when
changing airflow.

That doesn't even take into account that replacing a fan even with a similar
model (eg after a fan failed) will likely result in potentially significant
rpm changes.

Ultimately, anything that does more than determine if a fan is still running
is potentially unstable.

Having said all that, it is really your call to decide how you want to detect
fan failures. 

Thanks,
Guenter
Nuno Sa July 22, 2021, 1 p.m. UTC | #8
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter
> Roeck
> Sent: Wednesday, July 21, 2021 5:00 PM
> To: Sa, Nuno <Nuno.Sa@analog.com>
> Cc: Rob Herring <robh@kernel.org>; linux-hwmon@vger.kernel.org;
> devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com>
> Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add tacho
> properties
> 
> 
> On Mon, Jul 19, 2021 at 07:46:41AM +0000, Sa, Nuno wrote:
> >
> >
> > > -----Original Message-----
> > > From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter
> > > Roeck
> > > Sent: Friday, July 16, 2021 5:04 PM
> > > To: Sa, Nuno <Nuno.Sa@analog.com>
> > > Cc: Rob Herring <robh@kernel.org>; linux-
> hwmon@vger.kernel.org;
> > > devicetree@vger.kernel.org; Jean Delvare <jdelvare@suse.com>
> > > Subject: Re: [RFC PATCH 3/6] dt-bindings: axi-fan-control: add
> tacho
> > > properties
> > >
> > > [External]
> > >
> > > On 7/16/21 12:44 AM, Sa, Nuno wrote:
> > > [ ... ]
> > > >>
> > > >> Are you sure you can ever get this stable ? Each fan has its own
> > > >> properties
> > > >> and tolerances. If you replace a fan in a given system, you might
> get
> > > >> different RPM numbers. The RPM will differ widely from system
> to
> > > >> system
> > > >> and from fan to fan. Anything that assumes a specific RPM in
> > > >> devicetree
> > > >> data seems to be quite vulnerable to failures. I have
> experienced
> > > that
> > > >> recently with a different chip which also tries to correlate RPM
> and
> > > >> PWM
> > > >> and fails quite miserably.
> > > >>
> > > >> In my experience, anything other than minimum fan speed is
> really
> > > a
> > > >> recipe
> > > >> for instability and sporadic false failures. Even setting a
> minimum
> > > fan
> > > >> speed
> > > >> is tricky because it depends a lot on the fan.
> > > >
> > > > I see what you mean. So, I had to go through this process when
> > > testing
> > > > this changes because the fan I'm using is different from the
> default
> > > one
> > > > used to develop and stablish the default values in the IP core.
> The
> > > core
> > >
> > > Exactly my point.
> > >
> > > > provides you with a register which contains the tacho
> measurements
> > > in
> > > > clock cycles. You can read that for all the PWM points of interest
> > > > (with devmem2 for example) and make your own "calibration". I
> > > assume
> > > > that people have to go through this process before putting some
> > > values
> > > > in the devicetree. I'm aware this is not the neatest process but I
> > > guess it's
> > > > acceptable...
> > > >
> > >
> > > Do you really expect everyone using a system with this chip to go
> > > through
> > > this process and update its devicetree configuration, and then
> repeat it
> > > whenever a fan is changed ? Given how dynamic this is, I really
> wonder
> > > if that information should be in devicetree in the first place.
> > >
> >
> > My naive assumption was that we would only do this work at
> evaluation
> > time. After that and after we settled with a fan for some system, I
> expected
> > that changing to a different fan is not that likely. From your inputs, I
> guess
> > this is not really the case which makes this process more
> cumbersome (as it
> > also implies recompiling the devicetree for your system).
> >
> > However, even if we export these as runtime parameters,
> services/daemons
> > will also have an hard time doing this "calibration" in a dynamic way.
> The reason
> > is because the way the controller works is that it only accepts a new
> PWM
> > request if it is an higher value than whatever the HW has at that
> moment. Thus,
> > going through the calibration points might be very cumbersome. I
> can see some
> > ways of handling this though but not very neat...
> >
> > Since this is a FPGA core, we might have some flexibility here.
> Something that
> > came to my mind would be to have a calibration mode in the HW that
> would
> > allow us to freely control the PWM values. In that way we could go
> freely over
> > the calibration points. I guess, for safety reasons, this calibration
> mode would
> > expire after some reasonable time (that give us enough time for
> doing the whole
> > thing). The best place for doing the calibration, I guess it would be
> directly in the
> > driver since we do receive the interrupts about new tacho
> measurements making
> > things easier to sync and handle. However, given the time that takes
> for a new
> > PWM to settle + new tacho measurements, it would not be very
> acceptable to do this
> > during probe which is definitely also not ideal (we could defer this to
> a worker/timer).
> >
> > I'm not sure if the above makes much sense to you and it also
> depends on the HW
> > guys being on board with this mechanism.
> >
> 
> I don't really know what to say or recommend here. Personally I think
> any
> attempt to tie PWM values to RPM are doomed to fail. Here are a
> couple of
> examples:
> 
> Take your test system and move the fan to a restricted place (eg close
> to a
> wall). You'll see the fan RPM change, potentially significantly. Put it into
> some place with airflow towards or away from the system (eg blow air
> into
> the system from another place, which may happen if the system is
> installed
> in a lab), and again you'll see fan speed changes. Open the chassis, and
> the fan speed will change. I have seen fan speeds vary by up to 50%
> when
> changing airflow.

Here we can at least control the tolerance for each PWM vs RPM point but
I can image this as a very painful process to get these values right and no
one will think in setting tolerances of 50%...

> That doesn't even take into account that replacing a fan even with a
> similar
> model (eg after a fan failed) will likely result in potentially significant
> rpm changes.
> 
> Ultimately, anything that does more than determine if a fan is still
> running
> is potentially unstable.

Yeah, I understand your points. The HW does the evaluation and of 
course it also looks for the presence of a signal... So, in your opinion,
not even setting a minimum fan speed is likely to be stable?

> Having said all that, it is really your call to decide how you want to
> detect
> fan failures.
> 

Well, my hands are also tied here. The core is supposed to work without
any SW interaction in which case the tacho evaluation is always done. The
only thing I could do is to completely ignore fan faults which is also bad... 

I can try to persuade the HW guy to completely remove the evaluation and
just give fan fauts in case there's no signal but I'm not really sure he will go
for it. In that case, I'm tempted to just leave this as-is (with the extra bindings
for the tolerance and turn these bindings into a map) if you're willing to take it...
The reason is that, as you said, this is likely to be unstable any ways so that the
added complexity in the SW does not really pay off (better keep at least the SW
simple)...

- Nuno Sá
Guenter Roeck July 22, 2021, 3:23 p.m. UTC | #9
On 7/22/21 6:00 AM, Sa, Nuno wrote:
[ ... ]

>> I don't really know what to say or recommend here. Personally I think
>> any
>> attempt to tie PWM values to RPM are doomed to fail. Here are a
>> couple of
>> examples:
>>
>> Take your test system and move the fan to a restricted place (eg close
>> to a
>> wall). You'll see the fan RPM change, potentially significantly. Put it into
>> some place with airflow towards or away from the system (eg blow air
>> into
>> the system from another place, which may happen if the system is
>> installed
>> in a lab), and again you'll see fan speed changes. Open the chassis, and
>> the fan speed will change. I have seen fan speeds vary by up to 50%
>> when
>> changing airflow.
> 
> Here we can at least control the tolerance for each PWM vs RPM point but
> I can image this as a very painful process to get these values right and no
> one will think in setting tolerances of 50%...
> 
>> That doesn't even take into account that replacing a fan even with a
>> similar
>> model (eg after a fan failed) will likely result in potentially significant
>> rpm changes.
>>
>> Ultimately, anything that does more than determine if a fan is still
>> running
>> is potentially unstable.
> 
> Yeah, I understand your points. The HW does the evaluation and of
> course it also looks for the presence of a signal... So, in your opinion,
> not even setting a minimum fan speed is likely to be stable?
> 
Using the minimum fan speed as detection mechanism for fan failures is ok
and widely used. My concern is the desire to associate it with pwm values.

>> Having said all that, it is really your call to decide how you want to
>> detect
>> fan failures.
>>
> 
> Well, my hands are also tied here. The core is supposed to work without
> any SW interaction in which case the tacho evaluation is always done. The
> only thing I could do is to completely ignore fan faults which is also bad...
> 
As I said above, it would be perfectly fine to have a parameter that reflects
minimum fan speed (or, translated into chip speak, minimum number of pulses
per minute).

> I can try to persuade the HW guy to completely remove the evaluation and
> just give fan fauts in case there's no signal but I'm not really sure he will go
> for it. In that case, I'm tempted to just leave this as-is (with the extra bindings
> for the tolerance and turn these bindings into a map) if you're willing to take it...
> The reason is that, as you said, this is likely to be unstable any ways so that the
> added complexity in the SW does not really pay off (better keep at least the SW
> simple)...
> 
Sure, I'll take it, as long as you find a binding that is acceptable for Rob.
It is your funeral, after all :-).

Thanks,
Guenter
diff mbox series

Patch

diff --git a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml
index 6747b870f297..0481eb34d9f1 100644
--- a/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml
+++ b/Documentation/devicetree/bindings/hwmon/adi,axi-fan-control.yaml
@@ -37,6 +37,18 @@  properties:
     $ref: /schemas/types.yaml#/definitions/uint32
     enum: [1, 2, 4]
 
+  adi,tacho-25-us:
+    description: Expected tacho signal when the PWM is at 25%.
+
+  adi,tacho-50-us:
+    description: Expected tacho signal when the PWM is at 50%.
+
+  adi,tacho-75-us:
+    description: Expected tacho signal when the PWM is at 75%.
+
+  adi,tacho-100-us:
+    description: Expected tacho signal when the PWM is at 100%.
+
 required:
   - compatible
   - reg