diff mbox

[v2,3/6] PowerCap: Added to drivers build

Message ID 1380904616-17519-4-git-send-email-srinivas.pandruvada@linux.intel.com (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

Srinivas Pandruvada Oct. 4, 2013, 4:36 p.m. UTC
Added changes to Makefile and Kconfig to include in driver build.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/Kconfig  | 2 ++
 drivers/Makefile | 1 +
 2 files changed, 3 insertions(+)

Comments

Gene Heskett Oct. 4, 2013, 7:24 p.m. UTC | #1
On Friday 04 October 2013, Srinivas Pandruvada wrote:
>Added changes to Makefile and Kconfig to include in driver build.
>
>Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
>Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>---
> drivers/Kconfig  | 2 ++
> drivers/Makefile | 1 +
> 2 files changed, 3 insertions(+)
>
>diff --git a/drivers/Kconfig b/drivers/Kconfig
>index aa43b91..969e987 100644
>--- a/drivers/Kconfig
>+++ b/drivers/Kconfig
>@@ -166,4 +166,6 @@ source "drivers/reset/Kconfig"
>
> source "drivers/fmc/Kconfig"
>
>+source "drivers/powercap/Kconfig"
>+
> endmenu
>diff --git a/drivers/Makefile b/drivers/Makefile
>index ab93de8..34c1d55 100644
>--- a/drivers/Makefile
>+++ b/drivers/Makefile
>@@ -152,3 +152,4 @@ obj-$(CONFIG_VME_BUS)		+= vme/
> obj-$(CONFIG_IPACK_BUS)		+= ipack/
> obj-$(CONFIG_NTB)		+= ntb/
> obj-$(CONFIG_FMC)		+= fmc/
>+obj-$(CONFIG_POWERCAP)		+= powercap/

I would object to this whole premise if it is not under the absolute 
control of the users program. Linuxcnc for instance is intimately married 
to a parport whose status for writes is absolutely stable from write to 
write, and whose status may be in some cases, read at sub 20 u-second 
intervals.  Anything which would imped this, puts the operator of a 50+ ton 
piece of machinery's life in jeopardy because its status is not readable in 
as close to real time as possible as it may be required to initiate a stop 
from some alarm condition before it has moved far enough to injure, maim or 
kill. 50 thousandths of an inch in further movement while moving a 70 ton 
milling machines table at 150 inches a minute is not an unreasonable 
expectation for us.

Cheers, Gene
Srinivas Pandruvada Oct. 4, 2013, 7:38 p.m. UTC | #2
On 10/04/2013 12:24 PM, Gene Heskett wrote:
> On Friday 04 October 2013, Srinivas Pandruvada wrote:
>> Added changes to Makefile and Kconfig to include in driver build.
>>
>> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>> drivers/Kconfig  | 2 ++
>> drivers/Makefile | 1 +
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/drivers/Kconfig b/drivers/Kconfig
>> index aa43b91..969e987 100644
>> --- a/drivers/Kconfig
>> +++ b/drivers/Kconfig
>> @@ -166,4 +166,6 @@ source "drivers/reset/Kconfig"
>>
>> source "drivers/fmc/Kconfig"
>>
>> +source "drivers/powercap/Kconfig"
>> +
>> endmenu
>> diff --git a/drivers/Makefile b/drivers/Makefile
>> index ab93de8..34c1d55 100644
>> --- a/drivers/Makefile
>> +++ b/drivers/Makefile
>> @@ -152,3 +152,4 @@ obj-$(CONFIG_VME_BUS)		+= vme/
>> obj-$(CONFIG_IPACK_BUS)		+= ipack/
>> obj-$(CONFIG_NTB)		+= ntb/
>> obj-$(CONFIG_FMC)		+= fmc/
>> +obj-$(CONFIG_POWERCAP)		+= powercap/
> I would object to this whole premise if it is not under the absolute
> control of the users program. Linuxcnc for instance is intimately married
> to a parport whose status for writes is absolutely stable from write to
> write, and whose status may be in some cases, read at sub 20 u-second
> intervals.  Anything which would imped this, puts the operator of a 50+ ton
> piece of machinery's life in jeopardy because its status is not readable in
> as close to real time as possible as it may be required to initiate a stop
> from some alarm condition before it has moved far enough to injure, maim or
> kill. 50 thousandths of an inch in further movement while moving a 70 ton
> milling machines table at 150 inches a minute is not an unreasonable
> expectation for us.
<Sorry, I didn't understand. Are you pointing any problem in this patch 
or patch-set in general?
This change added powercap directory to the kernel build. Is something 
wrong with it or any other way to do that?
 >
> Cheers, Gene

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srinivas Pandruvada Oct. 4, 2013, 8:55 p.m. UTC | #3
On 10/04/2013 01:22 PM, Gene Heskett wrote:
> On Friday 04 October 2013, Srinivas Pandruvada wrote:
>> On 10/04/2013 12:24 PM, Gene Heskett wrote:
>>> On Friday 04 October 2013, Srinivas Pandruvada wrote:
>>>> Added changes to Makefile and Kconfig to include in driver build.
>>>>
>>>> Signed-off-by: Srinivas Pandruvada
>>>> <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jacob Pan
>>>> <jacob.jun.pan@linux.intel.com>
>>>> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>> ---
>>>> drivers/Kconfig  | 2 ++
>>>> drivers/Makefile | 1 +
>>>> 2 files changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/Kconfig b/drivers/Kconfig
>>>> index aa43b91..969e987 100644
>>>> --- a/drivers/Kconfig
>>>> +++ b/drivers/Kconfig
>>>> @@ -166,4 +166,6 @@ source "drivers/reset/Kconfig"
>>>>
>>>> source "drivers/fmc/Kconfig"
>>>>
>>>> +source "drivers/powercap/Kconfig"
>>>> +
>>>> endmenu
>>>> diff --git a/drivers/Makefile b/drivers/Makefile
>>>> index ab93de8..34c1d55 100644
>>>> --- a/drivers/Makefile
>>>> +++ b/drivers/Makefile
>>>> @@ -152,3 +152,4 @@ obj-$(CONFIG_VME_BUS)		+= vme/
>>>> obj-$(CONFIG_IPACK_BUS)		+= ipack/
>>>> obj-$(CONFIG_NTB)		+= ntb/
>>>> obj-$(CONFIG_FMC)		+= fmc/
>>>> +obj-$(CONFIG_POWERCAP)		+= powercap/
>>> I would object to this whole premise if it is not under the absolute
>>> control of the users program. Linuxcnc for instance is intimately
>>> married to a parport whose status for writes is absolutely stable from
>>> write to write, and whose status may be in some cases, read at sub 20
>>> u-second intervals.  Anything which would imped this, puts the
>>> operator of a 50+ ton piece of machinery's life in jeopardy because
>>> its status is not readable in as close to real time as possible as it
>>> may be required to initiate a stop from some alarm condition before it
>>> has moved far enough to injure, maim or kill. 50 thousandths of an
>>> inch in further movement while moving a 70 ton milling machines table
>>> at 150 inches a minute is not an unreasonable expectation for us.
>> <Sorry, I didn't understand. Are you pointing any problem in this patch
>> or patch-set in general?
> Not that my relatively untrained eyes can spot.
>
>> This change added powercap directory to the kernel build. Is something
>> wrong with it or any other way to do that?
> The prospect of having a poorly configured way to power down a port that is
> running heavy machinery under real time control scares me.  And that is
> what this patch series seems to be leading up to if I am reading the patch
> headers correctly.  If I am not reading it correctly, then assume I am
> issuing a pre-emptive strike just to make sure you folks trying to save a
> milliwatt here and there, and there is not a thing wrong with the basic
> idea, are made aware of the potential for maiming mischief should you
> decide to power down a port just because its last access was 5 milliseconds
> ago.  Even a completely servo driven configuration will tickle it faster
> than that, however an e-stop condition, which might shut down a charge pump
> pulse generator must be maintained until cleared by the operator, which
> means the control channel to the machine, whatever port it is, must be kept
> alive to be human safe around the machine.  The capability to do that to a
> given port should therefore be made a kernel .config selection incapable of
> being overridden by some other perceived dependency in kconfig.
>
> I hope this is a better explanation. :)
The idea of power capping is to cap total power not power down and also 
need root level access to modify.
There is a minimum and maximum values is also defined, which should make 
sure that the system is
running with reduced performance, not power down.
This patch is not affecting Linux PM. Power down a port trigger 
generated by PM framework in coordination
with the driver controlling the port.
If some port is capable of powercapping and implemented a client driver 
for this, they can disable the whole
power capping by setting CONFIG_POWERCAP=n for such systems.
> And please, lets keep such discussion on the list where it belongs.
>
I am still doing reply to all to keep everyone in the mailing list in loop.



Thanks,
Srinivas
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gene Heskett Oct. 4, 2013, 11:17 p.m. UTC | #4
On Friday 04 October 2013, Srinivas Pandruvada wrote:
>On 10/04/2013 01:22 PM, Gene Heskett wrote:

[and snipped]

>>> <Sorry, I didn't understand. Are you pointing any problem in this
>>> patch or patch-set in general?
>> 
>> Not that my relatively untrained eyes can spot.
>> 
>>> This change added powercap directory to the kernel build. Is something
>>> wrong with it or any other way to do that?
>> 
>> The prospect of having a poorly configured way to power down a port
>> that is running heavy machinery under real time control scares me. 
>> And that is what this patch series seems to be leading up to if I am
>> reading the patch headers correctly.  If I am not reading it
>> correctly, then assume I am issuing a pre-emptive strike just to make
>> sure you folks trying to save a milliwatt here and there, and there is
>> not a thing wrong with the basic idea, are made aware of the potential
>> for maiming mischief should you decide to power down a port just
>> because its last access was 5 milliseconds ago.  Even a completely
>> servo driven configuration will tickle it faster than that, however an
>> e-stop condition, which might shut down a charge pump pulse generator
>> must be maintained until cleared by the operator, which means the
>> control channel to the machine, whatever port it is, must be kept
>> alive to be human safe around the machine.  The capability to do that
>> to a given port should therefore be made a kernel .config selection
>> incapable of being overridden by some other perceived dependency in
>> kconfig.
>> 
>> I hope this is a better explanation. :)
>
>The idea of power capping is to cap total power not power down and also
>need root level access to modify.

No.  Restricting it to root control only is NOT an option.  There has to be 
some mechanism whereby the users non-root program can control it.  We don't 
run this software as root, ever.  And the part of this software that needs 
the parport (or a pci card access) is running on a cpu core that has been 
isolated for its use by an isocpus= statement, not visible to top or any 
other system monitoring utility, so you would never know we are pounding on 
that port, both reads and multiple writes, at least 3 times every 23 
microseconds.  So you might see it as idle and turn it off.

>There is a minimum and maximum values is also defined, which should make
>sure that the system is
>running with reduced performance, not power down.

When cutting steel, or anything else for that matter, power used is not a 
consideration as long as there is enough, so in this case, reduced 
performance is not a viable option particularly if stepper motors are 
involved as they need a very steady heartbeat. And its just as important in 
the cpu driving the machine as it is in the motors driving the machine.  
Right now, there are only about 4 or 5 motherboards on the planet that can 
do this job fairly well for stepper based systems.

2 of them are your atom powered boards.  One of them is the BeagleBone 
Black but we are still designing the I/O cape for that so its not cutting 
steel daily, yet.  Had you not disco'd the D-525-MW boards, the market 
channel for those could easily absorb 10 or more per week for a nearly 
indefinite period.  You accidentally made the nearly ideal board and its 
not a power hog.  The BIG Reason? IRQ latency is 2 to 3 microseconds, and 
there is nothing else x86 based that can touch that figure that we have 
found so far.

When I say you, I am of course referring to Intel, since you are coming in 
from an Intel address. :)

>This patch is not affecting Linux PM. Power down a port trigger
>generated by PM framework in coordination
>with the driver controlling the port.

But what if you cannot detect that driver, and its port use, because its 
behind an isolcpus=0 statement on the kernel command line in grub?

This is what scares me spitless.  It could get somebody injured/killed, or 
wreck a $500,000 machine.

>If some port is capable of powercapping and implemented a client driver
>for this, they can disable the whole
>power capping by setting CONFIG_POWERCAP=n for such systems.

Good.

>> And please, lets keep such discussion on the list where it belongs.

I clicked on reply to list and got no To: line contents at all.  KMail does 
that to me when its a PM. Only been using it since '98, I really should 
learn to use it right someday. :)

>I am still doing reply to all to keep everyone in the mailing list in
>loop.

So am I, now.
>
>
>
>Thanks,
>Srinivas


Cheers, Gene
Arjan van de Ven Oct. 6, 2013, 3:50 p.m. UTC | #5
On 10/4/2013 4:17 PM, Gene Heskett wrote:
>>> I hope this is a better explanation. :)
>>
>> The idea of power capping is to cap total power not power down and also
>> need root level access to modify.
>
> No.  Restricting it to root control only is NOT an option.  There has to be
> some mechanism whereby the users non-root program can control it.  We don't
> run this software as root, ever.  And the part of this software that needs
> the parport (or a pci card access) is running on a cpu core that has been
> isolated for its use by an isocpus= statement, not visible to top or any
> other system monitoring utility, so you would never know we are pounding on
> that port, both reads and multiple writes, at least 3 times every 23
> microseconds.  So you might see it as idle and turn it off.

I understand that you do not want to see powercapping in effect.
I think I mostly understand the realtime angle you're coming from as well.

However, powercapping is not done for energy savings, it is done for SURVIVAL.
It is not something optional that you can just turn off and ignore;
if you ignore it... something either has a thermal meltdown or trips a circuit breaker... or
in the case of a laptop/tablet kind of shape, you give the user burn blisters.

(the thermal meltdown effect can be either damage to the system or a hard reset done by a hardware safety
mechanism.. neither is what you want for your realtime workload)

The solution to not use powercapping in combination with realtime is to make sure there
is ample cooling for the system, and to make sure the circuit breakers are big enough...
.... not ways to try to turn it off from non-root.

(and note that powerclamp for example takes realtime priority into account by only running at "half priority"...
... but if the real realtime prevents clamping altogether, other, more dracionian things will kick in)


and if you wonder what linux does today without the framework; there are mechanisms that kick in
at the very end of the range, that are very draconian like taking the 3.0Ghz processor down to
effectively 100MHz, or even a system reboot. The point of what Jacob and Srinivas are trying to add
is to intervene slightly earlier (these failsafe mechanisms are still there) but much much more gently.

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gene Heskett Oct. 6, 2013, 8:15 p.m. UTC | #6
On Sunday 06 October 2013, Arjan van de Ven wrote:
>On 10/4/2013 4:17 PM, Gene Heskett wrote:
>>>> I hope this is a better explanation. :)
>>> 
>>> The idea of power capping is to cap total power not power down 

What is the difference to us if it wrecks a $1000 part, or a $100,000 
machine?

>>> and
>>> also need root level access to modify.
>> 
>> No.  Restricting it to root control only is NOT an option.  There has
>> to be some mechanism whereby the users non-root program can control
>> it.  We don't run this software as root, ever.  And the part of this
>> software that needs the parport (or a pci card access) is running on a
>> cpu core that has been isolated for its use by an isocpus= statement,
>> not visible to top or any other system monitoring utility, so you
>> would never know we are pounding on that port, both reads and multiple
>> writes, at least 3 times every 23 microseconds.  So you might see it
>> as idle and turn it off.
>
>I understand that you do not want to see powercapping in effect.
>I think I mostly understand the realtime angle you're coming from as
>well.
>
>However, powercapping is not done for energy savings, it is done for
>SURVIVAL. It is not something optional that you can just turn off and
>ignore; if you ignore it... something either has a thermal meltdown or
>trips a circuit breaker... or in the case of a laptop/tablet kind of
>shape, you give the user burn blisters.

Nobody puts an accessible I/O port, in this case an EPP capable parport, or 
except for the card slot on some of them, any port we can use for real time 
control, so obviously we aren't using any laptops or netbooks in such a 
system, so those concerns are completely out of our playing field.  They 
simply don't apply.

>(the thermal meltdown effect can be either damage to the system or a hard
>reset done by a hardware safety mechanism.. neither is what you want for
>your realtime workload)

No it surely isn't, but we are comparing the worth of replacing a failed 
motherboard that sells for less than 100 bucks, with the worth of a machine 
that may be carving a Toyota O.R.R. engine block at the time of the 
failure.  We can buy a couple cases of those motherboards without raising 
the price of that engine block to the racer, its simply not that big a 
factor.  The ruined but 99% finished engine block now is, so it had better 
not be a weekly occurrence. It is also not something that any of our group 
has ever experienced and gone public with.

>The solution to not use powercapping in combination with realtime is to
>make sure there is ample cooling for the system, and to make sure the
>circuit breakers are big enough... .... not ways to try to turn it off
>from non-root.
>
>(and note that powerclamp for example takes realtime priority into
>account by only running at "half priority"... ... but if the real
>realtime prevents clamping altogether, other, more dracionian things
>will kick in)
>
>
>and if you wonder what linux does today without the framework; there are
>mechanisms that kick in at the very end of the range, that are very
>draconian like taking the 3.0Ghz processor down to effectively 100MHz,
>or even a system reboot. The point of what Jacob and Srinivas are trying
>to add is to intervene slightly earlier (these failsafe mechanisms are
>still there) but much much more gently.

First off, we are not using the type of boards for controllers that would 
burn anything up sans its normal cooling, which is entirely passive on an 
atom powered board as you well know.  So there is no fan to fail and start 
your doomsday scenario in abut 30% of the cases now, but there are a rather 
dukes mixture of other boards being used yet.  Those will be replaced in 
due time as they fail, or the IRQ latency finally starts costing the shop 
owner money because the machine can't be run at the optimum speed with that 
poorly architect-ed board, probably with Atoms or BBB's.

So, let me ask, will your patches initiate a parport hardware shutdown, 
when that port is in fact being used at 1 millisecond intervals best case, 
20 u-sec worst case, by a process you can't see because it is behind an 
isolcpus= statement naming the processor core that is using it?

We can't see past that isolcpus=statement to see how hard that core is 
running, nor can we see the port activity without wasting a pin to drive an 
enabling charge pump.

If you insist on doing this, in the face of ample evidence its nothing but 
a feel good action on your part, then the least we ask is for a tally 
signal output, far enough in advance, say 0.25 seconds, to do a graceful, 
controlled e-stop before the machine self-destructs, or kills somebody 
standing just past the normal travel turn around and goes 2 meters past 
that turn around point because we didn't have time to run all the servo 
outputs to 0.000 volts, stopping the machine in a reasonable time frame 
that doesn't sheer the 3" bolts anchoring it to the floor.  We wouldn't 
care if the seismographs 20 miles away record that stop, which they will & 
have done quite a few times already in the Cincinnati area, but its a safe 
stop except for the potential damages to the workpiece on the table because 
the cutting motions during the stop would be out of the normal path 
tolerance window.

In fact, I'd go so far as to say that any hardware capable of self-
destructing in normal operation, does not need to guarded by this proposed 
function, but blacklisted instead, it is patently a defective design from 
square one regardless of the brand name on the box.  Or just let it burn 
up, the warranty returns will educate the maker/designer soon enough.

Maybe the best compromise is to just put a switch, either on the kernel 
command line, or in kconfig, allowing us to shut this function off on 
installs where this would be dangerous.

Linuxcnc, because of the truly invasive RTAI patches that often takes 
months to properly apply, do not build a new kernel very often, but we 
could shut it off either of those places and be happy.  We are currently 
running 90% of the machines on a 2.6.32-128-RTAI patched kernel, but recent 
experiments with the 3.4.xx + xenomai patch kit have also shown promise.
 
Cheers, Gene
Arjan van de Ven Oct. 7, 2013, 3:46 p.m. UTC | #7
On 10/6/2013 1:15 PM, Gene Heskett wrote:
>> and if you wonder what linux does today without the framework; there are
>> mechanisms that kick in at the very end of the range, that are very
>> draconian like taking the 3.0Ghz processor down to effectively 100MHz,
>> or even a system reboot. The point of what Jacob and Srinivas are trying
>> to add is to intervene slightly earlier (these failsafe mechanisms are
>> still there) but much much more gently.
>
> First off, we are not using the type of boards for controllers that would
> burn anything up sans its normal cooling, which is entirely passive on an
> atom powered board as you well know.  So there is no fan to fail and start
> your doomsday scenario in abut 30% of the cases now, but there are a rather
> dukes mixture of other boards being used yet.  Those will be replaced in
> due time as they fail, or the IRQ latency finally starts costing the shop
> owner money because the machine can't be run at the optimum speed with that
> poorly architect-ed board, probably with Atoms or BBB's.

so if your system today never hits the thermal shutdown...
... you're not going to hit anything powercapping either.



> If you insist on doing this, in the face of ample evidence its nothing but
> a feel good action on your part, then the least we ask is for a tally
> signal output, far enough in advance, say 0.25 seconds, to do a graceful,

btw one thing to note that this is just the kernel mechanism; the actual
knobs that it provides get turned by some userspace daemon..
I would fully expect that if you even ship such a daemon on your realtime device,
that you build in the notification for sure.


> In fact, I'd go so far as to say that any hardware capable of self-
> destructing in normal operation, does not need to guarded by this proposed
> function, but blacklisted instead, it is patently a defective design from
> square one regardless of the brand name on the box.  Or just let it burn
> up, the warranty returns will educate the maker/designer soon enough.

self-destruct or reboot... either case you will not like it.


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/Kconfig b/drivers/Kconfig
index aa43b91..969e987 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -166,4 +166,6 @@  source "drivers/reset/Kconfig"
 
 source "drivers/fmc/Kconfig"
 
+source "drivers/powercap/Kconfig"
+
 endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index ab93de8..34c1d55 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -152,3 +152,4 @@  obj-$(CONFIG_VME_BUS)		+= vme/
 obj-$(CONFIG_IPACK_BUS)		+= ipack/
 obj-$(CONFIG_NTB)		+= ntb/
 obj-$(CONFIG_FMC)		+= fmc/
+obj-$(CONFIG_POWERCAP)		+= powercap/