diff mbox

ACPICA: Revert "ACPICA: Remove obsolete acpi_os_validate_address interface"

Message ID 1246415391.24831.30.camel@minggr.sh.intel.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Lin Ming July 1, 2009, 2:29 a.m. UTC
Revert "ACPICA: Remove obsolete acpi_os_validate_address interface"
    
    This reverts commit f9ca058430333c9a24c5ca926aa445125f88df18.
    
    The quick fix of bug 13620 would be to revert the change.
    http://bugzilla.kernel.org/show_bug.cgi?id=13620
    
    Also, see the commit df92e695998e1bc6e426a840eb86d6d1ee87e2a5
    "ACPI: track opregion names to avoid driver resource conflicts."
    
    But there are problems we need to address:
    
    1. We need to enhance the mechanism of avoiding driver resource conflicts
       to base a "resource" on Field definitions instead of Operation Region definitions.
    
    2. For dynamic region, we need an interface to call when an operation region (field) is deleted,
       in order to delete it from the resource list.
    
    3. If the same region is created and added to resource list over and over again,
       this is have the potential to be a memory leak by growing the list every time
    
    CC: Thomas Renninger <trenn@suse.de>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
---
 drivers/acpi/acpica/acobject.h |    1 +
 drivers/acpi/acpica/dsopcode.c |   24 ++++++++++++++++++++++++
 drivers/acpi/acpica/exfldio.c  |    6 ++++++
 include/acpi/acpiosxf.h        |    4 ++++
 4 files changed, 35 insertions(+), 0 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Thomas Renninger July 1, 2009, 8:56 a.m. UTC | #1
Hi Lin,

thanks for adding me.
This is not "that" sever, but IMO this one should also be submitted
to 2.6.30 stable kernels as it is a riskless revert of a patch
fixing a regression.

On Wednesday 01 July 2009 04:29:51 Lin Ming wrote:
>     Revert "ACPICA: Remove obsolete acpi_os_validate_address interface"
>     
>     This reverts commit f9ca058430333c9a24c5ca926aa445125f88df18.
>     
>     The quick fix of bug 13620 would be to revert the change.
>     http://bugzilla.kernel.org/show_bug.cgi?id=13620
>     
>     Also, see the commit df92e695998e1bc6e426a840eb86d6d1ee87e2a5
>     "ACPI: track opregion names to avoid driver resource conflicts."
>     
>     But there are problems we need to address:
>     
>     1. We need to enhance the mechanism of avoiding driver resource
>     conflicts
>        to base a "resource" on Field definitions instead of Operation Region
>        definitions.
Good idea, this could avoid some false positive detected region conflicts.
>     
>     2. For dynamic region, we need an interface to call when an operation
>     region (field) is deleted,
>        in order to delete it from the resource list.
>     
>     3. If the same region is created and added to resource list over and
>     over again,
>        this is have the potential to be a memory leak by growing the list
>        every time
Region or field or both?
How can this happen, can you show a little ASL snippet or explain this a bit 
more detailed, please. I do not fully understand what is meant in 3.

Thanks,

  Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lin Ming July 1, 2009, 9:23 a.m. UTC | #2
On Wed, 2009-07-01 at 16:56 +0800, Thomas Renninger wrote:
> Hi Lin,
> 
> thanks for adding me.
> This is not "that" sever, but IMO this one should also be submitted
> to 2.6.30 stable kernels as it is a riskless revert of a patch
> fixing a regression.
> 
> On Wednesday 01 July 2009 04:29:51 Lin Ming wrote:
> >     Revert "ACPICA: Remove obsolete acpi_os_validate_address interface"
> >     
> >     This reverts commit f9ca058430333c9a24c5ca926aa445125f88df18.
> >     
> >     The quick fix of bug 13620 would be to revert the change.
> >     http://bugzilla.kernel.org/show_bug.cgi?id=13620
> >     
> >     Also, see the commit df92e695998e1bc6e426a840eb86d6d1ee87e2a5
> >     "ACPI: track opregion names to avoid driver resource conflicts."
> >     
> >     But there are problems we need to address:
> >     
> >     1. We need to enhance the mechanism of avoiding driver resource
> >     conflicts
> >        to base a "resource" on Field definitions instead of Operation Region
> >        definitions.
> Good idea, this could avoid some false positive detected region conflicts.
> >     
> >     2. For dynamic region, we need an interface to call when an operation
> >     region (field) is deleted,
> >        in order to delete it from the resource list.
> >     
> >     3. If the same region is created and added to resource list over and
> >     over again,
> >        this is have the potential to be a memory leak by growing the list
> >        every time
> Region or field or both?

Currently, it's region.
If we change the mechanism to base on "Field", as item 1 above, then
it's field.

> How can this happen, can you show a little ASL snippet or explain this a bit 
> more detailed, please. I do not fully understand what is meant in 3.

For example, the dynamic region which defined in a method,

Method(m000)
{
	OperationRegion (xxxx, SystemMemory, 0xFED11000, 0xFF)
	.....
}

If method m000 is called multiple times, then the region xxxx will be
inserted to the resource list again and again.

So we need item 2 above to delete the dynamic region from the resource
list.

Thanks for comments.

Lin Ming

> 
> Thanks,
> 
>   Thomas

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Renninger July 1, 2009, 9:35 a.m. UTC | #3
On Wednesday 01 July 2009 11:23:38 Lin Ming wrote:
> On Wed, 2009-07-01 at 16:56 +0800, Thomas Renninger wrote:
> > Hi Lin,
> > 
> > thanks for adding me.
> > This is not "that" sever, but IMO this one should also be submitted
> > to 2.6.30 stable kernels as it is a riskless revert of a patch
> > fixing a regression.
> > 
> > On Wednesday 01 July 2009 04:29:51 Lin Ming wrote:
> > >     Revert "ACPICA: Remove obsolete acpi_os_validate_address interface"
> > >     
> > >     This reverts commit f9ca058430333c9a24c5ca926aa445125f88df18.
> > >     
> > >     The quick fix of bug 13620 would be to revert the change.
> > >     http://bugzilla.kernel.org/show_bug.cgi?id=13620
> > >     
> > >     Also, see the commit df92e695998e1bc6e426a840eb86d6d1ee87e2a5
> > >     "ACPI: track opregion names to avoid driver resource conflicts."
> > >     
> > >     But there are problems we need to address:
> > >     
> > >     1. We need to enhance the mechanism of avoiding driver resource
> > >     conflicts
> > >        to base a "resource" on Field definitions instead of Operation 
Region
> > >        definitions.
> > Good idea, this could avoid some false positive detected region conflicts.
> > >     
> > >     2. For dynamic region, we need an interface to call when an 
operation
> > >     region (field) is deleted,
> > >        in order to delete it from the resource list.
> > >     
> > >     3. If the same region is created and added to resource list over and
> > >     over again,
> > >        this is have the potential to be a memory leak by growing the 
list
> > >        every time
> > Region or field or both?
> 
> Currently, it's region.
> If we change the mechanism to base on "Field", as item 1 above, then
> it's field.
> 
> > How can this happen, can you show a little ASL snippet or explain this a 
bit 
> > more detailed, please. I do not fully understand what is meant in 3.
> 
> For example, the dynamic region which defined in a method,
> 
> Method(m000)
> {
> 	OperationRegion (xxxx, SystemMemory, 0xFED11000, 0xFF)
> 	.....
> }
> 
> If method m000 is called multiple times, then the region xxxx will be
> inserted to the resource list again and again.
> 
> So we need item 2 above to delete the dynamic region from the resource
> list.
Ouch, I thought OperationRegions must be declared in global namespace.
I agree, we have a memleak and this looks rather sever.
Grmpfl, that makes the resource conflict checking even more ugly.
Thanks for spotting this,

     Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Moore, Robert July 1, 2009, 3:29 p.m. UTC | #4
>-----Original Message-----
>From: Thomas Renninger [mailto:trenn@suse.de]
>Sent: Wednesday, July 01, 2009 2:35 AM
>To: Lin, Ming M
>Cc: Len Brown; Moore, Robert; Zhao, Yakui; linux-acpi
>Subject: Re: [PATCH] ACPICA: Revert "ACPICA: Remove obsolete
>acpi_os_validate_address interface"
>
>On Wednesday 01 July 2009 11:23:38 Lin Ming wrote:
>> On Wed, 2009-07-01 at 16:56 +0800, Thomas Renninger wrote:
>> > Hi Lin,
>> >
>> > thanks for adding me.
>> > This is not "that" sever, but IMO this one should also be submitted
>> > to 2.6.30 stable kernels as it is a riskless revert of a patch
>> > fixing a regression.
>> >
>> > On Wednesday 01 July 2009 04:29:51 Lin Ming wrote:
>> > >     Revert "ACPICA: Remove obsolete acpi_os_validate_address
>interface"
>> > >
>> > >     This reverts commit f9ca058430333c9a24c5ca926aa445125f88df18.
>> > >
>> > >     The quick fix of bug 13620 would be to revert the change.
>> > >     http://bugzilla.kernel.org/show_bug.cgi?id=13620
>> > >
>> > >     Also, see the commit df92e695998e1bc6e426a840eb86d6d1ee87e2a5
>> > >     "ACPI: track opregion names to avoid driver resource conflicts."
>> > >
>> > >     But there are problems we need to address:
>> > >
>> > >     1. We need to enhance the mechanism of avoiding driver resource
>> > >     conflicts
>> > >        to base a "resource" on Field definitions instead of Operation
>Region
>> > >        definitions.
>> > Good idea, this could avoid some false positive detected region
>conflicts.
>> > >
>> > >     2. For dynamic region, we need an interface to call when an
>operation
>> > >     region (field) is deleted,
>> > >        in order to delete it from the resource list.
>> > >
>> > >     3. If the same region is created and added to resource list over
>and
>> > >     over again,
>> > >        this is have the potential to be a memory leak by growing the
>list
>> > >        every time
>> > Region or field or both?
>>
>> Currently, it's region.
>> If we change the mechanism to base on "Field", as item 1 above, then
>> it's field.
>>
>> > How can this happen, can you show a little ASL snippet or explain this
>a
>bit
>> > more detailed, please. I do not fully understand what is meant in 3.
>>
>> For example, the dynamic region which defined in a method,
>>
>> Method(m000)
>> {
>> 	OperationRegion (xxxx, SystemMemory, 0xFED11000, 0xFF)
>> 	.....
>> }
>>
>> If method m000 is called multiple times, then the region xxxx will be
>> inserted to the resource list again and again.
>>
>> So we need item 2 above to delete the dynamic region from the resource
>> list.
>Ouch, I thought OperationRegions must be declared in global namespace.
>I agree, we have a memleak and this looks rather sever.
>Grmpfl, that makes the resource conflict checking even more ugly.
>Thanks for spotting this,
>
>     Thomas


OperationRegions/Fields can be dynamically created/deleted in two ways:
1) Control method execution
2) Dynamic ACPI table load/unload (for example, hotplug)

So, in order to track this, the resource list must also be dynamic, with the ability to add and delete entries.

It begins to sound like the resource list is becoming a "mini-namespace" that consists of only ACPI field objects. Might it not be more efficient to simply query the existing ACPI namespace when you need to? A walk_namespace for field objects would give you the same information as the resource list, and it is automatically guaranteed to be current.

However, the may be some more fundamental issues. I have not looked at exactly how the resource list is used, but since the list is dynamic, if the worry concerns address collisions between a driver and the AML interpreter, it may not be enough to simply check for this at the time the driver is loaded. You would really need to check before *every* I/O or memory access. Which does not sound feasible.

I guess that I would like to understand the exact issue(s) that are behind the creation of the resource list in the first place. AFAIK, any AML code that reads/writes to operation regions usually assumes exclusive access to these regions. The ACPI Global Lock is used when exclusive access cannot be assumed.

Bob

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Renninger July 1, 2009, 9:19 p.m. UTC | #5
On Wednesday 01 July 2009 05:29:57 pm Moore, Robert wrote:
> >-----Original Message-----
> >From: Thomas Renninger [mailto:trenn@suse.de]
> >Sent: Wednesday, July 01, 2009 2:35 AM
> >To: Lin, Ming M
> >Cc: Len Brown; Moore, Robert; Zhao, Yakui; linux-acpi
> >Subject: Re: [PATCH] ACPICA: Revert "ACPICA: Remove obsolete
> >acpi_os_validate_address interface"
> >
> >On Wednesday 01 July 2009 11:23:38 Lin Ming wrote:
> >> On Wed, 2009-07-01 at 16:56 +0800, Thomas Renninger wrote:
> >> > Hi Lin,
> >> >
> >> > thanks for adding me.
> >> > This is not "that" sever, but IMO this one should also be submitted
> >> > to 2.6.30 stable kernels as it is a riskless revert of a patch
> >> > fixing a regression.
> >> >
> >> > On Wednesday 01 July 2009 04:29:51 Lin Ming wrote:
> >> > >     Revert "ACPICA: Remove obsolete acpi_os_validate_address
> >
> >interface"
> >
> >> > >     This reverts commit f9ca058430333c9a24c5ca926aa445125f88df18.
> >> > >
> >> > >     The quick fix of bug 13620 would be to revert the change.
> >> > >     http://bugzilla.kernel.org/show_bug.cgi?id=13620
> >> > >
> >> > >     Also, see the commit df92e695998e1bc6e426a840eb86d6d1ee87e2a5
> >> > >     "ACPI: track opregion names to avoid driver resource conflicts."
> >> > >

< Cut out some badly formatted text >

Jean, for you info: We have two problems here, not sure you saw this:
1) The acpica method to tell the os which IO addresses might get used got reverted,
     thus the acpi_enforce_resources=strict/lax/no functionality does not exist any more.
2) Our, or say may part of the acpi_enforce_resources= implementation introduces a
     sever memleak. Opregion declarations can be inside of functions and can be called
     again and again. The list I introduced may increase forever if often called ACPI functions
     include OpRegion declarations...
> >> For example, the dynamic region which defined in a method,
> >>
> >> Method(m000)
> >> {
> >> 	OperationRegion (xxxx, SystemMemory, 0xFED11000, 0xFF)
> >> 	.....
> >> }
> >>
> >> If method m000 is called multiple times, then the region xxxx will be
> >> inserted to the resource list again and again.
> >>
> >> So we need item 2 above to delete the dynamic region from the resource
> >> list.
> >
> >Ouch, I thought OperationRegions must be declared in global namespace.
> >I agree, we have a memleak and this looks rather sever.

An idea for a quick fix which may be suitable for stable kernels (without
checking acpica code..):
Is it easily possible to check whether the Opregion is declared in global
namespace or inside a method at the place where acpi_os_validate_address is called
from acpica?
In the latter case it should not be called and we avoid the memleak and
could still detect most i2c/smbus/hwmon devices conflicts.

> >Grmpfl, that makes the resource conflict checking even more ugly.
> >Thanks for spotting this,
> >
> >     Thomas
>
> OperationRegions/Fields can be dynamically created/deleted in two ways:
> 1) Control method execution
> 2) Dynamic ACPI table load/unload (for example, hotplug)
>
> So, in order to track this, the resource list must also be dynamic, with
> the ability to add and delete entries.
>
> It begins to sound like the resource list is becoming a "mini-namespace"
> that consists of only ACPI field objects. Might it not be more efficient to
> simply query the existing ACPI namespace when you need to? A walk_namespace
> for field objects would give you the same information as the resource list,
> and it is automatically guaranteed to be current.
>
> However, the may be some more fundamental issues. I have not looked at
> exactly how the resource list is used, but since the list is dynamic, if
> the worry concerns address collisions between a driver and the AML
> interpreter, it may not be enough to simply check for this at the time the
> driver is loaded. You would really need to check before *every* I/O or
> memory access. Which does not sound feasible.
>
> I guess that I would like to understand the exact issue(s) that are behind
> the creation of the resource list in the first place. AFAIK, any AML code
> that reads/writes to operation regions usually assumes exclusive access to
> these regions. The ACPI Global Lock is used when exclusive access cannot be
> assumed.

Thinking a bit more about this problem is probably a good idea.
The resource checking  always was a workaround which may avoid loading of
drivers because Opregion declarations may never get used.
When I looked at DSDTs I had the problem that BIOSes declare all kind of
things like overlapping and double declarations. I initially wanted to
add things somehow to:
/proc/iomem and /proc/ioports, but I finally didn't see a way to do that.

    Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Moore, Robert July 1, 2009, 10:07 p.m. UTC | #6
could still detect most i2c/smbus/hwmon devices conflicts.

What exactly are these conflicts?


>-----Original Message-----
>From: Thomas Renninger [mailto:trenn@suse.de]
>Sent: Wednesday, July 01, 2009 2:20 PM
>To: Moore, Robert
>Cc: Lin, Ming M; Len Brown; Zhao, Yakui; linux-acpi; jdelvare@suse.de
>Subject: Re: [PATCH] ACPICA: Revert "ACPICA: Remove obsolete
>acpi_os_validate_address interface"
>
>On Wednesday 01 July 2009 05:29:57 pm Moore, Robert wrote:
>> >-----Original Message-----
>> >From: Thomas Renninger [mailto:trenn@suse.de]
>> >Sent: Wednesday, July 01, 2009 2:35 AM
>> >To: Lin, Ming M
>> >Cc: Len Brown; Moore, Robert; Zhao, Yakui; linux-acpi
>> >Subject: Re: [PATCH] ACPICA: Revert "ACPICA: Remove obsolete
>> >acpi_os_validate_address interface"
>> >
>> >On Wednesday 01 July 2009 11:23:38 Lin Ming wrote:
>> >> On Wed, 2009-07-01 at 16:56 +0800, Thomas Renninger wrote:
>> >> > Hi Lin,
>> >> >
>> >> > thanks for adding me.
>> >> > This is not "that" sever, but IMO this one should also be submitted
>> >> > to 2.6.30 stable kernels as it is a riskless revert of a patch
>> >> > fixing a regression.
>> >> >
>> >> > On Wednesday 01 July 2009 04:29:51 Lin Ming wrote:
>> >> > >     Revert "ACPICA: Remove obsolete acpi_os_validate_address
>> >
>> >interface"
>> >
>> >> > >     This reverts commit f9ca058430333c9a24c5ca926aa445125f88df18.
>> >> > >
>> >> > >     The quick fix of bug 13620 would be to revert the change.
>> >> > >     http://bugzilla.kernel.org/show_bug.cgi?id=13620
>> >> > >
>> >> > >     Also, see the commit df92e695998e1bc6e426a840eb86d6d1ee87e2a5
>> >> > >     "ACPI: track opregion names to avoid driver resource
>conflicts."
>> >> > >
>
>< Cut out some badly formatted text >
>
>Jean, for you info: We have two problems here, not sure you saw this:
>1) The acpica method to tell the os which IO addresses might get used got
>reverted,
>     thus the acpi_enforce_resources=strict/lax/no functionality does not
>exist any more.
>2) Our, or say may part of the acpi_enforce_resources= implementation
>introduces a
>     sever memleak. Opregion declarations can be inside of functions and
>can be called
>     again and again. The list I introduced may increase forever if often
>called ACPI functions
>     include OpRegion declarations...
>> >> For example, the dynamic region which defined in a method,
>> >>
>> >> Method(m000)
>> >> {
>> >> 	OperationRegion (xxxx, SystemMemory, 0xFED11000, 0xFF)
>> >> 	.....
>> >> }
>> >>
>> >> If method m000 is called multiple times, then the region xxxx will be
>> >> inserted to the resource list again and again.
>> >>
>> >> So we need item 2 above to delete the dynamic region from the resource
>> >> list.
>> >
>> >Ouch, I thought OperationRegions must be declared in global namespace.
>> >I agree, we have a memleak and this looks rather sever.
>
>An idea for a quick fix which may be suitable for stable kernels (without
>checking acpica code..):
>Is it easily possible to check whether the Opregion is declared in global
>namespace or inside a method at the place where acpi_os_validate_address is
>called
>from acpica?
>In the latter case it should not be called and we avoid the memleak and
>could still detect most i2c/smbus/hwmon devices conflicts.
>
>> >Grmpfl, that makes the resource conflict checking even more ugly.
>> >Thanks for spotting this,
>> >
>> >     Thomas
>>
>> OperationRegions/Fields can be dynamically created/deleted in two ways:
>> 1) Control method execution
>> 2) Dynamic ACPI table load/unload (for example, hotplug)
>>
>> So, in order to track this, the resource list must also be dynamic, with
>> the ability to add and delete entries.
>>
>> It begins to sound like the resource list is becoming a "mini-namespace"
>> that consists of only ACPI field objects. Might it not be more efficient
>to
>> simply query the existing ACPI namespace when you need to? A
>walk_namespace
>> for field objects would give you the same information as the resource
>list,
>> and it is automatically guaranteed to be current.
>>
>> However, the may be some more fundamental issues. I have not looked at
>> exactly how the resource list is used, but since the list is dynamic, if
>> the worry concerns address collisions between a driver and the AML
>> interpreter, it may not be enough to simply check for this at the time
>the
>> driver is loaded. You would really need to check before *every* I/O or
>> memory access. Which does not sound feasible.
>>
>> I guess that I would like to understand the exact issue(s) that are
>behind
>> the creation of the resource list in the first place. AFAIK, any AML code
>> that reads/writes to operation regions usually assumes exclusive access
>to
>> these regions. The ACPI Global Lock is used when exclusive access cannot
>be
>> assumed.
>
>Thinking a bit more about this problem is probably a good idea.
>The resource checking  always was a workaround which may avoid loading of
>drivers because Opregion declarations may never get used.
>When I looked at DSDTs I had the problem that BIOSes declare all kind of
>things like overlapping and double declarations. I initially wanted to
>add things somehow to:
>/proc/iomem and /proc/ioports, but I finally didn't see a way to do that.
>
>    Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Len Brown July 2, 2009, 2:03 a.m. UTC | #7
I can't apply a patch that adds a known memory leak,
even if it removes a conflict between drivers.

The reason is that there is a workaround for the driver conflict,
but no workaround for the memory leak.

thanks,
Len Brown, Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jean Delvare July 2, 2009, 8:20 a.m. UTC | #8
Hi Robert,

Le jeudi 02 juillet 2009, Moore, Robert a écrit :
> could still detect most i2c/smbus/hwmon devices conflicts.
> 
> What exactly are these conflicts?

Native Linux drivers trying to access devices which ACPI also wants
to access. Most frequently these are SMBus master drivers or hardware
monitoring drivers (for Super I/O chips) but virtually any other
device type could be affected, due to the fact that ACPI I/O
resources do not show up in /proc/ioports and /proc/iomem.

The workaround written by Thomas would let the native drivers which
are more frequently affected by the problem ask the ACPI subsystem,
on a voluntary basis, if a given I/O range is safe to use. While not
perfect, it worked reasonably well in practice.

Your upstream commit broke this safety mechanism, and now all the
past bugs resulting of native driver vs. ACPI conflicts show up
again, so you will have to fix it now. Either by reverting your
commit, or by coming up with a better safety. I don't know much about
ACPI so I can't really comment on a technical perspective. All I care
about is the functionality: either concurrent access from native
drivers and ACPI must be doable safely, or it must be prevented
altogether (which Thomas' code was doing.)

I will be happy to answer your questions about the native Linux
drivers for SMBus master devices and hardware monitoring devices.
I can also help with testing if you want.
Jean Delvare July 2, 2009, 8:30 a.m. UTC | #9
Le mercredi 01 juillet 2009, Thomas Renninger a écrit :
> On Wednesday 01 July 2009 05:29:57 pm Moore, Robert wrote:
> > >-----Original Message-----
> > >From: Thomas Renninger [mailto:trenn@suse.de]
> > >Ouch, I thought OperationRegions must be declared in global namespace.
> > >I agree, we have a memleak and this looks rather sever.
> 
> An idea for a quick fix which may be suitable for stable kernels (without
> checking acpica code..):
> Is it easily possible to check whether the Opregion is declared in global
> namespace or inside a method at the place where acpi_os_validate_address is called
> from acpica?
> In the latter case it should not be called and we avoid the memleak and
> could still detect most i2c/smbus/hwmon devices conflicts.

Can't you just walk the list before you add an entry, and if the
entry is already present, do not add it again?

> > >Grmpfl, that makes the resource conflict checking even more ugly.
> > >Thanks for spotting this,
> > >
> > >     Thomas
> >
> > OperationRegions/Fields can be dynamically created/deleted in two ways:
> > 1) Control method execution
> > 2) Dynamic ACPI table load/unload (for example, hotplug)
> >
> > So, in order to track this, the resource list must also be dynamic, with
> > the ability to add and delete entries.
> >
> > It begins to sound like the resource list is becoming a "mini-namespace"
> > that consists of only ACPI field objects. Might it not be more efficient to
> > simply query the existing ACPI namespace when you need to? A walk_namespace
> > for field objects would give you the same information as the resource list,
> > and it is automatically guaranteed to be current.
> >
> > However, the may be some more fundamental issues. I have not looked at
> > exactly how the resource list is used, but since the list is dynamic, if
> > the worry concerns address collisions between a driver and the AML
> > interpreter, it may not be enough to simply check for this at the time the
> > driver is loaded. You would really need to check before *every* I/O or
> > memory access. Which does not sound feasible.

This is indeed not feasible, the cost would be much too high. Not to
mention it would still be racy by design.

Please remember that Thomas' code was meant to solve real-world
problems which had been there for long and nobody else was able to
solve so far. It wasn't meant to be bullet-proof, it was not perfect,
but it did help in practice.

> > I guess that I would like to understand the exact issue(s) that are behind
> > the creation of the resource list in the first place. AFAIK, any AML code
> > that reads/writes to operation regions usually assumes exclusive access to
> > these regions. The ACPI Global Lock is used when exclusive access cannot be
> > assumed.

I could change the affected native drivers to take the ACPI Global
Lock if needed. But what guarantees that the ACPI code which accesses
the device in question is taking the ACPI Global Lock too? It can
only work if everybody plays the game. My understanding was that most
ACPI implementation did not take the ACPI Global Lock when accessing
the devices and thus this approach wouldn't work in practice. Was I
wrong?
Thomas Renninger July 2, 2009, 10:22 a.m. UTC | #10
On Thursday 02 July 2009 04:03:55 Len Brown wrote:
> I can't apply a patch that adds a known memory leak,
> even if it removes a conflict between drivers.
Can you revert it with Lin's added on top (after discussing it), please.
It's always a good idea (or a must) to have a stable patch in mainline first.
I doubt we find a proper solution for 2.6.31.

> The reason is that there is a workaround for the driver conflict,
> but no workaround for the memory leak.
There is one now.

Thanks,

   Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Moore, Robert July 2, 2009, 3:49 p.m. UTC | #11
>-----Original Message-----
>From: Jean Delvare [mailto:jdelvare@suse.de]
...
>Native Linux drivers trying to access devices which ACPI also wants
>to access. Most frequently these are SMBus master drivers or hardware
>monitoring drivers (for Super I/O chips) but virtually any other
>device type could be affected, due to the fact that ACPI I/O
>resources do not show up in /proc/ioports and /proc/iomem.


I have some concern with this. The implication is that there is a serious hole in ACPI that can only be solved by monitoring exactly what the AML code is doing, because access to resources may conflict with device drivers. I have to think that if this were truly the case, the operating system vendors and the ACPI contributors would have pushed a fix for this problem into the ACPI specification by now (in the 13 years since the first release of ACPI).

The SMBus driver issue is very interesting. I have not heard of any conflicts between AML code and SMBus drivers on other operating systems, nor have we received any other complaints about removing the acpi_os_validate_address interface. I wonder if the Linux SMBus driver is doing something that it should not be doing.

I'd like to see a concrete example of the problem. If someone can send me a DSDT that contains such an example, I would appreciate it. I want to see the exact ASL code that is causing the problem along with the driver code that conflicts with it. This information will help me understand the problem in relation to the ACPICA code, and I can also present this to other ACPI and BIOS experts for their opinions.

As far as an immediate fix for the problem, I suggest that you think about using acpi_walk_namespace to retrieve all objects of type Region (or field). It seems a waste to build an entire list of objects that duplicates information that already exists in the namespace. 

Bob

>-----Original Message-----
>From: Thomas Renninger [mailto:trenn@suse.de]
>Sent: Thursday, July 02, 2009 3:22 AM
>To: Len Brown
>Cc: Lin, Ming M; Moore, Robert; Zhao, Yakui; linux-acpi
>Subject: Re: [PATCH] ACPICA: Revert "ACPICA: Remove obsolete
>acpi_os_validate_address interface"
>
>On Thursday 02 July 2009 04:03:55 Len Brown wrote:
>> I can't apply a patch that adds a known memory leak,
>> even if it removes a conflict between drivers.
>Can you revert it with Lin's added on top (after discussing it), please.
>It's always a good idea (or a must) to have a stable patch in mainline
>first.
>I doubt we find a proper solution for 2.6.31.
>
>> The reason is that there is a workaround for the driver conflict,
>> but no workaround for the memory leak.
>There is one now.
>
>Thanks,
>
>   Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robert Hancock July 4, 2009, 1:29 a.m. UTC | #12
On 07/02/2009 09:49 AM, Moore, Robert wrote:
>> -----Original Message-----
>> From: Jean Delvare [mailto:jdelvare@suse.de]
> ...
>> Native Linux drivers trying to access devices which ACPI also wants
>> to access. Most frequently these are SMBus master drivers or hardware
>> monitoring drivers (for Super I/O chips) but virtually any other
>> device type could be affected, due to the fact that ACPI I/O
>> resources do not show up in /proc/ioports and /proc/iomem.
>
>
> I have some concern with this. The implication is that there is a serious hole in ACPI that can only be solved by monitoring exactly what the AML code is doing, because access to resources may conflict with device drivers. I have to think that if this were truly the case, the operating system vendors and the ACPI contributors would have pushed a fix for this problem into the ACPI specification by now (in the 13 years since the first release of ACPI).
>
> The SMBus driver issue is very interesting. I have not heard of any conflicts between AML code and SMBus drivers on other operating systems, nor have we received any other complaints about removing the acpi_os_validate_address interface. I wonder if the Linux SMBus driver is doing something that it should not be doing.
>
> I'd like to see a concrete example of the problem. If someone can send me a DSDT that contains such an example, I would appreciate it. I want to see the exact ASL code that is causing the problem along with the driver code that conflicts with it. This information will help me understand the problem in relation to the ACPICA code, and I can also present this to other ACPI and BIOS experts for their opinions.
>
> As far as an immediate fix for the problem, I suggest that you think about using acpi_walk_namespace to retrieve all objects of type Region (or field). It seems a waste to build an entire list of objects that duplicates information that already exists in the namespace.

My guess is that exactly the same thing will happen if you install 
add-on hardware monitoring software on Windows that loads drivers that 
access SMBus hardware directly on such systems where ACPI accesses the 
same devices. The ACPI code and the native driver will likely stomp on 
each other's hardware accesses. Likely on Linux it's more frequent that 
this kind of native hardware monitoring driver gets used (in some setups 
it seems they will auto-load by default..)
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jean Delvare Aug. 30, 2009, 1:43 p.m. UTC | #13
Hi Robert, Robert, all,

Sorry for the late answer. Busy at work then long vacation and here
I am again.

Le samedi 04 juillet 2009, Robert Hancock a écrit :
> On 07/02/2009 09:49 AM, Moore, Robert wrote:
> >> -----Original Message-----
> >> From: Jean Delvare [mailto:jdelvare@suse.de]
> > ...
> >> Native Linux drivers trying to access devices which ACPI also wants
> >> to access. Most frequently these are SMBus master drivers or hardware
> >> monitoring drivers (for Super I/O chips) but virtually any other
> >> device type could be affected, due to the fact that ACPI I/O
> >> resources do not show up in /proc/ioports and /proc/iomem.
> >
> >
> > I have some concern with this. The implication is that there is a
> > serious hole in ACPI that can only be solved by monitoring
> > exactly what the AML code is doing, because access to resources
> > may conflict with device drivers.

This is the basic idea, yes. That being said, if I/O resources used
by ACPI are properly declared, then the problem is partly solved:
ACPI will take care of the devices (either through a dedicated driver
or through a generic ACPI driver such as "thermal") and native
drivers will not attempt to access the device.

Problems arise when:

* ACPI declares I/O regions just because they exist, but doesn't make
use of them. This will prevent native drivers from binding to the
device, leaving the user without the functionality.

* ACPI accesses I/O regions without declaring them. Native drivers
will access them too, causing havoc.

While the two issues above could be worked around by analyzing the
AML code as you suggested, a better fix is to get BIOSes out there
right. I know it's an on-going, never-ending battle, but it is still
preferable to fix the bug where it really is, isn't it?

Then there are two corner cases:

* The ACPI implementation does offer the possibility to write an ACPI
driver for hardware monitoring on a given machine, but we do not yet
have written such a driver. This has been the case for Asus' ATK0110
driver for quite some time. This is a minor issue IMHO, not different
from lack of support of other, hardware devices.

* ACPI does access the hardware monitoring chip (or SMBus controller)
and does something useful with it, but this is only a small subset of
what a native driver would do. A typical example of this is
mainboards implementing ACPI thermal zones out of a full-featured
hardware monitoring chip. I have such a board here (Jetway K8M8MS).
The ACPI "thermal" driver displays one temperature. The native driver
(f71805f) displays 9 voltages, 3 fan speeds and 3 temperature values,
all settable min and max limits, and their associated alarm flags.
The ACPI "thermal" driver will load by default, while the native
driver will be blocked, leaving the user with very reduced
functionality. Me, I have configured by system to blacklist the
thermal driver and load the f71805f driver instead, I _think_ I am
safe doing this, but I'm not even sure.

> > I have to think that if this were truly the case, the operating
> > system vendors and the ACPI contributors would have pushed a fix
> > for this problem into the ACPI specification by now (in the 13
> > years since the first release of ACPI).       

If they did care, yes they would have. But see below.

> > The SMBus driver issue is very interesting. I have not heard of
> > any conflicts between AML code and SMBus drivers on other
> > operating systems, nor have we received any other complaints
> > about removing the acpi_os_validate_address interface. I wonder
> > if the Linux SMBus driver is doing something that it should not
> > be doing.     

I don't think it is. The Linux SMBus drivers (note the plural, we
have one driver for each separate PC SMBus implementation, that's
about a dozen of them) do exactly what they have to, that is: provide
the rest of the kernel with a standard API to access devices
connected to the SMBus.

Of course you can claim that, on ACPI systems, the OS should never
attempt to access thermal-management-related devices. This solves
the resource conflict problem at once, at the price of a loss of
functionality for most users out there (because on most desktop and
server systems, ACPI doesn't care about thermal management at all.)

> > I'd like to see a concrete example of the problem. If someone can
> > send me a DSDT that contains such an example, I would appreciate
> > it. I want to see the exact ASL code that is causing the problem
> > along with the driver code that conflicts with it. This
> > information will help me understand the problem in relation to
> > the ACPICA code, and I can also present this to other ACPI and
> > BIOS experts for their opinions.      

Here you go, attached. This is the only machine I have at home with
an ACPI vs. native driver resource conflict. This is the mainboard I
mentioned above already: Jetway K8M8MS. Conflict is for I/O region
0x295-0x296.

> > As far as an immediate fix for the problem, I suggest that you
> > think about using acpi_walk_namespace to retrieve all objects of
> > type Region (or field). It seems a waste to build an entire list
> > of objects that duplicates information that already exists in the
> > namespace.    
> 
> My guess is that exactly the same thing will happen if you install 
> add-on hardware monitoring software on Windows that loads drivers that 
> access SMBus hardware directly on such systems where ACPI accesses the 
> same devices. The ACPI code and the native driver will likely stomp on 
> each other's hardware accesses. Likely on Linux it's more frequent that 
> this kind of native hardware monitoring driver gets used (in some setups 
> it seems they will auto-load by default..)

This is a very good point, although not totally exact. The key reason
why the problem wasn't solved yet is because Microsoft doesn't give a
damn. Last time I checked (several years ago admittedly) Windows did
not offer support for hardware monitoring at all. Basically the user
must choose between a vendor-provided monitoring applications (most
PC mainboard vendors have one by now) or 3rd party applications such
as MBM (discontinued), SpeedFan or hmonitor.

With vendor applications, the problem doesn't exist, because the
vendor has additional knowledge. They only support their own
products, and they know what their ACPI implementation is doing or
not. So they can easily access custom ACPI code if needed, use
custom locking mechanism if needed, etc.

With 3rd party applications, the problem is exactly the same as with
Linux: this is best effort. The big advantage is a unified hardware
monitoring handling across all vendors and models, but at the price
of a lack of knowledge about system specifics, which means lot of
trouble on some systems.

As you are talking about improving the ACPI standard, here are ideas
of what could be done:

* ACPI could define a standard interface to access hardware
monitoring chips. Something similar in essence to the thermal zones,
but that would cover all the standard features of hardware monitoring
chips (voltages, temperatures, fans, including limits, and if
possible fan speed control.) Vendors would implement that interface
in their BIOSes, and OSes would simply need one driver to support all
systems out there.

* ACPI could define a flag the OS would check, basically declaring
whether thermal management / hardware monitoring should be handled
by ACPI or by the OS on every given system. Then all we have to do
is educate all drivers to check this flag and decide whether to load
or not.

* ACPI could offer a way to share I/O regions between AML code and
OS code. Then both ACPI and native drivers could access the same
device. Of course, driver code would have to be very robust (i.e.
slow) because it couldn't assume _anything_ about register values
and device state between accesses. But that would still be better
than the current situation.

Lastly, one more thought about SMBus-based hardware monitoring chips:
blocking I/O access to them basically means blocking I/O access to
_all_ devices connected to the SMBus. In many cases this means SPD
EEPROMs, which are fun to access at run-time but we can definitely
live without them. But there can be other devices on the SMBus,
for example some FSC systems have a custom chip on the SMBus to
handle custom function keys. If you want to let the OS drive the
chip, then the best (only?) approach is to let ACPI handle the
SMBus for both itself and the OS. That is, have an ACPI driver for
the SMBus implementation, which must support ACPI accesses and OS
accesses alike, gracefully. I'm pretty sure that ACPI _does_ define
a standard SMBus API which would make writing such a driver
possible... I even seem to remember that said driver did exist at
some point in the past. I'm curious why it disappeared?
diff mbox

Patch

diff --git a/drivers/acpi/acpica/acobject.h b/drivers/acpi/acpica/acobject.h
index 544dcf8..eb6f038 100644
--- a/drivers/acpi/acpica/acobject.h
+++ b/drivers/acpi/acpica/acobject.h
@@ -97,6 +97,7 @@ 
 #define AOPOBJ_OBJECT_INITIALIZED   0x08
 #define AOPOBJ_SETUP_COMPLETE       0x10
 #define AOPOBJ_SINGLE_DATUM         0x20
+#define AOPOBJ_INVALID              0x40	/* Used if host OS won't allow an op_region address */
 
 /******************************************************************************
  *
diff --git a/drivers/acpi/acpica/dsopcode.c b/drivers/acpi/acpica/dsopcode.c
index 584d766..b79978f 100644
--- a/drivers/acpi/acpica/dsopcode.c
+++ b/drivers/acpi/acpica/dsopcode.c
@@ -397,6 +397,30 @@  acpi_status acpi_ds_get_region_arguments(union acpi_operand_object *obj_desc)
 	status = acpi_ds_execute_arguments(node, acpi_ns_get_parent_node(node),
 					   extra_desc->extra.aml_length,
 					   extra_desc->extra.aml_start);
+	if (ACPI_FAILURE(status)) {
+		return_ACPI_STATUS(status);
+	}
+
+	/* Validate the region address/length via the host OS */
+
+	status = acpi_os_validate_address(obj_desc->region.space_id,
+					  obj_desc->region.address,
+					  (acpi_size) obj_desc->region.length,
+					  acpi_ut_get_node_name(node));
+
+	if (ACPI_FAILURE(status)) {
+		/*
+		 * Invalid address/length. We will emit an error message and mark
+		 * the region as invalid, so that it will cause an additional error if
+		 * it is ever used. Then return AE_OK.
+		 */
+		ACPI_EXCEPTION((AE_INFO, status,
+				"During address validation of OpRegion [%4.4s]",
+				node->name.ascii));
+		obj_desc->common.flags |= AOPOBJ_INVALID;
+		status = AE_OK;
+	}
+
 	return_ACPI_STATUS(status);
 }
 
diff --git a/drivers/acpi/acpica/exfldio.c b/drivers/acpi/acpica/exfldio.c
index d4075b8..6687be1 100644
--- a/drivers/acpi/acpica/exfldio.c
+++ b/drivers/acpi/acpica/exfldio.c
@@ -113,6 +113,12 @@  acpi_ex_setup_region(union acpi_operand_object *obj_desc,
 		}
 	}
 
+	/* Exit if Address/Length have been disallowed by the host OS */
+
+	if (rgn_desc->common.flags & AOPOBJ_INVALID) {
+		return_ACPI_STATUS(AE_AML_ILLEGAL_ADDRESS);
+	}
+
 	/*
 	 * Exit now for SMBus address space, it has a non-linear address space
 	 * and the request cannot be directly validated
diff --git a/include/acpi/acpiosxf.h b/include/acpi/acpiosxf.h
index 3e79859..ab0b85c 100644
--- a/include/acpi/acpiosxf.h
+++ b/include/acpi/acpiosxf.h
@@ -242,6 +242,10 @@  acpi_os_derive_pci_id(acpi_handle rhandle,
 acpi_status acpi_os_validate_interface(char *interface);
 acpi_status acpi_osi_invalidate(char* interface);
 
+acpi_status
+acpi_os_validate_address(u8 space_id, acpi_physical_address address,
+			 acpi_size length, char *name);
+
 u64 acpi_os_get_timer(void);
 
 acpi_status acpi_os_signal(u32 function, void *info);