diff mbox

x86/pci: don't use crs for root if we only have one root bus

Message ID 4A42AFAC.6000300@kernel.org (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Yinghai Lu June 24, 2009, 10:58 p.m. UTC
for AMD system, when only one PCI root, just set PCI_NO_ROOT_CRS for it

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/pci/amd_bus.c |    4 ++++
 1 file changed, 4 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Linus Torvalds June 24, 2009, 11:09 p.m. UTC | #1
Should we not just revert 9e9f46c44e487af0a82eb61b624553e2f7118f5b?

The thing says:

 "At this point, it seems to solve more problems than it causes, so let's 
  try using it by default.  It's an easy revert if it ends up causing 
  trouble."

and it clearly does _not_ solve more problems than it causes, and the 
whole message in that commit implies we should revert it.

I'm happy to apply various patches to fix it up, but regardless, I thinkwe 
should revert that commit as bogus. We can try making it the default again 
next round, when maybe it will be true that it doesn't cause issues.

What did it even ever help with?

		Linus

On Wed, 24 Jun 2009, Yinghai Lu wrote:
> 
> for AMD system, when only one PCI root, just set PCI_NO_ROOT_CRS for it
> 
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> ---
>  arch/x86/pci/amd_bus.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> Index: linux-2.6/arch/x86/pci/amd_bus.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/pci/amd_bus.c
> +++ linux-2.6/arch/x86/pci/amd_bus.c
> @@ -561,6 +561,10 @@ static int __init early_fill_mp_bus_info
>  		}
>  	}
>  
> +	/* don't use _CRS if we only have one root */
> +	if (pci_root_num <= 1)
> +		pci_probe |= PCI_NO_ROOT_CRS;
> +
>  	return 0;
>  }
>  
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds June 24, 2009, 11:21 p.m. UTC | #2
On Wed, 24 Jun 2009, Linus Torvalds wrote:
>
> I'm happy to apply various patches to fix it up, but regardless, I thinkwe 
> should revert that commit as bogus. We can try making it the default again 
> next round, when maybe it will be true that it doesn't cause issues.

Btw, I really think our _CRS handling sucks.

There's two things that you can do with _CRS:

 - use the _existence_ of it as an indicator of a root bus

 - try to use it to populate the resource tree.

And quite frankly, I think #2 is broken. There's no way in hell that ACPI 
tables are ever going to be better than just asking the hardware. We've 
gone through this before. Trusting ACPI over the hardware is just 
FUNDAMENTALLY WRONG.

So I'm just going to do that revert. I'm not sure if it ever makes sense 
to make that insane _CRS code the default. It seems like a fundamentally 
flawed idea.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Barnes June 24, 2009, 11:37 p.m. UTC | #3
On Wed, 24 Jun 2009 16:21:09 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Wed, 24 Jun 2009, Linus Torvalds wrote:
> >
> > I'm happy to apply various patches to fix it up, but regardless, I
> > thinkwe should revert that commit as bogus. We can try making it
> > the default again next round, when maybe it will be true that it
> > doesn't cause issues.
> 
> Btw, I really think our _CRS handling sucks.
> 
> There's two things that you can do with _CRS:
> 
>  - use the _existence_ of it as an indicator of a root bus
> 
>  - try to use it to populate the resource tree.
> 
> And quite frankly, I think #2 is broken. There's no way in hell that
> ACPI tables are ever going to be better than just asking the
> hardware. We've gone through this before. Trusting ACPI over the
> hardware is just FUNDAMENTALLY WRONG.
> 
> So I'm just going to do that revert. I'm not sure if it ever makes
> sense to make that insane _CRS code the default. It seems like a
> fundamentally flawed idea.

Yeah, I think it's reasonable to revert, especially given how we do
_CRS handling currently.  I'm hoping at some point we can use the _CRS
data to at least augment the configuration we get from hardware, since
on some machines it seems to be necessary.
Linus Torvalds June 24, 2009, 11:54 p.m. UTC | #4
On Wed, 24 Jun 2009, Jesse Barnes wrote:
> 
> Yeah, I think it's reasonable to revert, especially given how we do
> _CRS handling currently.  I'm hoping at some point we can use the _CRS
> data to at least augment the configuration we get from hardware, since
> on some machines it seems to be necessary.

Agreed. I do think we should take _CRS into account - possibly just as a 
minimal hint to determine which root buses to try to scan (maybe we do 
this already, I really didn't check). Or maybe we could use it to extend 
on our scan information.

But when it seems to have things like "this bus can forward VGA cycles" 
kind of "resources" (which seems to be the main reason Larry Finger has so 
many of them), then that's just worthless knowledge that we're much better 
off just determining on our own.

Anyway, I may feel pretty strongly about things like this, but I'm also 
open to being convinced otherwise for 2.6.32. I wanted to do -rc1 today 
(it's been more than two weeks), and while I don't expect -rc1 to be 
flawless, I also hate to release it with _known_ bugs.

So partly due to timing, I'd rather revert it, and we can revisit it for 
the next release - whatever the actual end result then will be.

[ There's a difference between "we're supposed to find and fix bugs in the 
  -rc series", and "I release known-buggy -rc1's since we're supposed to 
  fix it later". For similar reasons, I hate pulling known-buggy stuff 
  during the merge window - it's ok if it shows itself to be buggy 
  _later_, but if people send me stuff that they know is buggy as they 
  send it to me, then that's a problem. ]

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gary Hade June 25, 2009, midnight UTC | #5
On Wed, Jun 24, 2009 at 04:09:36PM -0700, Linus Torvalds wrote:
> 
> Should we not just revert 9e9f46c44e487af0a82eb61b624553e2f7118f5b?
> 
> The thing says:
> 
>  "At this point, it seems to solve more problems than it causes, so let's 
>   try using it by default.  It's an easy revert if it ends up causing 
>   trouble."
> 
> and it clearly does _not_ solve more problems than it causes, and the 
> whole message in that commit implies we should revert it.
> 
> I'm happy to apply various patches to fix it up, but regardless, I thinkwe 
> should revert that commit as bogus. We can try making it the default again 
> next round, when maybe it will be true that it doesn't cause issues.
> 
> What did it even ever help with?

In our case it is needed on some of our systems for PCI hotplug to
avoid MCKs due to a device under one root bus getting a resource
during hotplug that can only be used by devices under a different
root bus.  If the user does not intend to use PCI hotplug, the
function isn't needed (i.e. 'pci=use_crs' can be omitted) because
resources are properly directed to the installed cards via BIOS
pre-assignment.

Gary
Jesse Barnes June 25, 2009, 12:01 a.m. UTC | #6
On Wed, 24 Jun 2009 16:54:08 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Wed, 24 Jun 2009, Jesse Barnes wrote:
> > 
> > Yeah, I think it's reasonable to revert, especially given how we do
> > _CRS handling currently.  I'm hoping at some point we can use the
> > _CRS data to at least augment the configuration we get from
> > hardware, since on some machines it seems to be necessary.
> 
> Agreed. I do think we should take _CRS into account - possibly just
> as a minimal hint to determine which root buses to try to scan (maybe
> we do this already, I really didn't check). Or maybe we could use it
> to extend on our scan information.
> 
> But when it seems to have things like "this bus can forward VGA
> cycles" kind of "resources" (which seems to be the main reason Larry
> Finger has so many of them), then that's just worthless knowledge
> that we're much better off just determining on our own.

Yeah, some of those bits don't seem useful; but OTOH if a given bridge
decodes a certain range in a non-configurable way how else will we get
that info w/o having huge tables of per-chip info?  Those are the sorts
of things I worry about with our current resource handling.  Maybe not
directly _CRS related, but still...

> Anyway, I may feel pretty strongly about things like this, but I'm
> also open to being convinced otherwise for 2.6.32. I wanted to do
> -rc1 today (it's been more than two weeks), and while I don't expect
> -rc1 to be flawless, I also hate to release it with _known_ bugs.

Sure, we'll keep at it and see if we can get something better in place
for .32.

> So partly due to timing, I'd rather revert it, and we can revisit it
> for the next release - whatever the actual end result then will be.
> 
> [ There's a difference between "we're supposed to find and fix bugs
> in the -rc series", and "I release known-buggy -rc1's since we're
> supposed to fix it later". For similar reasons, I hate pulling
> known-buggy stuff during the merge window - it's ok if it shows
> itself to be buggy _later_, but if people send me stuff that they
> know is buggy as they send it to me, then that's a problem. ]

Yeah, 100% agreed.  I didn't hear any reports until after people
started using your tree, so I think this case was handled correctly:
push something that *seems* ok upstream, but with eyes wide open for
the possibility we'd need to revert.

Thanks,
Ingo Molnar June 25, 2009, 7:03 a.m. UTC | #7
* Jesse Barnes <jbarnes@virtuousgeek.org> wrote:

> > [ There's a difference between "we're supposed to find and fix bugs
> > in the -rc series", and "I release known-buggy -rc1's since we're
> > supposed to fix it later". For similar reasons, I hate pulling
> > known-buggy stuff during the merge window - it's ok if it shows
> > itself to be buggy _later_, but if people send me stuff that they
> > know is buggy as they send it to me, then that's a problem. ]
> 
> Yeah, 100% agreed.  I didn't hear any reports until after people 
> started using your tree, so I think this case was handled 
> correctly: push something that *seems* ok upstream, but with eyes 
> wide open for the possibility we'd need to revert.

There's only one small gripe i have with the handling of it: the 
timing. "9e9f46c: PCI: use ACPI _CRS data by default" was written 
and committed on June 11th, two days _after_ the merge window 
opened.

That's way too late for maybe-broken changes to x86 lowlevel details 
(especially if it touches hw-environmental interaction - which is 
very hard to test with meaningful coverage), and it's also pretty 
much the worst moment to solicit testing from people who are busy 
getting their stuff to Linus and who are busy testing out any of the 
unexpected interactions and bugs.

So this was, to a certain degree, a predictable outcome. Trees in 
the Linux "critical path" of testing (core kernel, x86, core 
networking, very common drivers, PCI, driver core, VFS, etc.) should 
generally try to cool down 1-2 weeks before the merge window - 
because breakage there can do a lot of knock-on cascading damage. 
Two weeks is not a lot of time and the effects of showstopper bugs 
get magnified disproportionately.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jaswinder Singh Rajput June 25, 2009, 7:28 a.m. UTC | #8
On Thu, 2009-06-25 at 09:03 +0200, Ingo Molnar wrote:
> * Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> 
> > > [ There's a difference between "we're supposed to find and fix bugs
> > > in the -rc series", and "I release known-buggy -rc1's since we're
> > > supposed to fix it later". For similar reasons, I hate pulling
> > > known-buggy stuff during the merge window - it's ok if it shows
> > > itself to be buggy _later_, but if people send me stuff that they
> > > know is buggy as they send it to me, then that's a problem. ]
> > 
> > Yeah, 100% agreed.  I didn't hear any reports until after people 
> > started using your tree, so I think this case was handled 
> > correctly: push something that *seems* ok upstream, but with eyes 
> > wide open for the possibility we'd need to revert.
> 
> There's only one small gripe i have with the handling of it: the 
> timing. "9e9f46c: PCI: use ACPI _CRS data by default" was written 
> and committed on June 11th, two days _after_ the merge window 
> opened.
> 
> That's way too late for maybe-broken changes to x86 lowlevel details 
> (especially if it touches hw-environmental interaction - which is 
> very hard to test with meaningful coverage), and it's also pretty 
> much the worst moment to solicit testing from people who are busy 
> getting their stuff to Linus and who are busy testing out any of the 
> unexpected interactions and bugs.
> 
> So this was, to a certain degree, a predictable outcome. Trees in 
> the Linux "critical path" of testing (core kernel, x86, core 
> networking, very common drivers, PCI, driver core, VFS, etc.) should 
> generally try to cool down 1-2 weeks before the merge window - 
> because breakage there can do a lot of knock-on cascading damage. 
> Two weeks is not a lot of time and the effects of showstopper bugs 
> get magnified disproportionately.
> 

Yes, I was also thinking about this when I checked the commit date. And
totally agree with Ingo's suggestions.

Thanks,
--
JSR

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Barnes June 25, 2009, 4:28 p.m. UTC | #9
On Thu, 25 Jun 2009 09:03:47 +0200
Ingo Molnar <mingo@elte.hu> wrote:

> 
> * Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> 
> > > [ There's a difference between "we're supposed to find and fix
> > > bugs in the -rc series", and "I release known-buggy -rc1's since
> > > we're supposed to fix it later". For similar reasons, I hate
> > > pulling known-buggy stuff during the merge window - it's ok if it
> > > shows itself to be buggy _later_, but if people send me stuff
> > > that they know is buggy as they send it to me, then that's a
> > > problem. ]
> > 
> > Yeah, 100% agreed.  I didn't hear any reports until after people 
> > started using your tree, so I think this case was handled 
> > correctly: push something that *seems* ok upstream, but with eyes 
> > wide open for the possibility we'd need to revert.
> 
> There's only one small gripe i have with the handling of it: the 
> timing. "9e9f46c: PCI: use ACPI _CRS data by default" was written 
> and committed on June 11th, two days _after_ the merge window 
> opened.
> 
> That's way too late for maybe-broken changes to x86 lowlevel details 
> (especially if it touches hw-environmental interaction - which is 
> very hard to test with meaningful coverage), and it's also pretty 
> much the worst moment to solicit testing from people who are busy 
> getting their stuff to Linus and who are busy testing out any of the 
> unexpected interactions and bugs.

True, the patch had been around long before that and I definitely
should have committed it sooner.
diff mbox

Patch

Index: linux-2.6/arch/x86/pci/amd_bus.c
===================================================================
--- linux-2.6.orig/arch/x86/pci/amd_bus.c
+++ linux-2.6/arch/x86/pci/amd_bus.c
@@ -561,6 +561,10 @@  static int __init early_fill_mp_bus_info
 		}
 	}
 
+	/* don't use _CRS if we only have one root */
+	if (pci_root_num <= 1)
+		pci_probe |= PCI_NO_ROOT_CRS;
+
 	return 0;
 }