diff mbox

x86/pci: make pci_mem_start to be aligned only

Message ID 49E4F71F.10107@kernel.org
State Superseded, archived
Headers show

Commit Message

Yinghai Lu April 14, 2009, 8:50 p.m. UTC
Impact: make more big space below 4g for assigning to unassigned pci devices

don't need to reserved one round after the gapstart.

Reported-and-tested-by: Yannick <yannick.roehlly@free.fr>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Linus Torvalds April 14, 2009, 9:10 p.m. UTC | #1
On Tue, 14 Apr 2009, Yinghai Lu wrote:
> 
> Impact: make more big space below 4g for assigning to unassigned pci devices
> 
> don't need to reserved one round after the gapstart.
> 
> Reported-and-tested-by: Yannick <yannick.roehlly@free.fr>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index ef2c356..a0ba9b1 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -642,7 +642,7 @@ __init void e820_setup_gap(void)
>  	while ((gapsize >> 4) > round)
>  		round += round;
>  	/* Fun with two's complement */
> -	pci_mem_start = (gapstart + round) & -round;
> +	pci_mem_start = roundup(gapstart, round);

That thing is called "e820_setup_gap()" for a reason. It's supposed to 
create a _gap_. That's why it historically doesn't just round up (no 
"+round-1" as in round_up()), it rounds up to the _next_ boundary 
("+round") in order to guarantee a gap.

The reason? We've definitely seen ACPI code or integrated graphics stuff 
that steals a lot of memory at the end, which means that end-of-RAM might 
be not at 2GB, but at 2GB-16MB-1MB, for example (1MB of "ACPI data", and 
16MB of "stolen video ram").

Now, the BIOS _hopefully_ marks those areas clearly reserved, and as a 
result we don't end up allocating PCI data in there, but the gap was there 
literally to make sure we always leave that gap, very much on purpose.

So I'm very nervous about this. 

At a minimum, if we do this, I'd like to make sure we round up to a big 
boundary (eg 32MB or something - exactly because a missing 16MB can easily 
be some integrated stolen video memory).

Sure, we do that whole

        while ((gapsize >> 4) > round)
                round += round;

thing, so that if the gap is large, then we'll certainly get to 32MB too, 
but I think your patch matters the most exactly when the gap is small. 
Maybe we could just raise the initial minimum rounding from 1MB to 32MB?

Alternatively, maybe we can make sure that we round up to at least X bytes 
from the end of RAM, and to at least Y bytes from the end of some RESERVED 
thing.

I dunno. Maybe your patch is fine as-is. But I do get nervous.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin April 14, 2009, 9:27 p.m. UTC | #2
Linus Torvalds wrote:
> The reason? We've definitely seen ACPI code or integrated graphics stuff 
> that steals a lot of memory at the end, which means that end-of-RAM might 
> be not at 2GB, but at 2GB-16MB-1MB, for example (1MB of "ACPI data", and 
> 16MB of "stolen video ram").

This is pretty much standard these days.  It's hard to implement ACPI 
without doing so.  Throw in the SMI T-seg for even more fun.

> Now, the BIOS _hopefully_ marks those areas clearly reserved, and as a 
> result we don't end up allocating PCI data in there, but the gap was there 
> literally to make sure we always leave that gap, very much on purpose.

It would be nice if we would mark that memory reserved ourselves.

> thing, so that if the gap is large, then we'll certainly get to 32MB too, 
> but I think your patch matters the most exactly when the gap is small. 
> Maybe we could just raise the initial minimum rounding from 1MB to 32MB?

Since we're talking about address space, not actual memory, it seems 
rather hard to end up in a situation where either one of these is not true:

- we will have real hardware demand for a large alignment datum.
- we will have so much address space available that it doesn't matter.

The latter case would be e.g. a machine with a today-anemic handful of 
megabytes of RAM.

	-hpa[1]

[1] who remembers running a Linux server on a 0.59 bogomips i386/16 with
     3 MB of half-speed memory...

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu April 14, 2009, 9:27 p.m. UTC | #3
Linus Torvalds wrote:
> 

> Alternatively, maybe we can make sure that we round up to at least X bytes 
> from the end of RAM, and to at least Y bytes from the end of some RESERVED 
> thing.

ok, will try to have one updated one with those check...

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yannick Roehlly April 14, 2009, 10:35 p.m. UTC | #4
Le Tuesday 14 April 2009 23:10:38 Linus Torvalds, vous avez écrit :
> The reason? We've definitely seen ACPI code or integrated graphics stuff
> that steals a lot of memory at the end, which means that end-of-RAM might
> be not at 2GB, but at 2GB-16MB-1MB, for example (1MB of "ACPI data", and
> 16MB of "stolen video ram").

Good evening (UTC+0200),

First, I must say that I don't understand 99% of the technical details of this 
discussion. I'm only the bug reporter. ;-)

There's one thing about my computer that may affect, or maybe cause this bug. 
There's no need to make possibly harmful change to the kernel only to fix it on 
a few machines with buggy bioses.

My computer is an Asus M51Se laptop with an HD3470 graphic card. This card is 
an "hypermemory" one with 256MB dedicated RAM and up to 1GB with shared 
memory. The problem is that I don't know how the computer assigns main memory 
to the card. There is nothing in the bios to indicate the amount of shared 
memory (and Asus did not answer my questions).

Once again, I'm not an expert, so if this details are not important, please 
forgive me.

Sincerely,

Yannick
diff mbox

Patch

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index ef2c356..a0ba9b1 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -642,7 +642,7 @@  __init void e820_setup_gap(void)
 	while ((gapsize >> 4) > round)
 		round += round;
 	/* Fun with two's complement */
-	pci_mem_start = (gapstart + round) & -round;
+	pci_mem_start = roundup(gapstart, round);
 
 	printk(KERN_INFO
 	       "Allocating PCI resources starting at %lx (gap: %lx:%lx)\n",