diff mbox series

[v4,2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges

Message ID 20210301083230.30924-3-osalvador@suse.de (mailing list archive)
State New, archived
Headers show
Series Cleanup and fixups for vmemmap handling | expand

Commit Message

Oscar Salvador March 1, 2021, 8:32 a.m. UTC
We never get to allocate 1GB pages when mapping the vmemmap range.
Drop the dead code both for the aligned and unaligned cases and leave
only the direct map handling.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Suggested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
 arch/x86/mm/init_64.c | 35 +++++++----------------------------
 1 file changed, 7 insertions(+), 28 deletions(-)

Comments

Dave Hansen March 4, 2021, 6:42 p.m. UTC | #1
On 3/1/21 12:32 AM, Oscar Salvador wrote:
> We never get to allocate 1GB pages when mapping the vmemmap range.
> Drop the dead code both for the aligned and unaligned cases and leave
> only the direct map handling.

Could you elaborate a bit on why 1GB pages are never used?  It is just
unlikely to have a 64GB contiguous area of memory that needs 1GB of
contiguous vmemmap?  Or, does the fact that sections are smaller than
64GB keeps this from happening?
Oscar Salvador March 8, 2021, 6:48 p.m. UTC | #2
On Thu, Mar 04, 2021 at 10:42:59AM -0800, Dave Hansen wrote:
> On 3/1/21 12:32 AM, Oscar Salvador wrote:
> > We never get to allocate 1GB pages when mapping the vmemmap range.
> > Drop the dead code both for the aligned and unaligned cases and leave
> > only the direct map handling.
> 
> Could you elaborate a bit on why 1GB pages are never used?  It is just
> unlikely to have a 64GB contiguous area of memory that needs 1GB of
> contiguous vmemmap?  Or, does the fact that sections are smaller than
> 64GB keeps this from happening?

AFAIK, the biggest we populate vmemmap pages with is 2MB, plus the fact
that as you pointed out, memory sections on x86_64 are 128M, which is
way smaller than what would require to allocate a 1GB for vmemmap pages.

Am I missing something?
David Hildenbrand March 8, 2021, 7:25 p.m. UTC | #3
On 08.03.21 19:48, Oscar Salvador wrote:
> On Thu, Mar 04, 2021 at 10:42:59AM -0800, Dave Hansen wrote:
>> On 3/1/21 12:32 AM, Oscar Salvador wrote:
>>> We never get to allocate 1GB pages when mapping the vmemmap range.
>>> Drop the dead code both for the aligned and unaligned cases and leave
>>> only the direct map handling.
>>
>> Could you elaborate a bit on why 1GB pages are never used?  It is just
>> unlikely to have a 64GB contiguous area of memory that needs 1GB of
>> contiguous vmemmap?  Or, does the fact that sections are smaller than
>> 64GB keeps this from happening?
> 
> AFAIK, the biggest we populate vmemmap pages with is 2MB, plus the fact
> that as you pointed out, memory sections on x86_64 are 128M, which is
> way smaller than what would require to allocate a 1GB for vmemmap pages.
> 
> Am I missing something?

Right now, it is dead code that you are removing.

Just like for 2MB vmemmap pages, we would proactively have populate 1G 
pages when adding individual sections. You can easily waste a lot of memory.

Of course, one could also make a final pass over the tables to see where 
it makes sense forming 1GB pages.

But then, we would need quite some logic when removing individual 
sections (e.g., a 128 MB DIMM) - and I remember there are corner cases 
where we might have to remove boot memory ...

Long story short, I don't think 1G vmemmap pages are really worth the 
trouble.
diff mbox series

Patch

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b0e1d215c83e..9ecb3c488ac8 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1062,7 +1062,6 @@  remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
 	unsigned long next, pages = 0;
 	pmd_t *pmd_base;
 	pud_t *pud;
-	void *page_addr;
 
 	pud = pud_start + pud_index(addr);
 	for (; addr < end; addr = next, pud++) {
@@ -1071,33 +1070,13 @@  remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
 		if (!pud_present(*pud))
 			continue;
 
-		if (pud_large(*pud)) {
-			if (IS_ALIGNED(addr, PUD_SIZE) &&
-			    IS_ALIGNED(next, PUD_SIZE)) {
-				if (!direct)
-					free_pagetable(pud_page(*pud),
-						       get_order(PUD_SIZE));
-
-				spin_lock(&init_mm.page_table_lock);
-				pud_clear(pud);
-				spin_unlock(&init_mm.page_table_lock);
-				pages++;
-			} else {
-				/* If here, we are freeing vmemmap pages. */
-				memset((void *)addr, PAGE_INUSE, next - addr);
-
-				page_addr = page_address(pud_page(*pud));
-				if (!memchr_inv(page_addr, PAGE_INUSE,
-						PUD_SIZE)) {
-					free_pagetable(pud_page(*pud),
-						       get_order(PUD_SIZE));
-
-					spin_lock(&init_mm.page_table_lock);
-					pud_clear(pud);
-					spin_unlock(&init_mm.page_table_lock);
-				}
-			}
-
+		if (pud_large(*pud) &&
+		    IS_ALIGNED(addr, PUD_SIZE) &&
+		    IS_ALIGNED(next, PUD_SIZE)) {
+			spin_lock(&init_mm.page_table_lock);
+			pud_clear(pud);
+			spin_unlock(&init_mm.page_table_lock);
+			pages++;
 			continue;
 		}