Message ID | 20170426201126.GA32407@dhcp22.suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Ping on this. Andrew, are you going to fold this or should I post a separate patch? [...] > I cannot say I would be really happy about the chosen approach, > though. Why HASH_ADAPT is not implicit? Which hash table would need > gigabytes of memory and still benefit from it? Even if there is such an > example then it should use the explicit high_limit. I do not like this > opt-in because it is just too easy to miss that and hit the same issue > again. And in fact only few users of alloc_large_system_hash are using > the flag. E.g. why {dcache,inode}_init_early do not have the flag? I > am pretty sure that having a physically contiguous hash table would be > better over vmalloc from the TLB point of view. > > mount_hashtable resp. mountpoint_hashtable are another example. Other > users just have a reasonable max value. So can we do the following > on top of your commit? I think that we should rethink the scaling as > well but I do not have a good answer for the maximum size so let's just > start with a more reasonable API first. > --- > diff --git a/fs/dcache.c b/fs/dcache.c > index 808ea99062c2..363502faa328 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -3585,7 +3585,7 @@ static void __init dcache_init(void) > sizeof(struct hlist_bl_head), > dhash_entries, > 13, > - HASH_ZERO | HASH_ADAPT, > + HASH_ZERO, > &d_hash_shift, > &d_hash_mask, > 0, > diff --git a/fs/inode.c b/fs/inode.c > index a9caf53df446..b3c0731ec1fe 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1950,7 +1950,7 @@ void __init inode_init(void) > sizeof(struct hlist_head), > ihash_entries, > 14, > - HASH_ZERO | HASH_ADAPT, > + HASH_ZERO, > &i_hash_shift, > &i_hash_mask, > 0, > diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h > index dbaf312b3317..e223d91b6439 100644 > --- a/include/linux/bootmem.h > +++ b/include/linux/bootmem.h > @@ -359,7 +359,6 @@ extern void *alloc_large_system_hash(const char *tablename, > #define HASH_SMALL 0x00000002 /* sub-page allocation allowed, min > * shift passed via *_hash_shift */ > #define HASH_ZERO 0x00000004 /* Zero allocated hash table */ > -#define HASH_ADAPT 0x00000008 /* Adaptive scale for large memory */ > > /* Only NUMA needs hash distribution. 64bit NUMA architectures have > * sufficient vmalloc space. > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index fa752de84eef..3bf60669d200 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -7226,7 +7226,7 @@ void *__init alloc_large_system_hash(const char *tablename, > if (PAGE_SHIFT < 20) > numentries = round_up(numentries, (1<<20)/PAGE_SIZE); > > - if (flags & HASH_ADAPT) { > + if (!high_limit) { > unsigned long adapt; > > for (adapt = ADAPT_SCALE_NPAGES; adapt < numentries; > > -- > Michal Hocko > SUSE Labs
Hi Michal, I do not really want to impose any hard limit, because I do not know what it should be. The owners of the subsystems that use these large hash table should make a call, and perhaps pass high_limit, if needed into alloc_large_system_hash(). Previous growth rate was unacceptable, because in addition to allocating large tables (which is acceptable if we take a total system memory size), we also needed to zero that, and zeroing while we have only one CPU available was significantly reducing the boot time. Now, on 32T the hash table is 1G instead of 32G, so the call is 32 times faster to finish. While it is not a good idea to waste memory, both 1G and 32G is insignificant amount of memory compared to the total amount of such 32T systems (0.09% and 0.003% accordingly). Here is boot log on 32T system without this fix: https://hastebin.com/muruzoveno.go [ 769.622359] Dentry cache hash table entries: 2147483648 (order: 21, 17179869184 bytes) [ 791.942136] Inode-cache hash table entries: 2147483648 (order: 21, 17179869184 bytes) [ 810.810745] Mount-cache hash table entries: 67108864 (order: 16, 536870912 bytes) [ 810.922322] Mountpoint-cache hash table entries: 67108864 (order: 16, 536870912 bytes) [ 812.125398] ftrace: allocating 20650 entries in 41 pages Total time 42.5s With this fix (and some other unrelated for this interval fixes): https://hastebin.com/buxucurawa.go [ 12.621164] Dentry cache hash table entries: 134217728 (order: 17, 1073741824 bytes) [ 12.869462] Inode-cache hash table entries: 67108864 (order: 16, 536870912 bytes) [ 13.101963] Mount-cache hash table entries: 67108864 (order: 16, 536870912 bytes) [ 13.331988] Mountpoint-cache hash table entries: 67108864 (order: 16, 536870912 bytes) [ 13.364661] ftrace: allocating 20650 entries in 41 pages Total time 0.76s. So, it scales well for 32T systems, and will scale well for perceivable future without adding a hard ceiling limit. Pasha On 04/26/2017 04:11 PM, Michal Hocko wrote: > On Fri 03-03-17 15:32:47, Andrew Morton wrote: >> On Thu, 2 Mar 2017 00:33:45 -0500 Pavel Tatashin <pasha.tatashin@oracle.com> wrote: >> >>> Allow hash tables to scale with memory but at slower pace, when HASH_ADAPT >>> is provided every time memory quadruples the sizes of hash tables will only >>> double instead of quadrupling as well. This algorithm starts working only >>> when memory size reaches a certain point, currently set to 64G. >>> >>> This is example of dentry hash table size, before and after four various >>> memory configurations: >>> >>> MEMORY SCALE HASH_SIZE >>> old new old new >>> 8G 13 13 8M 8M >>> 16G 13 13 16M 16M >>> 32G 13 13 32M 32M >>> 64G 13 13 64M 64M >>> 128G 13 14 128M 64M >>> 256G 13 14 256M 128M >>> 512G 13 15 512M 128M >>> 1024G 13 15 1024M 256M >>> 2048G 13 16 2048M 256M >>> 4096G 13 16 4096M 512M >>> 8192G 13 17 8192M 512M >>> 16384G 13 17 16384M 1024M >>> 32768G 13 18 32768M 1024M >>> 65536G 13 18 65536M 2048M >> >> OK, but what are the runtime effects? Presumably some workloads will >> slow down a bit. How much? How do we know that this is a worthwhile >> tradeoff? >> >> If the effect of this change is "undetectable" then those hash tables >> are simply too large, and additional tuning is needed, yes? > > I am playing with a 3TB and have hit the following > [ 0.961309] Dentry cache hash table entries: 536870912 (order: 20, 4294967296 bytes) > [ 2.300012] vmalloc: allocation failure, allocated 1383612416 of 2147487744 bytes > [ 2.307473] swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) > [ 2.315101] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.4.49-hotplug19-default #1 > [ 2.324017] Hardware name: Huawei 9008/IT91SMUB, BIOS BLXSV607 04/17/2017 > [ 2.330775] ffffffff8101aba5 ffffffff8130efa0 ffffffff81863f48 ffffffff81c03e40 > [ 2.338201] ffffffff8118c9a2 02080020fff00300 ffffffff81863f48 ffffffff81c03de0 > [ 2.345628] 0000000000000018 ffffffff81c03e50 ffffffff81c03df8 ffffffff811d28e6 > [ 2.353056] Call Trace: > [ 2.355507] [<ffffffff81019a99>] dump_trace+0x59/0x310 > [ 2.360710] [<ffffffff81019e3a>] show_stack_log_lvl+0xea/0x170 > [ 2.366605] [<ffffffff8101abc1>] show_stack+0x21/0x40 > [ 2.371723] [<ffffffff8130efa0>] dump_stack+0x5c/0x7c > [ 2.376842] [<ffffffff8118c9a2>] warn_alloc_failed+0xe2/0x150 > [ 2.382655] [<ffffffff811c2a10>] __vmalloc_node_range+0x240/0x280 > [ 2.388814] [<ffffffff811c2a97>] __vmalloc+0x47/0x50 > [ 2.393851] [<ffffffff81da02ae>] alloc_large_system_hash+0x189/0x25d > [ 2.400264] [<ffffffff81da7625>] inode_init+0x74/0xa3 > [ 2.405381] [<ffffffff81da7483>] vfs_caches_init+0x59/0xe1 > [ 2.410930] [<ffffffff81d6f070>] start_kernel+0x474/0x4d0 > [ 2.416392] [<ffffffff81d6e719>] x86_64_start_kernel+0x147/0x156 > > Allocating 4G for a hash table is just ridiculous. 512MB which this > patch should give looks much reasonable, although I would argue it is > still a _lot_. > I cannot say I would be really happy about the chosen approach, > though. Why HASH_ADAPT is not implicit? Which hash table would need > gigabytes of memory and still benefit from it? Even if there is such an > example then it should use the explicit high_limit. I do not like this > opt-in because it is just too easy to miss that and hit the same issue > again. And in fact only few users of alloc_large_system_hash are using > the flag. E.g. why {dcache,inode}_init_early do not have the flag? I > am pretty sure that having a physically contiguous hash table would be > better over vmalloc from the TLB point of view. > > mount_hashtable resp. mountpoint_hashtable are another example. Other > users just have a reasonable max value. So can we do the following > on top of your commit? I think that we should rethink the scaling as > well but I do not have a good answer for the maximum size so let's just > start with a more reasonable API first. > --- > diff --git a/fs/dcache.c b/fs/dcache.c > index 808ea99062c2..363502faa328 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -3585,7 +3585,7 @@ static void __init dcache_init(void) > sizeof(struct hlist_bl_head), > dhash_entries, > 13, > - HASH_ZERO | HASH_ADAPT, > + HASH_ZERO, > &d_hash_shift, > &d_hash_mask, > 0, > diff --git a/fs/inode.c b/fs/inode.c > index a9caf53df446..b3c0731ec1fe 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1950,7 +1950,7 @@ void __init inode_init(void) > sizeof(struct hlist_head), > ihash_entries, > 14, > - HASH_ZERO | HASH_ADAPT, > + HASH_ZERO, > &i_hash_shift, > &i_hash_mask, > 0, > diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h > index dbaf312b3317..e223d91b6439 100644 > --- a/include/linux/bootmem.h > +++ b/include/linux/bootmem.h > @@ -359,7 +359,6 @@ extern void *alloc_large_system_hash(const char *tablename, > #define HASH_SMALL 0x00000002 /* sub-page allocation allowed, min > * shift passed via *_hash_shift */ > #define HASH_ZERO 0x00000004 /* Zero allocated hash table */ > -#define HASH_ADAPT 0x00000008 /* Adaptive scale for large memory */ > > /* Only NUMA needs hash distribution. 64bit NUMA architectures have > * sufficient vmalloc space. > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index fa752de84eef..3bf60669d200 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -7226,7 +7226,7 @@ void *__init alloc_large_system_hash(const char *tablename, > if (PAGE_SHIFT < 20) > numentries = round_up(numentries, (1<<20)/PAGE_SIZE); > > - if (flags & HASH_ADAPT) { > + if (!high_limit) { > unsigned long adapt; > > for (adapt = ADAPT_SCALE_NPAGES; adapt < numentries; >
BTW, I am OK with your patch on top of this "Adaptive hash table" patch, but I do not know what high_limit should be from where HASH_ADAPT will kick in. 128M sound reasonable to you? Pasha On 05/04/2017 02:23 PM, Pasha Tatashin wrote: > Hi Michal, > > I do not really want to impose any hard limit, because I do not know > what it should be. > > The owners of the subsystems that use these large hash table should make > a call, and perhaps pass high_limit, if needed into > alloc_large_system_hash(). > > Previous growth rate was unacceptable, because in addition to allocating > large tables (which is acceptable if we take a total system memory > size), we also needed to zero that, and zeroing while we have only one > CPU available was significantly reducing the boot time. > > Now, on 32T the hash table is 1G instead of 32G, so the call is 32 times > faster to finish. While it is not a good idea to waste memory, both 1G > and 32G is insignificant amount of memory compared to the total amount > of such 32T systems (0.09% and 0.003% accordingly). > > Here is boot log on 32T system without this fix: > https://hastebin.com/muruzoveno.go > > [ 769.622359] Dentry cache hash table entries: 2147483648 (order: 21, > 17179869184 bytes) > [ 791.942136] Inode-cache hash table entries: 2147483648 (order: 21, > 17179869184 bytes) > [ 810.810745] Mount-cache hash table entries: 67108864 (order: 16, > 536870912 bytes) > [ 810.922322] Mountpoint-cache hash table entries: 67108864 (order: 16, > 536870912 bytes) > [ 812.125398] ftrace: allocating 20650 entries in 41 pages > > Total time 42.5s > > With this fix (and some other unrelated for this interval fixes): > https://hastebin.com/buxucurawa.go > > [ 12.621164] Dentry cache hash table entries: 134217728 (order: 17, > 1073741824 bytes) > [ 12.869462] Inode-cache hash table entries: 67108864 (order: 16, > 536870912 bytes) > [ 13.101963] Mount-cache hash table entries: 67108864 (order: 16, > 536870912 bytes) > [ 13.331988] Mountpoint-cache hash table entries: 67108864 (order: 16, > 536870912 bytes) > [ 13.364661] ftrace: allocating 20650 entries in 41 pages > > Total time 0.76s. > > So, it scales well for 32T systems, and will scale well for perceivable > future without adding a hard ceiling limit. > > Pasha > > On 04/26/2017 04:11 PM, Michal Hocko wrote: >> On Fri 03-03-17 15:32:47, Andrew Morton wrote: >>> On Thu, 2 Mar 2017 00:33:45 -0500 Pavel Tatashin >>> <pasha.tatashin@oracle.com> wrote: >>> >>>> Allow hash tables to scale with memory but at slower pace, when >>>> HASH_ADAPT >>>> is provided every time memory quadruples the sizes of hash tables >>>> will only >>>> double instead of quadrupling as well. This algorithm starts working >>>> only >>>> when memory size reaches a certain point, currently set to 64G. >>>> >>>> This is example of dentry hash table size, before and after four >>>> various >>>> memory configurations: >>>> >>>> MEMORY SCALE HASH_SIZE >>>> old new old new >>>> 8G 13 13 8M 8M >>>> 16G 13 13 16M 16M >>>> 32G 13 13 32M 32M >>>> 64G 13 13 64M 64M >>>> 128G 13 14 128M 64M >>>> 256G 13 14 256M 128M >>>> 512G 13 15 512M 128M >>>> 1024G 13 15 1024M 256M >>>> 2048G 13 16 2048M 256M >>>> 4096G 13 16 4096M 512M >>>> 8192G 13 17 8192M 512M >>>> 16384G 13 17 16384M 1024M >>>> 32768G 13 18 32768M 1024M >>>> 65536G 13 18 65536M 2048M >>> >>> OK, but what are the runtime effects? Presumably some workloads will >>> slow down a bit. How much? How do we know that this is a worthwhile >>> tradeoff? >>> >>> If the effect of this change is "undetectable" then those hash tables >>> are simply too large, and additional tuning is needed, yes? >> >> I am playing with a 3TB and have hit the following >> [ 0.961309] Dentry cache hash table entries: 536870912 (order: 20, >> 4294967296 bytes) >> [ 2.300012] vmalloc: allocation failure, allocated 1383612416 of >> 2147487744 bytes >> [ 2.307473] swapper/0: page allocation failure: order:0, >> mode:0x2080020(GFP_ATOMIC) >> [ 2.315101] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G >> W 4.4.49-hotplug19-default #1 >> [ 2.324017] Hardware name: Huawei 9008/IT91SMUB, BIOS BLXSV607 >> 04/17/2017 >> [ 2.330775] ffffffff8101aba5 ffffffff8130efa0 ffffffff81863f48 >> ffffffff81c03e40 >> [ 2.338201] ffffffff8118c9a2 02080020fff00300 ffffffff81863f48 >> ffffffff81c03de0 >> [ 2.345628] 0000000000000018 ffffffff81c03e50 ffffffff81c03df8 >> ffffffff811d28e6 >> [ 2.353056] Call Trace: >> [ 2.355507] [<ffffffff81019a99>] dump_trace+0x59/0x310 >> [ 2.360710] [<ffffffff81019e3a>] show_stack_log_lvl+0xea/0x170 >> [ 2.366605] [<ffffffff8101abc1>] show_stack+0x21/0x40 >> [ 2.371723] [<ffffffff8130efa0>] dump_stack+0x5c/0x7c >> [ 2.376842] [<ffffffff8118c9a2>] warn_alloc_failed+0xe2/0x150 >> [ 2.382655] [<ffffffff811c2a10>] __vmalloc_node_range+0x240/0x280 >> [ 2.388814] [<ffffffff811c2a97>] __vmalloc+0x47/0x50 >> [ 2.393851] [<ffffffff81da02ae>] alloc_large_system_hash+0x189/0x25d >> [ 2.400264] [<ffffffff81da7625>] inode_init+0x74/0xa3 >> [ 2.405381] [<ffffffff81da7483>] vfs_caches_init+0x59/0xe1 >> [ 2.410930] [<ffffffff81d6f070>] start_kernel+0x474/0x4d0 >> [ 2.416392] [<ffffffff81d6e719>] x86_64_start_kernel+0x147/0x156 >> >> Allocating 4G for a hash table is just ridiculous. 512MB which this >> patch should give looks much reasonable, although I would argue it is >> still a _lot_. >> I cannot say I would be really happy about the chosen approach, >> though. Why HASH_ADAPT is not implicit? Which hash table would need >> gigabytes of memory and still benefit from it? Even if there is such an >> example then it should use the explicit high_limit. I do not like this >> opt-in because it is just too easy to miss that and hit the same issue >> again. And in fact only few users of alloc_large_system_hash are using >> the flag. E.g. why {dcache,inode}_init_early do not have the flag? I >> am pretty sure that having a physically contiguous hash table would be >> better over vmalloc from the TLB point of view. >> >> mount_hashtable resp. mountpoint_hashtable are another example. Other >> users just have a reasonable max value. So can we do the following >> on top of your commit? I think that we should rethink the scaling as >> well but I do not have a good answer for the maximum size so let's just >> start with a more reasonable API first. >> --- >> diff --git a/fs/dcache.c b/fs/dcache.c >> index 808ea99062c2..363502faa328 100644 >> --- a/fs/dcache.c >> +++ b/fs/dcache.c >> @@ -3585,7 +3585,7 @@ static void __init dcache_init(void) >> sizeof(struct hlist_bl_head), >> dhash_entries, >> 13, >> - HASH_ZERO | HASH_ADAPT, >> + HASH_ZERO, >> &d_hash_shift, >> &d_hash_mask, >> 0, >> diff --git a/fs/inode.c b/fs/inode.c >> index a9caf53df446..b3c0731ec1fe 100644 >> --- a/fs/inode.c >> +++ b/fs/inode.c >> @@ -1950,7 +1950,7 @@ void __init inode_init(void) >> sizeof(struct hlist_head), >> ihash_entries, >> 14, >> - HASH_ZERO | HASH_ADAPT, >> + HASH_ZERO, >> &i_hash_shift, >> &i_hash_mask, >> 0, >> diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h >> index dbaf312b3317..e223d91b6439 100644 >> --- a/include/linux/bootmem.h >> +++ b/include/linux/bootmem.h >> @@ -359,7 +359,6 @@ extern void *alloc_large_system_hash(const char >> *tablename, >> #define HASH_SMALL 0x00000002 /* sub-page allocation allowed, min >> * shift passed via *_hash_shift */ >> #define HASH_ZERO 0x00000004 /* Zero allocated hash table */ >> -#define HASH_ADAPT 0x00000008 /* Adaptive scale for large >> memory */ >> /* Only NUMA needs hash distribution. 64bit NUMA architectures have >> * sufficient vmalloc space. >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index fa752de84eef..3bf60669d200 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -7226,7 +7226,7 @@ void *__init alloc_large_system_hash(const char >> *tablename, >> if (PAGE_SHIFT < 20) >> numentries = round_up(numentries, (1<<20)/PAGE_SIZE); >> - if (flags & HASH_ADAPT) { >> + if (!high_limit) { >> unsigned long adapt; >> for (adapt = ADAPT_SCALE_NPAGES; adapt < numentries; >> > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
On Thu 04-05-17 14:23:24, Pasha Tatashin wrote: > Hi Michal, > > I do not really want to impose any hard limit, because I do not know what it > should be. > > The owners of the subsystems that use these large hash table should make a > call, and perhaps pass high_limit, if needed into alloc_large_system_hash(). Some of surely should. E.g. mount_hashtable resp. mountpoint_hashtable really do not need a large hash AFAIU. On the other hand it is somehow handy to scale dentry and inode hashes according to the amount of memory. But the scale factor should be much slower than the current upstream implementation. As I've said I do not want to judge your scaling change. All I am saying that making it explicit is just _wrong_ because it a) doesn't cover all cases just the two you have noticed and b) new users will most probably just copy&paste existing users so chances are they will introduce the same large hashtables without a good reason. I would even say that user shouldn't care about how the scaling is implemented. There is a way to limit it and if there is no limit set then just do whatever is appropriate. > > Previous growth rate was unacceptable, because in addition to allocating > large tables (which is acceptable if we take a total system memory size), we > also needed to zero that, and zeroing while we have only one CPU available > was significantly reducing the boot time. > > Now, on 32T the hash table is 1G instead of 32G, so the call is 32 times > faster to finish. While it is not a good idea to waste memory, both 1G and > 32G is insignificant amount of memory compared to the total amount of such > 32T systems (0.09% and 0.003% accordingly). Try to think in terms of hashed objects. How many objects would we need to hash? Also this might be not a significant portion of the memory but it is still a memory which can be used for other purposes.
On Thu 04-05-17 14:28:51, Pasha Tatashin wrote: > BTW, I am OK with your patch on top of this "Adaptive hash table" patch, but > I do not know what high_limit should be from where HASH_ADAPT will kick in. > 128M sound reasonable to you? For simplicity I would just use it unconditionally when no high_limit is set. What would be the problem with that? If you look at current users (and there no new users emerging too often) then most of them just want _some_ scaling. The original one obviously doesn't scale with large machines. Are you OK to fold my change to your patch or you want me to send a separate patch? AFAIK Andrew hasn't posted this patch to Linus yet.
On 05/05/2017 09:30 AM, Michal Hocko wrote: > On Thu 04-05-17 14:28:51, Pasha Tatashin wrote: >> BTW, I am OK with your patch on top of this "Adaptive hash table" patch, but >> I do not know what high_limit should be from where HASH_ADAPT will kick in. >> 128M sound reasonable to you? > > For simplicity I would just use it unconditionally when no high_limit is > set. What would be the problem with that? Sure, that sounds good. If you look at current users > (and there no new users emerging too often) then most of them just want > _some_ scaling. The original one obviously doesn't scale with large > machines. Are you OK to fold my change to your patch or you want me to > send a separate patch? AFAIK Andrew hasn't posted this patch to Linus > yet. > I would like a separate patch because mine has soaked in mm tree for a while now. Thank you, Pasha
diff --git a/fs/dcache.c b/fs/dcache.c index 808ea99062c2..363502faa328 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -3585,7 +3585,7 @@ static void __init dcache_init(void) sizeof(struct hlist_bl_head), dhash_entries, 13, - HASH_ZERO | HASH_ADAPT, + HASH_ZERO, &d_hash_shift, &d_hash_mask, 0, diff --git a/fs/inode.c b/fs/inode.c index a9caf53df446..b3c0731ec1fe 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1950,7 +1950,7 @@ void __init inode_init(void) sizeof(struct hlist_head), ihash_entries, 14, - HASH_ZERO | HASH_ADAPT, + HASH_ZERO, &i_hash_shift, &i_hash_mask, 0, diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h index dbaf312b3317..e223d91b6439 100644 --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -359,7 +359,6 @@ extern void *alloc_large_system_hash(const char *tablename, #define HASH_SMALL 0x00000002 /* sub-page allocation allowed, min * shift passed via *_hash_shift */ #define HASH_ZERO 0x00000004 /* Zero allocated hash table */ -#define HASH_ADAPT 0x00000008 /* Adaptive scale for large memory */ /* Only NUMA needs hash distribution. 64bit NUMA architectures have * sufficient vmalloc space. diff --git a/mm/page_alloc.c b/mm/page_alloc.c index fa752de84eef..3bf60669d200 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7226,7 +7226,7 @@ void *__init alloc_large_system_hash(const char *tablename, if (PAGE_SHIFT < 20) numentries = round_up(numentries, (1<<20)/PAGE_SIZE); - if (flags & HASH_ADAPT) { + if (!high_limit) { unsigned long adapt; for (adapt = ADAPT_SCALE_NPAGES; adapt < numentries;