Message ID | 20210813164053.51481-1-yanghui.def@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task | expand |
On Sat, Aug 14, 2021 at 12:40:53AM +0800, yanghui wrote: > +++ b/mm/mempolicy.c > @@ -193,7 +193,7 @@ static int mpol_new_interleave(struct mempolicy *pol, const nodemask_t *nodes) > { > if (nodes_empty(*nodes)) > return -EINVAL; > - pol->nodes = *nodes; > + WRITE_ONCE(pol->nodes, *nodes); typedef struct { DECLARE_BITMAP(bits, MAX_NUMNODES); } nodemask_t; If MAX_NUMNODES is large enough, is WRITE_ONCE going to work? It could be 128 bits, and few architectures have an atomic 128-bit store instruction.
Hi yanghui, Thank you for the patch! Yet something to improve: [auto build test ERROR on v5.14-rc5] [cannot apply to hnaz-linux-mm/master next-20210813] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/yanghui/mm-mempolicy-fix-a-race-between-offset_il_node-and-mpol_rebind_task/20210814-004451 base: 36a21d51725af2ce0700c6ebcb6b9594aac658a6 config: x86_64-randconfig-a003-20210813 (attached as .config) compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 62df4df41c939205b2dc0a2a3bfb75b8c1ed74fa) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/99d8d888eb92ea46a5f4883773f3edaee5ccd28e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review yanghui/mm-mempolicy-fix-a-race-between-offset_il_node-and-mpol_rebind_task/20210814-004451 git checkout 99d8d888eb92ea46a5f4883773f3edaee5ccd28e # save the attached .config to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> ld.lld: error: undefined symbol: __compiletime_assert_467 >>> referenced by mempolicy.c:1968 (mm/mempolicy.c:1968) >>> mempolicy.o:(offset_il_node) in archive mm/built-in.a --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Sat, Aug 14, 2021 at 07:16:24AM +0800, kernel test robot wrote: > Hi yanghui, > > Thank you for the patch! Yet something to improve: > > [auto build test ERROR on v5.14-rc5] > [cannot apply to hnaz-linux-mm/master next-20210813] > [If your patch is applied to the wrong git tree, kindly drop us a note. > And when submitting patch, we suggest to use '--base' as documented in > https://git-scm.com/docs/git-format-patch] > > url: https://github.com/0day-ci/linux/commits/yanghui/mm-mempolicy-fix-a-race-between-offset_il_node-and-mpol_rebind_task/20210814-004451 > base: 36a21d51725af2ce0700c6ebcb6b9594aac658a6 > config: x86_64-randconfig-a003-20210813 (attached as .config) > compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 62df4df41c939205b2dc0a2a3bfb75b8c1ed74fa) > reproduce (this is a W=1 build): > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross > chmod +x ~/bin/make.cross > # https://github.com/0day-ci/linux/commit/99d8d888eb92ea46a5f4883773f3edaee5ccd28e > git remote add linux-review https://github.com/0day-ci/linux > git fetch --no-tags linux-review yanghui/mm-mempolicy-fix-a-race-between-offset_il_node-and-mpol_rebind_task/20210814-004451 > git checkout 99d8d888eb92ea46a5f4883773f3edaee5ccd28e > # save the attached .config to linux build tree > mkdir build_dir > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=x86_64 SHELL=/bin/bash > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot <lkp@intel.com> > > All errors (new ones prefixed by >>): > > >> ld.lld: error: undefined symbol: __compiletime_assert_467 > >>> referenced by mempolicy.c:1968 (mm/mempolicy.c:1968) > >>> mempolicy.o:(offset_il_node) in archive mm/built-in.a Grabbing Nick's patch [1] to support the error function attribute in clang and the kernel side patch [2] turns this into: mm/mempolicy.c:1968:24: error: call to '__compiletime_assert_467' declared with 'error' attribute: Unsupported access size for {READ,WRITE}_ONCE(). nodemask_t nodemask = READ_ONCE(pol->nodes); ^ ./include/asm-generic/rwonce.h:49:2: note: expanded from macro 'READ_ONCE' compiletime_assert_rwonce_type(x); \ ^ ./include/asm-generic/rwonce.h:36:2: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ ./include/linux/compiler_types.h:322:2: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ./include/linux/compiler_types.h:310:2: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^ ./include/linux/compiler_types.h:303:4: note: expanded from macro '__compiletime_assert' prefix ## suffix(); \ ^ <scratch space>:246:1: note: expanded from here __compiletime_assert_467 ^ mm/mempolicy.c:214:2: error: call to '__compiletime_assert_448' declared with 'error' attribute: Unsupported access size for {READ,WRITE}_ONCE(). WRITE_ONCE(pol->nodes, *nodes); ^ ./include/asm-generic/rwonce.h:60:2: note: expanded from macro 'WRITE_ONCE' compiletime_assert_rwonce_type(x); \ ^ ./include/asm-generic/rwonce.h:36:2: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ ./include/linux/compiler_types.h:322:2: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ./include/linux/compiler_types.h:310:2: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^ ./include/linux/compiler_types.h:303:4: note: expanded from macro '__compiletime_assert' prefix ## suffix(); \ ^ <scratch space>:16:1: note: expanded from here __compiletime_assert_448 ^ mm/mempolicy.c:337:2: error: call to '__compiletime_assert_452' declared with 'error' attribute: Unsupported access size for {READ,WRITE}_ONCE(). WRITE_ONCE(pol->nodes, tmp); ^ ./include/asm-generic/rwonce.h:60:2: note: expanded from macro 'WRITE_ONCE' compiletime_assert_rwonce_type(x); \ ^ ./include/asm-generic/rwonce.h:36:2: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ ./include/linux/compiler_types.h:322:2: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ./include/linux/compiler_types.h:310:2: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^ ./include/linux/compiler_types.h:303:4: note: expanded from macro '__compiletime_assert' prefix ## suffix(); \ ^ <scratch space>:38:1: note: expanded from here __compiletime_assert_452 ^ mm/mempolicy.c:196:2: error: call to '__compiletime_assert_447' declared with 'error' attribute: Unsupported access size for {READ,WRITE}_ONCE(). WRITE_ONCE(pol->nodes, *nodes); ^ ./include/asm-generic/rwonce.h:60:2: note: expanded from macro 'WRITE_ONCE' compiletime_assert_rwonce_type(x); \ ^ ./include/asm-generic/rwonce.h:36:2: note: expanded from macro 'compiletime_assert_rwonce_type' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ ./include/linux/compiler_types.h:322:2: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ./include/linux/compiler_types.h:310:2: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^ ./include/linux/compiler_types.h:303:4: note: expanded from macro '__compiletime_assert' prefix ## suffix(); \ ^ <scratch space>:13:1: note: expanded from here __compiletime_assert_447 ^ 4 errors generated. Which appears to be what Matthew mentioned, as this config has CONFIG_NODES_SHIFT=10 so nodemask_t is 128 bytes large. [1]: https://reviews.llvm.org/D106030 [2]: https://lore.kernel.org/r/20210802202326.1817503-1-ndesaulniers@google.com/ Cheers, Nathan
Hi yanghui,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on v5.14-rc5]
[cannot apply to hnaz-linux-mm/master next-20210813]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/yanghui/mm-mempolicy-fix-a-race-between-offset_il_node-and-mpol_rebind_task/20210814-004451
base: 36a21d51725af2ce0700c6ebcb6b9594aac658a6
config: ia64-randconfig-c024-20210813 (attached as .config)
compiler: ia64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/99d8d888eb92ea46a5f4883773f3edaee5ccd28e
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review yanghui/mm-mempolicy-fix-a-race-between-offset_il_node-and-mpol_rebind_task/20210814-004451
git checkout 99d8d888eb92ea46a5f4883773f3edaee5ccd28e
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=ia64 SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
In file included from arch/ia64/include/asm/pgtable.h:153,
from include/linux/pgtable.h:6,
from arch/ia64/include/asm/uaccess.h:40,
from include/linux/uaccess.h:11,
from include/linux/sched/task.h:11,
from include/linux/sched/signal.h:9,
from include/linux/rcuwait.h:6,
from include/linux/percpu-rwsem.h:7,
from include/linux/fs.h:33,
from include/linux/dax.h:5,
from include/linux/mempolicy.h:11,
from mm/mempolicy.c:70:
arch/ia64/include/asm/mmu_context.h: In function 'reload_context':
arch/ia64/include/asm/mmu_context.h:127:48: warning: variable 'old_rr4' set but not used [-Wunused-but-set-variable]
127 | unsigned long rr0, rr1, rr2, rr3, rr4, old_rr4;
| ^~~~~~~
In file included from <command-line>:
mm/mempolicy.c: In function 'mpol_new_interleave':
>> include/linux/compiler_types.h:328:45: error: call to '__compiletime_assert_319' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
328 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:309:25: note: in definition of macro '__compiletime_assert'
309 | prefix ## suffix(); \
| ^~~~~~
include/linux/compiler_types.h:328:9: note: in expansion of macro '_compiletime_assert'
328 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:60:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
60 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/mempolicy.c:196:9: note: in expansion of macro 'WRITE_ONCE'
196 | WRITE_ONCE(pol->nodes, *nodes);
| ^~~~~~~~~~
mm/mempolicy.c: In function 'mpol_new_bind':
include/linux/compiler_types.h:328:45: error: call to '__compiletime_assert_320' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
328 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:309:25: note: in definition of macro '__compiletime_assert'
309 | prefix ## suffix(); \
| ^~~~~~
include/linux/compiler_types.h:328:9: note: in expansion of macro '_compiletime_assert'
328 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:60:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
60 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/mempolicy.c:214:9: note: in expansion of macro 'WRITE_ONCE'
214 | WRITE_ONCE(pol->nodes, *nodes);
| ^~~~~~~~~~
mm/mempolicy.c: In function 'offset_il_node':
include/linux/compiler_types.h:328:45: error: call to '__compiletime_assert_326' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
328 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:309:25: note: in definition of macro '__compiletime_assert'
309 | prefix ## suffix(); \
| ^~~~~~
include/linux/compiler_types.h:328:9: note: in expansion of macro '_compiletime_assert'
328 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:49:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
49 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/mempolicy.c:1968:31: note: in expansion of macro 'READ_ONCE'
1968 | nodemask_t nodemask = READ_ONCE(pol->nodes);
| ^~~~~~~~~
mm/mempolicy.c: In function 'mpol_rebind_nodemask':
include/linux/compiler_types.h:328:45: error: call to '__compiletime_assert_321' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
328 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:309:25: note: in definition of macro '__compiletime_assert'
309 | prefix ## suffix(); \
| ^~~~~~
include/linux/compiler_types.h:328:9: note: in expansion of macro '_compiletime_assert'
328 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:36:9: note: in expansion of macro 'compiletime_assert'
36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
| ^~~~~~~~~~~~~~~~~~
include/asm-generic/rwonce.h:60:9: note: in expansion of macro 'compiletime_assert_rwonce_type'
60 | compiletime_assert_rwonce_type(x); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/mempolicy.c:337:9: note: in expansion of macro 'WRITE_ONCE'
337 | WRITE_ONCE(pol->nodes, tmp);
| ^~~~~~~~~~
vim +/__compiletime_assert_319 +328 include/linux/compiler_types.h
eb5c2d4b45e3d2d Will Deacon 2020-07-21 314
eb5c2d4b45e3d2d Will Deacon 2020-07-21 315 #define _compiletime_assert(condition, msg, prefix, suffix) \
eb5c2d4b45e3d2d Will Deacon 2020-07-21 316 __compiletime_assert(condition, msg, prefix, suffix)
eb5c2d4b45e3d2d Will Deacon 2020-07-21 317
eb5c2d4b45e3d2d Will Deacon 2020-07-21 318 /**
eb5c2d4b45e3d2d Will Deacon 2020-07-21 319 * compiletime_assert - break build and emit msg if condition is false
eb5c2d4b45e3d2d Will Deacon 2020-07-21 320 * @condition: a compile-time constant condition to check
eb5c2d4b45e3d2d Will Deacon 2020-07-21 321 * @msg: a message to emit if condition is false
eb5c2d4b45e3d2d Will Deacon 2020-07-21 322 *
eb5c2d4b45e3d2d Will Deacon 2020-07-21 323 * In tradition of POSIX assert, this macro will break the build if the
eb5c2d4b45e3d2d Will Deacon 2020-07-21 324 * supplied condition is *false*, emitting the supplied error message if the
eb5c2d4b45e3d2d Will Deacon 2020-07-21 325 * compiler has support to do so.
eb5c2d4b45e3d2d Will Deacon 2020-07-21 326 */
eb5c2d4b45e3d2d Will Deacon 2020-07-21 327 #define compiletime_assert(condition, msg) \
eb5c2d4b45e3d2d Will Deacon 2020-07-21 @328 _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
eb5c2d4b45e3d2d Will Deacon 2020-07-21 329
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Sat, Aug 14, 2021 at 12:58 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Sat, Aug 14, 2021 at 12:40:53AM +0800, yanghui wrote: > > +++ b/mm/mempolicy.c > > @@ -193,7 +193,7 @@ static int mpol_new_interleave(struct mempolicy *pol, const nodemask_t *nodes) > > { > > if (nodes_empty(*nodes)) > > return -EINVAL; > > - pol->nodes = *nodes; > > + WRITE_ONCE(pol->nodes, *nodes); > > typedef struct { DECLARE_BITMAP(bits, MAX_NUMNODES); } nodemask_t; > > If MAX_NUMNODES is large enough, is WRITE_ONCE going to work? It could > be 128 bits, and few architectures have an atomic 128-bit store > instruction. > Hi Matthew, In my memory, the WRITE_ONCE will become memcpy if the size is greater than sizeof(long). But I found that it didn't support the size over sizeof(long) nowadays. So you are right. It doesn't work when MAX_NUMNODES is large enough. If we don't use WRITE_ONCE/READ_ONCE here, it seems like no problem. At least we can sure offset_il_node returns a valid node id. What is your opinion about this fix? Thanks, Muchun
On Sat, Aug 14, 2021 at 10:37:23AM +0800, Muchun Song wrote: > On Sat, Aug 14, 2021 at 12:58 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Sat, Aug 14, 2021 at 12:40:53AM +0800, yanghui wrote: > > > +++ b/mm/mempolicy.c > > > @@ -193,7 +193,7 @@ static int mpol_new_interleave(struct mempolicy *pol, const nodemask_t *nodes) > > > { > > > if (nodes_empty(*nodes)) > > > return -EINVAL; > > > - pol->nodes = *nodes; > > > + WRITE_ONCE(pol->nodes, *nodes); > > > > typedef struct { DECLARE_BITMAP(bits, MAX_NUMNODES); } nodemask_t; > > > > If MAX_NUMNODES is large enough, is WRITE_ONCE going to work? It could > > be 128 bits, and few architectures have an atomic 128-bit store > > instruction. > > > > Hi Matthew, > > In my memory, the WRITE_ONCE will become memcpy if > the size is greater than sizeof(long). But I found that it didn't > support the size over sizeof(long) nowadays. So you are > right. It doesn't work when MAX_NUMNODES is large enough. > > If we don't use WRITE_ONCE/READ_ONCE here, it seems > like no problem. At least we can sure offset_il_node returns > a valid node id. What is your opinion about this fix? Yes, I think that's a reasonable fix. Include a comment about how copying pol->nodes to the stack might race with a change to pol->nodes, but that doesn't matter as long as we don't look at pol->nodes later?
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index e32360e90274..9c3c168af3e2 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -193,7 +193,7 @@ static int mpol_new_interleave(struct mempolicy *pol, const nodemask_t *nodes) { if (nodes_empty(*nodes)) return -EINVAL; - pol->nodes = *nodes; + WRITE_ONCE(pol->nodes, *nodes); return 0; } @@ -211,7 +211,7 @@ static int mpol_new_bind(struct mempolicy *pol, const nodemask_t *nodes) { if (nodes_empty(*nodes)) return -EINVAL; - pol->nodes = *nodes; + WRITE_ONCE(pol->nodes, *nodes); return 0; } @@ -334,7 +334,7 @@ static void mpol_rebind_nodemask(struct mempolicy *pol, const nodemask_t *nodes) if (nodes_empty(tmp)) tmp = *nodes; - pol->nodes = tmp; + WRITE_ONCE(pol->nodes, tmp); } static void mpol_rebind_preferred(struct mempolicy *pol, @@ -1965,7 +1965,8 @@ unsigned int mempolicy_slab_node(void) */ static unsigned offset_il_node(struct mempolicy *pol, unsigned long n) { - unsigned nnodes = nodes_weight(pol->nodes); + nodemask_t nodemask = READ_ONCE(pol->nodes); + unsigned nnodes = nodes_weight(nodemask); unsigned target; int i; int nid; @@ -1973,9 +1974,9 @@ static unsigned offset_il_node(struct mempolicy *pol, unsigned long n) if (!nnodes) return numa_node_id(); target = (unsigned int)n % nnodes; - nid = first_node(pol->nodes); + nid = first_node(nodemask); for (i = 0; i < target; i++) - nid = next_node(nid, pol->nodes); + nid = next_node(nid, nodemask); return nid; }
Servers happened below panic: Kernel version:5.4.56 BUG: unable to handle page fault for address: 0000000000002c48 RIP: 0010:__next_zones_zonelist+0x1d/0x40 [264003.977696] RAX: 0000000000002c40 RBX: 0000000000100dca RCX: 0000000000000014 [264003.977872] Call Trace: [264003.977888] __alloc_pages_nodemask+0x277/0x310 [264003.977908] alloc_page_interleave+0x13/0x70 [264003.977926] handle_mm_fault+0xf99/0x1390 [264003.977951] __do_page_fault+0x288/0x500 [264003.977979] ? schedule+0x39/0xa0 [264003.977994] do_page_fault+0x30/0x110 [264003.978010] page_fault+0x3e/0x50 The reason of panic is that MAX_NUMNODES is passd in the third parameter in function __alloc_pages_nodemask(preferred_nid). So if to access zonelist->zoneref->zone_idx in __next_zones_zonelist the panic will happen. In offset_il_node(), first_node() return nid from pol->v.nodes, after this other threads may changed pol->v.nodes before next_node(). This race condition will let next_node return MAX_NUMNODES.So put pol->nodes in a local variable. The race condition is between offset_il_node and cpuset_change_task_nodemask: CPU0: CPU1: alloc_pages_vma() interleave_nid(pol,) offset_il_node(pol,) first_node(pol->v.nodes) cpuset_change_task_nodemask //nodes==0xc mpol_rebind_task mpol_rebind_policy mpol_rebind_nodemask(pol,nodes) //nodes==0x3 next_node(nid, pol->v.nodes)//return MAX_NUMNODES Signed-off-by: yanghui <yanghui.def@bytedance.com> --- mm/mempolicy.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)