Message ID | 20220524103638.473-1-hezhongkun.hzk@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: memcontrol: add the mempolicy interface for cgroup v2. | expand |
On Tue 24-05-22 18:36:38, hezhongkun wrote: > From: Hezhongkun <hezhongkun.hzk@bytedance.com> > > Mempolicy is difficult to use because it is set in-process > via a system call. We want to make it easier to use mempolicy > in cgroups, so that we can control low-priority cgroups to > allocate memory in specified nodes. So this patch want to > adds the mempolicy interface. > > the mempolicy priority of memcgroup is higher than the priority > of task. The order of getting the policy is, > memcgroup->policy,task->policy or vma policy, default policy. > memcgroup's policy is owned by itself, so descendants will > not inherit it. Why cannot you use cpuset cgroup?
Hi Michal, thanks for your reply. mempolicy has two functions, which nodes to choose and how to use these nodes. cpuset can only decide the first one,it equal to 'bind' mempolicy. If cgroups support mempolicy, we can continue to develop more policy types. For example, allocate memory according to node weight, etc. We would like to have more precise control over memory allocation in NUMA server. On Tue, May 24, 2022 at 6:47 PM Michal Hocko <mhocko@suse.com> wrote: > On Tue 24-05-22 18:36:38, hezhongkun wrote: > > From: Hezhongkun <hezhongkun.hzk@bytedance.com> > > > > Mempolicy is difficult to use because it is set in-process > > via a system call. We want to make it easier to use mempolicy > > in cgroups, so that we can control low-priority cgroups to > > allocate memory in specified nodes. So this patch want to > > adds the mempolicy interface. > > > > the mempolicy priority of memcgroup is higher than the priority > > of task. The order of getting the policy is, > > memcgroup->policy,task->policy or vma policy, default policy. > > memcgroup's policy is owned by itself, so descendants will > > not inherit it. > > Why cannot you use cpuset cgroup? > -- > Michal Hocko > SUSE Labs >
On Tue 24-05-22 19:46:38, 贺中坤 wrote: > Hi Michal, thanks for your reply. > mempolicy has two functions, which nodes to choose and how to use these > nodes. cpuset can only decide the first one,it equal to 'bind' mempolicy. > If cgroups support mempolicy, we can continue to develop more policy > types. For example, allocate memory according to node weight, etc. > We would like to have more precise control over memory allocation in NUMA > server. Why cputset controller cannot be extended instead?
Hi hezhongkun, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v5.18 next-20220524] [cannot apply to akpm-mm/mm-everything] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/hezhongkun/mm-memcontrol-add-the-mempolicy-interface-for-cgroup-v2/20220524-183922 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 143a6252e1b8ab424b4b293512a97cca7295c182 config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20220524/202205242108.pqUxw2OF-lkp@intel.com/config) compiler: gcc-11 (Debian 11.3.0-1) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/6adb0a02c27c8811bee9783451ee25155baf490e git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review hezhongkun/mm-memcontrol-add-the-mempolicy-interface-for-cgroup-v2/20220524-183922 git checkout 6adb0a02c27c8811bee9783451ee25155baf490e # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): >> mm/mempolicy.c:179:19: warning: no previous prototype for 'get_cgrp_or_task_policy' [-Wmissing-prototypes] 179 | struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) | ^~~~~~~~~~~~~~~~~~~~~~~ mm/mempolicy.c: In function 'get_cgrp_or_task_policy': >> mm/mempolicy.c:182:36: error: implicit declaration of function 'mem_cgroup_from_task'; did you mean 'perf_cgroup_from_task'? [-Werror=implicit-function-declaration] 182 | struct mem_cgroup *memcg = mem_cgroup_from_task(p); | ^~~~~~~~~~~~~~~~~~~~ | perf_cgroup_from_task >> mm/mempolicy.c:182:36: warning: initialization of 'struct mem_cgroup *' from 'int' makes pointer from integer without a cast [-Wint-conversion] >> mm/mempolicy.c:184:30: error: invalid use of undefined type 'struct mem_cgroup' 184 | pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_task_policy(p); | ^~ mm/mempolicy.c:184:50: error: invalid use of undefined type 'struct mem_cgroup' 184 | pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_task_policy(p); | ^~ mm/mempolicy.c: In function 'get_cgrp_or_vma_policy': mm/mempolicy.c:1799:36: warning: initialization of 'struct mem_cgroup *' from 'int' makes pointer from integer without a cast [-Wint-conversion] 1799 | struct mem_cgroup *memcg = mem_cgroup_from_task(current); | ^~~~~~~~~~~~~~~~~~~~ mm/mempolicy.c:1801:30: error: invalid use of undefined type 'struct mem_cgroup' 1801 | pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_vma_policy(vma, addr); | ^~ mm/mempolicy.c:1801:50: error: invalid use of undefined type 'struct mem_cgroup' 1801 | pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_vma_policy(vma, addr); | ^~ cc1: some warnings being treated as errors vim +182 mm/mempolicy.c 178 > 179 struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) 180 { 181 struct mempolicy *pol; > 182 struct mem_cgroup *memcg = mem_cgroup_from_task(p); 183 > 184 pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_task_policy(p); 185 return pol; 186 } 187
Hi hezhongkun, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v5.18 next-20220524] [cannot apply to akpm-mm/mm-everything] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/hezhongkun/mm-memcontrol-add-the-mempolicy-interface-for-cgroup-v2/20220524-183922 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 143a6252e1b8ab424b4b293512a97cca7295c182 config: x86_64-randconfig-a016 (https://download.01.org/0day-ci/archive/20220524/202205242200.VGAUIGvw-lkp@intel.com/config) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 10c9ecce9f6096e18222a331c5e7d085bd813f75) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/6adb0a02c27c8811bee9783451ee25155baf490e git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review hezhongkun/mm-memcontrol-add-the-mempolicy-interface-for-cgroup-v2/20220524-183922 git checkout 6adb0a02c27c8811bee9783451ee25155baf490e # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): >> mm/mempolicy.c:182:29: error: call to undeclared function 'mem_cgroup_from_task'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] struct mem_cgroup *memcg = mem_cgroup_from_task(p); ^ mm/mempolicy.c:182:29: note: did you mean 'mem_cgroup_from_css'? include/linux/memcontrol.h:1267:20: note: 'mem_cgroup_from_css' declared here struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css) ^ >> mm/mempolicy.c:182:21: warning: incompatible integer to pointer conversion initializing 'struct mem_cgroup *' with an expression of type 'int' [-Wint-conversion] struct mem_cgroup *memcg = mem_cgroup_from_task(p); ^ ~~~~~~~~~~~~~~~~~~~~~~~ >> mm/mempolicy.c:184:23: error: incomplete definition of type 'struct mem_cgroup' pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_task_policy(p); ~~~~~^ include/linux/mm_types.h:31:8: note: forward declaration of 'struct mem_cgroup' struct mem_cgroup; ^ mm/mempolicy.c:184:43: error: incomplete definition of type 'struct mem_cgroup' pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_task_policy(p); ~~~~~^ include/linux/mm_types.h:31:8: note: forward declaration of 'struct mem_cgroup' struct mem_cgroup; ^ mm/mempolicy.c:179:19: warning: no previous prototype for function 'get_cgrp_or_task_policy' [-Wmissing-prototypes] struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) ^ mm/mempolicy.c:179:1: note: declare 'static' if the function is not intended to be used outside of this translation unit struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) ^ static mm/mempolicy.c:1799:29: error: call to undeclared function 'mem_cgroup_from_task'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] struct mem_cgroup *memcg = mem_cgroup_from_task(current); ^ mm/mempolicy.c:1799:21: warning: incompatible integer to pointer conversion initializing 'struct mem_cgroup *' with an expression of type 'int' [-Wint-conversion] struct mem_cgroup *memcg = mem_cgroup_from_task(current); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ mm/mempolicy.c:1801:23: error: incomplete definition of type 'struct mem_cgroup' pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_vma_policy(vma, addr); ~~~~~^ include/linux/mm_types.h:31:8: note: forward declaration of 'struct mem_cgroup' struct mem_cgroup; ^ mm/mempolicy.c:1801:43: error: incomplete definition of type 'struct mem_cgroup' pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_vma_policy(vma, addr); ~~~~~^ include/linux/mm_types.h:31:8: note: forward declaration of 'struct mem_cgroup' struct mem_cgroup; ^ 3 warnings and 6 errors generated. vim +/mem_cgroup_from_task +182 mm/mempolicy.c 178 179 struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) 180 { 181 struct mempolicy *pol; > 182 struct mem_cgroup *memcg = mem_cgroup_from_task(p); 183 > 184 pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_task_policy(p); 185 return pol; 186 } 187
Hi hezhongkun, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v5.18 next-20220524] [cannot apply to akpm-mm/mm-everything] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/hezhongkun/mm-memcontrol-add-the-mempolicy-interface-for-cgroup-v2/20220524-183922 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 143a6252e1b8ab424b4b293512a97cca7295c182 config: x86_64-randconfig-a014 (https://download.01.org/0day-ci/archive/20220524/202205242316.8f8rvh3s-lkp@intel.com/config) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 10c9ecce9f6096e18222a331c5e7d085bd813f75) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/6adb0a02c27c8811bee9783451ee25155baf490e git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review hezhongkun/mm-memcontrol-add-the-mempolicy-interface-for-cgroup-v2/20220524-183922 git checkout 6adb0a02c27c8811bee9783451ee25155baf490e # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> mm/mempolicy.c:179:19: warning: no previous prototype for function 'get_cgrp_or_task_policy' [-Wmissing-prototypes] struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) ^ mm/mempolicy.c:179:1: note: declare 'static' if the function is not intended to be used outside of this translation unit struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) ^ static 1 warning generated. vim +/get_cgrp_or_task_policy +179 mm/mempolicy.c 178 > 179 struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) 180 { 181 struct mempolicy *pol; 182 struct mem_cgroup *memcg = mem_cgroup_from_task(p); 183 184 pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_task_policy(p); 185 return pol; 186 } 187
Greeting, FYI, we noticed the following commit (built with gcc-11): commit: 6adb0a02c27c8811bee9783451ee25155baf490e ("[PATCH] mm: memcontrol: add the mempolicy interface for cgroup v2.") url: https://github.com/intel-lab-lkp/linux/commits/hezhongkun/mm-memcontrol-add-the-mempolicy-interface-for-cgroup-v2/20220524-183922 base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 143a6252e1b8ab424b4b293512a97cca7295c182 patch link: https://lore.kernel.org/lkml/20220524103638.473-1-hezhongkun.hzk@bytedance.com in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): If you fix the issue, kindly add following tag Reported-by: kernel test robot <oliver.sang@intel.com> [ 1.775514][ T2] WARNING: suspicious RCU usage [ 1.776115][ T2] 5.18.0-01158-g6adb0a02c27c #10 Not tainted [ 1.776513][ T2] ----------------------------- [ 1.777133][ T2] include/linux/cgroup.h:495 suspicious rcu_dereference_check() usage! [ 1.777513][ T2] [ 1.777513][ T2] other info that might help us debug this: [ 1.777513][ T2] [ 1.778513][ T2] [ 1.778513][ T2] rcu_scheduler_active = 1, debug_locks = 1 [ 1.779493][ T2] no locks held by kthreadd/2. [ 1.779514][ T2] [ 1.779514][ T2] stack backtrace: [ 1.780272][ T2] CPU: 0 PID: 2 Comm: kthreadd Not tainted 5.18.0-01158-g6adb0a02c27c #10 [ 1.780509][ T2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014 [ 1.780509][ T2] Call Trace: [ 1.780509][ T2] <TASK> [ 1.780509][ T2] dump_stack_lvl (kbuild/src/x86_64-2/lib/dump_stack.c:107 (discriminator 4)) [ 1.780509][ T2] mem_cgroup_from_task (kbuild/src/x86_64-2/include/linux/cgroup.h:495 kbuild/src/x86_64-2/mm/memcontrol.c:909) [ 1.780509][ T2] get_cgrp_or_task_policy (kbuild/src/x86_64-2/mm/mempolicy.c:184) [ 1.780509][ T2] alloc_pages (kbuild/src/x86_64-2/mm/mempolicy.c:2280) [ 1.780509][ T2] allocate_slab (kbuild/src/x86_64-2/mm/slub.c:1799 kbuild/src/x86_64-2/mm/slub.c:1944) [ 1.780509][ T2] ___slab_alloc (kbuild/src/x86_64-2/mm/slub.c:3005) [ 1.780509][ T2] ? dup_task_struct (kbuild/src/x86_64-2/kernel/fork.c:172 kbuild/src/x86_64-2/kernel/fork.c:971) [ 1.780509][ T2] kmem_cache_alloc_node (kbuild/src/x86_64-2/mm/slub.c:3092 kbuild/src/x86_64-2/mm/slub.c:3183 kbuild/src/x86_64-2/mm/slub.c:3267) [ 1.780509][ T2] dup_task_struct (kbuild/src/x86_64-2/kernel/fork.c:172 kbuild/src/x86_64-2/kernel/fork.c:971) [ 1.780509][ T2] ? trace_hardirqs_on (kbuild/src/x86_64-2/kernel/trace/trace_preemptirq.c:50 (discriminator 22)) [ 1.780509][ T2] copy_process (kbuild/src/x86_64-2/kernel/fork.c:2073) [ 1.780509][ T2] ? alloc_chain_hlocks (kbuild/src/x86_64-2/kernel/locking/lockdep.c:3455) [ 1.780509][ T2] ? add_chain_cache (kbuild/src/x86_64-2/kernel/locking/lockdep.c:3664) [ 1.780509][ T2] ? __lock_acquire (kbuild/src/x86_64-2/kernel/locking/lockdep.c:5029) [ 1.780509][ T2] ? __cleanup_sighand (kbuild/src/x86_64-2/kernel/fork.c:1982) [ 1.780509][ T2] ? finish_task_switch+0x20f/0x900 [ 1.780509][ T2] ? check_prev_add (kbuild/src/x86_64-2/kernel/locking/lockdep.c:3759) [ 1.780509][ T2] ? __lock_release (kbuild/src/x86_64-2/kernel/locking/lockdep.c:5317) [ 1.780509][ T2] kernel_clone (kbuild/src/x86_64-2/kernel/fork.c:2644) [ 1.780509][ T2] ? create_io_thread (kbuild/src/x86_64-2/kernel/fork.c:2604) [ 1.780509][ T2] ? __lock_acquire (kbuild/src/x86_64-2/kernel/locking/lockdep.c:5029) [ 1.780509][ T2] ? finish_task_switch+0x214/0x900 [ 1.780509][ T2] ? find_held_lock (kbuild/src/x86_64-2/kernel/locking/lockdep.c:5132) [ 1.780509][ T2] kernel_thread (kbuild/src/x86_64-2/kernel/fork.c:2687) [ 1.780509][ T2] ? __ia32_sys_clone3 (kbuild/src/x86_64-2/kernel/fork.c:2687) [ 1.780509][ T2] ? lock_downgrade (kbuild/src/x86_64-2/kernel/locking/lockdep.c:5293) [ 1.780509][ T2] ? kthread_complete_and_exit (kbuild/src/x86_64-2/kernel/kthread.c:331) [ 1.780509][ T2] ? kthreadd (kbuild/src/x86_64-2/kernel/kthread.c:396 kbuild/src/x86_64-2/kernel/kthread.c:745) [ 1.780509][ T2] ? do_raw_spin_unlock (kbuild/src/x86_64-2/arch/x86/include/asm/atomic.h:29 kbuild/src/x86_64-2/include/linux/atomic/atomic-instrumented.h:28 kbuild/src/x86_64-2/include/asm-generic/qspinlock.h:28 kbuild/src/x86_64-2/kernel/locking/spinlock_debug.c:100 kbuild/src/x86_64-2/kernel/locking/spinlock_debug.c:140) [ 1.780509][ T2] kthreadd (kbuild/src/x86_64-2/kernel/kthread.c:400 kbuild/src/x86_64-2/kernel/kthread.c:745) [ 1.780509][ T2] ? kthread_is_per_cpu (kbuild/src/x86_64-2/kernel/kthread.c:718) [ 1.780509][ T2] ret_from_fork (kbuild/src/x86_64-2/arch/x86/entry/entry_64.S:308) [ 1.780509][ T2] </TASK> [ 1.781590][ T1] cblist_init_generic: Setting adjustable number of callback queues. [ 1.782518][ T1] cblist_init_generic: Setting shift to 1 and lim to 1. [ 1.783730][ T1] cblist_init_generic: Setting shift to 1 and lim to 1. [ 1.784646][ T1] Running RCU-tasks wait API self tests [ 1.785657][ T1] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only. [ 1.787556][ T1] rcu: Hierarchical SRCU implementation. [ 1.791308][ T1] NMI watchdog: Perf NMI watchdog permanently disabled [ 1.792109][ T1] smp: Bringing up secondary CPUs ... [ 1.793634][ T1] x86: Booting SMP configuration: [ 1.794300][ T1] .... node #0, CPUs: #1 [ 0.090644][ T0] masked ExtINT on CPU#1 [ 1.797615][ T1] smp: Brought up 1 node, 2 CPUs [ 1.798527][ T1] smpboot: Max logical packages: 1 [ 1.799519][ T1] smpboot: Total of 2 processors activated (8380.31 BogoMIPS) [ 1.802552][ T11] Callback from call_rcu_tasks_trace() invoked. [ 1.898728][ T10] Callback from call_rcu_tasks_rude() invoked. [ 1.998585][ T22] node 0 deferred pages initialised in 196ms [ 2.099652][ T1] allocated 268435456 bytes of page_ext [ 2.100769][ T1] Node 0, zone DMA: page owner found early allocated 0 pages [ 2.106388][ T1] Node 0, zone DMA32: page owner found early allocated 0 pages [ 2.143231][ T1] Node 0, zone Normal: page owner found early allocated 66872 pages [ 2.145828][ T1] devtmpfs: initialized [ 2.147626][ T1] x86/mm: Memory block size: 128MB [ 2.195988][ T1] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns [ 2.197567][ T1] futex hash table entries: 512 (order: 4, 65536 bytes, linear) [ 2.199426][ T1] pinctrl core: initialized pinctrl subsystem [ 2.212984][ T1] NET: Registered PF_NETLINK/PF_ROUTE protocol family [ 2.217521][ T1] audit: initializing netlink subsys (disabled) [ 2.219652][ T27] audit: type=2000 audit(1653397015.364:1): state=initialized audit_enabled=0 res=1 [ 2.222174][ T1] thermal_sys: Registered thermal governor 'fair_share' [ 2.222184][ T1] thermal_sys: Registered thermal governor 'bang_bang' [ 2.222529][ T1] thermal_sys: Registered thermal governor 'step_wise' [ 2.223522][ T1] thermal_sys: Registered thermal governor 'user_space' [ 2.224756][ T1] cpuidle: using governor menu [ 2.227738][ T1] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [ 2.229834][ T1] PCI: Using configuration type 1 for base access [ 2.279233][ T1] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible. [ 2.281673][ T1] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages [ 2.285568][ T1] cryptd: max_cpu_qlen set to 1000 [ 2.291676][ T1] ACPI: Added _OSI(Module Device) [ 2.292523][ T1] ACPI: Added _OSI(Processor Device) [ 2.293523][ T1] ACPI: Added _OSI(3.0 _SCP Extensions) [ 2.294523][ T1] ACPI: Added _OSI(Processor Aggregator Device) [ 2.295547][ T1] ACPI: Added _OSI(Linux-Dell-Video) [ 2.296535][ T1] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio) [ 2.297539][ T1] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics) [ 2.347738][ T1] ACPI: 1 ACPI AML tables successfully acquired and loaded [ 2.363724][ T1] ACPI: Interpreter enabled [ 2.364811][ T1] ACPI: PM: (supports S0 S3 S4 S5) [ 2.365567][ T1] ACPI: Using IOAPIC for interrupt routing [ 2.366799][ T1] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 2.370916][ T1] ACPI: Enabled 2 GPEs in block 00 to 0F To reproduce: # build kernel cd linux cp config-5.18.0-01158-g6adb0a02c27c .config make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install cd <mod-install-dir> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 89b14729d59f..2261eeb6100c 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -343,6 +343,7 @@ struct mem_cgroup { #ifdef CONFIG_TRANSPARENT_HUGEPAGE struct deferred_split deferred_split_queue; #endif + struct mempolicy *mempolicy; struct mem_cgroup_per_node *nodeinfo[]; }; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 598fece89e2b..38108fd4df64 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6332,6 +6332,42 @@ static int memory_numa_stat_show(struct seq_file *m, void *v) return 0; } + +static int memory_policy_show(struct seq_file *m, void *v) +{ + char buffer[64]; + struct mempolicy *mpol = mem_cgroup_from_seq(m)->mempolicy; + + memset(buffer, 0, sizeof(buffer)); + + if (!mpol || mpol->mode == MPOL_DEFAULT) + return 0; + + mpol_to_str(buffer, sizeof(buffer), mpol); + seq_printf(m, buffer); + seq_putc(m, '\n'); + return 0; +} + +static ssize_t memory_policy_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off) +{ + int err = 1; + struct mempolicy *mpol, *old; + struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of)); + + old = memcg->mempolicy; + buf = strstrip(buf); + err = mpol_parse_str(buf, &mpol); + + if (err) + goto out; + mpol_put(old); + memcg->mempolicy = mpol; +out: + return nbytes; +} + #endif static int memory_oom_group_show(struct seq_file *m, void *v) @@ -6416,6 +6452,12 @@ static struct cftype memory_files[] = { .name = "numa_stat", .seq_show = memory_numa_stat_show, }, + { + .name = "policy", + .flags = CFTYPE_NOT_ON_ROOT, + .seq_show = memory_policy_show, + .write = memory_policy_write, + }, #endif { .name = "oom.group", diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 8c74107a2b15..5153b046f8c3 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -176,6 +176,16 @@ struct mempolicy *get_task_policy(struct task_struct *p) return &default_policy; } +struct mempolicy *get_cgrp_or_task_policy(struct task_struct *p) +{ + struct mempolicy *pol; + struct mem_cgroup *memcg = mem_cgroup_from_task(p); + + pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_task_policy(p); + return pol; +} + + static const struct mempolicy_operations { int (*create)(struct mempolicy *pol, const nodemask_t *nodes); void (*rebind)(struct mempolicy *pol, const nodemask_t *nodes); @@ -1782,6 +1792,16 @@ static struct mempolicy *get_vma_policy(struct vm_area_struct *vma, return pol; } +static struct mempolicy *get_cgrp_or_vma_policy(struct vm_area_struct *vma, + unsigned long addr) +{ + struct mempolicy *pol; + struct mem_cgroup *memcg = mem_cgroup_from_task(current); + + pol = (memcg && memcg->mempolicy) ? memcg->mempolicy : get_vma_policy(vma, addr); + return pol; +} + bool vma_policy_mof(struct vm_area_struct *vma) { struct mempolicy *pol; @@ -1896,7 +1916,7 @@ unsigned int mempolicy_slab_node(void) if (!in_task()) return node; - policy = current->mempolicy; + policy = get_cgrp_or_task_policy(current); if (!policy) return node; @@ -2005,7 +2025,7 @@ int huge_node(struct vm_area_struct *vma, unsigned long addr, gfp_t gfp_flags, int nid; int mode; - *mpol = get_vma_policy(vma, addr); + *mpol = get_cgrp_or_vma_policy(vma, addr); *nodemask = NULL; mode = (*mpol)->mode; @@ -2158,7 +2178,7 @@ struct page *alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma, int preferred_nid; nodemask_t *nmask; - pol = get_vma_policy(vma, addr); + pol = get_cgrp_or_vma_policy(vma, addr); if (pol->mode == MPOL_INTERLEAVE) { unsigned nid; @@ -2257,7 +2277,7 @@ struct page *alloc_pages(gfp_t gfp, unsigned order) struct page *page; if (!in_interrupt() && !(gfp & __GFP_THISNODE)) - pol = get_task_policy(current); + pol = get_cgrp_or_task_policy(current); /* * No reference counting needed for current->mempolicy @@ -2562,7 +2582,7 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long int polnid = NUMA_NO_NODE; int ret = NUMA_NO_NODE; - pol = get_vma_policy(vma, addr); + pol = get_cgrp_or_vma_policy(vma, addr); if (!(pol->flags & MPOL_F_MOF)) goto out;