Message ID | 20220217061616.3303271-1-vipinsh@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] KVM: Move VM's worker kthreads back to the original cgroup before exiting. | expand |
Hi Vipin, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on db6e7adf8de9b3b99a9856acb73870cc3a70e3ca] url: https://github.com/0day-ci/linux/commits/Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroup-before-exiting/20220217-141723 base: db6e7adf8de9b3b99a9856acb73870cc3a70e3ca config: s390-randconfig-s032-20220217 (https://download.01.org/0day-ci/archive/20220217/202202172046.GuW8pHQc-lkp@intel.com/config) compiler: s390-linux-gcc (GCC) 11.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.4-dirty # https://github.com/0day-ci/linux/commit/1abffef71ef85b6fb8f1296e6ef38febc4f2b007 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroup-before-exiting/20220217-141723 git checkout 1abffef71ef85b6fb8f1296e6ef38febc4f2b007 # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=s390 SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> sparse warnings: (new ones prefixed by >>) arch/s390/kvm/../../../virt/kvm/kvm_main.c: note: in included file: include/linux/kvm_host.h:1877:54: sparse: sparse: array of flexible structures include/linux/kvm_host.h:1879:56: sparse: sparse: array of flexible structures >> arch/s390/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct task_struct *from @@ got struct task_struct [noderef] __rcu *real_parent @@ arch/s390/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: expected struct task_struct *from arch/s390/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: got struct task_struct [noderef] __rcu *real_parent arch/s390/kvm/../../../virt/kvm/kvm_main.c:2522:9: sparse: sparse: context imbalance in 'hva_to_pfn_remapped' - unexpected unlock vim +5859 arch/s390/kvm/../../../virt/kvm/kvm_main.c 5805 5806 static int kvm_vm_worker_thread(void *context) 5807 { 5808 /* 5809 * The init_context is allocated on the stack of the parent thread, so 5810 * we have to locally copy anything that is needed beyond initialization 5811 */ 5812 struct kvm_vm_worker_thread_context *init_context = context; 5813 struct kvm *kvm = init_context->kvm; 5814 kvm_vm_thread_fn_t thread_fn = init_context->thread_fn; 5815 uintptr_t data = init_context->data; 5816 int err, reattach_err; 5817 5818 err = kthread_park(current); 5819 /* kthread_park(current) is never supposed to return an error */ 5820 WARN_ON(err != 0); 5821 if (err) 5822 goto init_complete; 5823 5824 err = cgroup_attach_task_all(init_context->parent, current); 5825 if (err) { 5826 kvm_err("%s: cgroup_attach_task_all failed with err %d\n", 5827 __func__, err); 5828 goto init_complete; 5829 } 5830 5831 set_user_nice(current, task_nice(init_context->parent)); 5832 5833 init_complete: 5834 init_context->err = err; 5835 complete(&init_context->init_done); 5836 init_context = NULL; 5837 5838 if (err) 5839 goto out; 5840 5841 /* Wait to be woken up by the spawner before proceeding. */ 5842 kthread_parkme(); 5843 5844 if (!kthread_should_stop()) 5845 err = thread_fn(kvm, data); 5846 5847 out: 5848 /* 5849 * Move kthread back to its original cgroup to prevent it lingering in 5850 * the cgroup of the VM process, after the latter finishes its 5851 * execution. 5852 * 5853 * kthread_stop() waits on the 'exited' completion condition which is 5854 * set in exit_mm(), via mm_release(), in do_exit(). However, the 5855 * kthread is removed from the cgroup in the cgroup_exit() which is 5856 * called after the exit_mm(). This causes the kthread_stop() to return 5857 * before the kthread actually quits the cgroup. 5858 */ > 5859 reattach_err = cgroup_attach_task_all(current->real_parent, current); 5860 if (reattach_err) { 5861 kvm_err("%s: cgroup_attach_task_all failed on reattach with err %d\n", 5862 __func__, reattach_err); 5863 } 5864 return err; 5865 } 5866 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On 2/17/22 13:34, kernel test robot wrote: >> 5859 reattach_err = cgroup_attach_task_all(current->real_parent, current); This needs to be within rcu_dereference(). Paolo > 5860 if (reattach_err) { > 5861 kvm_err("%s: cgroup_attach_task_all failed on reattach with err %d\n", > 5862 __func__, reattach_err); > 5863 } > 5864 return err; > 5865 } > 5866 > > --- > 0-DAY CI Kernel Test Service, Intel Corporation > https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org >
Hi Vipin, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on db6e7adf8de9b3b99a9856acb73870cc3a70e3ca] url: https://github.com/0day-ci/linux/commits/Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroup-before-exiting/20220217-141723 base: db6e7adf8de9b3b99a9856acb73870cc3a70e3ca config: arm64-randconfig-s032-20220217 (https://download.01.org/0day-ci/archive/20220218/202202180218.msk1UR5R-lkp@intel.com/config) compiler: aarch64-linux-gcc (GCC) 11.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.4-dirty # https://github.com/0day-ci/linux/commit/1abffef71ef85b6fb8f1296e6ef38febc4f2b007 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroup-before-exiting/20220217-141723 git checkout 1abffef71ef85b6fb8f1296e6ef38febc4f2b007 # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> sparse warnings: (new ones prefixed by >>) arch/arm64/kvm/../../../virt/kvm/kvm_main.c: note: in included file: include/linux/kvm_host.h:1877:54: sparse: sparse: array of flexible structures include/linux/kvm_host.h:1879:56: sparse: sparse: array of flexible structures >> arch/arm64/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct task_struct *from @@ got struct task_struct [noderef] __rcu *real_parent @@ arch/arm64/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: expected struct task_struct *from arch/arm64/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: got struct task_struct [noderef] __rcu *real_parent arch/arm64/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_change_pte' - different lock contexts for basic block arch/arm64/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_invalidate_range_start' - different lock contexts for basic block arch/arm64/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_invalidate_range_end' - different lock contexts for basic block arch/arm64/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_clear_flush_young' - different lock contexts for basic block arch/arm64/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_clear_young' - different lock contexts for basic block arch/arm64/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_test_young' - different lock contexts for basic block arch/arm64/kvm/../../../virt/kvm/kvm_main.c:2522:9: sparse: sparse: context imbalance in 'hva_to_pfn_remapped' - unexpected unlock vim +5859 arch/arm64/kvm/../../../virt/kvm/kvm_main.c 5805 5806 static int kvm_vm_worker_thread(void *context) 5807 { 5808 /* 5809 * The init_context is allocated on the stack of the parent thread, so 5810 * we have to locally copy anything that is needed beyond initialization 5811 */ 5812 struct kvm_vm_worker_thread_context *init_context = context; 5813 struct kvm *kvm = init_context->kvm; 5814 kvm_vm_thread_fn_t thread_fn = init_context->thread_fn; 5815 uintptr_t data = init_context->data; 5816 int err, reattach_err; 5817 5818 err = kthread_park(current); 5819 /* kthread_park(current) is never supposed to return an error */ 5820 WARN_ON(err != 0); 5821 if (err) 5822 goto init_complete; 5823 5824 err = cgroup_attach_task_all(init_context->parent, current); 5825 if (err) { 5826 kvm_err("%s: cgroup_attach_task_all failed with err %d\n", 5827 __func__, err); 5828 goto init_complete; 5829 } 5830 5831 set_user_nice(current, task_nice(init_context->parent)); 5832 5833 init_complete: 5834 init_context->err = err; 5835 complete(&init_context->init_done); 5836 init_context = NULL; 5837 5838 if (err) 5839 goto out; 5840 5841 /* Wait to be woken up by the spawner before proceeding. */ 5842 kthread_parkme(); 5843 5844 if (!kthread_should_stop()) 5845 err = thread_fn(kvm, data); 5846 5847 out: 5848 /* 5849 * Move kthread back to its original cgroup to prevent it lingering in 5850 * the cgroup of the VM process, after the latter finishes its 5851 * execution. 5852 * 5853 * kthread_stop() waits on the 'exited' completion condition which is 5854 * set in exit_mm(), via mm_release(), in do_exit(). However, the 5855 * kthread is removed from the cgroup in the cgroup_exit() which is 5856 * called after the exit_mm(). This causes the kthread_stop() to return 5857 * before the kthread actually quits the cgroup. 5858 */ > 5859 reattach_err = cgroup_attach_task_all(current->real_parent, current); 5860 if (reattach_err) { 5861 kvm_err("%s: cgroup_attach_task_all failed on reattach with err %d\n", 5862 __func__, reattach_err); 5863 } 5864 return err; 5865 } 5866 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Vipin, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on db6e7adf8de9b3b99a9856acb73870cc3a70e3ca] url: https://github.com/0day-ci/linux/commits/Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroup-before-exiting/20220217-141723 base: db6e7adf8de9b3b99a9856acb73870cc3a70e3ca config: x86_64-rhel-8.3-kselftests (https://download.01.org/0day-ci/archive/20220218/202202180730.AAgeOZkF-lkp@intel.com/config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.4-dirty # https://github.com/0day-ci/linux/commit/1abffef71ef85b6fb8f1296e6ef38febc4f2b007 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroup-before-exiting/20220217-141723 git checkout 1abffef71ef85b6fb8f1296e6ef38febc4f2b007 # save the config file to linux build tree mkdir build_dir make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> sparse warnings: (new ones prefixed by >>) arch/x86/kvm/../../../virt/kvm/kvm_main.c: note: in included file: include/linux/kvm_host.h:1877:54: sparse: sparse: array of flexible structures include/linux/kvm_host.h:1879:56: sparse: sparse: array of flexible structures >> arch/x86/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected struct task_struct *from @@ got struct task_struct [noderef] __rcu *real_parent @@ arch/x86/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: expected struct task_struct *from arch/x86/kvm/../../../virt/kvm/kvm_main.c:5859:54: sparse: got struct task_struct [noderef] __rcu *real_parent arch/x86/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_change_pte' - different lock contexts for basic block arch/x86/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_invalidate_range_start' - different lock contexts for basic block arch/x86/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_invalidate_range_end' - different lock contexts for basic block arch/x86/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_clear_flush_young' - different lock contexts for basic block arch/x86/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_clear_young' - different lock contexts for basic block arch/x86/kvm/../../../virt/kvm/kvm_main.c:538:9: sparse: sparse: context imbalance in 'kvm_mmu_notifier_test_young' - different lock contexts for basic block arch/x86/kvm/../../../virt/kvm/kvm_main.c:2522:9: sparse: sparse: context imbalance in 'hva_to_pfn_remapped' - unexpected unlock vim +5859 arch/x86/kvm/../../../virt/kvm/kvm_main.c 5805 5806 static int kvm_vm_worker_thread(void *context) 5807 { 5808 /* 5809 * The init_context is allocated on the stack of the parent thread, so 5810 * we have to locally copy anything that is needed beyond initialization 5811 */ 5812 struct kvm_vm_worker_thread_context *init_context = context; 5813 struct kvm *kvm = init_context->kvm; 5814 kvm_vm_thread_fn_t thread_fn = init_context->thread_fn; 5815 uintptr_t data = init_context->data; 5816 int err, reattach_err; 5817 5818 err = kthread_park(current); 5819 /* kthread_park(current) is never supposed to return an error */ 5820 WARN_ON(err != 0); 5821 if (err) 5822 goto init_complete; 5823 5824 err = cgroup_attach_task_all(init_context->parent, current); 5825 if (err) { 5826 kvm_err("%s: cgroup_attach_task_all failed with err %d\n", 5827 __func__, err); 5828 goto init_complete; 5829 } 5830 5831 set_user_nice(current, task_nice(init_context->parent)); 5832 5833 init_complete: 5834 init_context->err = err; 5835 complete(&init_context->init_done); 5836 init_context = NULL; 5837 5838 if (err) 5839 goto out; 5840 5841 /* Wait to be woken up by the spawner before proceeding. */ 5842 kthread_parkme(); 5843 5844 if (!kthread_should_stop()) 5845 err = thread_fn(kvm, data); 5846 5847 out: 5848 /* 5849 * Move kthread back to its original cgroup to prevent it lingering in 5850 * the cgroup of the VM process, after the latter finishes its 5851 * execution. 5852 * 5853 * kthread_stop() waits on the 'exited' completion condition which is 5854 * set in exit_mm(), via mm_release(), in do_exit(). However, the 5855 * kthread is removed from the cgroup in the cgroup_exit() which is 5856 * called after the exit_mm(). This causes the kthread_stop() to return 5857 * before the kthread actually quits the cgroup. 5858 */ > 5859 reattach_err = cgroup_attach_task_all(current->real_parent, current); 5860 if (reattach_err) { 5861 kvm_err("%s: cgroup_attach_task_all failed on reattach with err %d\n", 5862 __func__, reattach_err); 5863 } 5864 return err; 5865 } 5866 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Thu, Feb 17, 2022, Paolo Bonzini wrote: > On 2/17/22 13:34, kernel test robot wrote: > > > 5859 reattach_err = cgroup_attach_task_all(current->real_parent, current); > > This needs to be within rcu_dereference(). As Vipin found out the hard way, cgroup_attach_task_all() can't be inside an RCU critical section as it might sleep due to mutex_lock(). My sched knowledge is sketchy at best, but I believe KVM just needs to guarantee the liveliness of real_parent when passing it to cgroup_attach_task_all(). Presumably kthreadd_task can (in theory) be killed/exiting while this is going on, but at that point I doubt anyone cares much about cgroup accuracy. So I think this can be: rcu_read_lock(); parent = rcu_dereference(current->real_parent); get_task_struct(parent); rcu_read_unlock(); (void)cgroup_attach_task_all(parent, current); put_task_struct(parent); > > 5860 if (reattach_err) { > > 5861 kvm_err("%s: cgroup_attach_task_all failed on reattach with err %d\n", > > 5862 __func__, reattach_err); Eh, I wouldn't bother logging the error, userspace can't take action on it, and if the system is OOM it's just more noise in the log to dig through.
On 2/19/22 01:30, Sean Christopherson wrote: > > So I think this can be: > > rcu_read_lock(); > parent = rcu_dereference(current->real_parent); > get_task_struct(parent); > rcu_read_unlock(); > > (void)cgroup_attach_task_all(parent, current); > put_task_struct(parent); Yes, I agree. Paolo
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 83c57bcc6eb6..2c9dcfffb606 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5813,7 +5813,7 @@ static int kvm_vm_worker_thread(void *context) struct kvm *kvm = init_context->kvm; kvm_vm_thread_fn_t thread_fn = init_context->thread_fn; uintptr_t data = init_context->data; - int err; + int err, reattach_err; err = kthread_park(current); /* kthread_park(current) is never supposed to return an error */ @@ -5836,7 +5836,7 @@ static int kvm_vm_worker_thread(void *context) init_context = NULL; if (err) - return err; + goto out; /* Wait to be woken up by the spawner before proceeding. */ kthread_parkme(); @@ -5844,6 +5844,23 @@ static int kvm_vm_worker_thread(void *context) if (!kthread_should_stop()) err = thread_fn(kvm, data); +out: + /* + * Move kthread back to its original cgroup to prevent it lingering in + * the cgroup of the VM process, after the latter finishes its + * execution. + * + * kthread_stop() waits on the 'exited' completion condition which is + * set in exit_mm(), via mm_release(), in do_exit(). However, the + * kthread is removed from the cgroup in the cgroup_exit() which is + * called after the exit_mm(). This causes the kthread_stop() to return + * before the kthread actually quits the cgroup. + */ + reattach_err = cgroup_attach_task_all(current->real_parent, current); + if (reattach_err) { + kvm_err("%s: cgroup_attach_task_all failed on reattach with err %d\n", + __func__, reattach_err); + } return err; }
VM worker kthreads can linger in the VM process's cgroup for sometime after KVM terminates the VM process. KVM terminates the worker kthreads by calling kthread_stop() which waits on the 'exited' completion, triggered by exit_mm(), via mm_release(), in do_exit() during the kthread's exit. However, these kthreads are removed from the cgroup using the cgroup_exit() which happens after the exit_mm(). Therefore, a VM process can terminate in between the exit_mm() and cgroup_exit() calls, leaving only worker kthreads in the cgroup. Moving worker kthreads back to the original cgroup (kthreadd_task's cgroup) makes sure that the cgroup is empty as soon as the main VM process is terminated. Signed-off-by: Vipin Sharma <vipinsh@google.com> --- v3: - Use 'current->real_parent' (kthreadd_task) in the cgroup_attach_task_all() call. - Revert cgroup APIs changes in v2. Now, patch does not touch cgroup APIs. - Update commit and comment message v2: https://lore.kernel.org/lkml/20211222225350.1912249-1-vipinsh@google.com/ - Use kthreadd_task in the cgroup API to avoid build issue. v1: https://lore.kernel.org/lkml/20211214050708.4040200-1-vipinsh@google.com/ virt/kvm/kvm_main.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) base-commit: db6e7adf8de9b3b99a9856acb73870cc3a70e3ca