Message ID | 20211214050708.4040200-1-vipinsh@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: Move VM's worker kthreads back to the original cgroups before exiting. | expand |
Hi Vipin,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on d8f6ef45a623d650f9b97e11553adb4978f6aa70]
url: https://github.com/0day-ci/linux/commits/Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroups-before-exiting/20211214-130827
base: d8f6ef45a623d650f9b97e11553adb4978f6aa70
config: riscv-defconfig (https://download.01.org/0day-ci/archive/20211214/202112142328.a9ebDmd7-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/fd29d23507ef3f06b61d9de1b7ecd1a0d70136f3
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroups-before-exiting/20211214-130827
git checkout fd29d23507ef3f06b61d9de1b7ecd1a0d70136f3
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=riscv SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "kthreadd_task" [arch/riscv/kvm/kvm.ko] undefined!
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Tue, Dec 14, 2021 at 4:13 PM Vipin Sharma <vipinsh@google.com> wrote: > > VM worker kthreads can linger in the VM process's cgroup for sometime > after KVM temrinates the VM process. > > KVM terminates the worker kthreads by calling kthread_stop() which waits > on the signal generated by exit_mm() in do_exit() during kthread's exit. > However, these kthreads are removed from the cgroup using cgroup_exit() > call which happens after exit_mm() in do_exit(). A VM process can > terminate between the time window of exit_mm() to cgroup_exit(), leaving > only worker kthreads in the cgroup. > > Moving worker kthreads back to the original cgroup (kthreadd_task's > cgroup) makes sure that cgroup is empty as soon as the main VM process > is terminated. > > Signed-off-by: Vipin Sharma <vipinsh@google.com> > --- > virt/kvm/kvm_main.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) Hello Off-topic, can this kvm worker_thread and the thread to do async pagefault be possibly changed to use something like io_uring's IOWQ (fs/io-wq.c) created by create_io_thread()? So that every resource the threads used are credited to the process of the vm. Thanks Lai
On Tue, Dec 14, 2021, Vipin Sharma wrote: > VM worker kthreads can linger in the VM process's cgroup for sometime > after KVM temrinates the VM process. > > KVM terminates the worker kthreads by calling kthread_stop() which waits > on the signal generated by exit_mm() in do_exit() during kthread's exit. > However, these kthreads are removed from the cgroup using cgroup_exit() > call which happens after exit_mm() in do_exit(). A VM process can > terminate between the time window of exit_mm() to cgroup_exit(), leaving > only worker kthreads in the cgroup. > > Moving worker kthreads back to the original cgroup (kthreadd_task's > cgroup) makes sure that cgroup is empty as soon as the main VM process > is terminated. > > Signed-off-by: Vipin Sharma <vipinsh@google.com> > --- > virt/kvm/kvm_main.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index b0f7e6eb00ff..edd304a18f16 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -5785,7 +5785,7 @@ static int kvm_vm_worker_thread(void *context) > init_context = NULL; > > if (err) > - return err; > + goto out; > > /* Wait to be woken up by the spawner before proceeding. */ > kthread_parkme(); > @@ -5793,6 +5793,15 @@ static int kvm_vm_worker_thread(void *context) > if (!kthread_should_stop()) > err = thread_fn(kvm, data); > > +out: > + /* > + * We need to move the kthread back to its original cgroups, so that it > + * doesn't linger in the cgroups of the user process after that has > + * already terminated. exit_mm() in do_exit() signals kthread_stop() to > + * return, whereas, removal of the task from the cgroups happens in > + * cgroup_exit() which happens after exit_mm(). > + */ > + WARN_ON(cgroup_attach_task_all(kthreadd_task, current)); As the build bot noted, kthreadd_task isn't exported, and I doubt you'll convince folks to let you export it. Why is it problematic for the kthread to linger in the cgroup? Conceptually, it's not really wrong. > return err; > } > > > base-commit: d8f6ef45a623d650f9b97e11553adb4978f6aa70 > -- > 2.34.1.173.g76aa8bc2d0-goog >
Hi Vipin,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on d8f6ef45a623d650f9b97e11553adb4978f6aa70]
url: https://github.com/0day-ci/linux/commits/Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroups-before-exiting/20211214-130827
base: d8f6ef45a623d650f9b97e11553adb4978f6aa70
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20211215/202112150131.MaZ9xOJx-lkp@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/fd29d23507ef3f06b61d9de1b7ecd1a0d70136f3
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Vipin-Sharma/KVM-Move-VM-s-worker-kthreads-back-to-the-original-cgroups-before-exiting/20211214-130827
git checkout fd29d23507ef3f06b61d9de1b7ecd1a0d70136f3
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "kthreadd_task" [arch/x86/kvm/kvm.ko] undefined!
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Tue, Dec 14, 2021 at 9:16 AM Sean Christopherson <seanjc@google.com> wrote: > > On Tue, Dec 14, 2021, Vipin Sharma wrote: > > + WARN_ON(cgroup_attach_task_all(kthreadd_task, current)); > > As the build bot noted, kthreadd_task isn't exported, and I doubt you'll convince > folks to let you export it. > > Why is it problematic for the kthread to linger in the cgroup? Conceptually, it's > not really wrong. Issue comes when a process tries to clear up the resources when a VM shutdown/dies. The process sometimes get an EBUSY error when it tries to delete the cgroup directories which were created for that VM. It is also difficult to know how many times to retry or how much time to wait before the cgroup is empty. This issue is not always happening.
On 12/14/21 06:07, Vipin Sharma wrote: > > KVM terminates the worker kthreads by calling kthread_stop() which waits > on the signal generated by exit_mm() in do_exit() during kthread's exit. Instead of "signal", please spell it as "the 'exited' completion, triggered by exit_mm(), via mm_release(), during the kthread's exit". That makes things a bit clearer. So the issue is that the kthread_stop happens around the time exit_task_work() destroys the VM, but the process can go on and signal its demise to the parent process before the kthread has been completely dropped. Not even close() can fix it, though it may reduce the window completely, so I agree that this is a bug and vhost has the same bug too. Due to the issue with kthreadd_task not being exported, perhaps you can change cgroup_attach_task_all to use kthreadd_task if the "from" argument is NULL? Paolo
On Wed, Dec 15, 2021 at 10:27 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > So the issue is that the kthread_stop happens around the time > exit_task_work() destroys the VM, but the process can go on and signal > its demise to the parent process before the kthread has been completely > dropped. Not even close() can fix it, though it may reduce the window > completely, so I agree that this is a bug and vhost has the same bug too. > > Due to the issue with kthreadd_task not being exported, perhaps you can > change cgroup_attach_task_all to use kthreadd_task if the "from" > argument is NULL? Thanks for the suggestion. This also works, I will send out a patch.
On Tue, Dec 14, 2021 at 8:14 AM Lai Jiangshan <jiangshanlai@gmail.com> wrote: > > Hello > > Off-topic, can this kvm worker_thread and the thread to do async pagefault > be possibly changed to use something like io_uring's IOWQ (fs/io-wq.c) > created by create_io_thread()? > > So that every resource the threads used are credited to the process > of the vm. > Sorry, I am not very familiar with it. Maybe someone from the community knows better about this.
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b0f7e6eb00ff..edd304a18f16 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5785,7 +5785,7 @@ static int kvm_vm_worker_thread(void *context) init_context = NULL; if (err) - return err; + goto out; /* Wait to be woken up by the spawner before proceeding. */ kthread_parkme(); @@ -5793,6 +5793,15 @@ static int kvm_vm_worker_thread(void *context) if (!kthread_should_stop()) err = thread_fn(kvm, data); +out: + /* + * We need to move the kthread back to its original cgroups, so that it + * doesn't linger in the cgroups of the user process after that has + * already terminated. exit_mm() in do_exit() signals kthread_stop() to + * return, whereas, removal of the task from the cgroups happens in + * cgroup_exit() which happens after exit_mm(). + */ + WARN_ON(cgroup_attach_task_all(kthreadd_task, current)); return err; }
VM worker kthreads can linger in the VM process's cgroup for sometime after KVM temrinates the VM process. KVM terminates the worker kthreads by calling kthread_stop() which waits on the signal generated by exit_mm() in do_exit() during kthread's exit. However, these kthreads are removed from the cgroup using cgroup_exit() call which happens after exit_mm() in do_exit(). A VM process can terminate between the time window of exit_mm() to cgroup_exit(), leaving only worker kthreads in the cgroup. Moving worker kthreads back to the original cgroup (kthreadd_task's cgroup) makes sure that cgroup is empty as soon as the main VM process is terminated. Signed-off-by: Vipin Sharma <vipinsh@google.com> --- virt/kvm/kvm_main.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) base-commit: d8f6ef45a623d650f9b97e11553adb4978f6aa70