Message ID | 912c7377-26f0-c14a-e3aa-f00a81ed5766@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Tentative fix for "out of PoD memory" issue | expand |
On Thu, Oct 21, 2021 at 01:53:06PM +0200, Juergen Gross wrote: > Marek, > > could you please test whether the attached patch is fixing your > problem? Sure. In fact, I made a similar patch in the meantime (attached) to experiment with this a bit. > BTW, I don't think this couldn't happen before kernel 5.15. I guess > my modification to use a kernel thread instead of a workqueue just > made the issue more probable. I think you are right here. But this all looks still a bit weird. 1. baseline: 5.10.61 (before using kernel thread - which was backported to stable branches). Here the startup completes successfully (no "out of PoD memory" issue) with memory=270MB. 2. 5.10.61 with added boot delay patch: The delay is about 18s and the guest boot successfully. 3. 5.10.71 with "xen/balloon: fix cancelled balloon action" but without delay patch: The domain is killed during startup (in the middle of fsck, I think) 4. 5.10.74 with delay patch: The delay is about 19s and the guest boot successfully. Now the weird part: with memory=270MB with the delay patch, the balloon delay _fails_ - state=BP_ECANCELED, and credit is -19712 at that time. In both thread and workqueue balloon variants. Yet, it isn't killed (*). But with 5.10.61, even without the delay patch, the guest starts successfully in the end. Also, I think there was some implicit wait for initial balloon down before. That was the main motivation for 197ecb3802c0 "xen/balloon: add runtime control for scrubbing ballooned out pages" - because that initial balloon down held the system startup for some long time. Sadly, I can't find my notes from debugging that (especially if I had written down a stacktrace _where_ exactly it was waiting)... > I couldn't reproduce the crash you are seeing, but the introduced > wait was 4.2 seconds on my test system (a PVH guest with 2 GB of > memory, maxmem 6 GB). I'm testing it on a much more aggressive setting: - memory: 270 MB (the minimal that is sufficient to boot the system) - maxmem: 4 GB The default settings in Qubes are: - memory: 400 MB - maxmem: 4 GB That should explains why it happens on Qubes way more often than elsewhere. (*) At some point during system boot, qubes memory manager kicks in and the VM gets more memory. But it's rather late, and definitely after it is killed with "out of PoD memory" in other cases.
From 3ee35f6f110e2258ec94f0d1397fac8c26b41761 Mon Sep 17 00:00:00 2001 From: Juergen Gross <jgross@suse.com> To: linux-kernel@vger.kernel.org Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: xen-devel@lists.xenproject.org Date: Thu, 21 Oct 2021 12:51:06 +0200 Subject: [PATCH] xen/balloon: add late_initcall_sync() for initial ballooning done MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When running as PVH or HVM guest with actual memory < max memory the hypervisor is using "populate on demand" in order to allow the guest to balloon down from its maximum memory size. For this to work correctly the guest must not touch more memory pages than its target memory size as otherwise the PoD cache will be exhausted and the guest is crashed as a result of that. In extreme cases ballooning down might not be finished today before the init process is started, which can consume lots of memory. In order to avoid random boot crashes in such cases, add a late init call to wait for ballooning down having finished for PVH/HVM guests. Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> --- drivers/xen/balloon.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 3a50f097ed3e..d19b851c3d3b 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -765,3 +765,23 @@ static int __init balloon_init(void) return 0; } subsys_initcall(balloon_init); + +static int __init balloon_wait_finish(void) +{ + if (!xen_domain()) + return -ENODEV; + + /* PV guests don't need to wait. */ + if (xen_pv_domain() || !current_credit()) + return 0; + + pr_info("Waiting for initial ballooning down having finished.\n"); + + while (current_credit()) + schedule_timeout_interruptible(HZ / 10); + + pr_info("Initial ballooning down finished.\n"); + + return 0; +} +late_initcall_sync(balloon_wait_finish); -- 2.26.2