diff mbox series

[2/3] x86/mem_sharing: use dom_cow as placeholder parent until fork is complete

Message ID d5d8c7bad025a4ef11bf09ad3a4b23c8b4673ff6.1647970630.git.tamas.lengyel@intel.com (mailing list archive)
State New, archived
Headers show
Series [1/3] x86/mem_sharing: option to skip populating special pages during fork | expand

Commit Message

Tamas K Lengyel March 22, 2022, 5:41 p.m. UTC
For the duration of the fork memop set dom_cow as a placeholder parent. This
gets updated to the real parent when the fork operation completes, or to NULL
in case the fork failed. Doing this allows us to skip populating the physmap
with any entries until the fork operation successfully completes. Currently
bringing up vCPUs may inadvertantly map in some pages that can turn out to be
unecessary, like the CR3 gfn when paging is disabled.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
---
 xen/arch/x86/include/asm/mem_sharing.h | 2 +-
 xen/arch/x86/mm/mem_sharing.c          | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)

Comments

Jan Beulich March 28, 2022, 1:32 p.m. UTC | #1
On 22.03.2022 18:41, Tamas K Lengyel wrote:
> For the duration of the fork memop set dom_cow as a placeholder parent. This
> gets updated to the real parent when the fork operation completes, or to NULL
> in case the fork failed.

I am concerned of this, in particular because the state may last across
perhaps a long series of preemptions. Furthermore ...

> --- a/xen/arch/x86/include/asm/mem_sharing.h
> +++ b/xen/arch/x86/include/asm/mem_sharing.h
> @@ -79,7 +79,7 @@ static inline int mem_sharing_unshare_page(struct domain *d,
>  
>  static inline bool mem_sharing_is_fork(const struct domain *d)
>  {
> -    return d->parent;
> +    return d->parent && d->parent != dom_cow;
>  }

... this now makes the function "lie" (the domain is a fork already
while being constructed). Hence at the very least a comment would want
to appear here explaining why this is wanted despite not really being
correct. This "lying" for example means a partly set up fork would be
skipped by domain_relinquish_resources(), in case the tool stack
decided to kill the domain instead of waiting for its creation to
finish.

> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1850,7 +1850,9 @@ static int fork(struct domain *cd, struct domain *d, uint16_t flags)
>          *cd->arch.cpuid = *d->arch.cpuid;
>          *cd->arch.msr = *d->arch.msr;
>          cd->vmtrace_size = d->vmtrace_size;
> -        cd->parent = d;
> +
> +        /* use dom_cow as a placeholder until we are all done */

Nit: As per ./CODING_STYLE you want to at least start the comment with
a capital letter.

Jan
Tamas K Lengyel March 28, 2022, 3:36 p.m. UTC | #2
On Mon, Mar 28, 2022 at 9:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 22.03.2022 18:41, Tamas K Lengyel wrote:
> > For the duration of the fork memop set dom_cow as a placeholder parent. This
> > gets updated to the real parent when the fork operation completes, or to NULL
> > in case the fork failed.
>
> I am concerned of this, in particular because the state may last across
> perhaps a long series of preemptions. Furthermore ...
>
> > --- a/xen/arch/x86/include/asm/mem_sharing.h
> > +++ b/xen/arch/x86/include/asm/mem_sharing.h
> > @@ -79,7 +79,7 @@ static inline int mem_sharing_unshare_page(struct domain *d,
> >
> >  static inline bool mem_sharing_is_fork(const struct domain *d)
> >  {
> > -    return d->parent;
> > +    return d->parent && d->parent != dom_cow;
> >  }
>
> ... this now makes the function "lie" (the domain is a fork already
> while being constructed). Hence at the very least a comment would want
> to appear here explaining why this is wanted despite not really being
> correct. This "lying" for example means a partly set up fork would be
> skipped by domain_relinquish_resources(), in case the tool stack
> decided to kill the domain instead of waiting for its creation to
> finish.

Hm, yea, domain_relinquish_resources really ought to check if there is
any parent at all, while fork_page needs to check whether there is a
parent and it's not dom_cow. I think I just need two separate
mem_sharing_is_fork functions, one would be the current
mem_sharing_is_fork() and a new one mem_sharing_is_fork_complete(),
that would make it a bit clearer on what they do.

>
> > --- a/xen/arch/x86/mm/mem_sharing.c
> > +++ b/xen/arch/x86/mm/mem_sharing.c
> > @@ -1850,7 +1850,9 @@ static int fork(struct domain *cd, struct domain *d, uint16_t flags)
> >          *cd->arch.cpuid = *d->arch.cpuid;
> >          *cd->arch.msr = *d->arch.msr;
> >          cd->vmtrace_size = d->vmtrace_size;
> > -        cd->parent = d;
> > +
> > +        /* use dom_cow as a placeholder until we are all done */
>
> Nit: As per ./CODING_STYLE you want to at least start the comment with
> a capital letter.

Ack.

Thanks,
Tamas
diff mbox series

Patch

diff --git a/xen/arch/x86/include/asm/mem_sharing.h b/xen/arch/x86/include/asm/mem_sharing.h
index cf7a12f4d2..b4a8e8795a 100644
--- a/xen/arch/x86/include/asm/mem_sharing.h
+++ b/xen/arch/x86/include/asm/mem_sharing.h
@@ -79,7 +79,7 @@  static inline int mem_sharing_unshare_page(struct domain *d,
 
 static inline bool mem_sharing_is_fork(const struct domain *d)
 {
-    return d->parent;
+    return d->parent && d->parent != dom_cow;
 }
 
 int mem_sharing_fork_page(struct domain *d, gfn_t gfn,
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 84c04ddfa3..a21c781452 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1850,7 +1850,9 @@  static int fork(struct domain *cd, struct domain *d, uint16_t flags)
         *cd->arch.cpuid = *d->arch.cpuid;
         *cd->arch.msr = *d->arch.msr;
         cd->vmtrace_size = d->vmtrace_size;
-        cd->parent = d;
+
+        /* use dom_cow as a placeholder until we are all done */
+        cd->parent = dom_cow;
     }
 
     /* This is preemptible so it's the first to get done */
@@ -1862,6 +1864,7 @@  static int fork(struct domain *cd, struct domain *d, uint16_t flags)
 
     if ( !(rc = copy_settings(cd, d, skip_special_pages)) )
     {
+        cd->parent = d;
         cd->arch.hvm.mem_sharing.block_interrupts = block_interrupts;
         cd->arch.hvm.mem_sharing.skip_special_pages = skip_special_pages;
         /* skip mapping the vAPIC page on unpause if skipping special pages */