diff mbox series

[v2,1/9] x86/shadow: replace sh_reset_l3_up_pointers()

Message ID d91b5315-a5bb-a6ee-c9bb-58974c733a4e@suse.com (mailing list archive)
State Superseded
Headers show
Series x86/shadow: misc tidying | expand

Commit Message

Jan Beulich Jan. 11, 2023, 1:52 p.m. UTC
Rather than doing a separate hash walk (and then even using the vCPU
variant, which is to go away), do the up-pointer-clearing right in
sh_unpin(), as an alternative to the (now further limited) enlisting on
a "free floating" list fragment. This utilizes the fact that such list
fragments are traversed only for multi-page shadows (in shadow_free()).
Furthermore sh_terminate_list() is a safe guard only anyway, which isn't
in use in the common case (it actually does anything only for BIGMEM
configurations).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Comments

Jan Beulich Jan. 17, 2023, 10:10 a.m. UTC | #1
On 11.01.2023 14:52, Jan Beulich wrote:
> Rather than doing a separate hash walk (and then even using the vCPU
> variant, which is to go away), do the up-pointer-clearing right in
> sh_unpin(), as an alternative to the (now further limited) enlisting on
> a "free floating" list fragment. This utilizes the fact that such list
> fragments are traversed only for multi-page shadows (in shadow_free()).

I've added mention of sh_next_page() here as well. Not sure how I
managed to miss that, but this doesn't change the reasoning.

Jan
George Dunlap Jan. 20, 2023, 5:02 p.m. UTC | #2
On Wed, Jan 11, 2023 at 1:52 PM Jan Beulich <jbeulich@suse.com> wrote:

> Rather than doing a separate hash walk (and then even using the vCPU
> variant, which is to go away), do the up-pointer-clearing right in
> sh_unpin(), as an alternative to the (now further limited) enlisting on
> a "free floating" list fragment. This utilizes the fact that such list
> fragments are traversed only for multi-page shadows (in shadow_free()).
> Furthermore sh_terminate_list() is a safe guard only anyway, which isn't
> in use in the common case (it actually does anything only for BIGMEM
> configurations).
>

One thing that seems strange about this patch is that you're essentially
adding a field to the domain shadow struct in lieu of adding another
another argument to sh_unpin() (unless the bit is referenced elsewhere in
subsequent patches, which I haven't reviewed, in part because about half of
them don't apply cleanly to the current tree).

 -George
Jan Beulich Jan. 23, 2023, 8:41 a.m. UTC | #3
On 20.01.2023 18:02, George Dunlap wrote:
> On Wed, Jan 11, 2023 at 1:52 PM Jan Beulich <jbeulich@suse.com> wrote:
> 
>> Rather than doing a separate hash walk (and then even using the vCPU
>> variant, which is to go away), do the up-pointer-clearing right in
>> sh_unpin(), as an alternative to the (now further limited) enlisting on
>> a "free floating" list fragment. This utilizes the fact that such list
>> fragments are traversed only for multi-page shadows (in shadow_free()).
>> Furthermore sh_terminate_list() is a safe guard only anyway, which isn't
>> in use in the common case (it actually does anything only for BIGMEM
>> configurations).
> 
> One thing that seems strange about this patch is that you're essentially
> adding a field to the domain shadow struct in lieu of adding another
> another argument to sh_unpin() (unless the bit is referenced elsewhere in
> subsequent patches, which I haven't reviewed, in part because about half of
> them don't apply cleanly to the current tree).

Well, to me adding another parameter to sh_unpin() would have looked odd;
the new field looks slightly cleaner to me. But changing that is merely a
matter of taste, so if you and e.g. Andrew think that approach was better,
I could switch to that. And no, I don't foresee further uses of the field.

As to half of the patches not applying: Some where already applied out of
order, and others therefore need re-basing slightly. Till now I saw no
reason to re-send the remaining patches just for that.

Jan
George Dunlap Jan. 23, 2023, 3:20 p.m. UTC | #4
On Mon, Jan 23, 2023 at 8:41 AM Jan Beulich <jbeulich@suse.com> wrote:

> On 20.01.2023 18:02, George Dunlap wrote:
> > On Wed, Jan 11, 2023 at 1:52 PM Jan Beulich <jbeulich@suse.com> wrote:
> >
> >> Rather than doing a separate hash walk (and then even using the vCPU
> >> variant, which is to go away), do the up-pointer-clearing right in
> >> sh_unpin(), as an alternative to the (now further limited) enlisting on
> >> a "free floating" list fragment. This utilizes the fact that such list
> >> fragments are traversed only for multi-page shadows (in shadow_free()).
> >> Furthermore sh_terminate_list() is a safe guard only anyway, which isn't
> >> in use in the common case (it actually does anything only for BIGMEM
> >> configurations).
> >
> > One thing that seems strange about this patch is that you're essentially
> > adding a field to the domain shadow struct in lieu of adding another
> > another argument to sh_unpin() (unless the bit is referenced elsewhere in
> > subsequent patches, which I haven't reviewed, in part because about half
> of
> > them don't apply cleanly to the current tree).
>
> Well, to me adding another parameter to sh_unpin() would have looked odd;
> the new field looks slightly cleaner to me. But changing that is merely a
> matter of taste, so if you and e.g. Andrew think that approach was better,
> I could switch to that. And no, I don't foresee further uses of the field.
>

You're about to call sh_unpin(), and you want to tell that function to
change its behavior.  What's so odd about adding an argument to the
function to indicate the behavior?  Instead you're adding a bit of global
state which is carried around 100% of the time, even when that function
isn't being called.  That's not what people normally expect; it makes the
code harder to reason about.

It would certainly be ugly to have to add "false" to every other instance
of sh_unpin; but the normal way you get around that is to redefine
sh_unpin() as a wrapper which calls the other function with the 'false'
argument set.

You asked me to review this for a second opinion on the safety of clearing
the up-pointer this way, not because you need an ack; so I don't really
want to block the patch for non-functional reasons.  But I think this is
one of the "death by a thousand cuts" that makes the shadow code more
fragile and difficult for new people to approach and understand.

Re the original question: I've stared at the code for a bit now, and I
can't see anything obviously wrong or dangerous about it.

But it does make me ask, why do we need the "unpinning_l3" pseudo-argument
at all?  Is there any reason not to unconditionally zero out sp->up when we
find a head_type of SH_type_l3_64_shadow?  As far as I can tell, sp->list
doesn't require any special state.  Why do we make the effort to leave it
alone when we're not unpinning all l3s?

In fact, is there a way to unpin an l3 shadow *other* than when we're
unpinning all l3's?  If so, then this patch, as written, is broken -- the
original code clears the up-pointer for *all* L3_64 shadows, regardless of
whether they're on the pinned list; the new patch will only clear the ones
on the pinned list.  But unconditionally clearing sp->up could actually fix
that.

Thoughts?

As to half of the patches not applying: Some where already applied out of
> order, and others therefore need re-basing slightly. Till now I saw no
> reason to re-send the remaining patches just for that.
>

Sorry if that sounded like complaining; I was only being preemptively
defensive against the potential accusation that the answer would have been
obvious if I'd just continued reviewing the series. :-). (And indeed if the
whole series had applied I would have checked that the final result didn't
have any other references to it.)

 -George
Jan Beulich Jan. 23, 2023, 4:14 p.m. UTC | #5
On 23.01.2023 16:20, George Dunlap wrote:
> Re the original question: I've stared at the code for a bit now, and I
> can't see anything obviously wrong or dangerous about it.
> 
> But it does make me ask, why do we need the "unpinning_l3" pseudo-argument
> at all?  Is there any reason not to unconditionally zero out sp->up when we
> find a head_type of SH_type_l3_64_shadow?  As far as I can tell, sp->list
> doesn't require any special state.  Why do we make the effort to leave it
> alone when we're not unpinning all l3s?

This was an attempt to retain original behavior as much as possible, but I'm
afraid that, ...

> In fact, is there a way to unpin an l3 shadow *other* than when we're
> unpinning all l3's?

... since the answer here is of course "yes", ...

>  If so, then this patch, as written, is broken -- the
> original code clears the up-pointer for *all* L3_64 shadows, regardless of
> whether they're on the pinned list; the new patch will only clear the ones
> on the pinned list.  But unconditionally clearing sp->up could actually fix
> that.

... you're right, and I failed (went too far) with that attempt. Plus it'll
naturally resolve the parameter-vs-state aspect.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/include/asm/domain.h
+++ b/xen/arch/x86/include/asm/domain.h
@@ -116,6 +116,9 @@  struct shadow_domain {
     /* OOS */
     bool_t oos_active;
 
+    /* Domain is in the process of leaving SHOPT_LINUX_L3_TOPLEVEL mode. */
+    bool unpinning_l3;
+
 #ifdef CONFIG_HVM
     /* Has this domain ever used HVMOP_pagetable_dying? */
     bool_t pagetable_dying_op;
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2302,29 +2302,6 @@  void shadow_prepare_page_type_change(str
 
 /**************************************************************************/
 
-/* Reset the up-pointers of every L3 shadow to 0.
- * This is called when l3 shadows stop being pinnable, to clear out all
- * the list-head bits so the up-pointer field is properly inititalised. */
-static int cf_check sh_clear_up_pointer(
-    struct vcpu *v, mfn_t smfn, mfn_t unused)
-{
-    mfn_to_page(smfn)->up = 0;
-    return 0;
-}
-
-void sh_reset_l3_up_pointers(struct vcpu *v)
-{
-    static const hash_vcpu_callback_t callbacks[SH_type_unused] = {
-        [SH_type_l3_64_shadow] = sh_clear_up_pointer,
-    };
-
-    HASH_CALLBACKS_CHECK(SHF_L3_64);
-    hash_vcpu_foreach(v, SHF_L3_64, callbacks, INVALID_MFN);
-}
-
-
-/**************************************************************************/
-
 static void sh_update_paging_modes(struct vcpu *v)
 {
     struct domain *d = v->domain;
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -960,6 +960,8 @@  sh_make_shadow(struct vcpu *v, mfn_t gmf
         }
         if ( l4count > 2 * d->max_vcpus )
         {
+            d->arch.paging.shadow.unpinning_l3 = true;
+
             /* Unpin all the pinned l3 tables, and don't pin any more. */
             page_list_for_each_safe(sp, t, &d->arch.paging.shadow.pinned_shadows)
             {
@@ -967,7 +969,8 @@  sh_make_shadow(struct vcpu *v, mfn_t gmf
                     sh_unpin(d, page_to_mfn(sp));
             }
             d->arch.paging.shadow.opt_flags &= ~SHOPT_LINUX_L3_TOPLEVEL;
-            sh_reset_l3_up_pointers(v);
+
+            d->arch.paging.shadow.unpinning_l3 = false;
         }
     }
 #endif
--- a/xen/arch/x86/mm/shadow/private.h
+++ b/xen/arch/x86/mm/shadow/private.h
@@ -497,11 +497,6 @@  void shadow_blow_tables(struct domain *d
  */
 int sh_remove_all_mappings(struct domain *d, mfn_t gmfn, gfn_t gfn);
 
-/* Reset the up-pointers of every L3 shadow to 0.
- * This is called when l3 shadows stop being pinnable, to clear out all
- * the list-head bits so the up-pointer field is properly inititalised. */
-void sh_reset_l3_up_pointers(struct vcpu *v);
-
 /******************************************************************************
  * Flags used in the return value of the shadow_set_lXe() functions...
  */
@@ -721,7 +716,7 @@  static inline void sh_unpin(struct domai
 {
     struct page_list_head tmp_list, *pin_list;
     struct page_info *sp, *next;
-    unsigned int i, head_type;
+    unsigned int i, head_type, sz;
 
     ASSERT(mfn_valid(smfn));
     sp = mfn_to_page(smfn);
@@ -733,20 +728,30 @@  static inline void sh_unpin(struct domai
         return;
     sp->u.sh.pinned = 0;
 
-    /* Cut the sub-list out of the list of pinned shadows,
-     * stitching it back into a list fragment of its own. */
+    sz = shadow_size(head_type);
+
+    /*
+     * Cut the sub-list out of the list of pinned shadows, stitching
+     * multi-page shadows back into a list fragment of their own.
+     */
     pin_list = &d->arch.paging.shadow.pinned_shadows;
     INIT_PAGE_LIST_HEAD(&tmp_list);
-    for ( i = 0; i < shadow_size(head_type); i++ )
+    for ( i = 0; i < sz; i++ )
     {
         ASSERT(sp->u.sh.type == head_type);
         ASSERT(!i || !sp->u.sh.head);
         next = page_list_next(sp, pin_list);
         page_list_del(sp, pin_list);
-        page_list_add_tail(sp, &tmp_list);
+        if ( sz > 1 )
+            page_list_add_tail(sp, &tmp_list);
+        else if ( head_type == SH_type_l3_64_shadow &&
+                  d->arch.paging.shadow.unpinning_l3 )
+            sp->up = 0;
         sp = next;
     }
-    sh_terminate_list(&tmp_list);
+
+    if ( sz > 1 )
+        sh_terminate_list(&tmp_list);
 
     sh_put_ref(d, smfn, 0);
 }