Message ID | 1510298286-30952-2-git-send-email-yu.c.zhang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 10.11.17 at 08:18, <yu.c.zhang@linux.intel.com> wrote: > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) > */ > if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 0)) ) > continue; > + if ( locking ) > + spin_lock(&map_pgdir_lock); > + > + /* L2E may be cleared on another CPU. */ > + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ) I think you also need a PSE check here, or else the l2e_to_l1e() below may be illegal. > @@ -5105,11 +5116,16 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) > { > /* Empty: zap the L2E and free the L1 page. */ > l2e_write_atomic(pl2e, l2e_empty()); > + if ( locking ) > + spin_unlock(&map_pgdir_lock); > flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */ > free_xen_pagetable(pl1e); > } > + else if ( locking ) > + spin_unlock(&map_pgdir_lock); > } > > +check_l3: Labels indented by at least one space please. Jan
On 11/10/2017 5:57 PM, Jan Beulich wrote: >>>> On 10.11.17 at 08:18, <yu.c.zhang@linux.intel.com> wrote: >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) >> */ >> if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 0)) ) >> continue; >> + if ( locking ) >> + spin_lock(&map_pgdir_lock); >> + >> + /* L2E may be cleared on another CPU. */ >> + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ) > I think you also need a PSE check here, or else the l2e_to_l1e() below > may be illegal. Hmm, interesting point, and thanks! :-) I did not check the PSE, because modify_xen_mappings() will not do the re-consolidation, and concurrent invokes of this routine will not change this flag. But now I believe this presumption shall not be made, because the paging structures may be modified by other routines, like map_pages_to_xen() on other CPUs. So yes, I think a _PAGE_PSE check is necessary here. And I suggest we also check the _PAGE_PRESENT flag as well, for the re-consolidation part in my first patch for map_pages_to_xen(). Do you agree? >> @@ -5105,11 +5116,16 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) >> { >> /* Empty: zap the L2E and free the L1 page. */ >> l2e_write_atomic(pl2e, l2e_empty()); >> + if ( locking ) >> + spin_unlock(&map_pgdir_lock); >> flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */ >> free_xen_pagetable(pl1e); >> } >> + else if ( locking ) >> + spin_unlock(&map_pgdir_lock); >> } >> >> +check_l3: > Labels indented by at least one space please. Got it . Thanks. Yu > > Jan > >
>>> On 10.11.17 at 15:02, <yu.c.zhang@linux.intel.com> wrote: > On 11/10/2017 5:57 PM, Jan Beulich wrote: >>>>> On 10.11.17 at 08:18, <yu.c.zhang@linux.intel.com> wrote: >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) >>> */ >>> if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 0)) ) >>> continue; >>> + if ( locking ) >>> + spin_lock(&map_pgdir_lock); >>> + >>> + /* L2E may be cleared on another CPU. */ >>> + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ) >> I think you also need a PSE check here, or else the l2e_to_l1e() below >> may be illegal. > > Hmm, interesting point, and thanks! :-) > I did not check the PSE, because modify_xen_mappings() will not do the > re-consolidation, and > concurrent invokes of this routine will not change this flag. But now I > believe this presumption > shall not be made, because the paging structures may be modified by > other routines, like > map_pages_to_xen() on other CPUs. > > So yes, I think a _PAGE_PSE check is necessary here. And I suggest we > also check the _PAGE_PRESENT > flag as well, for the re-consolidation part in my first patch for > map_pages_to_xen(). Do you agree? Oh, yes, definitely. I should have noticed this myself. Jan
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 47855fb..c07c528 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5097,6 +5097,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) */ if ( (nf & _PAGE_PRESENT) || ((v != e) && (l1_table_offset(v) != 0)) ) continue; + if ( locking ) + spin_lock(&map_pgdir_lock); + + /* L2E may be cleared on another CPU. */ + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ) + { + if ( locking ) + spin_unlock(&map_pgdir_lock); + goto check_l3; + } + pl1e = l2e_to_l1e(*pl2e); for ( i = 0; i < L1_PAGETABLE_ENTRIES; i++ ) if ( l1e_get_intpte(pl1e[i]) != 0 ) @@ -5105,11 +5116,16 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) { /* Empty: zap the L2E and free the L1 page. */ l2e_write_atomic(pl2e, l2e_empty()); + if ( locking ) + spin_unlock(&map_pgdir_lock); flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */ free_xen_pagetable(pl1e); } + else if ( locking ) + spin_unlock(&map_pgdir_lock); } +check_l3: /* * If we are not destroying mappings, or not done with the L3E, * skip the empty&free check. @@ -5117,6 +5133,17 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) if ( (nf & _PAGE_PRESENT) || ((v != e) && (l2_table_offset(v) + l1_table_offset(v) != 0)) ) continue; + if ( locking ) + spin_lock(&map_pgdir_lock); + + /* L3E may be cleared on another CPU. */ + if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) ) + { + if ( locking ) + spin_unlock(&map_pgdir_lock); + continue; + } + pl2e = l3e_to_l2e(*pl3e); for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) if ( l2e_get_intpte(pl2e[i]) != 0 ) @@ -5125,9 +5152,13 @@ int modify_xen_mappings(unsigned long s, unsigned long e, unsigned int nf) { /* Empty: zap the L3E and free the L2 page. */ l3e_write_atomic(pl3e, l3e_empty()); + if ( locking ) + spin_unlock(&map_pgdir_lock); flush_area(NULL, FLUSH_TLB_GLOBAL); /* flush before free */ free_xen_pagetable(pl2e); } + else if ( locking ) + spin_unlock(&map_pgdir_lock); } flush_area(NULL, FLUSH_TLB_GLOBAL);
In modify_xen_mappings(), a L1/L2 page table may be freed, if all entries of this page table are empty. Corresponding L2/L3 PTE will need be cleared in such scenario. However, logic to enumerate the L1/L2 page table and to reset the corresponding L2/L3 PTE need to be protected with spinlock. Otherwise, the paging structure may be freed more than once, if the same routine is invoked simultaneously on different CPUs. Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> --- Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> --- xen/arch/x86/mm.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)