Message ID | 58F8EF4D0200007800152882@prv-mh.provo.novell.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Apr 20, 2017 at 5:26 PM, Jan Beulich <JBeulich@suse.com> wrote: > Jann's explanation of the problem: > > "start situation: > - domain A and domain B are PV domains > - domain A and B both have currently scheduled vCPUs, and the vCPUs > are not scheduled away > - domain A has XSM_TARGET access to domain B > - page X is owned by domain B and has no mappings > - page X is zeroed > > steps: > - domain A uses do_mmu_update() to map page X in domain A as writable > - domain A accesses page X through the new PTE, creating a TLB entry > - domain A removes its mapping of page X > - type count of page X goes to 0 > - tlbflush_timestamp of page X is bumped > - domain B maps page X as L1 pagetable > - type of page X changes to PGT_l1_page_table > - TLB flush is forced using domain_dirty_cpumask of domain B > - page X is mapped as L1 pagetable in domain B > > At this point, domain B's vCPUs are guaranteed to have no > incorrectly-typed stale TLB entries for page X, but AFAICS domain A's > vCPUs can still have stale TLB entries that map page X as writable, > permitting domain A to control a live pagetable of domain B." > > Domain A necessarily is Dom0 (DomU-s with XSM_TARGET permission are > being created only for HVM domains, but domain B needs to be PV here), > so this is not a security issue, but nevertheless seems desirable to > correct. > > Reported-by: Jann Horn <jannh@google.com> > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -602,6 +602,20 @@ static inline void guest_get_eff_kern_l1 > TOGGLE_MODE(); > } > > +static const cpumask_t *get_flush_tlb_mask(const struct page_info *page, > + const struct domain *d) > +{ > + cpumask_t *mask = this_cpu(scratch_cpumask); > + > + BUG_ON(in_irq()); > + cpumask_copy(mask, d->domain_dirty_cpumask); > + > + /* Don't flush if the timestamp is old enough */ > + tlbflush_filter(mask, page->tlbflush_timestamp); > + > + return mask; > +} > + > const char __section(".bss.page_aligned.const") __aligned(PAGE_SIZE) > zero_page[PAGE_SIZE]; > > @@ -1266,6 +1280,23 @@ void put_page_from_l1e(l1_pgentry_t l1e, > if ( (l1e_get_flags(l1e) & _PAGE_RW) && > ((l1e_owner == pg_owner) || !paging_mode_external(pg_owner)) ) > { > + /* > + * Don't leave stale writable TLB entries in the unmapping domain's > + * page tables, to prevent them allowing access to pages required to > + * be read-only (e.g. after pg_owner changed them to page table or > + * segment descriptor pages). > + */ > + if ( unlikely(l1e_owner != pg_owner) ) > + { > + const cpumask_t *mask = get_flush_tlb_mask(page, l1e_owner); > + > + if ( !cpumask_empty(mask) ) > + { > + perfc_incr(need_flush_tlb_flush); > + flush_tlb_mask(mask); > + } > + } Why does this use a flush masked with page->tlbflush_timestamp? Shouldn't it force an unconditional flush on the whole domain, similar to gnttab_flush_tlb()? Also: I think the same issue might apply to readonly mappings where the page is released back to the dom heap afterwards. The current fix only covers writable pages.
--- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -602,6 +602,20 @@ static inline void guest_get_eff_kern_l1 TOGGLE_MODE(); } +static const cpumask_t *get_flush_tlb_mask(const struct page_info *page, + const struct domain *d) +{ + cpumask_t *mask = this_cpu(scratch_cpumask); + + BUG_ON(in_irq()); + cpumask_copy(mask, d->domain_dirty_cpumask); + + /* Don't flush if the timestamp is old enough */ + tlbflush_filter(mask, page->tlbflush_timestamp); + + return mask; +} + const char __section(".bss.page_aligned.const") __aligned(PAGE_SIZE) zero_page[PAGE_SIZE]; @@ -1266,6 +1280,23 @@ void put_page_from_l1e(l1_pgentry_t l1e, if ( (l1e_get_flags(l1e) & _PAGE_RW) && ((l1e_owner == pg_owner) || !paging_mode_external(pg_owner)) ) { + /* + * Don't leave stale writable TLB entries in the unmapping domain's + * page tables, to prevent them allowing access to pages required to + * be read-only (e.g. after pg_owner changed them to page table or + * segment descriptor pages). + */ + if ( unlikely(l1e_owner != pg_owner) ) + { + const cpumask_t *mask = get_flush_tlb_mask(page, l1e_owner); + + if ( !cpumask_empty(mask) ) + { + perfc_incr(need_flush_tlb_flush); + flush_tlb_mask(mask); + } + } + put_page_and_type(page); } else @@ -2545,13 +2576,7 @@ static int __get_page_type(struct page_i * may be unnecessary (e.g., page was GDT/LDT) but those * circumstances should be very rare. */ - cpumask_t *mask = this_cpu(scratch_cpumask); - - BUG_ON(in_irq()); - cpumask_copy(mask, d->domain_dirty_cpumask); - - /* Don't flush if the timestamp is old enough */ - tlbflush_filter(mask, page->tlbflush_timestamp); + const cpumask_t *mask = get_flush_tlb_mask(page, d); if ( unlikely(!cpumask_empty(mask)) && /* Shadow mode: track only writable pages. */