Message ID | 1602780274-29141-22-git-send-email-olekstysh@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | IOREQ feature (+ virtio-mmio) on Arm | expand |
On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: > @@ -1067,7 +1068,14 @@ static int __p2m_set_entry(struct p2m_domain *p2m, > */ > if ( p2m_is_valid(orig_pte) && > !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) ) > + { > +#ifdef CONFIG_IOREQ_SERVER > + if ( domain_has_ioreq_server(p2m->domain) && > + (p2m->domain == current->domain) && p2m_is_ram(orig_pte.p2m.type) ) > + p2m->domain->qemu_mapcache_invalidate = true; > +#endif > p2m_free_entry(p2m, orig_pte, level); > + } For all I have to say here, please bear in mind that I don't know the internals of Arm memory management. The first odd thing here the merely MFN-based condition. It may well be that's sufficient, if there's no way to get a "not present" entry with an MFN matching any valid MFN. (This isn't just with your addition, but even before.) Given how p2m_free_entry() works (or is supposed to work in the long run), is the new code you add guaranteed to only alter leaf entries? If not, the freeing of page tables needs deferring until after qemu has dropped its mappings. And with there being refcounting only for foreign pages, how do you prevent the freeing of the page just unmapped before qemu has dropped its possible mapping? On the x86 side this problem is one of the reasons why PVH Dom0 isn't "supported", yet. At least a respective code comment would seem advisable, so the issue to be addressed won't be forgotten. > --- a/xen/arch/arm/traps.c > +++ b/xen/arch/arm/traps.c > @@ -1442,6 +1442,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr, > const union hsr hsr) > { > arm_hypercall_fn_t call = NULL; > + struct vcpu *v = current; This ought to be named "curr". Jan
Hi Jan, On 16/10/2020 07:29, Jan Beulich wrote: > On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: >> @@ -1067,7 +1068,14 @@ static int __p2m_set_entry(struct p2m_domain *p2m, >> */ >> if ( p2m_is_valid(orig_pte) && >> !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) ) >> + { >> +#ifdef CONFIG_IOREQ_SERVER >> + if ( domain_has_ioreq_server(p2m->domain) && >> + (p2m->domain == current->domain) && p2m_is_ram(orig_pte.p2m.type) ) >> + p2m->domain->qemu_mapcache_invalidate = true; >> +#endif >> p2m_free_entry(p2m, orig_pte, level); >> + } > > For all I have to say here, please bear in mind that I don't know > the internals of Arm memory management. > > The first odd thing here the merely MFN-based condition. It may > well be that's sufficient, if there's no way to get a "not present" > entry with an MFN matching any valid MFN. (This isn't just with > your addition, but even before. Invalid entries are always zeroed. So in theory the problem could arise if MFN 0 used in the guest. It should not be possible on staging, but I agree this should be fixed. > > Given how p2m_free_entry() works (or is supposed to work in the > long run), is the new code you add guaranteed to only alter leaf > entries? This path may also be called with tables. I think we want to move the check in p2m_free_entry() so we can find the correct leaf type. > If not, the freeing of page tables needs deferring until > after qemu has dropped its mappings. Freeing the page tables doesn't release a page. So may I ask why we would need to defer it? > > And with there being refcounting only for foreign pages, how do > you prevent the freeing of the page just unmapped before qemu has > dropped its possible mapping? QEMU mappings can only be done using the foreign mapping interface. This means that page reference count will be incremented for each QEMU mappings. Therefore the page cannot disappear until QEMU dropped the last reference. > On the x86 side this problem is one > of the reasons why PVH Dom0 isn't "supported", yet. At least a > respective code comment would seem advisable, so the issue to be > addressed won't be forgotten. Are you sure? Isn't because you don't take a reference on foreign pages while mapping it? Anyway, Arm has supported foreign mapping since its inception. So if there is a bug, then it should be fixed. Cheers,
On 16.10.2020 10:41, Julien Grall wrote: > On 16/10/2020 07:29, Jan Beulich wrote: >> Given how p2m_free_entry() works (or is supposed to work in the >> long run), is the new code you add guaranteed to only alter leaf >> entries? > > This path may also be called with tables. I think we want to move the > check in p2m_free_entry() so we can find the correct leaf type. > >> If not, the freeing of page tables needs deferring until >> after qemu has dropped its mappings. > > Freeing the page tables doesn't release a page. So may I ask why we > would need to defer it? Oh, sorry - qemu of course doesn't use the same p2m, so the intermediate page tables are private to the subject guest. >> And with there being refcounting only for foreign pages, how do >> you prevent the freeing of the page just unmapped before qemu has >> dropped its possible mapping? > QEMU mappings can only be done using the foreign mapping interface. This > means that page reference count will be incremented for each QEMU > mappings. Therefore the page cannot disappear until QEMU dropped the > last reference. Okay, sorry for the noise then. Jan
On 16.10.20 11:41, Julien Grall wrote: Hi Jan, Julien Sorry for the late response. > Hi Jan, > > On 16/10/2020 07:29, Jan Beulich wrote: >> On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: >>> @@ -1067,7 +1068,14 @@ static int __p2m_set_entry(struct p2m_domain >>> *p2m, >>> */ >>> if ( p2m_is_valid(orig_pte) && >>> !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) ) >>> + { >>> +#ifdef CONFIG_IOREQ_SERVER >>> + if ( domain_has_ioreq_server(p2m->domain) && >>> + (p2m->domain == current->domain) && >>> p2m_is_ram(orig_pte.p2m.type) ) >>> + p2m->domain->qemu_mapcache_invalidate = true; >>> +#endif >>> p2m_free_entry(p2m, orig_pte, level); >>> + } >> >> For all I have to say here, please bear in mind that I don't know >> the internals of Arm memory management. >> >> The first odd thing here the merely MFN-based condition. It may >> well be that's sufficient, if there's no way to get a "not present" >> entry with an MFN matching any valid MFN. (This isn't just with >> your addition, but even before. > Invalid entries are always zeroed. So in theory the problem could > arise if MFN 0 used in the guest. It should not be possible on > staging, but I agree this should be fixed. > >> >> Given how p2m_free_entry() works (or is supposed to work in the >> long run), is the new code you add guaranteed to only alter leaf >> entries? > > This path may also be called with tables. I think we want to move the > check in p2m_free_entry() so we can find the correct leaf type. Well, but inside p2m_free_entry() we don't have a new entry in order to check whether new MFN is actually different. An extra arg only comes to mind... [Didn't update call sites yet and didn't tested] diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index 5bb23df..4001f46 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -739,7 +739,7 @@ static void p2m_put_l3_page(const lpae_t pte) /* Free lpae sub-tree behind an entry */ static void p2m_free_entry(struct p2m_domain *p2m, - lpae_t entry, unsigned int level) + lpae_t entry, lpae_t new_entry, unsigned int level) { unsigned int i; lpae_t *table; @@ -750,17 +750,19 @@ static void p2m_free_entry(struct p2m_domain *p2m, if ( !p2m_is_valid(entry) ) return; - /* Nothing to do but updating the stats if the entry is a super-page. */ - if ( p2m_is_superpage(entry, level) ) + if ( p2m_is_superpage(entry, level) || (level == 3) ) { - p2m->stats.mappings[level]--; - return; - } +#ifdef CONFIG_IOREQ_SERVER + if ( domain_has_ioreq_server(p2m->domain) && + (p2m->domain == current->domain) && + !mfn_eq(lpae_get_mfn(new_entry), lpae_get_mfn(entry)) && + p2m_is_ram(entry.p2m.type) ) + p2m->domain->qemu_mapcache_invalidate = true; +#endif - if ( level == 3 ) - { p2m->stats.mappings[level]--; - p2m_put_l3_page(entry); + if ( level == 3 ) + p2m_put_l3_page(entry); return; } (END)
Hi Oleksandr, On 11/11/2020 00:03, Oleksandr wrote: > > On 16.10.20 11:41, Julien Grall wrote: >> On 16/10/2020 07:29, Jan Beulich wrote: >>> On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: >>>> @@ -1067,7 +1068,14 @@ static int __p2m_set_entry(struct p2m_domain >>>> *p2m, >>>> */ >>>> if ( p2m_is_valid(orig_pte) && >>>> !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) ) >>>> + { >>>> +#ifdef CONFIG_IOREQ_SERVER >>>> + if ( domain_has_ioreq_server(p2m->domain) && >>>> + (p2m->domain == current->domain) && >>>> p2m_is_ram(orig_pte.p2m.type) ) >>>> + p2m->domain->qemu_mapcache_invalidate = true; >>>> +#endif >>>> p2m_free_entry(p2m, orig_pte, level); >>>> + } >>> >>> For all I have to say here, please bear in mind that I don't know >>> the internals of Arm memory management. >>> >>> The first odd thing here the merely MFN-based condition. It may >>> well be that's sufficient, if there's no way to get a "not present" >>> entry with an MFN matching any valid MFN. (This isn't just with >>> your addition, but even before. >> Invalid entries are always zeroed. So in theory the problem could >> arise if MFN 0 used in the guest. It should not be possible on >> staging, but I agree this should be fixed. >> >>> >>> Given how p2m_free_entry() works (or is supposed to work in the >>> long run), is the new code you add guaranteed to only alter leaf >>> entries? >> >> This path may also be called with tables. I think we want to move the >> check in p2m_free_entry() so we can find the correct leaf type. > > Well, but inside p2m_free_entry() we don't have a new entry in order to > check whether new MFN is actually different. An extra arg only comes to > mind... Aside the recursive call, there are two users for p2m_free_entry(): - When we fail to shatter a superpage in OOM - When the entry is replaced by an entry with a different base I wouldn't be too concerned to send spurious mapcache invalidation in an error path. So I don't think you need to know the new entry. What you need to know if the type of the leaf. Cheers,
On 11.11.20 21:27, Julien Grall wrote: > Hi Oleksandr, Hi Julien. > > On 11/11/2020 00:03, Oleksandr wrote: >> >> On 16.10.20 11:41, Julien Grall wrote: >>> On 16/10/2020 07:29, Jan Beulich wrote: >>>> On 15.10.2020 18:44, Oleksandr Tyshchenko wrote: >>>>> @@ -1067,7 +1068,14 @@ static int __p2m_set_entry(struct >>>>> p2m_domain *p2m, >>>>> */ >>>>> if ( p2m_is_valid(orig_pte) && >>>>> !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) ) >>>>> + { >>>>> +#ifdef CONFIG_IOREQ_SERVER >>>>> + if ( domain_has_ioreq_server(p2m->domain) && >>>>> + (p2m->domain == current->domain) && >>>>> p2m_is_ram(orig_pte.p2m.type) ) >>>>> + p2m->domain->qemu_mapcache_invalidate = true; >>>>> +#endif >>>>> p2m_free_entry(p2m, orig_pte, level); >>>>> + } >>>> >>>> For all I have to say here, please bear in mind that I don't know >>>> the internals of Arm memory management. >>>> >>>> The first odd thing here the merely MFN-based condition. It may >>>> well be that's sufficient, if there's no way to get a "not present" >>>> entry with an MFN matching any valid MFN. (This isn't just with >>>> your addition, but even before. >>> Invalid entries are always zeroed. So in theory the problem could >>> arise if MFN 0 used in the guest. It should not be possible on >>> staging, but I agree this should be fixed. >>> >>>> >>>> Given how p2m_free_entry() works (or is supposed to work in the >>>> long run), is the new code you add guaranteed to only alter leaf >>>> entries? >>> >>> This path may also be called with tables. I think we want to move >>> the check in p2m_free_entry() so we can find the correct leaf type. >> >> Well, but inside p2m_free_entry() we don't have a new entry in order >> to check whether new MFN is actually different. An extra arg only >> comes to mind... > Aside the recursive call, there are two users for p2m_free_entry(): > - When we fail to shatter a superpage in OOM > - When the entry is replaced by an entry with a different base > > I wouldn't be too concerned to send spurious mapcache invalidation in > an error path. So I don't think you need to know the new entry. Thank you for the clarification, sounds reasonable. > > What you need to know if the type of the leaf. Yes, to check whether it is a RAM page.
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index 370173c..2693b0c 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -1,6 +1,7 @@ #include <xen/cpu.h> #include <xen/domain_page.h> #include <xen/iocap.h> +#include <xen/ioreq.h> #include <xen/lib.h> #include <xen/sched.h> #include <xen/softirq.h> @@ -1067,7 +1068,14 @@ static int __p2m_set_entry(struct p2m_domain *p2m, */ if ( p2m_is_valid(orig_pte) && !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) ) + { +#ifdef CONFIG_IOREQ_SERVER + if ( domain_has_ioreq_server(p2m->domain) && + (p2m->domain == current->domain) && p2m_is_ram(orig_pte.p2m.type) ) + p2m->domain->qemu_mapcache_invalidate = true; +#endif p2m_free_entry(p2m, orig_pte, level); + } out: unmap_domain_page(table); diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index a8f5fdf..9eaa342 100644 --- a/xen/arch/arm/traps.c +++ b/xen/arch/arm/traps.c @@ -1442,6 +1442,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr, const union hsr hsr) { arm_hypercall_fn_t call = NULL; + struct vcpu *v = current; BUILD_BUG_ON(NR_hypercalls < ARRAY_SIZE(arm_hypercall_table) ); @@ -1458,7 +1459,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr, return; } - current->hcall_preempted = false; + v->hcall_preempted = false; perfc_incra(hypercalls, *nr); call = arm_hypercall_table[*nr].fn; @@ -1471,7 +1472,7 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr, HYPERCALL_RESULT_REG(regs) = call(HYPERCALL_ARGS(regs)); #ifndef NDEBUG - if ( !current->hcall_preempted ) + if ( !v->hcall_preempted ) { /* Deliberately corrupt parameter regs used by this hypercall. */ switch ( arm_hypercall_table[*nr].nr_args ) { @@ -1488,8 +1489,14 @@ static void do_trap_hypercall(struct cpu_user_regs *regs, register_t *nr, #endif /* Ensure the hypercall trap instruction is re-executed. */ - if ( current->hcall_preempted ) + if ( v->hcall_preempted ) regs->pc -= 4; /* re-execute 'hvc #XEN_HYPERCALL_TAG' */ + +#ifdef CONFIG_IOREQ_SERVER + if ( unlikely(v->domain->qemu_mapcache_invalidate) && + test_and_clear_bool(v->domain->qemu_mapcache_invalidate) ) + send_invalidate_ioreq(); +#endif } void arch_hypercall_tasklet_result(struct vcpu *v, long res)