Message ID | 20221216114853.8227-7-julien@xen.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Remove the directmap | expand |
On 16.12.2022 12:48, Julien Grall wrote: > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -165,7 +165,24 @@ restore_all_guest: > and %rsi, %rdi > and %r9, %rsi > add %rcx, %rdi > - add %rcx, %rsi > + > + /* > + * Without a direct map, we have to map first before copying. We only > + * need to map the guest root table but not the per-CPU root_pgt, > + * because the latter is still a xenheap page. > + */ > + pushq %r9 > + pushq %rdx > + pushq %rax > + pushq %rdi > + mov %rsi, %rdi > + shr $PAGE_SHIFT, %rdi > + callq map_domain_page > + mov %rax, %rsi > + popq %rdi > + /* Stash the pointer for unmapping later. */ > + pushq %rax > + > mov $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx > mov root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8 > mov %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi) > @@ -177,6 +194,14 @@ restore_all_guest: > sub $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \ > ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi > rep movsq > + > + /* Unmap the page. */ > + popq %rdi > + callq unmap_domain_page > + popq %rax > + popq %rdx > + popq %r9 While the PUSH/POP are part of what I dislike here, I think this wants doing differently: Establish a mapping when putting in place a new guest page table, and use the pointer here. This could be a new per-domain mapping, to limit its visibility. Jan
Hi, On 22/12/2022 11:12, Jan Beulich wrote: > On 16.12.2022 12:48, Julien Grall wrote: >> --- a/xen/arch/x86/x86_64/entry.S >> +++ b/xen/arch/x86/x86_64/entry.S >> @@ -165,7 +165,24 @@ restore_all_guest: >> and %rsi, %rdi >> and %r9, %rsi >> add %rcx, %rdi >> - add %rcx, %rsi >> + >> + /* >> + * Without a direct map, we have to map first before copying. We only >> + * need to map the guest root table but not the per-CPU root_pgt, >> + * because the latter is still a xenheap page. >> + */ >> + pushq %r9 >> + pushq %rdx >> + pushq %rax >> + pushq %rdi >> + mov %rsi, %rdi >> + shr $PAGE_SHIFT, %rdi >> + callq map_domain_page >> + mov %rax, %rsi >> + popq %rdi >> + /* Stash the pointer for unmapping later. */ >> + pushq %rax >> + >> mov $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx >> mov root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8 >> mov %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi) >> @@ -177,6 +194,14 @@ restore_all_guest: >> sub $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \ >> ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi >> rep movsq >> + >> + /* Unmap the page. */ >> + popq %rdi >> + callq unmap_domain_page >> + popq %rax >> + popq %rdx >> + popq %r9 > > While the PUSH/POP are part of what I dislike here, I think this wants > doing differently: Establish a mapping when putting in place a new guest > page table, and use the pointer here. This could be a new per-domain > mapping, to limit its visibility. I have looked at a per-domain approach and this looks way more complex than the few concise lines here (not mentioning the extra amount of memory). So I am not convinced this is worth the effort here. I don't have an other approach in mind. So are you disliking this approach to the point this will be nacked? Cheers,
On 23.12.2022 13:22, Julien Grall wrote: > Hi, > > On 22/12/2022 11:12, Jan Beulich wrote: >> On 16.12.2022 12:48, Julien Grall wrote: >>> --- a/xen/arch/x86/x86_64/entry.S >>> +++ b/xen/arch/x86/x86_64/entry.S >>> @@ -165,7 +165,24 @@ restore_all_guest: >>> and %rsi, %rdi >>> and %r9, %rsi >>> add %rcx, %rdi >>> - add %rcx, %rsi >>> + >>> + /* >>> + * Without a direct map, we have to map first before copying. We only >>> + * need to map the guest root table but not the per-CPU root_pgt, >>> + * because the latter is still a xenheap page. >>> + */ >>> + pushq %r9 >>> + pushq %rdx >>> + pushq %rax >>> + pushq %rdi >>> + mov %rsi, %rdi >>> + shr $PAGE_SHIFT, %rdi >>> + callq map_domain_page >>> + mov %rax, %rsi >>> + popq %rdi >>> + /* Stash the pointer for unmapping later. */ >>> + pushq %rax >>> + >>> mov $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx >>> mov root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8 >>> mov %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi) >>> @@ -177,6 +194,14 @@ restore_all_guest: >>> sub $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \ >>> ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi >>> rep movsq >>> + >>> + /* Unmap the page. */ >>> + popq %rdi >>> + callq unmap_domain_page >>> + popq %rax >>> + popq %rdx >>> + popq %r9 >> >> While the PUSH/POP are part of what I dislike here, I think this wants >> doing differently: Establish a mapping when putting in place a new guest >> page table, and use the pointer here. This could be a new per-domain >> mapping, to limit its visibility. > > I have looked at a per-domain approach and this looks way more complex > than the few concise lines here (not mentioning the extra amount of > memory). Yes, I do understand that would be a more intrusive change. > So I am not convinced this is worth the effort here. > > I don't have an other approach in mind. So are you disliking this > approach to the point this will be nacked? I guess I wouldn't nack it, but I also wouldn't provide an ack. I'm curious what Andrew or Roger think here... Jan
Hi Jan, On 04/01/2023 10:27, Jan Beulich wrote: > On 23.12.2022 13:22, Julien Grall wrote: >> Hi, >> >> On 22/12/2022 11:12, Jan Beulich wrote: >>> On 16.12.2022 12:48, Julien Grall wrote: >>>> --- a/xen/arch/x86/x86_64/entry.S >>>> +++ b/xen/arch/x86/x86_64/entry.S >>>> @@ -165,7 +165,24 @@ restore_all_guest: >>>> and %rsi, %rdi >>>> and %r9, %rsi >>>> add %rcx, %rdi >>>> - add %rcx, %rsi >>>> + >>>> + /* >>>> + * Without a direct map, we have to map first before copying. We only >>>> + * need to map the guest root table but not the per-CPU root_pgt, >>>> + * because the latter is still a xenheap page. >>>> + */ >>>> + pushq %r9 >>>> + pushq %rdx >>>> + pushq %rax >>>> + pushq %rdi >>>> + mov %rsi, %rdi >>>> + shr $PAGE_SHIFT, %rdi >>>> + callq map_domain_page >>>> + mov %rax, %rsi >>>> + popq %rdi >>>> + /* Stash the pointer for unmapping later. */ >>>> + pushq %rax >>>> + >>>> mov $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx >>>> mov root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8 >>>> mov %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi) >>>> @@ -177,6 +194,14 @@ restore_all_guest: >>>> sub $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \ >>>> ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi >>>> rep movsq >>>> + >>>> + /* Unmap the page. */ >>>> + popq %rdi >>>> + callq unmap_domain_page >>>> + popq %rax >>>> + popq %rdx >>>> + popq %r9 >>> >>> While the PUSH/POP are part of what I dislike here, I think this wants >>> doing differently: Establish a mapping when putting in place a new guest >>> page table, and use the pointer here. This could be a new per-domain >>> mapping, to limit its visibility. >> >> I have looked at a per-domain approach and this looks way more complex >> than the few concise lines here (not mentioning the extra amount of >> memory). > > Yes, I do understand that would be a more intrusive change. I could be persuaded to look at a more intrusive change if there are a good reason to do it. To me, at the moment, it mostly seem a matter of taste. So what would we gain from a perdomain mapping? > >> So I am not convinced this is worth the effort here. >> >> I don't have an other approach in mind. So are you disliking this >> approach to the point this will be nacked? > > I guess I wouldn't nack it, but I also wouldn't provide an ack. > I'm curious > what Andrew or Roger think here... Unfortunately Roger is on parental leaves for the next couple of months. It would be good to make some progress before hand. Andrew, what do you think?
On 13.01.2023 00:20, Julien Grall wrote: > On 04/01/2023 10:27, Jan Beulich wrote: >> On 23.12.2022 13:22, Julien Grall wrote: >>> On 22/12/2022 11:12, Jan Beulich wrote: >>>> On 16.12.2022 12:48, Julien Grall wrote: >>>>> --- a/xen/arch/x86/x86_64/entry.S >>>>> +++ b/xen/arch/x86/x86_64/entry.S >>>>> @@ -165,7 +165,24 @@ restore_all_guest: >>>>> and %rsi, %rdi >>>>> and %r9, %rsi >>>>> add %rcx, %rdi >>>>> - add %rcx, %rsi >>>>> + >>>>> + /* >>>>> + * Without a direct map, we have to map first before copying. We only >>>>> + * need to map the guest root table but not the per-CPU root_pgt, >>>>> + * because the latter is still a xenheap page. >>>>> + */ >>>>> + pushq %r9 >>>>> + pushq %rdx >>>>> + pushq %rax >>>>> + pushq %rdi >>>>> + mov %rsi, %rdi >>>>> + shr $PAGE_SHIFT, %rdi >>>>> + callq map_domain_page >>>>> + mov %rax, %rsi >>>>> + popq %rdi >>>>> + /* Stash the pointer for unmapping later. */ >>>>> + pushq %rax >>>>> + >>>>> mov $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx >>>>> mov root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8 >>>>> mov %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi) >>>>> @@ -177,6 +194,14 @@ restore_all_guest: >>>>> sub $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \ >>>>> ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi >>>>> rep movsq >>>>> + >>>>> + /* Unmap the page. */ >>>>> + popq %rdi >>>>> + callq unmap_domain_page >>>>> + popq %rax >>>>> + popq %rdx >>>>> + popq %r9 >>>> >>>> While the PUSH/POP are part of what I dislike here, I think this wants >>>> doing differently: Establish a mapping when putting in place a new guest >>>> page table, and use the pointer here. This could be a new per-domain >>>> mapping, to limit its visibility. >>> >>> I have looked at a per-domain approach and this looks way more complex >>> than the few concise lines here (not mentioning the extra amount of >>> memory). >> >> Yes, I do understand that would be a more intrusive change. > > I could be persuaded to look at a more intrusive change if there are a > good reason to do it. To me, at the moment, it mostly seem a matter of > taste. > > So what would we gain from a perdomain mapping? Rather than mapping/unmapping once per hypervisor entry/exit, we'd map just once per context switch. Plus we'd save ugly/fragile assembly code (apart from the push/pop I also dislike C functions being called from assembly which aren't really meant to be called this way: While these two may indeed be unlikely to ever change, any such change comes with the risk of the assembly callers being missed - the compiler won't tell you that e.g. argument types/count don't match parameters anymore). Jan
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S index ae012851819a..b72abf923d9c 100644 --- a/xen/arch/x86/x86_64/entry.S +++ b/xen/arch/x86/x86_64/entry.S @@ -165,7 +165,24 @@ restore_all_guest: and %rsi, %rdi and %r9, %rsi add %rcx, %rdi - add %rcx, %rsi + + /* + * Without a direct map, we have to map first before copying. We only + * need to map the guest root table but not the per-CPU root_pgt, + * because the latter is still a xenheap page. + */ + pushq %r9 + pushq %rdx + pushq %rax + pushq %rdi + mov %rsi, %rdi + shr $PAGE_SHIFT, %rdi + callq map_domain_page + mov %rax, %rsi + popq %rdi + /* Stash the pointer for unmapping later. */ + pushq %rax + mov $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx mov root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8 mov %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi) @@ -177,6 +194,14 @@ restore_all_guest: sub $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \ ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi rep movsq + + /* Unmap the page. */ + popq %rdi + callq unmap_domain_page + popq %rax + popq %rdx + popq %r9 + .Lrag_copy_done: mov %r9, STACK_CPUINFO_FIELD(xen_cr3)(%rdx) movb $1, STACK_CPUINFO_FIELD(use_pv_cr3)(%rdx)