Message ID | 1468416032-7692-8-git-send-email-suravee.suthikulpanit@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Radim, On 7/13/16 21:14, Radim Krčmář wrote: > [I pasted v3 reviews prefixed with a pipe where I think they still apply.] > > 2016-07-13 08:20-0500, Suravee Suthikulpanit: >> From: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> >> >> Introduces a new IOMMU API, amd_iommu_update_ga(), which allows >> KVM (SVM) to update existing posted interrupt IOMMU IRTE when >> load/unload vcpu. >> >> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> >> --- >> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c >> @@ -4461,4 +4461,69 @@ int amd_iommu_create_irq_domain(struct amd_iommu *iommu) >> +int amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 vm_id, >> + u64 base, bool is_run) > > |2016-07-13 15:49+0700, Suravee Suthikulpanit: > |> On 07/12/2016 01:59 AM, Radim Krčmář wrote: > |>> Not just in this function does the interface between svm and iommu split > |>> ga_tag into its two components (vcpu_id and ga_tag), but it seems that > |>> the combined value could always be used instead ... > |>> Is there an advantage to passing two values? > |> > |> Here, the amd_iommu_update_ga() takes the two separate value for input > |> parameters. Mainly the ga_tag (which is really the vm_id) and vcpu_id. This > |> allow IOMMU driver to decide how to encode the GATAG to be programmed into > |> the IRTE. Currently, the actual GATAG is a 16-bit value, <vm_id><vcpu_id>. > |> This keeps the interface independent from how we encode the GATAG. > > I was thinking about making the IOMMU unaware how SVM or other Linux > hypervisors use the ga_tag, i.e. passing the final u32 ga_tag. > For example 32 bit hypervisor doesn't need to use lookup, because any > pointer can used as the ga_tag directly. Ahh....... (w/ a big light bulb) I get your point now. Let's just have SVM (or other hypervisor) define what the tag should be and just pass-on the value to IOMMU. IOMMU can just simply use this w/o knowing what it is. Sorry, I'm slow :) > And there are other viable algoritms for assigning the ga_tag -- > why isn't the vm_id 24 bits? Good point! Actually, I am somehow limited to 30-bit hash value. So, the VM_ID can be 22 bits, I'll make that change. > >> + unsigned long flags; >> + struct amd_iommu *iommu; >> + >> + if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) >> + return 0; >> + >> + for_each_iommu(iommu) { >> + struct amd_ir_data *ir_data; >> + >> + spin_lock_irqsave(&iommu->gatag_ir_hash_lock, flags); >> + >> + /* Note: >> + * We need to update all interrupt remapping table entries >> + * for targeting the specified vcpu. Here, we use gatag >> + * as a hash key and iterate through all entries in the bucket. >> + */ >> + hash_for_each_possible(iommu->gatag_ir_hash, ir_data, hnode, >> + AMD_IOMMU_GATAG(vm_id, vcpu_id)) { >> + struct irte_ga *irte = (struct irte_ga *) ir_data->entry; > > |>> (The ga_tag check is missing here too.) > |> > |> Here, the intention is to update all interrupt remapping entries in the > |> bucket w/ the same GATAG (i.e. vm_id + vcpu_id), where GATAG = > |> AMD_IOMMU_GATAG(vm_id, vcpu_id). > > Which is why you need to check that > AMD_IOMMU_GATAG(vm_id, vcpu_id) == entry->fields_vapic.ga_tag > > The hashing function can map two different vm_id + vcpu_id to the same > bucket and hash_for_each_possible() would return both of them, but only > one belongs to the VCPU that we want to update. > > (And shouldn't there be only one match?) Actually, with your suggestion above, the hask key would be (vm_id & 0x3FFFFF << 8)| (vcpu_id & 0xFF). So, it should be unique for each vcpu of each vm, or am I still missing something? Also, since we will not be passing the vmid and vcpuid as separate value, and just passing the (u32)ga_tag, we would not be able to do the check you suggested here. > >> + >> + if (!irte->lo.fields_vapic.guest_mode) >> + continue; >> + >> + update_irte_ga((struct irte_ga *)ir_data->ref, >> + ir_data->irq_2_irte.devid, >> + base, cpu, is_run); > > |>> (The lookup leading up to here is avoidable -- svm, the caller, has the > |>> ability to map ga_tag into irte/ir_data directly with a pointer. I'm not sure about this optimization to avoid look up. The struct amd_ir_data is part of the IOMMU driver, and the SVM knows nothing about it. I don't think it would be able to find out the pointer to amd_ir_data/irte. Also, with the current design, each ga_tag can be mapped to different irte since there could be multiple interrupts targeting a particular cpu. Here, we would want to update all of the IRTEs with the same ga_tag. > |>> I'm not sure if the lookup is slow enough to pardon optimization, but > |>> it might make the code simpler as well.) > |> > |> I might have mislead you up to this point. Not sure if the assumption here > |> still hold with my explanation above. Sorry for confusion. > > SVM configures IOMMU with ga_tag, so IOMMU could return the pointer to > ir_data/irte that was just configured. Also, IIUC, you want to use the pointer to ir_data/irte as the ga_tag value. The issue would be ga_tag is a 32-bit value, and this would not work with 64-bit address. > SVM would couple it with a VCPU > (and hence a ga_tag) and when amd_iommu_update_ga() was needed, SVM > would pass the ir_data/irte pointer directly, instead of looking it up > though a ga_tag. Please let me know if I am still missing any points. Thanks, Suravee -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/14/16 16:13, Suravee Suthikulpanit wrote: >>> unsigned long flags; >>> + struct amd_iommu *iommu; >>> + >>> + if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) >>> + return 0; >>> + >>> + for_each_iommu(iommu) { >>> + struct amd_ir_data *ir_data; >>> + >>> + spin_lock_irqsave(&iommu->gatag_ir_hash_lock, flags); >>> + >>> + /* Note: >>> + * We need to update all interrupt remapping table entries >>> + * for targeting the specified vcpu. Here, we use gatag >>> + * as a hash key and iterate through all entries in the bucket. >>> + */ >>> + hash_for_each_possible(iommu->gatag_ir_hash, ir_data, hnode, >>> + AMD_IOMMU_GATAG(vm_id, vcpu_id)) { >>> + struct irte_ga *irte = (struct irte_ga *) ir_data->entry; >> >> |>> (The ga_tag check is missing here too.) >> |> >> |> Here, the intention is to update all interrupt remapping entries in >> the >> |> bucket w/ the same GATAG (i.e. vm_id + vcpu_id), where GATAG = >> |> AMD_IOMMU_GATAG(vm_id, vcpu_id). >> >> Which is why you need to check that >> AMD_IOMMU_GATAG(vm_id, vcpu_id) == entry->fields_vapic.ga_tag >> >> The hashing function can map two different vm_id + vcpu_id to the same >> bucket and hash_for_each_possible() would return both of them, but only >> one belongs to the VCPU that we want to update. >> >> (And shouldn't there be only one match?) > > Actually, with your suggestion above, the hask key would be (vm_id & > 0x3FFFFF << 8)| (vcpu_id & 0xFF). So, it should be unique for each vcpu > of each vm, or am I still missing something? Ok, one scenario would be when SVM run out of the VM_ID and having to start re-using them. Since we want SVM to generate ga_tag and just pass into IOMMU driver for it to program the IRTE, we probably can make an assumption that SVM would make sure that ga_tag would not conflict for each vm_id/vcpu_id. Thanks, Suravee Thanks, Suravee -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2016-07-14 16:13+0700, Suravee Suthikulpanit: > On 7/13/16 21:14, Radim Krčmář wrote: >> 2016-07-13 08:20-0500, Suravee Suthikulpanit: >> > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c >> > @@ -4461,4 +4461,69 @@ int amd_iommu_create_irq_domain(struct amd_iommu *iommu) >> > +int amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 vm_id, >> > + u64 base, bool is_run) >> >> |2016-07-13 15:49+0700, Suravee Suthikulpanit: >> |> On 07/12/2016 01:59 AM, Radim Krčmář wrote: >> |>> Not just in this function does the interface between svm and iommu split >> |>> ga_tag into its two components (vcpu_id and ga_tag), but it seems that >> |>> the combined value could always be used instead ... >> |>> Is there an advantage to passing two values? >> |> >> |> Here, the amd_iommu_update_ga() takes the two separate value for input >> |> parameters. Mainly the ga_tag (which is really the vm_id) and vcpu_id. This >> |> allow IOMMU driver to decide how to encode the GATAG to be programmed into >> |> the IRTE. Currently, the actual GATAG is a 16-bit value, <vm_id><vcpu_id>. >> |> This keeps the interface independent from how we encode the GATAG. >> >> I was thinking about making the IOMMU unaware how SVM or other Linux >> hypervisors use the ga_tag, i.e. passing the final u32 ga_tag. >> For example 32 bit hypervisor doesn't need to use lookup, because any >> pointer can used as the ga_tag directly. > > Ahh....... (w/ a big light bulb) > I get your point now. Let's just have SVM (or other hypervisor) define what > the tag should be and just pass-on the value to IOMMU. IOMMU can just simply > use this w/o knowing what it is. Sorry, I'm slow :) That is what I meant, but misunderstanding is a product of both participants. I didn't write it clearly on the first try. >> > + hash_for_each_possible(iommu->gatag_ir_hash, ir_data, hnode, >> > + AMD_IOMMU_GATAG(vm_id, vcpu_id)) { >> > + struct irte_ga *irte = (struct irte_ga *) ir_data->entry; >> >> |>> (The ga_tag check is missing here too.) >> |> >> |> Here, the intention is to update all interrupt remapping entries in the >> |> bucket w/ the same GATAG (i.e. vm_id + vcpu_id), where GATAG = >> |> AMD_IOMMU_GATAG(vm_id, vcpu_id). >> >> Which is why you need to check that >> AMD_IOMMU_GATAG(vm_id, vcpu_id) == entry->fields_vapic.ga_tag >> >> The hashing function can map two different vm_id + vcpu_id to the same >> bucket and hash_for_each_possible() would return both of them, but only >> one belongs to the VCPU that we want to update. >> >> (And shouldn't there be only one match?) > > Actually, with your suggestion above, the hask key would be (vm_id & > 0x3FFFFF << 8)| (vcpu_id & 0xFF). So, it should be unique for each vcpu of > each vm, or am I still missing something? [Reply in the followup mail.] > Also, since we will not be passing the vmid and vcpuid as separate value, > and just passing the (u32)ga_tag, we would not be able to do the check you > suggested here. There will be the u32 ga_tag argument, so you would still do ga_tag == entry->fields_vapic.ga_tag Because even if the ga_tag is unique for every vcpu, the hash table will mix various vcpus into one bucket and you need to filter them. >> > + update_irte_ga((struct irte_ga *)ir_data->ref, >> > + ir_data->irq_2_irte.devid, >> > + base, cpu, is_run); >> >> |>> (The lookup leading up to here is avoidable -- svm, the caller, has the >> |>> ability to map ga_tag into irte/ir_data directly with a pointer. > > I'm not sure about this optimization to avoid look up. > > The struct amd_ir_data is part of the IOMMU driver, and the SVM knows > nothing about it. I don't think it would be able to find out the pointer to > amd_ir_data/irte. Yeah, SVM would store it in a "void *" pointer, because it doesn't need to know anything else, but you still need to retrieve it from IOMMU, which could be done through vcpu_info argument to amd_ir_set_vcpu_affinity(). (I am not sure if it doesn't breach isolation of IOMMU, so we might not want to do it in any case ...) > Also, with the current design, each ga_tag can be mapped to different irte > since there could be multiple interrupts targeting a particular cpu. Here, > we would want to update all of the IRTEs with the same ga_tag. True, that design is good. SVM would need a list of pointers for each vcpu to cope with it ... >> |>> I'm not sure if the lookup is slow enough to pardon optimization, but >> |>> it might make the code simpler as well.) >> |> >> |> I might have mislead you up to this point. Not sure if the assumption here >> |> still hold with my explanation above. Sorry for confusion. >> >> SVM configures IOMMU with ga_tag, so IOMMU could return the pointer to >> ir_data/irte that was just configured. > > Also, IIUC, you want to use the pointer to ir_data/irte as the ga_tag value. > The issue would be ga_tag is a 32-bit value, and this would not work with > 64-bit address. I mean something slightly different. Instead of passing ga_tag into amd_iommu_update_ga(), just pass void * of whatever IOMMU provided back when SVM configured the interrupt. ga_tag will never come into play. (The vcpu lookup from ga_tag is necessary, when processing the queue of undelivered interrupts. ir_data lookup can be avoided.) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2016-07-14 16:33+0700, Suravee Suthikulpanit: > On 7/14/16 16:13, Suravee Suthikulpanit wrote: >> > > unsigned long flags; >> > > + struct amd_iommu *iommu; >> > > + >> > > + if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) >> > > + return 0; >> > > + >> > > + for_each_iommu(iommu) { >> > > + struct amd_ir_data *ir_data; >> > > + >> > > + spin_lock_irqsave(&iommu->gatag_ir_hash_lock, flags); >> > > + >> > > + /* Note: >> > > + * We need to update all interrupt remapping table entries >> > > + * for targeting the specified vcpu. Here, we use gatag >> > > + * as a hash key and iterate through all entries in the bucket. >> > > + */ >> > > + hash_for_each_possible(iommu->gatag_ir_hash, ir_data, hnode, >> > > + AMD_IOMMU_GATAG(vm_id, vcpu_id)) { >> > > + struct irte_ga *irte = (struct irte_ga *) ir_data->entry; >> > >> > |>> (The ga_tag check is missing here too.) >> > |> >> > |> Here, the intention is to update all interrupt remapping entries in >> > the >> > |> bucket w/ the same GATAG (i.e. vm_id + vcpu_id), where GATAG = >> > |> AMD_IOMMU_GATAG(vm_id, vcpu_id). >> > >> > Which is why you need to check that >> > AMD_IOMMU_GATAG(vm_id, vcpu_id) == entry->fields_vapic.ga_tag >> > >> > The hashing function can map two different vm_id + vcpu_id to the same >> > bucket and hash_for_each_possible() would return both of them, but only >> > one belongs to the VCPU that we want to update. >> > >> > (And shouldn't there be only one match?) >> >> Actually, with your suggestion above, the hask key would be (vm_id & >> 0x3FFFFF << 8)| (vcpu_id & 0xFF). So, it should be unique for each vcpu >> of each vm, or am I still missing something? > > Ok, one scenario would be when SVM run out of the VM_ID and having to start > re-using them. Since we want SVM to generate ga_tag and just pass into IOMMU > driver for it to program the IRTE, we probably can make an assumption that > SVM would make sure that ga_tag would not conflict for each vm_id/vcpu_id. I agree, it could enable doorbell to an unscheduled VCPU and therefore lose the notification. The per-vcpu list of IRTEs would solve it as well, but making sure that no two VMs have the same id might be easier and 2^22 active VMs should be more than enough. :) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index fe9b005..4a337dc 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -4461,4 +4461,69 @@ int amd_iommu_create_irq_domain(struct amd_iommu *iommu) return 0; } + +static int +update_irte_ga(struct irte_ga *irte, unsigned int devid, + u64 base, int cpu, bool is_run) +{ + struct irq_remap_table *irt = get_irq_table(devid, false); + unsigned long flags; + + if (!irt) + return -ENODEV; + + spin_lock_irqsave(&irt->lock, flags); + + if (irte->lo.fields_vapic.guest_mode) { + irte->hi.fields.ga_root_ptr = (base >> 12); + if (cpu >= 0) + irte->lo.fields_vapic.destination = cpu; + irte->lo.fields_vapic.is_run = is_run; + barrier(); + } + + spin_unlock_irqrestore(&irt->lock, flags); + + return 0; +} + +int amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 vm_id, + u64 base, bool is_run) +{ + unsigned long flags; + struct amd_iommu *iommu; + + if (!AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) + return 0; + + for_each_iommu(iommu) { + struct amd_ir_data *ir_data; + + spin_lock_irqsave(&iommu->gatag_ir_hash_lock, flags); + + /* Note: + * We need to update all interrupt remapping table entries + * for targeting the specified vcpu. Here, we use gatag + * as a hash key and iterate through all entries in the bucket. + */ + hash_for_each_possible(iommu->gatag_ir_hash, ir_data, hnode, + AMD_IOMMU_GATAG(vm_id, vcpu_id)) { + struct irte_ga *irte = (struct irte_ga *) ir_data->entry; + + if (!irte->lo.fields_vapic.guest_mode) + continue; + + update_irte_ga((struct irte_ga *)ir_data->ref, + ir_data->irq_2_irte.devid, + base, cpu, is_run); + iommu_flush_irt(iommu, ir_data->irq_2_irte.devid); + iommu_completion_wait(iommu); + } + + spin_unlock_irqrestore(&iommu->gatag_ir_hash_lock, flags); + } + + return 0; +} +EXPORT_SYMBOL(amd_iommu_update_ga); #endif diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h index 2ed353b..52160c8 100644 --- a/drivers/iommu/amd_iommu_types.h +++ b/drivers/iommu/amd_iommu_types.h @@ -844,6 +844,7 @@ struct amd_ir_data { union { struct msi_msg msi_entry; }; + void *ref; /* Pointer to the actual irte */ }; #ifdef CONFIG_IRQ_REMAP diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h index 940fdd8..a6fc022 100644 --- a/include/linux/amd-iommu.h +++ b/include/linux/amd-iommu.h @@ -179,6 +179,9 @@ static inline int amd_iommu_detect(void) { return -ENODEV; } /* IOMMU AVIC Function */ extern int amd_iommu_register_ga_log_notifier(int (*notifier)(int, int)); +extern int +amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 vm_id, u64 base, bool is_run); + #else /* defined(CONFIG_AMD_IOMMU) && defined(CONFIG_IRQ_REMAP) */ static inline int @@ -187,6 +190,12 @@ amd_iommu_register_ga_log_notifier(int (*notifier)(int, int)) return 0; } +static inline int +amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 vm_id, u64 base, bool is_run) +{ + return 0; +} + #endif /* defined(CONFIG_AMD_IOMMU) && defined(CONFIG_IRQ_REMAP) */ #endif /* _ASM_X86_AMD_IOMMU_H */