From patchwork Wed Feb 11 08:08:51 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sheng Yang X-Patchwork-Id: 6629 Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n1B89EYn014045 for ; Wed, 11 Feb 2009 08:09:14 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752645AbZBKII6 (ORCPT ); Wed, 11 Feb 2009 03:08:58 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753861AbZBKII6 (ORCPT ); Wed, 11 Feb 2009 03:08:58 -0500 Received: from mga14.intel.com ([143.182.124.37]:64479 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752645AbZBKII4 (ORCPT ); Wed, 11 Feb 2009 03:08:56 -0500 Received: from azsmga001.ch.intel.com ([10.2.17.19]) by azsmga102.ch.intel.com with ESMTP; 11 Feb 2009 00:08:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.38,191,1233561600"; d="scan'208";a="109465959" Received: from syang10-desktop.sh.intel.com (HELO syang10-desktop) ([10.239.13.102]) by azsmga001.ch.intel.com with ESMTP; 11 Feb 2009 00:08:54 -0800 Received: from yasker by syang10-desktop with local (Exim 4.69) (envelope-from ) id 1LXA9L-0000q1-UL; Wed, 11 Feb 2009 16:08:51 +0800 From: Sheng Yang To: Avi Kivity Cc: kvm@vger.kernel.org, Sheng Yang Subject: [PATCH 3/3] KVM: Enable MSI-X for KVM assigned device Date: Wed, 11 Feb 2009 16:08:51 +0800 Message-Id: <1234339731-3195-4-git-send-email-sheng@linux.intel.com> X-Mailer: git-send-email 1.5.6.3 In-Reply-To: <1234339731-3195-1-git-send-email-sheng@linux.intel.com> References: <1234339731-3195-1-git-send-email-sheng@linux.intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch finally enable MSI-X. What we need for MSI-X: 1. Intercept one page in MMIO region of device. So that we can get guest desired MSI-X table and set up the real one. Now this have been done by guest, and transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY. 2. Information for incoming interrupt. Now one device can have more than one interrupt, and they are all handled by one workqueue structure. So we need to identify them. The previous patch enable gsi_msg_pending_bitmap get this done. 3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X message address/data. We used same entry number for the host and guest here, so that it's easy to find the correlated guest gsi. What we lack for now: 1. The PCI spec said nothing can existed with MSI-X table in the same page of MMIO region, except pending bits. The patch ignore pending bits as the first step (so they are always 0 - no pending). 2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch didn't support this, and Linux also don't work in this way. 3. The patch didn't implement MSI-X mask all and mask single entry. I would implement the former in driver/pci/msi.c later. And for single entry, userspace should have reposibility to handle it. Signed-off-by: Sheng Yang --- include/linux/kvm.h | 8 ++++ virt/kvm/kvm_main.c | 107 ++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 109 insertions(+), 6 deletions(-) diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 5200768..fcd08da 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -439,6 +439,9 @@ struct kvm_irq_routing { }; #endif +#if defined(CONFIG_X86) +#define KVM_CAP_DEVICE_MSIX 26 +#endif /* * ioctls for VM fds @@ -596,6 +599,11 @@ struct kvm_assigned_irq { #define KVM_DEV_IRQ_ASSIGN_MSI_ACTION KVM_DEV_IRQ_ASSIGN_ENABLE_MSI #define KVM_DEV_IRQ_ASSIGN_ENABLE_MSI (1 << 0) +#define KVM_DEV_IRQ_ASSIGN_MSIX_ACTION (KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX |\ + KVM_DEV_IRQ_ASSIGN_MASK_MSIX) +#define KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX (1 << 1) +#define KVM_DEV_IRQ_ASSIGN_MASK_MSIX (1 << 2) + struct kvm_assigned_msix_nr { __u32 assigned_dev_id; __u16 entry_nr; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 961603f..c774027 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -281,13 +281,33 @@ static void kvm_free_assigned_irq(struct kvm *kvm, * now, the kvm state is still legal for probably we also have to wait * interrupt_work done. */ - disable_irq_nosync(assigned_dev->host_irq); - cancel_work_sync(&assigned_dev->interrupt_work); + if (assigned_dev->irq_requested_type & KVM_ASSIGNED_DEV_MSIX) { + int i; + for (i = 0; i < assigned_dev->entries_nr; i++) + disable_irq_nosync(assigned_dev-> + host_msix_entries[i].vector); + + cancel_work_sync(&assigned_dev->interrupt_work); + + for (i = 0; i < assigned_dev->entries_nr; i++) + free_irq(assigned_dev->host_msix_entries[i].vector, + (void *)assigned_dev); + + assigned_dev->entries_nr = 0; + kfree(assigned_dev->host_msix_entries); + kfree(assigned_dev->guest_msix_entries); + pci_disable_msix(assigned_dev->dev); + } else { + /* Deal with MSI and INTx */ + disable_irq_nosync(assigned_dev->host_irq); + cancel_work_sync(&assigned_dev->interrupt_work); - free_irq(assigned_dev->host_irq, (void *)assigned_dev); + free_irq(assigned_dev->host_irq, (void *)assigned_dev); - if (assigned_dev->irq_requested_type & KVM_ASSIGNED_DEV_HOST_MSI) - pci_disable_msi(assigned_dev->dev); + if (assigned_dev->irq_requested_type & + KVM_ASSIGNED_DEV_HOST_MSI) + pci_disable_msi(assigned_dev->dev); + } assigned_dev->irq_requested_type = 0; } @@ -416,6 +436,69 @@ static int assigned_device_update_msi(struct kvm *kvm, adev->irq_requested_type |= KVM_ASSIGNED_DEV_HOST_MSI; return 0; } + +static int assigned_device_update_msix(struct kvm *kvm, + struct kvm_assigned_dev_kernel *adev, + struct kvm_assigned_irq *airq) +{ + /* TODO Deal with KVM_DEV_IRQ_ASSIGNED_MASK_MSIX */ + int i, r; + + adev->ack_notifier.gsi = -1; + + if (irqchip_in_kernel(kvm)) { + if (airq->flags & KVM_DEV_IRQ_ASSIGN_MASK_MSIX) { + printk(KERN_WARNING + "kvm: unsupported mask MSI-X, flags 0x%x!\n", + airq->flags); + return 0; + } + + if (!(airq->flags & KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX)) { + /* Guest disable MSI-X */ + kvm_free_assigned_irq(kvm, adev); + if (msi2intx) { + pci_enable_msi(adev->dev); + if (adev->dev->msi_enabled) + return assigned_device_update_msi(kvm, + adev, airq); + } + return assigned_device_update_intx(kvm, adev, airq); + } + + /* host_msix_entries and guest_msix_entries should have been + * initialized */ + + if (adev->entries_nr == 0) { + printk(KERN_ERR + "kvm: MSI-X entry not initialized!\n"); + return -EFAULT; + } + + kvm_free_assigned_irq(kvm, adev); + + r = pci_enable_msix(adev->dev, adev->host_msix_entries, + adev->entries_nr); + if (r) { + printk(KERN_ERR "Fail to enable MSI-X feature!\n"); + return r; + } + + for (i = 0; i < adev->entries_nr; i++) { + r = request_irq((adev->host_msix_entries + i)->vector, + kvm_assigned_dev_intr, 0, + "kvm_assigned_msix_device", + (void *)adev); + if (r) + return r; + } + } + + adev->irq_requested_type |= KVM_ASSIGNED_DEV_MSIX; + + return 0; +} + #endif static int kvm_vm_ioctl_assign_irq(struct kvm *kvm, @@ -462,12 +545,24 @@ static int kvm_vm_ioctl_assign_irq(struct kvm *kvm, } } - if ((match->irq_requested_type & KVM_ASSIGNED_DEV_HOST_MSI) && + if (match->irq_requested_type & KVM_ASSIGNED_DEV_MSIX) + current_flags |= KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX; + else if ((match->irq_requested_type & KVM_ASSIGNED_DEV_HOST_MSI) && (match->irq_requested_type & KVM_ASSIGNED_DEV_GUEST_MSI)) current_flags |= KVM_DEV_IRQ_ASSIGN_ENABLE_MSI; changed_flags = assigned_irq->flags ^ current_flags; +#ifdef CONFIG_X86 + if (changed_flags & KVM_DEV_IRQ_ASSIGN_MSIX_ACTION) { + r = assigned_device_update_msix(kvm, match, assigned_irq); + if (r) { + printk(KERN_WARNING "kvm: failed to execute " + "MSI-X action!\n"); + goto out_release; + } + } else +#endif if ((changed_flags & KVM_DEV_IRQ_ASSIGN_MSI_ACTION) || (msi2intx && match->dev->msi_enabled)) { #ifdef CONFIG_X86