From patchwork Mon Feb 27 01:45:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Gao X-Patchwork-Id: 9592849 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A465460453 for ; Mon, 27 Feb 2017 08:51:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 997FE28454 for ; Mon, 27 Feb 2017 08:51:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8E4A52844C; Mon, 27 Feb 2017 08:51:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=2.0 tests=BAYES_00, DATE_IN_PAST_06_12, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 979C92844C for ; Mon, 27 Feb 2017 08:51:00 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ciGzE-00067v-Ag; Mon, 27 Feb 2017 08:48:52 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ciGzD-000678-50 for xen-devel@lists.xen.org; Mon, 27 Feb 2017 08:48:51 +0000 Received: from [85.158.139.211] by server-6.bemta-5.messagelabs.com id EF/C3-28994-2F7E3B85; Mon, 27 Feb 2017 08:48:50 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrLLMWRWlGSWpSXmKPExsVywNykQvfj880 RBqdmK1ss+biYxYHR4+ju30wBjFGsmXlJ+RUJrBk/z+9mLDgSVXH29C2WBsaTLl2MnBxCAtMZ JVY9TgKxJQR4JY4sm8HaxcgBZAdI7PjIA1FSLjHhxy4mEJtNQFni4tdeNhBbREBa4trny4xdj FwczAJPmSTebfvCApIQFgiUuHf1GjuIzSKgKnH49zywOK+Ak8SGQ7dZIXYpSEx5+J4ZxOYUcJ a4vaWLFWKZk8TlCc8YJzDyLmBkWMWoUZxaVJZapGtkqpdUlJmeUZKbmJmja2hgqpebWlycmJ6 ak5hUrJecn7uJERgM9QwMjDsYd7X7HWKU5GBSEuVdZLwxQogvKT+lMiOxOCO+qDQntfgQowwH h5IE7/FnmyOEBItS01Mr0jJzgGEJk5bg4FES4d0IkuYtLkjMLc5Mh0idYlSUEuc9BJIQAElkl ObBtcFi4RKjrJQwLyMDA4MQT0FqUW5mCar8K0ZxDkYlYd6PIFN4MvNK4Ka/AlrMBLR4NsjNvM UliQgpqQbGjA1atxd7rFSMifr187LWnC8P7yfHqk7lbv7XFf3IpvbgPh4/pRUe+yYFHIr4fHG qa4LtBcZpEw7d9I/PKNZzO647YZbH5tTrFTdDD2f9W19XIr1pZStPiuDs000qKtIJxk91D2xu 2GzVcjDx5JFPf1y/cH38EfIlOyxm+ucrU77vbhW+XjGVW4mlOCPRUIu5qDgRAFgvYdOAAgAA X-Env-Sender: chao.gao@intel.com X-Msg-Ref: server-5.tower-206.messagelabs.com!1488185327!85216824!1 X-Originating-IP: [192.55.52.120] X-SpamReason: No, hits=0.8 required=7.0 tests=DATE_IN_PAST_06_12 X-StarScan-Received: X-StarScan-Version: 9.2.3; banners=-,-,- X-VirusChecked: Checked Received: (qmail 48783 invoked from network); 27 Feb 2017 08:48:49 -0000 Received: from mga04.intel.com (HELO mga04.intel.com) (192.55.52.120) by server-5.tower-206.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 27 Feb 2017 08:48:49 -0000 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Feb 2017 00:48:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.35,213,1484035200"; d="scan'208"; a="1135354112" Received: from skl-2s3.sh.intel.com ([10.239.48.35]) by fmsmga002.fm.intel.com with ESMTP; 27 Feb 2017 00:48:19 -0800 From: Chao Gao To: xen-devel@lists.xen.org Date: Mon, 27 Feb 2017 09:45:46 +0800 Message-Id: <1488159949-15011-6-git-send-email-chao.gao@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1488159949-15011-1-git-send-email-chao.gao@intel.com> References: <1488159949-15011-1-git-send-email-chao.gao@intel.com> Cc: Kevin Tian , Feng Wu , Jun Nakajima , George Dunlap , Andrew Cooper , Dario Faggioli , Jan Beulich , Chao Gao Subject: [Xen-devel] [PATCH v9 5/8] VT-d: Introduce a new function update_irte_for_msi_common X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Both pi_update_irte() and msi_msg_to_remap_entry() update the content of IRTE; besides, the current msi_msg_to_remap_entry is buggy when the live IRTE is in posted format. This patch try to rework these two functions to make them clearer by moving their common part to the new function. Signed-off-by: Feng Wu Signed-off-by: Chao Gao --- v9: - Newly added. xen/drivers/passthrough/vtd/intremap.c | 232 +++++++++++++++++++-------------- 1 file changed, 131 insertions(+), 101 deletions(-) diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c index bfd468b..4269cd4 100644 --- a/xen/drivers/passthrough/vtd/intremap.c +++ b/xen/drivers/passthrough/vtd/intremap.c @@ -420,7 +420,8 @@ void io_apic_write_remap_rte( __ioapic_write_entry(apic, ioapic_pin, 1, old_rte); } -static void set_msi_source_id(struct pci_dev *pdev, struct iremap_entry *ire) +static void set_msi_source_id(const struct pci_dev *pdev, + struct iremap_entry *ire) { u16 seg; u8 bus, devfn, secbus; @@ -547,16 +548,116 @@ static int remap_entry_to_msi_msg( return 0; } +/* + * This function is a common interface to update irte for msi case. + * + * If @pi_desc != NULL and @gvec != 0, the IRTE will be updated to a posted + * format. In this case, @msg is ignored because constructing a posted format + * IRTE doesn't need any information about the msi address or msi data. + * + * If @pi_desc == NULL and @gvec == 0, the IRTE will be updated to a remapped + * format. In this case, @msg can't be NULL. + * + * Assume 'ir_ctrl->iremap_lock' has been acquired and the remap_index + * of msi_desc has a benign value. + */ +static int update_irte_for_msi_common( + struct iommu *iommu, const struct pci_dev *pdev, + const struct msi_desc *msi_desc, struct msi_msg *msg, + const struct pi_desc *pi_desc, const uint8_t gvec) +{ + struct iremap_entry *iremap_entry = NULL, *iremap_entries; + struct iremap_entry new_ire = {{0}}; + unsigned int index = msi_desc->remap_index; + struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu); + + ASSERT( ir_ctrl ); + ASSERT( spin_is_locked(&ir_ctrl->iremap_lock) ); + ASSERT( (index >= 0) && (index < IREMAP_ENTRY_NR) ); + + if ( (!pi_desc && gvec) || (pi_desc && !gvec) ) + return -EINVAL; + + if ( !pi_desc && !gvec && !msg ) + return -EINVAL; + + GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index, + iremap_entries, iremap_entry); + + if ( !pi_desc ) + { + /* Set interrupt remapping table entry */ + new_ire.remap.dm = msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT; + new_ire.remap.tm = msg->data >> MSI_DATA_TRIGGER_SHIFT; + new_ire.remap.dlm = msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT; + /* Hardware require RH = 1 for LPR delivery mode */ + new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio); + new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) & + MSI_DATA_VECTOR_MASK; + if ( x2apic_enabled ) + new_ire.remap.dst = msg->dest32; + else + new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT) + & 0xff) << 8; + new_ire.remap.p = 1; + } + else + { + new_ire.post.im = 1; + new_ire.post.vector = gvec; + new_ire.post.pda_l = virt_to_maddr(pi_desc) >> (32 - PDA_LOW_BIT); + new_ire.post.pda_h = virt_to_maddr(pi_desc) >> 32; + new_ire.post.p = 1; + } + + if ( pdev ) + set_msi_source_id(pdev, &new_ire); + else + set_hpet_source_id(msi_desc->hpet_id, &new_ire); + + if ( iremap_entry->val != new_ire.val ) + { + if ( cpu_has_cx16 ) + { + __uint128_t ret; + struct iremap_entry old_ire; + + old_ire = *iremap_entry; + ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire); + + /* + * In the above, we use cmpxchg16 to atomically update the 128-bit + * IRTE, and the hardware cannot update the IRTE behind us, so + * the return value of cmpxchg16 should be the same as old_ire. + * This ASSERT validate it. + */ + ASSERT(ret == old_ire.val); + } + else + { + iremap_entry->lo = new_ire.lo; + iremap_entry->hi = new_ire.hi; + } + + iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry)); + iommu_flush_iec_index(iommu, 0, index); + } + + unmap_vtd_domain_page(iremap_entries); + return 0; +} + static int msi_msg_to_remap_entry( - struct iommu *iommu, struct pci_dev *pdev, + struct iommu *iommu, const struct pci_dev *pdev, struct msi_desc *msi_desc, struct msi_msg *msg) { struct iremap_entry *iremap_entry = NULL, *iremap_entries; - struct iremap_entry new_ire; struct msi_msg_remap_entry *remap_rte; unsigned int index, i, nr = 1; unsigned long flags; struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu); + void *pi_desc; + int gvec; if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI ) nr = msi_desc->msi.nvec; @@ -592,38 +693,33 @@ static int msi_msg_to_remap_entry( return -EFAULT; } + /* Get the IRTE's bind relationship with guest from the live IRTE. */ GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index, iremap_entries, iremap_entry); - - memcpy(&new_ire, iremap_entry, sizeof(struct iremap_entry)); - - /* Set interrupt remapping table entry */ - new_ire.remap.fpd = 0; - new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1; - new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1; - new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1; - /* Hardware require RH = 1 for LPR delivery mode */ - new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio); - new_ire.remap.avail = 0; - new_ire.remap.res_1 = 0; - new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) & - MSI_DATA_VECTOR_MASK; - new_ire.remap.res_2 = 0; - if ( x2apic_enabled ) - new_ire.remap.dst = msg->dest32; + if ( !iremap_entry->remap.im ) + { + gvec = 0; + pi_desc = NULL; + } else - new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT) - & 0xff) << 8; + { + gvec = iremap_entry->post.vector; + pi_desc = (void *)((((u64)iremap_entry->post.pda_h) << PDA_LOW_BIT ) + + iremap_entry->post.pda_l); + } + unmap_vtd_domain_page(iremap_entries); - if ( pdev ) - set_msi_source_id(pdev, &new_ire); - else - set_hpet_source_id(msi_desc->hpet_id, &new_ire); - new_ire.remap.res_3 = 0; - new_ire.remap.res_4 = 0; - new_ire.remap.p = 1; /* finally, set present bit */ + /* + * Actually we can just suppress the update when IRTE is already in posted + * format. After a msi gets bound to a guest interrupt, changes to the msi + * message have no effect to the IRTE. + */ + update_irte_for_msi_common(iommu, pdev, msi_desc, msg, pi_desc, gvec); /* now construct new MSI/MSI-X rte entry */ + if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI ) + nr = msi_desc->msi.nvec; + remap_rte = (struct msi_msg_remap_entry *)msg; remap_rte->address_lo.dontcare = 0; i = index; @@ -637,11 +733,6 @@ static int msi_msg_to_remap_entry( remap_rte->address_hi = 0; remap_rte->data = index - i; - memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry)); - iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry)); - iommu_flush_iec_index(iommu, 0, index); - - unmap_vtd_domain_page(iremap_entries); spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags); return 0; } @@ -902,42 +993,6 @@ void iommu_disable_x2apic_IR(void) disable_qinval(drhd->iommu); } -static void setup_posted_irte( - struct iremap_entry *new_ire, const struct iremap_entry *old_ire, - const struct pi_desc *pi_desc, const uint8_t gvec) -{ - memset(new_ire, 0, sizeof(*new_ire)); - - /* - * 'im' filed decides whether the irte is in posted format (with value 1) - * or remapped format (with value 0), if the old irte is in remapped format, - * we copy things from remapped part in 'struct iremap_entry', otherwise, - * we copy from posted part. - */ - if ( !old_ire->remap.im ) - { - new_ire->post.p = old_ire->remap.p; - new_ire->post.fpd = old_ire->remap.fpd; - new_ire->post.sid = old_ire->remap.sid; - new_ire->post.sq = old_ire->remap.sq; - new_ire->post.svt = old_ire->remap.svt; - } - else - { - new_ire->post.p = old_ire->post.p; - new_ire->post.fpd = old_ire->post.fpd; - new_ire->post.sid = old_ire->post.sid; - new_ire->post.sq = old_ire->post.sq; - new_ire->post.svt = old_ire->post.svt; - new_ire->post.urg = old_ire->post.urg; - } - - new_ire->post.im = 1; - new_ire->post.vector = gvec; - new_ire->post.pda_l = virt_to_maddr(pi_desc) >> (32 - PDA_LOW_BIT); - new_ire->post.pda_h = virt_to_maddr(pi_desc) >> 32; -} - /* * This function is used to update the IRTE for posted-interrupt * when guest changes MSI/MSI-X information. @@ -947,16 +1002,13 @@ int pi_update_irte(const struct vcpu *v, const struct pirq *pirq, { struct irq_desc *desc; const struct msi_desc *msi_desc; - int remap_index; int rc = 0; const struct pci_dev *pci_dev; const struct acpi_drhd_unit *drhd; struct iommu *iommu; struct ir_ctrl *ir_ctrl; - struct iremap_entry *iremap_entries = NULL, *p = NULL; - struct iremap_entry new_ire, old_ire; + unsigned long flags; const struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc; - __uint128_t ret; desc = pirq_spin_lock_irq_desc(pirq, NULL); if ( !desc ) @@ -976,8 +1028,6 @@ int pi_update_irte(const struct vcpu *v, const struct pirq *pirq, goto unlock_out; } - remap_index = msi_desc->remap_index; - spin_unlock_irq(&desc->lock); ASSERT(pcidevs_locked()); @@ -996,31 +1046,11 @@ int pi_update_irte(const struct vcpu *v, const struct pirq *pirq, if ( !ir_ctrl ) return -ENODEV; - spin_lock_irq(&ir_ctrl->iremap_lock); - - GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, remap_index, iremap_entries, p); - - old_ire = *p; - - /* Setup/Update interrupt remapping table entry. */ - setup_posted_irte(&new_ire, &old_ire, pi_desc, gvec); - ret = cmpxchg16b(p, &old_ire, &new_ire); - - /* - * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE, - * and the hardware cannot update the IRTE behind us, so the return value - * of cmpxchg16 should be the same as old_ire. This ASSERT validate it. - */ - ASSERT(ret == old_ire.val); - - iommu_flush_cache_entry(p, sizeof(*p)); - iommu_flush_iec_index(iommu, 0, remap_index); - - unmap_vtd_domain_page(iremap_entries); - - spin_unlock_irq(&ir_ctrl->iremap_lock); - - return 0; + spin_lock_irqsave(&ir_ctrl->iremap_lock, flags); + rc = update_irte_for_msi_common(iommu, pci_dev, msi_desc, NULL, pi_desc, + gvec); + spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags); + return rc; unlock_out: spin_unlock_irq(&desc->lock);