From patchwork Sat Jun 22 00:19:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dey, Megha" X-Patchwork-Id: 11010903 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BC7AE6C5 for ; Fri, 21 Jun 2019 23:58:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF1232870D for ; Fri, 21 Jun 2019 23:58:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A35C728BAD; Fri, 21 Jun 2019 23:58:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 33A7D2870D for ; Fri, 21 Jun 2019 23:58:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726045AbfFUX6G (ORCPT ); Fri, 21 Jun 2019 19:58:06 -0400 Received: from mga09.intel.com ([134.134.136.24]:48779 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726134AbfFUX5W (ORCPT ); Fri, 21 Jun 2019 19:57:22 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Jun 2019 16:57:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,402,1557212400"; d="scan'208";a="359022882" Received: from megha-z97x-ud7-th.sc.intel.com ([143.183.85.162]) by fmsmga005.fm.intel.com with ESMTP; 21 Jun 2019 16:57:21 -0700 From: Megha Dey To: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, marc.zyngier@arm.com, ashok.raj@intel.com, jacob.jun.pan@linux.intel.com, megha.dey@intel.com, Megha Dey Subject: [RFC V1 RESEND 1/6] PCI/MSI: New structures/macros for dynamic MSI-X allocation Date: Fri, 21 Jun 2019 17:19:33 -0700 Message-Id: <1561162778-12669-2-git-send-email-megha.dey@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> References: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is a preparatory patch to introduce the dynamic allocation of MSI-X vectors. In this patch, we add new structure members and macros which will be consumed by the API that will dynamically allocate MSI-X vectors. Cc: Jacob Pan Cc: Ashok Raj Signed-off-by: Megha Dey --- include/linux/device.h | 3 +++ include/linux/msi.h | 9 +++++++++ include/linux/pci.h | 6 ++++++ 3 files changed, 18 insertions(+) diff --git a/include/linux/device.h b/include/linux/device.h index 848fc71..99d4951 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -967,6 +967,7 @@ struct dev_links_info { * device. * @dma_coherent: this particular device is dma coherent, even if the * architecture supports non-coherent devices. + * @grp_first_desc: Pointer to the first msi_desc in every MSI-X group * * At the lowest level, every device in a Linux system is represented by an * instance of struct device. The device structure contains the information @@ -1062,6 +1063,8 @@ struct device { defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) bool dma_coherent:1; #endif + /* For dynamic MSI-X allocation */ + struct msi_desc *grp_first_desc; }; static inline struct device *kobj_to_dev(struct kobject *kobj) diff --git a/include/linux/msi.h b/include/linux/msi.h index d48e919..91273cd 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -74,6 +74,7 @@ struct ti_sci_inta_msi_desc { * @default_irq:[PCI MSI/X] The default pre-assigned non-MSI irq * @mask_pos: [PCI MSI] Mask register position * @mask_base: [PCI MSI-X] Mask register base address + * @group_id: [PCI MSI-X] group to which this descriptor belongs * @platform: [platform] Platform device specific msi descriptor data * @fsl_mc: [fsl-mc] FSL MC device specific msi descriptor data * @inta: [INTA] TISCI based INTA specific msi descriptor data @@ -107,6 +108,7 @@ struct msi_desc { u8 mask_pos; void __iomem *mask_base; }; + unsigned int group_id; }; /* @@ -131,6 +133,10 @@ struct msi_desc { list_for_each_entry((desc), dev_to_msi_list((dev)), list) #define for_each_msi_entry_safe(desc, tmp, dev) \ list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list) +/* Iterate through MSI entries of device dev starting from a given desc */ +#define for_each_msi_entry_from(desc, dev) \ + desc = (*dev).grp_first_desc; \ + list_for_each_entry_from((desc), dev_to_msi_list((dev)), list) \ #ifdef CONFIG_IRQ_MSI_IOMMU static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc) @@ -159,6 +165,9 @@ static inline void msi_desc_set_iommu_cookie(struct msi_desc *desc, #define first_pci_msi_entry(pdev) first_msi_entry(&(pdev)->dev) #define for_each_pci_msi_entry(desc, pdev) \ for_each_msi_entry((desc), &(pdev)->dev) +/* Iterate through PCI-MSI entries of pdev starting from a given desc */ +#define for_each_pci_msi_entry_from(desc, pdev) \ + for_each_msi_entry_from((desc), &(pdev)->dev) struct pci_dev *msi_desc_to_pci_dev(struct msi_desc *desc); void *msi_desc_to_pci_sysdata(struct msi_desc *desc); diff --git a/include/linux/pci.h b/include/linux/pci.h index dd436da..b9a1d41 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -465,6 +465,12 @@ struct pci_dev { char *driver_override; /* Driver name to force a match */ unsigned long priv_flags; /* Private flags for the PCI driver */ + /* For dynamic MSI-X allocation */ + unsigned int num_msix; /* Number of MSI-X vectors supported */ + void __iomem *base; /* Holds base address of MSI-X table */ + struct idr *grp_idr; /* IDR to assign group to MSI-X vecs */ + unsigned long *entry; /* Bitmap to represent MSI-X entries */ + bool one_shot; /* If true, oneshot MSI-X allocation */ }; static inline struct pci_dev *pci_physfn(struct pci_dev *dev) From patchwork Sat Jun 22 00:19:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dey, Megha" X-Patchwork-Id: 11010901 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 919F16C5 for ; Fri, 21 Jun 2019 23:58:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 822D62870D for ; Fri, 21 Jun 2019 23:58:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 766E928B81; Fri, 21 Jun 2019 23:58:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 485E928BB7 for ; Fri, 21 Jun 2019 23:58:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726299AbfFUX5X (ORCPT ); Fri, 21 Jun 2019 19:57:23 -0400 Received: from mga09.intel.com ([134.134.136.24]:48779 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726285AbfFUX5X (ORCPT ); Fri, 21 Jun 2019 19:57:23 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Jun 2019 16:57:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,402,1557212400"; d="scan'208";a="359022886" Received: from megha-z97x-ud7-th.sc.intel.com ([143.183.85.162]) by fmsmga005.fm.intel.com with ESMTP; 21 Jun 2019 16:57:21 -0700 From: Megha Dey To: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, marc.zyngier@arm.com, ashok.raj@intel.com, jacob.jun.pan@linux.intel.com, megha.dey@intel.com, Megha Dey Subject: [RFC V1 RESEND 2/6] PCI/MSI: Dynamic allocation of MSI-X vectors by group Date: Fri, 21 Jun 2019 17:19:34 -0700 Message-Id: <1561162778-12669-3-git-send-email-megha.dey@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> References: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently, MSI-X vector enabling and allocation for a PCIe device is static i.e. a device driver gets only one chance to enable a specific number of MSI-X vectors, usually during device probe. Also, in many cases, drivers usually reserve more than required number of vectors anticipating their use, which unnecessarily blocks resources that could have been made available to other devices. Lastly, there is no way for drivers to reserve more vectors, if the MSI-X has already been enabled for that device. Hence, a dynamic MSI-X kernel infrastructure can benefit drivers by deferring MSI-X allocation to post probe phase, where actual demand information is available. This patch enables the dynamic allocation of MSI-X vectors even after MSI-X is enabled for a PCIe device by introducing a new API: pci_alloc_irq_vectors_dyn(). This API can be called multiple times by the driver. The MSI-X vectors allocated each time are associated with a group ID. If the existing static allocation is used, a default group ID of -1 is assigned. The existing pci_alloc_irq_vectors() and the new pci_alloc_irq_vectors_dyn() API cannot be used alongside each other. Lastly, in order to obtain the Linux IRQ number associated with any vector in a group, a new API pci_irq_vector_group() has been introduced. Cc: Jacob Pan Cc: Ashok Raj Signed-off-by: Megha Dey --- drivers/pci/msi.c | 203 +++++++++++++++++++++++++++++++++++++++++++++------- drivers/pci/probe.c | 8 +++ include/linux/pci.h | 37 ++++++++++ kernel/irq/msi.c | 8 +-- 4 files changed, 226 insertions(+), 30 deletions(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index e039b74..73ad9bf 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -102,7 +102,7 @@ int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) if (type == PCI_CAP_ID_MSI && nvec > 1) return 1; - for_each_pci_msi_entry(entry, dev) { + for_each_pci_msi_entry_from(entry, dev) { ret = arch_setup_msi_irq(dev, entry); if (ret < 0) return ret; @@ -468,7 +468,7 @@ static int populate_msi_sysfs(struct pci_dev *pdev) int i; /* Determine how many msi entries we have */ - for_each_pci_msi_entry(entry, pdev) + for_each_pci_msi_entry_from(entry, pdev) num_msi += entry->nvec_used; if (!num_msi) return 0; @@ -477,7 +477,7 @@ static int populate_msi_sysfs(struct pci_dev *pdev) msi_attrs = kcalloc(num_msi + 1, sizeof(void *), GFP_KERNEL); if (!msi_attrs) return -ENOMEM; - for_each_pci_msi_entry(entry, pdev) { + for_each_pci_msi_entry_from(entry, pdev) { for (i = 0; i < entry->nvec_used; i++) { msi_dev_attr = kzalloc(sizeof(*msi_dev_attr), GFP_KERNEL); if (!msi_dev_attr) @@ -506,7 +506,11 @@ static int populate_msi_sysfs(struct pci_dev *pdev) goto error_irq_group; msi_irq_groups[0] = msi_irq_group; - ret = sysfs_create_groups(&pdev->dev.kobj, msi_irq_groups); + if (!pdev->msix_enabled) + ret = sysfs_create_group(&pdev->dev.kobj, msi_irq_group); + else + ret = sysfs_merge_group(&pdev->dev.kobj, msi_irq_group); + if (ret) goto error_irq_groups; pdev->msi_irq_groups = msi_irq_groups; @@ -574,7 +578,7 @@ static int msi_verify_entries(struct pci_dev *dev) { struct msi_desc *entry; - for_each_pci_msi_entry(entry, dev) { + for_each_pci_msi_entry_from(entry, dev) { if (!dev->no_64bit_msi || !entry->msg.address_hi) continue; pci_err(dev, "Device has broken 64-bit MSI but arch" @@ -615,6 +619,9 @@ static int msi_capability_init(struct pci_dev *dev, int nvec, list_add_tail(&entry->list, dev_to_msi_list(&dev->dev)); + dev->dev.grp_first_desc = list_last_entry + (dev_to_msi_list(&dev->dev), struct msi_desc, list); + /* Configure MSI capability structure */ ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI); if (ret) { @@ -669,7 +676,7 @@ static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries) static int msix_setup_entries(struct pci_dev *dev, void __iomem *base, struct msix_entry *entries, int nvec, - struct irq_affinity *affd) + struct irq_affinity *affd, int group) { struct irq_affinity_desc *curmsk, *masks = NULL; struct msi_desc *entry; @@ -698,8 +705,20 @@ static int msix_setup_entries(struct pci_dev *dev, void __iomem *base, entry->msi_attrib.entry_nr = i; entry->msi_attrib.default_irq = dev->irq; entry->mask_base = base; + entry->group_id = group; list_add_tail(&entry->list, dev_to_msi_list(&dev->dev)); + + /* + * Save the pointer to the first msi_desc entry of every + * MSI-X group. This pointer is used by other functions + * as the starting point to iterate through each of the + * entries in that particular group. + */ + if (!i) + dev->dev.grp_first_desc = list_last_entry + (dev_to_msi_list(&dev->dev), struct msi_desc, list); + if (masks) curmsk++; } @@ -715,7 +734,7 @@ static void msix_program_entries(struct pci_dev *dev, struct msi_desc *entry; int i = 0; - for_each_pci_msi_entry(entry, dev) { + for_each_pci_msi_entry_from(entry, dev) { if (entries) entries[i++].vector = entry->irq; entry->masked = readl(pci_msix_desc_addr(entry) + @@ -730,28 +749,31 @@ static void msix_program_entries(struct pci_dev *dev, * @entries: pointer to an array of struct msix_entry entries * @nvec: number of @entries * @affd: Optional pointer to enable automatic affinity assignement + * @group: The Group ID to be allocated to the msi-x vectors * * Setup the MSI-X capability structure of device function with a * single MSI-X irq. A return of zero indicates the successful setup of * requested MSI-X entries with allocated irqs or non-zero for otherwise. **/ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, - int nvec, struct irq_affinity *affd) + int nvec, struct irq_affinity *affd, int group) { int ret; u16 control; - void __iomem *base; /* Ensure MSI-X is disabled while it is set up */ pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control); + /* Request & Map MSI-X table region */ - base = msix_map_region(dev, msix_table_size(control)); - if (!base) - return -ENOMEM; + if (!dev->msix_enabled) { + dev->base = msix_map_region(dev, msix_table_size(control)); + if (!dev->base) + return -ENOMEM; + } - ret = msix_setup_entries(dev, base, entries, nvec, affd); + ret = msix_setup_entries(dev, dev->base, entries, nvec, affd, group); if (ret) return ret; @@ -784,6 +806,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0); pcibios_free_irq(dev); + return 0; out_avail: @@ -795,7 +818,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries, struct msi_desc *entry; int avail = 0; - for_each_pci_msi_entry(entry, dev) { + for_each_pci_msi_entry_from(entry, dev) { if (entry->irq != 0) avail++; } @@ -932,7 +955,8 @@ int pci_msix_vec_count(struct pci_dev *dev) EXPORT_SYMBOL(pci_msix_vec_count); static int __pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, - int nvec, struct irq_affinity *affd) + int nvec, struct irq_affinity *affd, + bool one_shot, int group) { int nr_entries; int i, j; @@ -963,7 +987,7 @@ static int __pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, pci_info(dev, "can't enable MSI-X (MSI IRQ already assigned)\n"); return -EINVAL; } - return msix_capability_init(dev, entries, nvec, affd); + return msix_capability_init(dev, entries, nvec, affd, group); } static void pci_msix_shutdown(struct pci_dev *dev) @@ -1079,16 +1103,14 @@ EXPORT_SYMBOL(pci_enable_msi); static int __pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec, - int maxvec, struct irq_affinity *affd) + int maxvec, struct irq_affinity *affd, + bool one_shot, int group) { int rc, nvec = maxvec; if (maxvec < minvec) return -ERANGE; - if (WARN_ON_ONCE(dev->msix_enabled)) - return -EINVAL; - for (;;) { if (affd) { nvec = irq_calc_affinity_vectors(minvec, nvec, affd); @@ -1096,7 +1118,8 @@ static int __pci_enable_msix_range(struct pci_dev *dev, return -ENOSPC; } - rc = __pci_enable_msix(dev, entries, nvec, affd); + rc = __pci_enable_msix(dev, entries, nvec, affd, one_shot, + group); if (rc == 0) return nvec; @@ -1127,7 +1150,8 @@ static int __pci_enable_msix_range(struct pci_dev *dev, int pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries, int minvec, int maxvec) { - return __pci_enable_msix_range(dev, entries, minvec, maxvec, NULL); + return __pci_enable_msix_range(dev, entries, minvec, maxvec, NULL, + false, -1); } EXPORT_SYMBOL(pci_enable_msix_range); @@ -1153,9 +1177,49 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs, unsigned int max_vecs, unsigned int flags, struct irq_affinity *affd) { + int group = -1; + + dev->one_shot = true; + + return pci_alloc_irq_vectors_affinity_dyn(dev, min_vecs, max_vecs, + flags, NULL, &group, dev->one_shot); +} +EXPORT_SYMBOL(pci_alloc_irq_vectors_affinity); + +/** + * pci_alloc_irq_vectors_affinity_dyn - allocate multiple IRQs for a device + * dynamically. Can be called multiple times. + * @dev: PCI device to operate on + * @min_vecs: minimum number of vectors required (must be >= 1) + * @max_vecs: maximum (desired) number of vectors + * @flags: flags or quirks for the allocation + * @affd: optional description of the affinity requirements + * @group_id: group ID assigned to vectors allocated + * @one_shot: true if dynamic MSI-X allocation is disabled, else false + * + * Allocate up to @max_vecs interrupt vectors for @dev, using MSI-X. Return + * the number of vectors allocated (which might be smaller than @max_vecs) + * if successful, or a negative error code on error. If less than @min_vecs + * interrupt vectors are available for @dev the function will fail with -ENOSPC. + * Assign a unique group ID to the set of vectors being allocated. + * + * To get the Linux IRQ number used for a vector that can be passed to + * request_irq() use the pci_irq_vector_group() helper. + */ +int pci_alloc_irq_vectors_affinity_dyn(struct pci_dev *dev, + unsigned int min_vecs, + unsigned int max_vecs, + unsigned int flags, + struct irq_affinity *affd, + int *group_id, bool one_shot) +{ + struct irq_affinity msi_default_affd = {0}; - int msix_vecs = -ENOSPC; + int msix_vecs = -ENOSPC, i; int msi_vecs = -ENOSPC; + struct msix_entry *entries = NULL; + struct msi_desc *entry; + int p = 0; if (flags & PCI_IRQ_AFFINITY) { if (!affd) @@ -1166,8 +1230,54 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs, } if (flags & PCI_IRQ_MSIX) { - msix_vecs = __pci_enable_msix_range(dev, NULL, min_vecs, - max_vecs, affd); + if (dev->msix_enabled) { + if (one_shot) { + goto err_alloc; + } else { + for_each_pci_msi_entry(entry, dev) { + if (entry->group_id != -1) + p = 1; + } + if (!p) + goto err_alloc; + } + } else { + dev->num_msix = pci_msix_vec_count(dev); + dev->entry = kcalloc(BITS_TO_LONGS(dev->num_msix), + sizeof(unsigned long), GFP_KERNEL); + if (!dev->entry) + return -ENOMEM; + } + + entries = kcalloc(max_vecs, sizeof(struct msix_entry), + GFP_KERNEL); + if (entries == NULL) + return -ENOMEM; + + for (i = 0; i < max_vecs; i++) { + entries[i].entry = find_first_zero_bit( + dev->entry, dev->num_msix); + if (entries[i].entry == dev->num_msix) + return -ENOSPC; + set_bit(entries[i].entry, dev->entry); + } + + if (!one_shot) { + /* Assign a unique group ID */ + *group_id = idr_alloc(dev->grp_idr, NULL, + 0, dev->num_msix, GFP_KERNEL); + if (*group_id < 0) { + if (*group_id == -ENOSPC) + pci_err(dev, "No free group IDs\n"); + return *group_id; + } + } + + msix_vecs = __pci_enable_msix_range(dev, entries, min_vecs, + max_vecs, affd, one_shot, *group_id); + + kfree(entries); + if (msix_vecs > 0) return msix_vecs; } @@ -1197,8 +1307,12 @@ int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs, if (msix_vecs == -ENOSPC) return -ENOSPC; return msi_vecs; + +err_alloc: + WARN_ON_ONCE(dev->msix_enabled); + return -EINVAL; } -EXPORT_SYMBOL(pci_alloc_irq_vectors_affinity); +EXPORT_SYMBOL(pci_alloc_irq_vectors_affinity_dyn); /** * pci_free_irq_vectors - free previously allocated IRQs for a device @@ -1248,6 +1362,43 @@ int pci_irq_vector(struct pci_dev *dev, unsigned int nr) EXPORT_SYMBOL(pci_irq_vector); /** + * pci_irq_vector_group - return the IRQ number of a device vector associated + * with a group + * @dev: PCI device to operate on + * @nr: device-relative interrupt vector index (0-based) + * @group_id: group from which IRQ number should be returned + */ +int pci_irq_vector_group(struct pci_dev *dev, unsigned int nr, + unsigned int group_id) +{ + if (dev->msix_enabled) { + struct msi_desc *entry; + int i = 0, grp_present = 0; + + for_each_pci_msi_entry(entry, dev) { + if (entry->group_id == group_id) { + grp_present = 1; + if (i == nr) + return entry->irq; + i++; + } + } + + if (!grp_present) { + pci_err(dev, "Group %d not present\n", group_id); + return -EINVAL; + } + + pci_err(dev, "Interrupt vector index %d does not exist in " + "group %d\n", nr, group_id); + } + + pci_err(dev, "MSI-X not enabled\n"); + return -EINVAL; +} +EXPORT_SYMBOL(pci_irq_vector_group); + +/** * pci_irq_get_affinity - return the affinity of a particular msi vector * @dev: PCI device to operate on * @nr: device-relative interrupt vector index (0-based). diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 0e8e2c1..491c1cf 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2426,6 +2426,14 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus) if (!dev) return NULL; + /* For dynamic MSI-x */ + dev->grp_idr = kzalloc(sizeof(struct idr), GFP_KERNEL); + if (!dev->grp_idr) + return NULL; + + /* Initialise the IDR structures */ + idr_init(dev->grp_idr); + INIT_LIST_HEAD(&dev->bus_list); dev->dev.type = &pci_dev_type; dev->bus = pci_bus_get(bus); diff --git a/include/linux/pci.h b/include/linux/pci.h index b9a1d41..c56462c 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1411,9 +1411,17 @@ static inline int pci_enable_msix_exact(struct pci_dev *dev, int pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs, unsigned int max_vecs, unsigned int flags, struct irq_affinity *affd); +int pci_alloc_irq_vectors_affinity_dyn(struct pci_dev *dev, + unsigned int min_vecs, + unsigned int max_vecs, + unsigned int flags, + struct irq_affinity *affd, + int *group_id, bool one_shot); void pci_free_irq_vectors(struct pci_dev *dev); int pci_irq_vector(struct pci_dev *dev, unsigned int nr); +int pci_irq_vector_group(struct pci_dev *dev, unsigned int nr, + unsigned int group_id); const struct cpumask *pci_irq_get_affinity(struct pci_dev *pdev, int vec); int pci_irq_get_node(struct pci_dev *pdev, int vec); @@ -1443,6 +1451,17 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs, return -ENOSPC; } +static inline int +pci_alloc_irq_vectors_affinity_dyn(struct pci_dev *dev, unsigned int min_vecs, + unsigned int max_vecs, unsigned int flags, + struct irq_affinity *aff_desc, + int *group_id, bool one_shot) +{ + if ((flags & PCI_IRQ_LEGACY) && min_vecs == 1 && dev->irq) + return 1; + return -ENOSPC; +} + static inline void pci_free_irq_vectors(struct pci_dev *dev) { } @@ -1453,6 +1472,15 @@ static inline int pci_irq_vector(struct pci_dev *dev, unsigned int nr) return -EINVAL; return dev->irq; } + +static inline int pci_irq_vector_group(struct pci_dev *dev, unsigned int nr, + unsigned int group_id) +{ + if (WARN_ON_ONCE(nr > 0)) + return -EINVAL; + return dev->irq; +} + static inline const struct cpumask *pci_irq_get_affinity(struct pci_dev *pdev, int vec) { @@ -1473,6 +1501,15 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs, NULL); } +static inline int +pci_alloc_irq_vectors_dyn(struct pci_dev *dev, unsigned int min_vecs, + unsigned int max_vecs, unsigned int flags, + int *group_id) +{ + return pci_alloc_irq_vectors_affinity_dyn(dev, min_vecs, max_vecs, + flags, NULL, group_id, false); +} + /** * pci_irqd_intx_xlate() - Translate PCI INTx value to an IRQ domain hwirq * @d: the INTx IRQ domain diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c index ad26fbc..5cfa931 100644 --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -411,7 +411,7 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, if (ret) return ret; - for_each_msi_entry(desc, dev) { + for_each_msi_entry_from(desc, dev) { ops->set_desc(&arg, desc); virq = __irq_domain_alloc_irqs(domain, -1, desc->nvec_used, @@ -437,7 +437,7 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, can_reserve = msi_check_reservation_mode(domain, info, dev); - for_each_msi_entry(desc, dev) { + for_each_msi_entry_from(desc, dev) { virq = desc->irq; if (desc->nvec_used == 1) dev_dbg(dev, "irq %d for MSI\n", virq); @@ -465,7 +465,7 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, * so request_irq() will assign the final vector. */ if (can_reserve) { - for_each_msi_entry(desc, dev) { + for_each_msi_entry_from(desc, dev) { irq_data = irq_domain_get_irq_data(domain, desc->irq); irqd_clr_activated(irq_data); } @@ -473,7 +473,7 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, return 0; cleanup: - for_each_msi_entry(desc, dev) { + for_each_msi_entry_from(desc, dev) { struct irq_data *irqd; if (desc->irq == virq) From patchwork Sat Jun 22 00:19:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dey, Megha" X-Patchwork-Id: 11010869 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 275186C5 for ; Fri, 21 Jun 2019 23:58:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A62E2870D for ; Fri, 21 Jun 2019 23:58:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0E69D28BAF; Fri, 21 Jun 2019 23:58:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9BE742870D for ; Fri, 21 Jun 2019 23:58:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726417AbfFUX5k (ORCPT ); Fri, 21 Jun 2019 19:57:40 -0400 Received: from mga09.intel.com ([134.134.136.24]:48779 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726293AbfFUX5Y (ORCPT ); Fri, 21 Jun 2019 19:57:24 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Jun 2019 16:57:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,402,1557212400"; d="scan'208";a="359022888" Received: from megha-z97x-ud7-th.sc.intel.com ([143.183.85.162]) by fmsmga005.fm.intel.com with ESMTP; 21 Jun 2019 16:57:21 -0700 From: Megha Dey To: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, marc.zyngier@arm.com, ashok.raj@intel.com, jacob.jun.pan@linux.intel.com, megha.dey@intel.com, Megha Dey Subject: [RFC V1 RESEND 3/6] x86: Introduce the dynamic teardown function Date: Fri, 21 Jun 2019 17:19:35 -0700 Message-Id: <1561162778-12669-4-git-send-email-megha.dey@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> References: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is a preparatory patch to introduce disabling of MSI-X vectors belonging to a particular group. In this patch, we introduce a x86 specific mechanism to teardown the IRQ vectors belonging to a particular group. Cc: Jacob Pan Cc: Ashok Raj Signed-off-by: Megha Dey --- arch/x86/include/asm/x86_init.h | 1 + arch/x86/kernel/x86_init.c | 6 ++++++ drivers/pci/msi.c | 18 ++++++++++++++++++ include/linux/msi.h | 2 ++ 4 files changed, 27 insertions(+) diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index b85a7c5..50f26a0 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -283,6 +283,7 @@ struct pci_dev; struct x86_msi_ops { int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); void (*teardown_msi_irq)(unsigned int irq); + void (*teardown_msi_irqs_grp)(struct pci_dev *dev, int group_id); void (*teardown_msi_irqs)(struct pci_dev *dev); void (*restore_msi_irqs)(struct pci_dev *dev); }; diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c index 50a2b49..794e7d4 100644 --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -127,6 +127,7 @@ EXPORT_SYMBOL_GPL(x86_platform); struct x86_msi_ops x86_msi __ro_after_init = { .setup_msi_irqs = native_setup_msi_irqs, .teardown_msi_irq = native_teardown_msi_irq, + .teardown_msi_irqs_grp = default_teardown_msi_irqs_grp, .teardown_msi_irqs = default_teardown_msi_irqs, .restore_msi_irqs = default_restore_msi_irqs, }; @@ -142,6 +143,11 @@ void arch_teardown_msi_irqs(struct pci_dev *dev) x86_msi.teardown_msi_irqs(dev); } +void arch_teardown_msi_irqs_grp(struct pci_dev *dev, int group_id) +{ + x86_msi.teardown_msi_irqs_grp(dev, group_id); +} + void arch_teardown_msi_irq(unsigned int irq) { x86_msi.teardown_msi_irq(irq); diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 73ad9bf..fd7fa6e 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -133,6 +133,24 @@ void __weak arch_teardown_msi_irqs(struct pci_dev *dev) return default_teardown_msi_irqs(dev); } +void default_teardown_msi_irqs_grp(struct pci_dev *dev, int group_id) +{ + int i; + struct msi_desc *entry; + + for_each_pci_msi_entry(entry, dev) { + if (entry->group_id == group_id && entry->irq) { + for (i = 0; i < entry->nvec_used; i++) + arch_teardown_msi_irq(entry->irq + i); + } + } +} + +void __weak arch_teardown_msi_irqs_grp(struct pci_dev *dev, int group_id) +{ + return default_teardown_msi_irqs_grp(dev, group_id); +} + static void default_restore_msi_irq(struct pci_dev *dev, int irq) { struct msi_desc *entry; diff --git a/include/linux/msi.h b/include/linux/msi.h index 91273cd..e61ba24 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -202,9 +202,11 @@ int arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc); void arch_teardown_msi_irq(unsigned int irq); int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type); void arch_teardown_msi_irqs(struct pci_dev *dev); +void arch_teardown_msi_irqs_grp(struct pci_dev *dev, int group_id); void arch_restore_msi_irqs(struct pci_dev *dev); void default_teardown_msi_irqs(struct pci_dev *dev); +void default_teardown_msi_irqs_grp(struct pci_dev *dev, int group_id); void default_restore_msi_irqs(struct pci_dev *dev); struct msi_controller { From patchwork Sat Jun 22 00:19:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dey, Megha" X-Patchwork-Id: 11010837 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 167026C5 for ; Fri, 21 Jun 2019 23:57:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 090C328BAF for ; Fri, 21 Jun 2019 23:57:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F0DBA28BAD; Fri, 21 Jun 2019 23:57:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 87D6928BAD for ; Fri, 21 Jun 2019 23:57:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726311AbfFUX5Z (ORCPT ); Fri, 21 Jun 2019 19:57:25 -0400 Received: from mga09.intel.com ([134.134.136.24]:48780 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726296AbfFUX5Y (ORCPT ); Fri, 21 Jun 2019 19:57:24 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Jun 2019 16:57:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,402,1557212400"; d="scan'208";a="359022891" Received: from megha-z97x-ud7-th.sc.intel.com ([143.183.85.162]) by fmsmga005.fm.intel.com with ESMTP; 21 Jun 2019 16:57:21 -0700 From: Megha Dey To: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, marc.zyngier@arm.com, ashok.raj@intel.com, jacob.jun.pan@linux.intel.com, megha.dey@intel.com, Megha Dey Subject: [RFC V1 RESEND 4/6] PCI/MSI: Introduce new structure to manage MSI-x entries Date: Fri, 21 Jun 2019 17:19:36 -0700 Message-Id: <1561162778-12669-5-git-send-email-megha.dey@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> References: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is a preparatory patch to introduce disabling of MSI-X vectors belonging to a particular group. In this patch, we introduce a new structure msix_sysfs, which manages sysfs entries for dynamically allocated MSI-X vectors belonging to a particular group. Cc: Jacob Pan Cc: Ashok Raj Signed-off-by: Megha Dey --- drivers/pci/msi.c | 12 +++++++++++- drivers/pci/probe.c | 1 + include/linux/pci.h | 9 +++++++++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index fd7fa6e..e947243 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -479,10 +479,11 @@ static int populate_msi_sysfs(struct pci_dev *pdev) struct device_attribute *msi_dev_attr; struct attribute_group *msi_irq_group; const struct attribute_group **msi_irq_groups; + struct msix_sysfs *msix_sysfs_entry; struct msi_desc *entry; int ret = -ENOMEM; int num_msi = 0; - int count = 0; + int count = 0, group = -1; int i; /* Determine how many msi entries we have */ @@ -509,6 +510,8 @@ static int populate_msi_sysfs(struct pci_dev *pdev) goto error_attrs; msi_dev_attr->attr.mode = S_IRUGO; msi_dev_attr->show = msi_mode_show; + if (!i) + group = entry->group_id; ++count; } } @@ -524,6 +527,13 @@ static int populate_msi_sysfs(struct pci_dev *pdev) goto error_irq_group; msi_irq_groups[0] = msi_irq_group; + msix_sysfs_entry = kzalloc(sizeof(*msix_sysfs_entry) * 2, GFP_KERNEL); + msix_sysfs_entry->msi_irq_group = msi_irq_group; + msix_sysfs_entry->group_id = group; + msix_sysfs_entry->vecs_in_grp = count; + INIT_LIST_HEAD(&msix_sysfs_entry->list); + list_add_tail(&msix_sysfs_entry->list, &pdev->msix_sysfs); + if (!pdev->msix_enabled) ret = sysfs_create_group(&pdev->dev.kobj, msi_irq_group); else diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 491c1cf..bb20ef6 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2435,6 +2435,7 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus) idr_init(dev->grp_idr); INIT_LIST_HEAD(&dev->bus_list); + INIT_LIST_HEAD(&dev->msix_sysfs); dev->dev.type = &pci_dev_type; dev->bus = pci_bus_get(bus); diff --git a/include/linux/pci.h b/include/linux/pci.h index c56462c..73385c0 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -471,6 +471,7 @@ struct pci_dev { struct idr *grp_idr; /* IDR to assign group to MSI-X vecs */ unsigned long *entry; /* Bitmap to represent MSI-X entries */ bool one_shot; /* If true, oneshot MSI-X allocation */ + struct list_head msix_sysfs; /* sysfs entries for MSI-X group */ }; static inline struct pci_dev *pci_physfn(struct pci_dev *dev) @@ -1390,6 +1391,14 @@ struct msix_entry { u16 entry; /* Driver uses to specify entry, OS writes */ }; +/* Manage sysfs entries for dynamically allocated MSI-X vectors */ +struct msix_sysfs { + struct attribute_group *msi_irq_group; + struct list_head list; + int group_id; + int vecs_in_grp; +}; + #ifdef CONFIG_PCI_MSI int pci_msi_vec_count(struct pci_dev *dev); void pci_disable_msi(struct pci_dev *dev); From patchwork Sat Jun 22 00:19:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dey, Megha" X-Patchwork-Id: 11010873 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 97D2976 for ; Fri, 21 Jun 2019 23:58:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A6022870D for ; Fri, 21 Jun 2019 23:58:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7E4CE28BB3; Fri, 21 Jun 2019 23:58:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BA4332870D for ; Fri, 21 Jun 2019 23:58:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726237AbfFUX5j (ORCPT ); Fri, 21 Jun 2019 19:57:39 -0400 Received: from mga09.intel.com ([134.134.136.24]:48779 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726290AbfFUX5Y (ORCPT ); Fri, 21 Jun 2019 19:57:24 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Jun 2019 16:57:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,402,1557212400"; d="scan'208";a="359022894" Received: from megha-z97x-ud7-th.sc.intel.com ([143.183.85.162]) by fmsmga005.fm.intel.com with ESMTP; 21 Jun 2019 16:57:21 -0700 From: Megha Dey To: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, marc.zyngier@arm.com, ashok.raj@intel.com, jacob.jun.pan@linux.intel.com, megha.dey@intel.com, Megha Dey Subject: [RFC V1 RESEND 5/6] PCI/MSI: Free MSI-X resources by group Date: Fri, 21 Jun 2019 17:19:37 -0700 Message-Id: <1561162778-12669-6-git-send-email-megha.dey@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> References: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently, the pci_free_irq_vectors() frees all the allocated resources associated with a PCIe device when the device is being shut down. With the introduction of dynamic allocation of MSI-X vectors by group ID, there should exist an API which can free the resources allocated only to a particular group, which can be called even if the device is not being shut down. The pci_free_irq_vectors_grp() function provides this type of interface. The existing pci_free_irq_vectors() can be called along side this API. Cc: Jacob Pan Cc: Ashok Raj Signed-off-by: Megha Dey --- drivers/pci/msi.c | 130 ++++++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/msi.h | 2 + include/linux/pci.h | 9 ++++ kernel/irq/msi.c | 26 +++++++++++ 4 files changed, 167 insertions(+) diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index e947243..342e267 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -53,9 +53,23 @@ static void pci_msi_teardown_msi_irqs(struct pci_dev *dev) else arch_teardown_msi_irqs(dev); } + +static void pci_msi_teardown_msi_irqs_grp(struct pci_dev *dev, int group_id) +{ + struct irq_domain *domain; + + domain = dev_get_msi_domain(&dev->dev); + + if (domain && irq_domain_is_hierarchy(domain)) + msi_domain_free_irqs_grp(domain, &dev->dev, group_id); + else + arch_teardown_msi_irqs_grp(dev, group_id); +} + #else #define pci_msi_setup_msi_irqs arch_setup_msi_irqs #define pci_msi_teardown_msi_irqs arch_teardown_msi_irqs +#define pci_msi_teardown_msi_irqs_grp default_teardown_msi_irqs_grp #endif /* Arch hooks */ @@ -373,6 +387,7 @@ static void free_msi_irqs(struct pci_dev *dev) list_for_each_entry_safe(entry, tmp, msi_list, list) { if (entry->msi_attrib.is_msix) { + clear_bit(entry->msi_attrib.entry_nr, dev->entry); if (list_is_last(&entry->list, msi_list)) iounmap(entry->mask_base); } @@ -381,6 +396,8 @@ static void free_msi_irqs(struct pci_dev *dev) free_msi_entry(entry); } + idr_destroy(dev->grp_idr); + if (dev->msi_irq_groups) { sysfs_remove_groups(&dev->dev.kobj, dev->msi_irq_groups); msi_attrs = dev->msi_irq_groups[0]->attrs; @@ -398,6 +415,60 @@ static void free_msi_irqs(struct pci_dev *dev) } } +static const char msix_sysfs_grp[] = "msi_irqs"; + +static int free_msi_irqs_grp(struct pci_dev *dev, int group_id) +{ + struct list_head *msi_list = dev_to_msi_list(&dev->dev); + struct msi_desc *entry, *tmp; + struct attribute **msi_attrs; + struct device_attribute *dev_attr; + int i; + long vec; + struct msix_sysfs *msix_sysfs_entry, *tmp_msix; + struct list_head *pci_msix = &dev->msix_sysfs; + int num_vec = 0; + + for_each_pci_msi_entry(entry, dev) { + if (entry->group_id == group_id && entry->irq) + for (i = 0; i < entry->nvec_used; i++) + BUG_ON(irq_has_action(entry->irq + i)); + } + + pci_msi_teardown_msi_irqs_grp(dev, group_id); + + list_for_each_entry_safe(entry, tmp, msi_list, list) { + if (entry->group_id == group_id) { + clear_bit(entry->msi_attrib.entry_nr, dev->entry); + list_del(&entry->list); + free_msi_entry(entry); + } + } + + list_for_each_entry_safe(msix_sysfs_entry, tmp_msix, pci_msix, list) { + if (msix_sysfs_entry->group_id == group_id) { + msi_attrs = msix_sysfs_entry->msi_irq_group->attrs; + for (i = 0; i < msix_sysfs_entry->vecs_in_grp; i++) { + if (!i) + num_vec = msix_sysfs_entry->vecs_in_grp; + dev_attr = container_of(msi_attrs[i], + struct device_attribute, attr); + sysfs_remove_file_from_group(&dev->dev.kobj, + &dev_attr->attr, msix_sysfs_grp); + if (kstrtol(dev_attr->attr.name, 10, &vec)) + return -EINVAL; + kfree(dev_attr->attr.name); + kfree(dev_attr); + } + msix_sysfs_entry->msi_irq_group = NULL; + list_del(&msix_sysfs_entry->list); + idr_remove(dev->grp_idr, group_id); + kfree(msix_sysfs_entry); + } + } + return num_vec; +} + static void pci_intx_for_msi(struct pci_dev *dev, int enable) { if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG)) @@ -1052,6 +1123,45 @@ void pci_disable_msix(struct pci_dev *dev) } EXPORT_SYMBOL(pci_disable_msix); +static void pci_msix_shutdown_grp(struct pci_dev *dev, int group_id) +{ + struct msi_desc *entry; + int grp_present = 0; + + if (pci_dev_is_disconnected(dev)) { + dev->msix_enabled = 0; + return; + } + + /* Return the device with MSI-X masked as initial states */ + for_each_pci_msi_entry(entry, dev) { + if (entry->group_id == group_id) { + /* Keep cached states to be restored */ + __pci_msix_desc_mask_irq(entry, 1); + grp_present = 1; + } + } + + if (!grp_present) { + pci_err(dev, "Group to be disabled not present\n"); + return; + } +} + +int pci_disable_msix_grp(struct pci_dev *dev, int group_id) +{ + int num_vecs; + + if (!pci_msi_enable || !dev) + return -EINVAL; + + pci_msix_shutdown_grp(dev, group_id); + num_vecs = free_msi_irqs_grp(dev, group_id); + + return num_vecs; +} +EXPORT_SYMBOL(pci_disable_msix_grp); + void pci_no_msi(void) { pci_msi_enable = 0; @@ -1356,6 +1466,26 @@ void pci_free_irq_vectors(struct pci_dev *dev) EXPORT_SYMBOL(pci_free_irq_vectors); /** + * pci_free_irq_vectors_grp - free previously allocated IRQs for a + * device associated with a group + * @dev: PCI device to operate on + * @group_id: group to be freed + * + * Undoes the allocations and enabling in pci_alloc_irq_vectors_dyn(). + * Can be only called for MSIx vectors. + */ +int pci_free_irq_vectors_grp(struct pci_dev *dev, int group_id) +{ + if (group_id < 0) { + pci_err(dev, "Group should be > 0\n"); + return -EINVAL; + } + + return pci_disable_msix_grp(dev, group_id); +} +EXPORT_SYMBOL(pci_free_irq_vectors_grp); + +/** * pci_irq_vector - return Linux IRQ number of a device vector * @dev: PCI device to operate on * @nr: device-relative interrupt vector index (0-based). diff --git a/include/linux/msi.h b/include/linux/msi.h index e61ba24..78929ad 100644 --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -333,6 +333,8 @@ struct irq_domain *msi_create_irq_domain(struct fwnode_handle *fwnode, int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, int nvec); void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev); +void msi_domain_free_irqs_grp(struct irq_domain *domain, struct device *dev, + int group_id); struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain); struct irq_domain *platform_msi_create_irq_domain(struct fwnode_handle *fwnode, diff --git a/include/linux/pci.h b/include/linux/pci.h index 73385c0..944e539 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1404,6 +1404,7 @@ int pci_msi_vec_count(struct pci_dev *dev); void pci_disable_msi(struct pci_dev *dev); int pci_msix_vec_count(struct pci_dev *dev); void pci_disable_msix(struct pci_dev *dev); +int pci_disable_msix_grp(struct pci_dev *dev, int group_id); void pci_restore_msi_state(struct pci_dev *dev); int pci_msi_enabled(void); int pci_enable_msi(struct pci_dev *dev); @@ -1428,6 +1429,7 @@ int pci_alloc_irq_vectors_affinity_dyn(struct pci_dev *dev, int *group_id, bool one_shot); void pci_free_irq_vectors(struct pci_dev *dev); +int pci_free_irq_vectors_grp(struct pci_dev *dev, int group_id); int pci_irq_vector(struct pci_dev *dev, unsigned int nr); int pci_irq_vector_group(struct pci_dev *dev, unsigned int nr, unsigned int group_id); @@ -1439,6 +1441,8 @@ static inline int pci_msi_vec_count(struct pci_dev *dev) { return -ENOSYS; } static inline void pci_disable_msi(struct pci_dev *dev) { } static inline int pci_msix_vec_count(struct pci_dev *dev) { return -ENOSYS; } static inline void pci_disable_msix(struct pci_dev *dev) { } +static inline int pci_disable_msix_grp(struct pci_dev *dev, int group_id) + { return -ENOSYS; } static inline void pci_restore_msi_state(struct pci_dev *dev) { } static inline int pci_msi_enabled(void) { return 0; } static inline int pci_enable_msi(struct pci_dev *dev) @@ -1475,6 +1479,11 @@ static inline void pci_free_irq_vectors(struct pci_dev *dev) { } +static inline void pci_free_irq_vectors_grp(struct pci_dev *dev, int group_id) +{ + return 0; +} + static inline int pci_irq_vector(struct pci_dev *dev, unsigned int nr) { if (WARN_ON_ONCE(nr > 0)) diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c index 5cfa931..d73a5dc 100644 --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -511,6 +511,32 @@ void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev) } /** + * msi_domain_free_irqs_grp - Free interrupts belonging to a group from + * a MSI interrupt @domain associated to @dev + * @domain: The domain to managing the interrupts + * @dev: Pointer to device struct of the device for which the interrupt + * should be freed + * @group_id: The group ID to be freed + */ +void msi_domain_free_irqs_grp(struct irq_domain *domain, struct device *dev, + int group_id) +{ + struct msi_desc *desc; + + for_each_msi_entry(desc, dev) { + /* + * We might have failed to allocate an MSI early + * enough that there is no IRQ associated to this + * entry. If that's the case, don't do anything. + */ + if (desc->group_id == group_id && desc->irq) { + irq_domain_free_irqs(desc->irq, desc->nvec_used); + desc->irq = 0; + } + } +} + +/** * msi_get_domain_info - Get the MSI interrupt domain info for @domain * @domain: The interrupt domain to retrieve data from * From patchwork Sat Jun 22 00:19:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dey, Megha" X-Patchwork-Id: 11010855 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C086186E for ; Fri, 21 Jun 2019 23:57:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D3B72870D for ; Fri, 21 Jun 2019 23:57:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2174B28B81; Fri, 21 Jun 2019 23:57:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B64A228BAF for ; Fri, 21 Jun 2019 23:57:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726412AbfFUX5j (ORCPT ); Fri, 21 Jun 2019 19:57:39 -0400 Received: from mga09.intel.com ([134.134.136.24]:48780 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726308AbfFUX5Y (ORCPT ); Fri, 21 Jun 2019 19:57:24 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Jun 2019 16:57:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.63,402,1557212400"; d="scan'208";a="359022898" Received: from megha-z97x-ud7-th.sc.intel.com ([143.183.85.162]) by fmsmga005.fm.intel.com with ESMTP; 21 Jun 2019 16:57:21 -0700 From: Megha Dey To: bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, marc.zyngier@arm.com, ashok.raj@intel.com, jacob.jun.pan@linux.intel.com, megha.dey@intel.com, Megha Dey Subject: [RFC V1 RESEND 6/6] Documentation: PCI/MSI: Document dynamic MSI-X infrastructure Date: Fri, 21 Jun 2019 17:19:38 -0700 Message-Id: <1561162778-12669-7-git-send-email-megha.dey@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> References: <1561162778-12669-1-git-send-email-megha.dey@linux.intel.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add Documentation for the newly introduced dynamic allocation and deallocation of MSI-X vectors. Cc: Jacob Pan Cc: Ashok Raj Signed-off-by: Megha Dey --- Documentation/PCI/MSI-HOWTO.txt | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/Documentation/PCI/MSI-HOWTO.txt b/Documentation/PCI/MSI-HOWTO.txt index 618e13d..5f6daf4 100644 --- a/Documentation/PCI/MSI-HOWTO.txt +++ b/Documentation/PCI/MSI-HOWTO.txt @@ -156,6 +156,44 @@ the driver can specify that only MSI or MSI-X is acceptable: if (nvec < 0) goto out_err; +4.2.1 Dynamic MSI-X Allocation: + +The pci_alloc_irq_vectors() API is a one-shot method to allocate MSI resources +i.e. they cannot be called multiple times. In order to allocate MSI-X vectors +post probe phase, multiple times, use the following API: + + int pci_alloc_irq_vectors_dyn(struct pci_dev *dev, unsigned int min_vecs, + unsigned int max_vecs, unsigned int flags, int *group_id); + +This API allocates up to max_vecs interrupt vectors for a PCI device. It returns +the number of vectors allocated or a negative error. If the device has a +requirement for a minimum number of vectors the driver can pass a min_vecs +argument set to this limit, and the PCI core will return -ENOSPC if it can't +meet the minimum number of vectors. This API is only to be used for MSI-X vectors. + +A group ID pointer is passed which gets populated by this function. A unique +group_id will associated with all the MSI-X vectors allocated each time this +function is called: + + int group_id; + nvec = pci_alloc_irq_vectors_dyn(pdev, minvecs, maxvecs, + flags | PCI_IRQ_MSIX, &group_id); + if (nvec < 0) + goto out_err; + +To get the Linux IRQ numbers to pass to request_irq() and free_irq(), use the +following function: + + int pci_irq_vec_grp(struct pci_dev *dev, unsigned int nr, unsigned int group_id); + +In order to free the MSI-X resources associated with a particular group, use +the following function: + + int pci_free_irq_vectors_grp(struct pci_dev *dev, int group_id); + +For example, to delete the group allocated with the pci_alloc_irq_vectors_dyn(), + nvec = pci_free_irq_vectors_grp(pdev, group_id); + 4.3 Legacy APIs The following old APIs to enable and disable MSI or MSI-X interrupts should