From patchwork Tue Feb 9 13:34:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 12078137 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9E47C433DB for ; Tue, 9 Feb 2021 13:35:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B3EC64ED7 for ; Tue, 9 Feb 2021 13:35:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230262AbhBINf5 (ORCPT ); Tue, 9 Feb 2021 08:35:57 -0500 Received: from mail.kernel.org ([198.145.29.99]:43990 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230208AbhBINfs (ORCPT ); Tue, 9 Feb 2021 08:35:48 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id CC1C564EDA; Tue, 9 Feb 2021 13:35:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612877704; bh=JBhprM4DIDADxeTeqD6MW59O69WywX5uOcYWV/jADd4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=V80fYHx/7+xncRsX3aBxC0HO69IxgnZq/34phC6vPKqXjm/D1blfRpwEtDd11fsHl kmwMooMYbNuyxcPc19n3rVQmCLlLhv49koepXgCt8uKUiiBwMjDkPkojFSqfpnm6OL eji/gn1C6MhRMFyC4W5PYcvocHHNmS0D53mkIH28D+KIJlOVoO/wbIRdXltJlFK5Jk 9LWFbBAy4cxmQCGkb5iAk3Irt/5peUqog/an+RuQlbx772XuId8uLywyH78/BSg06y 4ImlLL/6dKDOBJOLV4QalXYBRjzbu3GQRneF2tZUsMWjnD0rVPAYf5JMyJi/ImzNq7 1Qc+VMRGo46EQ== From: Leon Romanovsky To: Bjorn Helgaas , Saeed Mahameed Cc: Leon Romanovsky , Jason Gunthorpe , Alexander Duyck , Jakub Kicinski , linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, netdev@vger.kernel.org, Don Dutile , Alex Williamson , "David S . Miller" Subject: [PATCH mlx5-next v6 1/4] PCI: Add sysfs callback to allow MSI-X table size change of SR-IOV VFs Date: Tue, 9 Feb 2021 15:34:42 +0200 Message-Id: <20210209133445.700225-2-leon@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210209133445.700225-1-leon@kernel.org> References: <20210209133445.700225-1-leon@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org From: Leon Romanovsky Extend PCI sysfs interface with a new callback that allows configuration of the number of MSI-X vectors for specific SR-IOV VF. This is needed to optimize the performance of VFs devices by allocating the number of vectors based on the administrator knowledge of the intended use of the VF. This function is applicable for SR-IOV VF because such devices allocate their MSI-X table before they will run on the VMs and HW can't guess the right number of vectors, so some devices allocate them statically and equally. 1) The newly added /sys/bus/pci/devices/.../sriov_vf_msix_count file will be seen for the VFs and it is writable as long as a driver is not bound to the VF. The values accepted are: * > 0 - this will be number reported by the Table Size in the VF's MSI-X Message Control register * < 0 - not valid * = 0 - will reset to the device default value 2) In order to make management easy, provide new read-only sysfs file that returns a total number of possible to configure MSI-X vectors. cat /sys/bus/pci/devices/.../sriov_vf_total_msix = 0 - feature is not supported > 0 - total number of MSI-X vectors available for distribution among the VFs Signed-off-by: Leon Romanovsky --- Documentation/ABI/testing/sysfs-bus-pci | 28 +++++ drivers/pci/iov.c | 153 ++++++++++++++++++++++++ include/linux/pci.h | 12 ++ 3 files changed, 193 insertions(+) -- 2.29.2 diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci index 25c9c39770c6..7dadc3610959 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci +++ b/Documentation/ABI/testing/sysfs-bus-pci @@ -375,3 +375,31 @@ Description: The value comes from the PCI kernel device state and can be one of: "unknown", "error", "D0", D1", "D2", "D3hot", "D3cold". The file is read only. + +What: /sys/bus/pci/devices/.../sriov_vf_total_msix +Date: January 2021 +Contact: Leon Romanovsky +Description: + This file is associated with the SR-IOV PFs. + It contains the total number of MSI-X vectors available for + assignment to all VFs associated with this PF. It may be zero + if the device doesn't support this functionality. + +What: /sys/bus/pci/devices/.../sriov_vf_msix_count +Date: January 2021 +Contact: Leon Romanovsky +Description: + This file is associated with the SR-IOV VFs. + It allows configuration of the number of MSI-X vectors for + the VF. This is needed to optimize performance of newly bound + devices by allocating the number of vectors based on the + administrator knowledge of targeted VM. + + The values accepted are: + * > 0 - this will be number reported by the VF's MSI-X + capability + * < 0 - not valid + * = 0 - will reset to the device default value + + The file is writable if the PF is bound to a driver that + implements ->sriov_set_msix_vec_count(). diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index 4afd4ee4f7f0..c0554aa6b90a 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -31,6 +31,7 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id) return (dev->devfn + dev->sriov->offset + dev->sriov->stride * vf_id) & 0xff; } +EXPORT_SYMBOL_GPL(pci_iov_virtfn_devfn); /* * Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may @@ -157,6 +158,158 @@ int pci_iov_sysfs_link(struct pci_dev *dev, return rc; } +#ifdef CONFIG_PCI_MSI +static ssize_t sriov_vf_msix_count_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct pci_dev *vf_dev = to_pci_dev(dev); + struct pci_dev *pdev = pci_physfn(vf_dev); + int val, ret; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + if (val < 0) + return -EINVAL; + + device_lock(&pdev->dev); + if (!pdev->driver || !pdev->driver->sriov_set_msix_vec_count) { + ret = -EOPNOTSUPP; + goto err_pdev; + } + + device_lock(&vf_dev->dev); + if (vf_dev->driver) { + /* + * Driver already probed this VF and configured itself + * based on previously configured (or default) MSI-X vector + * count. It is too late to change this field for this + * specific VF. + */ + ret = -EBUSY; + goto err_dev; + } + + ret = pdev->driver->sriov_set_msix_vec_count(vf_dev, val); + +err_dev: + device_unlock(&vf_dev->dev); +err_pdev: + device_unlock(&pdev->dev); + return ret ? : count; +} +static DEVICE_ATTR_WO(sriov_vf_msix_count); + +static ssize_t sriov_vf_total_msix_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct pci_dev *pdev = to_pci_dev(dev); + u32 vf_total_msix; + + device_lock(dev); + if (!pdev->driver || !pdev->driver->sriov_get_vf_total_msix) { + device_unlock(dev); + return -EOPNOTSUPP; + } + vf_total_msix = pdev->driver->sriov_get_vf_total_msix(pdev); + device_unlock(dev); + + return sysfs_emit(buf, "%u\n", vf_total_msix); +} +static DEVICE_ATTR_RO(sriov_vf_total_msix); +#endif + +static const struct attribute *sriov_pf_dev_attrs[] = { +#ifdef CONFIG_PCI_MSI + &dev_attr_sriov_vf_total_msix.attr, +#endif + NULL, +}; + +static const struct attribute *sriov_vf_dev_attrs[] = { +#ifdef CONFIG_PCI_MSI + &dev_attr_sriov_vf_msix_count.attr, +#endif + NULL, +}; + +/* + * The PF can change the specific properties of associated VFs. Such + * functionality is usually known after PF probed and PCI sysfs files + * were already created. + * + * The function below is driven by such PF. It adds sysfs files to already + * existing PF/VF sysfs device hierarchies. + */ +int pci_enable_vf_overlay(struct pci_dev *dev) +{ + struct pci_dev *virtfn; + int id, ret; + + if (!dev->is_physfn || !dev->sriov->num_VFs) + return 0; + + ret = sysfs_create_files(&dev->dev.kobj, sriov_pf_dev_attrs); + if (ret) + return ret; + + for (id = 0; id < dev->sriov->num_VFs; id++) { + virtfn = pci_get_domain_bus_and_slot( + pci_domain_nr(dev->bus), pci_iov_virtfn_bus(dev, id), + pci_iov_virtfn_devfn(dev, id)); + + if (!virtfn) + continue; + + ret = sysfs_create_files(&virtfn->dev.kobj, + sriov_vf_dev_attrs); + if (ret) + goto out; + } + return 0; + +out: + while (id--) { + virtfn = pci_get_domain_bus_and_slot( + pci_domain_nr(dev->bus), pci_iov_virtfn_bus(dev, id), + pci_iov_virtfn_devfn(dev, id)); + + if (!virtfn) + continue; + + sysfs_remove_files(&virtfn->dev.kobj, sriov_vf_dev_attrs); + } + sysfs_remove_files(&dev->dev.kobj, sriov_pf_dev_attrs); + return ret; +} +EXPORT_SYMBOL_GPL(pci_enable_vf_overlay); + +void pci_disable_vf_overlay(struct pci_dev *dev) +{ + struct pci_dev *virtfn; + int id; + + if (!dev->is_physfn || !dev->sriov->num_VFs) + return; + + id = dev->sriov->num_VFs; + while (id--) { + virtfn = pci_get_domain_bus_and_slot( + pci_domain_nr(dev->bus), pci_iov_virtfn_bus(dev, id), + pci_iov_virtfn_devfn(dev, id)); + + if (!virtfn) + continue; + + sysfs_remove_files(&virtfn->dev.kobj, sriov_vf_dev_attrs); + } + sysfs_remove_files(&dev->dev.kobj, sriov_pf_dev_attrs); +} +EXPORT_SYMBOL_GPL(pci_disable_vf_overlay); + int pci_iov_add_virtfn(struct pci_dev *dev, int id) { int i; diff --git a/include/linux/pci.h b/include/linux/pci.h index b32126d26997..732611937574 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -856,6 +856,11 @@ struct module; * e.g. drivers/net/e100.c. * @sriov_configure: Optional driver callback to allow configuration of * number of VFs to enable via sysfs "sriov_numvfs" file. + * @sriov_set_msix_vec_count: Driver callback to change number of MSI-X vectors + * to configure via sysfs "sriov_vf_msix_count" entry. This will + * change MSI-X Table Size in their Message Control registers. + * @sriov_get_vf_total_msix: Total number of MSI-X veectors to distribute + * to the VFs * @err_handler: See Documentation/PCI/pci-error-recovery.rst * @groups: Sysfs attribute groups. * @driver: Driver model structure. @@ -871,6 +876,8 @@ struct pci_driver { int (*resume)(struct pci_dev *dev); /* Device woken up */ void (*shutdown)(struct pci_dev *dev); int (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */ + int (*sriov_set_msix_vec_count)(struct pci_dev *vf, int msix_vec_count); /* On PF */ + u32 (*sriov_get_vf_total_msix)(struct pci_dev *pf); const struct pci_error_handlers *err_handler; const struct attribute_group **groups; struct device_driver driver; @@ -2059,6 +2066,9 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar); int pci_iov_virtfn_bus(struct pci_dev *dev, int id); int pci_iov_virtfn_devfn(struct pci_dev *dev, int id); +int pci_enable_vf_overlay(struct pci_dev *dev); +void pci_disable_vf_overlay(struct pci_dev *dev); + int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn); void pci_disable_sriov(struct pci_dev *dev); @@ -2100,6 +2110,8 @@ static inline int pci_iov_add_virtfn(struct pci_dev *dev, int id) } static inline void pci_iov_remove_virtfn(struct pci_dev *dev, int id) { } +static inline int pci_enable_vf_overlay(struct pci_dev *dev) { return 0; } +static inline void pci_disable_vf_overlay(struct pci_dev *dev) { } static inline void pci_disable_sriov(struct pci_dev *dev) { } static inline int pci_num_vf(struct pci_dev *dev) { return 0; } static inline int pci_vfs_assigned(struct pci_dev *dev) From patchwork Tue Feb 9 13:34:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 12078131 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B748EC433E0 for ; Tue, 9 Feb 2021 13:35:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 71E7264ED4 for ; Tue, 9 Feb 2021 13:35:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229754AbhBINfk (ORCPT ); Tue, 9 Feb 2021 08:35:40 -0500 Received: from mail.kernel.org ([198.145.29.99]:43812 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229772AbhBINff (ORCPT ); Tue, 9 Feb 2021 08:35:35 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 8AEC964ED4; Tue, 9 Feb 2021 13:34:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612877694; bh=9u4XcR+je2ELj6DNgC77h551MxruUIms4e+LHa1FoSY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CrJmH1upy+82Oe/cFOe/qvRbN4r6clUtQYV5P/5f4m6appd/nhEfVErQbHc/Yucp2 O3C6k0PzxBV8a+mkEoeZvpoKTq6Islr0M3PL4epPe9bXylStAehtCXX08XZbDJbNEs SPMekFtGjI17J+r+ohqCfD83e0wsOftNMBMFc/H/j67Mvk9kTv5/YeH4t9je5JbCpK 456IbBZNIGUO/ihSJ3W3m+j07CYXVUc1FBehz7OcGcB9ge/Q3pzgioNpNgpke2g0ft ld63Ofji1xsqLg+K563dcT7plwUCld1ZN9KaUMzDqSyApl0K5K/5Mlj/nj41pxk55H 3NulUtlZUp2ag== From: Leon Romanovsky To: Bjorn Helgaas , Saeed Mahameed Cc: Leon Romanovsky , Jason Gunthorpe , Alexander Duyck , Jakub Kicinski , linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, netdev@vger.kernel.org, Don Dutile , Alex Williamson , "David S . Miller" Subject: [PATCH mlx5-next v6 2/4] net/mlx5: Add dynamic MSI-X capabilities bits Date: Tue, 9 Feb 2021 15:34:43 +0200 Message-Id: <20210209133445.700225-3-leon@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210209133445.700225-1-leon@kernel.org> References: <20210209133445.700225-1-leon@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org From: Leon Romanovsky These new fields declare the number of MSI-X vectors that is possible to allocate on the VF through PF configuration. Value must be in range defined by min_dynamic_vf_msix_table_size and max_dynamic_vf_msix_table_size. The driver should continue to query its MSI-X table through PCI configuration header. Signed-off-by: Leon Romanovsky --- include/linux/mlx5/mlx5_ifc.h | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 77051bd5c1cf..ffe2c7231ae4 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1677,7 +1677,16 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 reserved_at_6e0[0x10]; u8 sf_base_id[0x10]; - u8 reserved_at_700[0x80]; + u8 reserved_at_700[0x8]; + u8 num_total_dynamic_vf_msix[0x18]; + u8 reserved_at_720[0x14]; + u8 dynamic_msix_table_size[0xc]; + u8 reserved_at_740[0xc]; + u8 min_dynamic_vf_msix_table_size[0x4]; + u8 reserved_at_750[0x4]; + u8 max_dynamic_vf_msix_table_size[0xc]; + + u8 reserved_at_760[0x20]; u8 vhca_tunnel_commands[0x40]; u8 reserved_at_7c0[0x40]; }; From patchwork Tue Feb 9 13:34:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 12078133 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50B42C433E6 for ; Tue, 9 Feb 2021 13:35:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0FC3664ED4 for ; Tue, 9 Feb 2021 13:35:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230169AbhBINfm (ORCPT ); Tue, 9 Feb 2021 08:35:42 -0500 Received: from mail.kernel.org ([198.145.29.99]:43872 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231347AbhBINfj (ORCPT ); Tue, 9 Feb 2021 08:35:39 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id E831664ED5; Tue, 9 Feb 2021 13:34:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612877697; bh=9lDu3DmFqXRm5A+A6HYfkgkN6t0kzXRQ3a3T6gFHNm0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jDnxPuIasKzBOxdlDI8GWMKGPtau38aFJy4k7tTrt8cejUeeJSXEql7j0NL+2HBYB qcyxDKEE4JGaI0RHXDDd3unUbrA9R9V1JpBjuZ3tq+tZ7BEpniy3fUs7GKnmPUu065 rZWNUUB27mmemlvZpHRrbleXhOylEjIKcRGsevRmzWKVAO9sQjYGrTWD7cTzCVBG6L gFk/ANtlBcuToWik2p/KCStV/sFru3vTxROd4N8BtOMh/xQh8H/ekOtCI8GK3xQRlt OWo71PUfFmHBNp+JuMdvMk9QshBpzn3Q1N0Ij0eTL0P+5fGc5KTE97VDQ/3TNqQdGa GzwMkl0zkn3iw== From: Leon Romanovsky To: Bjorn Helgaas , Saeed Mahameed Cc: Leon Romanovsky , Jason Gunthorpe , Alexander Duyck , Jakub Kicinski , linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, netdev@vger.kernel.org, Don Dutile , Alex Williamson , "David S . Miller" Subject: [PATCH mlx5-next v6 3/4] net/mlx5: Dynamically assign MSI-X vectors count Date: Tue, 9 Feb 2021 15:34:44 +0200 Message-Id: <20210209133445.700225-4-leon@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210209133445.700225-1-leon@kernel.org> References: <20210209133445.700225-1-leon@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org From: Leon Romanovsky The number of MSI-X vectors is PCI property visible through lspci that field is read-only and configured by the device. The mlx5 devices work in two possible modes: static assignment or dynamic. Static assignment means that all newly created VFs have a stable number of MSI-X vectors. Such partitioning doesn't utilize the newly created VF because it is unknown to the device the future load and configuration where that VF will be used. The second mode is a dynamic assignment and provided to overcome the inefficiency in the spread of such MSI-X vectors. It allows the kernel to convey a new number of vectors based on the administrator's knowledge. For example: if VF is bound to a powerful VM with many CPUs, the administrator will configure a larger number of MSI-X vectors than in static mode. It will utilize the mlx5 device efficiently. The assignment is performed before the driver is probed on that VF. Due to the internal to mlx5 implementation, the device provides more dynamic vectors than static vectors. In this patch, we enable this feature, and such change immediately increases the amount of MSI-X vectors for the system with 2 VFs from 12 vectors, to be 32 vectors per-VF. Before this patch: [root@server ~]# lspci -vs 0000:08:00.2 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] .... Capabilities: [9c] MSI-X: Enable- Count=12 Masked- After this patch: [root@server ~]# lspci -vs 0000:08:00.2 08:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] .... Capabilities: [9c] MSI-X: Enable- Count=32 Masked- Signed-off-by: Leon Romanovsky --- .../net/ethernet/mellanox/mlx5/core/main.c | 4 ++ .../ethernet/mellanox/mlx5/core/mlx5_core.h | 5 ++ .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 72 +++++++++++++++++++ .../net/ethernet/mellanox/mlx5/core/sriov.c | 13 +++- 4 files changed, 92 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index ca6f2fc39ea0..79cfcc844156 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -567,6 +567,10 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx) if (MLX5_CAP_GEN_MAX(dev, mkey_by_name)) MLX5_SET(cmd_hca_cap, set_hca_cap, mkey_by_name, 1); + if (MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix)) + MLX5_SET(cmd_hca_cap, set_hca_cap, num_total_dynamic_vf_msix, + MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix)); + return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index 0a0302ce7144..5babb4434a87 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -172,6 +172,11 @@ int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx, struct notifier_block *nb); int mlx5_irq_detach_nb(struct mlx5_irq_table *irq_table, int vecidx, struct notifier_block *nb); + +int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int devfn, + int msix_vec_count); +int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs); + struct cpumask * mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx); struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *table); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index 6fd974920394..715c801ae865 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -55,6 +55,78 @@ static struct mlx5_irq *mlx5_irq_get(struct mlx5_core_dev *dev, int vecidx) return &irq_table->irq[vecidx]; } +/** + * mlx5_get_default_msix_vec_count - Get default number of MSI-X vectors + * to be ssigned to each VF. + * @dev: PF to work on + * @num_vfs: Number of enabled VFs/ + */ +int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs) +{ + int num_vf_msix, min_msix, max_msix; + + num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix); + if (!num_vf_msix) + return 0; + + min_msix = MLX5_CAP_GEN(dev, min_dynamic_vf_msix_table_size); + max_msix = MLX5_CAP_GEN(dev, max_dynamic_vf_msix_table_size); + + /* Limit maximum number of MSI-X vectors to leave some of them free + * in the pool and ready to be assigned by the users without need to + * resize other Vfs. + */ + return max(min(num_vf_msix / num_vfs, max_msix / 2), min_msix); +} + +/** + * mlx5_set_msix_vec_count - Set dynamically allocated MSI-X to the VF + * @dev: PF to work on + * @function_id: Internal PCI VF function IDd + * @msix_vec_count: Number of MSI-X vectors to set + */ +int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int function_id, + int msix_vec_count) +{ + int sz = MLX5_ST_SZ_BYTES(set_hca_cap_in); + int num_vf_msix, min_msix, max_msix; + void *hca_cap, *cap; + int ret; + + num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix); + if (!num_vf_msix) + return 0; + + if (!MLX5_CAP_GEN(dev, vport_group_manager) || !mlx5_core_is_pf(dev)) + return -EOPNOTSUPP; + + min_msix = MLX5_CAP_GEN(dev, min_dynamic_vf_msix_table_size); + max_msix = MLX5_CAP_GEN(dev, max_dynamic_vf_msix_table_size); + + if (msix_vec_count < min_msix) + return -EINVAL; + + if (msix_vec_count > max_msix) + return -EOVERFLOW; + + hca_cap = kzalloc(sz, GFP_KERNEL); + if (!hca_cap) + return -ENOMEM; + + cap = MLX5_ADDR_OF(set_hca_cap_in, hca_cap, capability); + MLX5_SET(cmd_hca_cap, cap, dynamic_msix_table_size, msix_vec_count); + + MLX5_SET(set_hca_cap_in, hca_cap, opcode, MLX5_CMD_OP_SET_HCA_CAP); + MLX5_SET(set_hca_cap_in, hca_cap, other_function, 1); + MLX5_SET(set_hca_cap_in, hca_cap, function_id, function_id); + + MLX5_SET(set_hca_cap_in, hca_cap, op_mod, + MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE << 1); + ret = mlx5_cmd_exec_in(dev, set_hca_cap, hca_cap); + kfree(hca_cap); + return ret; +} + int mlx5_irq_attach_nb(struct mlx5_irq_table *irq_table, int vecidx, struct notifier_block *nb) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c index 3094d20297a9..f0ec86a1c8a6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c @@ -71,8 +71,7 @@ static int sriov_restore_guids(struct mlx5_core_dev *dev, int vf) static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs) { struct mlx5_core_sriov *sriov = &dev->priv.sriov; - int err; - int vf; + int err, vf, num_msix_count; if (!MLX5_ESWITCH_MANAGER(dev)) goto enable_vfs_hca; @@ -85,12 +84,22 @@ static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs) } enable_vfs_hca: + num_msix_count = mlx5_get_default_msix_vec_count(dev, num_vfs); for (vf = 0; vf < num_vfs; vf++) { err = mlx5_core_enable_hca(dev, vf + 1); if (err) { mlx5_core_warn(dev, "failed to enable VF %d (%d)\n", vf, err); continue; } + + err = mlx5_set_msix_vec_count(dev, vf + 1, num_msix_count); + if (err) { + mlx5_core_warn(dev, + "failed to set MSI-X vector counts VF %d, err %d\n", + vf, err); + continue; + } + sriov->vfs_ctx[vf].enabled = 1; if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_IB) { err = sriov_restore_guids(dev, vf); From patchwork Tue Feb 9 13:34:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 12078135 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74462C433DB for ; Tue, 9 Feb 2021 13:35:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2D2B164ED5 for ; Tue, 9 Feb 2021 13:35:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230185AbhBINfp (ORCPT ); Tue, 9 Feb 2021 08:35:45 -0500 Received: from mail.kernel.org ([198.145.29.99]:43930 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231366AbhBINfm (ORCPT ); Tue, 9 Feb 2021 08:35:42 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 58C3664ED3; Tue, 9 Feb 2021 13:35:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612877701; bh=5UrSrnNscVpnQ6QiFa+BRdHDIDUqHAl3HAvZrkWu1bI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=M1dJbvjGVX4F19Qfsn9BWW9HObeyv44PioLJqp2/uFDMTJWghqkfhM8UGdbPsjt+D FcwEMVgaaO2BNlSm1oYddBH7RkEyYeVGfl8woRJeJVpz3l2d4LnbBeAClWBCO5p4br EEuutMh901fGS2a+s6nfLNNp+ZYlNIp6Cs+oX47zHCmSsiXZXZPGmt9pxCassMi5jM OYYEfNUuvPWl/cc8yGmnik9Bgu+nMHNqGU4HwZ0fx0lklm6kH/MbnYn8BMO36F97mp oB8YQlO+WQs4CqNM9N346TajHxSGJGkHnvP/8SNvxGyBzLD7QarC4/sKrfqXuRhZy1 W155lIOxxwirw== From: Leon Romanovsky To: Bjorn Helgaas , Saeed Mahameed Cc: Leon Romanovsky , Jason Gunthorpe , Alexander Duyck , Jakub Kicinski , linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, netdev@vger.kernel.org, Don Dutile , Alex Williamson , "David S . Miller" Subject: [PATCH mlx5-next v6 4/4] net/mlx5: Allow to the users to configure number of MSI-X vectors Date: Tue, 9 Feb 2021 15:34:45 +0200 Message-Id: <20210209133445.700225-5-leon@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210209133445.700225-1-leon@kernel.org> References: <20210209133445.700225-1-leon@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org From: Leon Romanovsky Implement ability to configure MSI-X for the SR-IOV VFs. Signed-off-by: Leon Romanovsky --- .../net/ethernet/mellanox/mlx5/core/main.c | 13 ++++++ .../ethernet/mellanox/mlx5/core/mlx5_core.h | 22 +++++++++ .../net/ethernet/mellanox/mlx5/core/sriov.c | 45 +++++++++++++++++++ 3 files changed, 80 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 79cfcc844156..db59c51e148e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1395,6 +1395,14 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *id) goto err_load_one; } + err = mlx5_enable_vf_overlay(dev); + if (err) { + mlx5_core_err(dev, + "mlx5_enable_vf_overlay failed with error code %d\n", + err); + goto err_vf_overlay; + } + err = mlx5_crdump_enable(dev); if (err) dev_err(&pdev->dev, "mlx5_crdump_enable failed with error code %d\n", err); @@ -1403,6 +1411,8 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *id) devlink_reload_enable(devlink); return 0; +err_vf_overlay: + mlx5_unload_one(dev, true); err_load_one: mlx5_pci_close(dev); pci_init_err: @@ -1423,6 +1433,7 @@ static void remove_one(struct pci_dev *pdev) devlink_reload_disable(devlink); mlx5_crdump_disable(dev); mlx5_drain_health_wq(dev); + mlx5_disable_vf_overlay(dev); mlx5_unload_one(dev, true); mlx5_pci_close(dev); mlx5_mdev_uninit(dev); @@ -1650,6 +1661,8 @@ static struct pci_driver mlx5_core_driver = { .shutdown = shutdown, .err_handler = &mlx5_err_handler, .sriov_configure = mlx5_core_sriov_configure, + .sriov_get_vf_total_msix = mlx5_sriov_get_vf_total_msix, + .sriov_set_msix_vec_count = mlx5_core_sriov_set_msix_vec_count, }; static void mlx5_core_verify_params(void) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index 5babb4434a87..e5c2fa558f77 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -138,6 +138,7 @@ void mlx5_sriov_cleanup(struct mlx5_core_dev *dev); int mlx5_sriov_attach(struct mlx5_core_dev *dev); void mlx5_sriov_detach(struct mlx5_core_dev *dev); int mlx5_core_sriov_configure(struct pci_dev *dev, int num_vfs); +int mlx5_core_sriov_set_msix_vec_count(struct pci_dev *vf, int msix_vec_count); int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 func_id); int mlx5_core_disable_hca(struct mlx5_core_dev *dev, u16 func_id); int mlx5_create_scheduling_element_cmd(struct mlx5_core_dev *dev, u8 hierarchy, @@ -264,4 +265,25 @@ void mlx5_set_nic_state(struct mlx5_core_dev *dev, u8 state); void mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup); int mlx5_load_one(struct mlx5_core_dev *dev, bool boot); + +static inline int mlx5_enable_vf_overlay(struct mlx5_core_dev *dev) +{ + if (MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix)) + return pci_enable_vf_overlay(dev->pdev); + + return 0; +} + +static inline void mlx5_disable_vf_overlay(struct mlx5_core_dev *dev) +{ + if (MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix)) + pci_disable_vf_overlay(dev->pdev); +} + +static inline u32 mlx5_sriov_get_vf_total_msix(struct pci_dev *pdev) +{ + struct mlx5_core_dev *dev = pci_get_drvdata(pdev); + + return MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix); +} #endif /* __MLX5_CORE_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c index f0ec86a1c8a6..446bfdfce4a2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c @@ -156,6 +156,15 @@ static int mlx5_sriov_enable(struct pci_dev *pdev, int num_vfs) if (err) { mlx5_core_warn(dev, "pci_enable_sriov failed : %d\n", err); mlx5_device_disable_sriov(dev, num_vfs, true); + return err; + } + + err = mlx5_enable_vf_overlay(dev); + if (err) { + mlx5_core_warn(dev, "mlx5_enable_vf_overlay failed : %d\n", + err); + pci_disable_sriov(pdev); + mlx5_device_disable_sriov(dev, num_vfs, true); } return err; } @@ -165,6 +174,7 @@ static void mlx5_sriov_disable(struct pci_dev *pdev) struct mlx5_core_dev *dev = pci_get_drvdata(pdev); int num_vfs = pci_num_vf(dev->pdev); + mlx5_disable_vf_overlay(dev); pci_disable_sriov(pdev); mlx5_device_disable_sriov(dev, num_vfs, true); } @@ -187,6 +197,41 @@ int mlx5_core_sriov_configure(struct pci_dev *pdev, int num_vfs) return err ? err : num_vfs; } +int mlx5_core_sriov_set_msix_vec_count(struct pci_dev *vf, int msix_vec_count) +{ + struct pci_dev *pf = pci_physfn(vf); + struct mlx5_core_sriov *sriov; + struct mlx5_core_dev *dev; + int num_vf_msix, id; + + dev = pci_get_drvdata(pf); + num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix); + if (!num_vf_msix) + return -EOPNOTSUPP; + + if (!msix_vec_count) + msix_vec_count = + mlx5_get_default_msix_vec_count(dev, pci_num_vf(pf)); + + sriov = &dev->priv.sriov; + + /* Reversed translation of PCI VF function number to the internal + * function_id, which exists in the name of virtfn symlink. + */ + for (id = 0; id < pci_num_vf(pf); id++) { + if (!sriov->vfs_ctx[id].enabled) + continue; + + if (vf->devfn == pci_iov_virtfn_devfn(pf, id)) + break; + } + + if (id == pci_num_vf(pf) || !sriov->vfs_ctx[id].enabled) + return -EINVAL; + + return mlx5_set_msix_vec_count(dev, id + 1, msix_vec_count); +} + int mlx5_sriov_attach(struct mlx5_core_dev *dev) { if (!mlx5_core_is_pf(dev) || !pci_num_vf(dev->pdev))