From patchwork Thu Jun 15 13:33:13 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sagi Grimberg X-Patchwork-Id: 9788887 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A07BA60384 for ; Thu, 15 Jun 2017 13:33:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 92F8225D9E for ; Thu, 15 Jun 2017 13:33:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 87AA928613; Thu, 15 Jun 2017 13:33:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, URIBL_BLACK autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E51DE25D9E for ; Thu, 15 Jun 2017 13:33:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752427AbdFONde (ORCPT ); Thu, 15 Jun 2017 09:33:34 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:41704 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752445AbdFONde (ORCPT ); Thu, 15 Jun 2017 09:33:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=vuJxq/cyCNdBAek4kvN3v6fs3eXKSQmd1QhXiApJRjM=; b=d8VJc2F/i28zBXBdLe3m/vrZF f+sKNABXPTIHf3Cur87O9ozQOc35JGA5IO+JWhBto1Zdt+ciLEe4Kzln7b01xur3d8CxgkKRO7/Aj HeuCsLh0HNoprG/mqO5yn9ovr0ZEWVpSEIr9CduqHzbXrECC3UDtImRqYLbFElZ+qEsX90Q9jmwZZ 17VHOiTcMapzClELdtX234HoMGkXcktKfAFFy9i1XY9xS6M+AfEx3rOKqI6lkiWjQ7diDP05Mb5JW oCxQuak7DddgYSU7rZeSb2YuC2BrnctpsowOI+jEHe1sqY0MgKSl++pbbKKfKiPxPWeJp2fNRUcm8 Mi/X9hz1Q==; Received: from bzq-82-81-101-184.red.bezeqint.net ([82.81.101.184] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtpsa (Exim 4.87 #1 (Red Hat Linux)) id 1dLUtv-0005Yw-DY; Thu, 15 Jun 2017 13:33:31 +0000 From: Sagi Grimberg To: Doug Ledford , linux-rdma@vger.kernel.org, Saeed Mahameed Cc: Christoph Hellwig , Leon Romanovsky , linux-nvme@lists.infradead.org Subject: [PATCH v5 for-4.13 6/7] block: Add rdma affinity based queue mapping helper Date: Thu, 15 Jun 2017 16:33:13 +0300 Message-Id: <1497533594-11579-7-git-send-email-sagi@grimberg.me> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1497533594-11579-1-git-send-email-sagi@grimberg.me> References: <1497533594-11579-1-git-send-email-sagi@grimberg.me> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Like pci and virtio, we add a rdma helper for affinity spreading. This achieves optimal mq affinity assignments according to the underlying rdma device affinity maps. Reviewed-by: Jens Axboe Reviewed-by: Christoph Hellwig Reviewed-by: Max Gurtovoy Signed-off-by: Sagi Grimberg --- block/Kconfig | 5 +++++ block/Makefile | 1 + block/blk-mq-rdma.c | 54 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/blk-mq-rdma.h | 10 +++++++++ 4 files changed, 70 insertions(+) create mode 100644 block/blk-mq-rdma.c create mode 100644 include/linux/blk-mq-rdma.h diff --git a/block/Kconfig b/block/Kconfig index 89cd28f8d051..3ab42bbb06d5 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -206,4 +206,9 @@ config BLK_MQ_VIRTIO depends on BLOCK && VIRTIO default y +config BLK_MQ_RDMA + bool + depends on BLOCK && INFINIBAND + default y + source block/Kconfig.iosched diff --git a/block/Makefile b/block/Makefile index 2b281cf258a0..9396ebc85d24 100644 --- a/block/Makefile +++ b/block/Makefile @@ -29,6 +29,7 @@ obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o obj-$(CONFIG_BLK_MQ_VIRTIO) += blk-mq-virtio.o +obj-$(CONFIG_BLK_MQ_RDMA) += blk-mq-rdma.o obj-$(CONFIG_BLK_DEV_ZONED) += blk-zoned.o obj-$(CONFIG_BLK_WBT) += blk-wbt.o obj-$(CONFIG_BLK_DEBUG_FS) += blk-mq-debugfs.o diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c new file mode 100644 index 000000000000..7dc07b43858b --- /dev/null +++ b/block/blk-mq-rdma.c @@ -0,0 +1,54 @@ +/* + * Copyright (c) 2017 Sagi Grimberg. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ +#include +#include +#include + +/** + * blk_mq_rdma_map_queues - provide a default queue mapping for rdma device + * @set: tagset to provide the mapping for + * @dev: rdma device associated with @set. + * @first_vec: first interrupt vectors to use for queues (usually 0) + * + * This function assumes the rdma device @dev has at least as many available + * interrupt vetors as @set has queues. It will then query it's affinity mask + * and built queue mapping that maps a queue to the CPUs that have irq affinity + * for the corresponding vector. + * + * In case either the driver passed a @dev with less vectors than + * @set->nr_hw_queues, or @dev does not provide an affinity mask for a + * vector, we fallback to the naive mapping. + */ +int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set, + struct ib_device *dev, int first_vec) +{ + const struct cpumask *mask; + unsigned int queue, cpu; + + if (set->nr_hw_queues > dev->num_comp_vectors) + goto fallback; + + for (queue = 0; queue < set->nr_hw_queues; queue++) { + mask = ib_get_vector_affinity(dev, first_vec + queue); + if (!mask) + goto fallback; + + for_each_cpu(cpu, mask) + set->mq_map[cpu] = queue; + } + + return 0; +fallback: + return blk_mq_map_queues(set); +} +EXPORT_SYMBOL_GPL(blk_mq_rdma_map_queues); diff --git a/include/linux/blk-mq-rdma.h b/include/linux/blk-mq-rdma.h new file mode 100644 index 000000000000..b4ade198007d --- /dev/null +++ b/include/linux/blk-mq-rdma.h @@ -0,0 +1,10 @@ +#ifndef _LINUX_BLK_MQ_RDMA_H +#define _LINUX_BLK_MQ_RDMA_H + +struct blk_mq_tag_set; +struct ib_device; + +int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set, + struct ib_device *dev, int first_vec); + +#endif /* _LINUX_BLK_MQ_RDMA_H */