From patchwork Tue Oct 18 21:42:17 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Bates X-Patchwork-Id: 9383215 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 06CBA600CA for ; Tue, 18 Oct 2016 22:31:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 02920297F2 for ; Tue, 18 Oct 2016 22:31:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EB240297F6; Tue, 18 Oct 2016 22:31:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DAD38297F2 for ; Tue, 18 Oct 2016 22:31:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934199AbcJRWbn (ORCPT ); Tue, 18 Oct 2016 18:31:43 -0400 Received: from gateway23.websitewelcome.com ([192.185.48.71]:49983 "EHLO gateway23.websitewelcome.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932771AbcJRWbm (ORCPT ); Tue, 18 Oct 2016 18:31:42 -0400 X-Greylist: delayed 1500 seconds by postgrey-1.27 at vger.kernel.org; Tue, 18 Oct 2016 18:31:42 EDT Received: from cm1.websitewelcome.com (cm.websitewelcome.com [192.185.0.102]) by gateway23.websitewelcome.com (Postfix) with ESMTP id A2A53D4536935 for ; Tue, 18 Oct 2016 16:42:31 -0500 (CDT) Received: from estate.websitewelcome.com ([192.185.83.90]) by cm1.websitewelcome.com with id x9iW1t00d1wvuag019iXVb; Tue, 18 Oct 2016 16:42:31 -0500 Received: from lambic.deltatee.com ([207.54.116.65]:59202 helo=cgy1-donard.priv.deltatee.com) by estate.websitewelcome.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-SHA256:128) (Exim 4.87) (envelope-from ) id 1bwc9V-0005Jd-Su; Tue, 18 Oct 2016 16:42:30 -0500 From: Stephen Bates To: linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-rdma@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org Cc: dan.j.williams@intel.com, ross.zwisler@linux.intel.com, willy@linux.intel.com, jgunthorpe@obsidianresearch.com, haggaie@mellanox.com, hch@infradead.org, axboe@fb.com, corbet@lwn.net, jim.macdonald@everspin.com, sbates@raithin.com, logang@deltatee.com, Stephen Bates Subject: [PATCH 3/3] iopmem : Add documentation for iopmem driver Date: Tue, 18 Oct 2016 15:42:17 -0600 Message-Id: <1476826937-20665-4-git-send-email-sbates@raithlin.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1476826937-20665-1-git-send-email-sbates@raithlin.com> References: <1476826937-20665-1-git-send-email-sbates@raithlin.com> X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - estate.websitewelcome.com X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - raithlin.com X-BWhitelist: no X-Source-IP: 207.54.116.65 X-Exim-ID: 1bwc9V-0005Jd-Su X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: lambic.deltatee.com (cgy1-donard.priv.deltatee.com) [207.54.116.65]:59202 X-Source-Auth: sbates@raithlin.com X-Email-Count: 61 X-Source-Cap: cmFpdGhsaW47c2NvdHQ7ZXN0YXRlLndlYnNpdGV3ZWxjb21lLmNvbQ== Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add documentation for the iopmem PCIe device driver. Signed-off-by: Stephen Bates Signed-off-by: Logan Gunthorpe --- Documentation/blockdev/00-INDEX | 2 ++ Documentation/blockdev/iopmem.txt | 62 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) create mode 100644 Documentation/blockdev/iopmem.txt -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/Documentation/blockdev/00-INDEX b/Documentation/blockdev/00-INDEX index c08df56..913e500 100644 --- a/Documentation/blockdev/00-INDEX +++ b/Documentation/blockdev/00-INDEX @@ -8,6 +8,8 @@ cpqarray.txt - info on using Compaq's SMART2 Intelligent Disk Array Controllers. floppy.txt - notes and driver options for the floppy disk driver. +iopmem.txt + - info on the iopmem block driver. mflash.txt - info on mGine m(g)flash driver for linux. nbd.txt diff --git a/Documentation/blockdev/iopmem.txt b/Documentation/blockdev/iopmem.txt new file mode 100644 index 0000000..ba805b8 --- /dev/null +++ b/Documentation/blockdev/iopmem.txt @@ -0,0 +1,62 @@ +IOPMEM Block Driver +=================== + +Logan Gunthorpe and Stephen Bates - October 2016 + +Introduction +------------ + +The iopmem module creates a DAX capable block device from a BAR on a PCIe +device. iopmem leverages heavily from the pmem driver although it utilizes IO +memory rather than system memory as its backing store. + +Usage +----- + +To include the iopmem module in your kernel please set CONFIG_BLK_DEV_IOPMEM +to either y or m. A block device will be created for each PCIe attached device +that matches the vendor and device ID as specified in the module. Currently an +unallocated PMC PCIe ID is used as the default. Alternatively this driver can +be bound to any aribtary PCIe function using the sysfs bind entry. + +The main purpose for an iopmem block device is expected to be for peer-2-peer +PCIe transfers. We DO NOT RECCOMEND accessing a iopmem device using the local +CPU unless you are doing one of the three following things: + +1. Creating a DAX capable filesystem on the iopmem device. +2. Creating some files on the DAX capable filesystem. +3. Interogating the files on said filesystem to obtain pointers that can be + passed to other PCIe devices for p2p DMA operations. + +Issues +------ + +1. Address Translation. Suggestions have been made that in certain +architectures and topologies the dma_addr_t passed to the DMA master +in a peer-2-peer transfer will not correctly route to the IO memory +intended. However in our testing to date we have not seen this to be +an issue, even in systems with IOMMUs and PCIe switches. It is our +understanding that an IOMMU only maps system memory and would not +interfere with device memory regions. (It certainly has no opportunity +to do so if the transfer gets routed through a switch). + +2. Memory Segment Spacing. This patch has the same limitations that +ZONE_DEVICE does in that memory regions must be spaces at least +SECTION_SIZE bytes part. On x86 this is 128MB and there are cases where +BARs can be placed closer together than this. Thus ZONE_DEVICE would not +be usable on neighboring BARs. For our purposes, this is not an issue as +we'd only be looking at enabling a single BAR in a given PCIe device. +More exotic use cases may have problems with this. + +3. Coherency Issues. When IOMEM is written from both the CPU and a PCIe +peer there is potential for coherency issues and for writes to occur out +of order. This is something that users of this feature need to be +cognizant of and may necessitate the use of CONFIG_EXPERT. Though really, +this isn't much different than the existing situation with RDMA: if +userspace sets up an MR for remote use, they need to be careful about +using that memory region themselves. + +4. Architecture. Currently this patch is applicable only to x86 +architectures. The same is true for much of the code pertaining to +PMEM and ZONE_DEVICE. It is hoped that the work will be extended to other +ARCH over time.