From patchwork Mon Nov 12 15:40:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Battersby X-Patchwork-Id: 10678857 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5270E14E2 for ; Mon, 12 Nov 2018 15:41:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 40B9229FBC for ; Mon, 12 Nov 2018 15:41:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 356C42A06F; Mon, 12 Nov 2018 15:41:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B601129FBC for ; Mon, 12 Nov 2018 15:41:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A65B86B0290; Mon, 12 Nov 2018 10:41:01 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A16CC6B0292; Mon, 12 Nov 2018 10:41:01 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92C896B0293; Mon, 12 Nov 2018 10:41:01 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 5D1B16B0290 for ; Mon, 12 Nov 2018 10:41:01 -0500 (EST) Received: by mail-qk1-f197.google.com with SMTP id n68so24280483qkn.8 for ; Mon, 12 Nov 2018 07:41:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:subject :to:cc:message-id:date:user-agent:mime-version :content-transfer-encoding:content-language; bh=jPi0gp7ff571xE5+eIWYbw84fX2WGq3UL/yDVlN8xMI=; b=ZEEXZooge18vUpGpd5blkAE6j5U5CDBSfUuMeik/Wb9PAijUx/Ny6KH0OFueI22mce zChc/gMmXtVQeXuvYoMqdyRUVMVyVHzfX1JR0KL3kZN/mSva+VIWHLP01u7i7PbZ9R87 qt36d7HE6pBew0qfQB+j1NbzUN5ehjQrha8P+FYmuweuYFytGO0Vj09EHmGbo2zkmwV1 Nqh4UT8QRdhuJqPyqz2aFJpDdkA2ffI1gnGyoc5cu1fm8V6ThZUJEa5XelpTDWoKcJBs ZHKWZgk5lVyEfluzL5gSGTZ+lfqLAtZtvZORJ9DFQvjncJ1USvKfF7e63FCQdlp4X6zJ xI9A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of btv1==854ac0a7dab==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==854ac0a7dab==tonyb@cybernetics.com" X-Gm-Message-State: AGRZ1gIAMPfKeNKPr0f++A38d1gVOIYIoCiu0jlDH8eX86e6AoEyb416 hb2/nKRzVV+HuN31zKouPgyDNyxZqRI0RBwAcGYQoQiugAtOhXR49wzT9gUv13nojkIQg3nBsov QSEoWDTSzJQgmknwdPQkC1DFd6Cu8axcwp5SNFlQJQja7Ebsa4AA0R6rTkoeIW4gdKw== X-Received: by 2002:aed:35c5:: with SMTP id d5mr1374543qte.212.1542037261094; Mon, 12 Nov 2018 07:41:01 -0800 (PST) X-Google-Smtp-Source: AJdET5ekaTYnzZn09KJOYrH/LYVWamfBCTOMJ8XUO2cgHNBGIq8jdJTt5T/pgpv2JcXezkVU7Lvz X-Received: by 2002:aed:35c5:: with SMTP id d5mr1374495qte.212.1542037260302; Mon, 12 Nov 2018 07:41:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542037260; cv=none; d=google.com; s=arc-20160816; b=NYzVNOJvH+suD5ROV/4qJK6aI5qeOiWuzdieroNpXcMOrEUoFbRVId354ZxEJ2l0SA VeYx0KPMRQEU3tIrjmyAERE/7s3ElFjX0l3SGY6l/+neJ43iu800S3iRqnH0yfCN2AGs asdtDMImCUDCaVabDfVFN5+1ka7jH8q8MTN6FTrnX8HiCgRAZ3k5HKOtPcw/G2KsrLPs nzo21ZuCWJZta57NJvW2lmbC+8zLAx34W2wEE7l4/CoDiCN/0vIdkEXyQHot8KGDTZCP dWxbIbXCe4HjShht0yehFCLLc8LzY+3HJ2qY3xdrklUOhfNHhj5br7RLPo2cYGgIYtKs xkDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-language:content-transfer-encoding:mime-version:user-agent :date:message-id:cc:to:subject:from; bh=jPi0gp7ff571xE5+eIWYbw84fX2WGq3UL/yDVlN8xMI=; b=E12RcTwSJx2Ey38rCZFAxSq5h9QMDJlZmMkBAEn6dJ9UQ9AjVOD69N+dIwBrW5nZh6 SOllfDxYbyxw1Jju29S1ybLaYHCBObzvRo5T5O1QCxGJaiSwp9OfJT9T7sK3WQ7B44FZ BdQTX153FEtMii7MNm34TgTowhO+h5ZSNv/UQsZFwgeEWaG37aPOu6UhyzgST35if2vx Fnfr9Y+5YZK3028w7JAnMcxJeo6OJuWVkPWERvUHcE/crUI6ekesm0kFvW6j6UZV58w9 u47afauylumzRzKdfsllCb0H/7/r8ol6QoQOcaGnKkfg6Y5IUNWoZrazyYCeKRLFK8Mh Cl6Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of btv1==854ac0a7dab==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==854ac0a7dab==tonyb@cybernetics.com" Received: from mail.cybernetics.com (mail.cybernetics.com. [173.71.130.66]) by mx.google.com with ESMTPS id 3si3005205qtp.70.2018.11.12.07.41.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Nov 2018 07:41:00 -0800 (PST) Received-SPF: pass (google.com: domain of btv1==854ac0a7dab==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) client-ip=173.71.130.66; Authentication-Results: mx.google.com; spf=pass (google.com: domain of btv1==854ac0a7dab==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==854ac0a7dab==tonyb@cybernetics.com" X-ASG-Debug-ID: 1542037258-0fb3b01fb3add440001-v9ZeMO Received: from cybernetics.com ([10.157.1.126]) by mail.cybernetics.com with ESMTP id Z1Dfql5bnhDHBjmT (version=SSLv3 cipher=DES-CBC3-SHA bits=112 verify=NO); Mon, 12 Nov 2018 10:40:58 -0500 (EST) X-Barracuda-Envelope-From: tonyb@cybernetics.com X-ASG-Whitelist: Client Received: from [10.157.2.224] (account tonyb HELO [192.168.200.1]) by cybernetics.com (CommuniGate Pro SMTP 5.1.14) with ESMTPSA id 8529336; Mon, 12 Nov 2018 10:40:57 -0500 From: Tony Battersby Subject: [PATCH v4 0/9] mpt3sas and dmapool scalability To: Matthew Wilcox , Christoph Hellwig , Marek Szyprowski , "iommu@lists.linux-foundation.org" , linux-mm@kvack.org X-ASG-Orig-Subj: [PATCH v4 0/9] mpt3sas and dmapool scalability Cc: "linux-scsi@vger.kernel.org" Message-ID: <88395080-efc1-4e7b-f813-bb90c86d0745@cybernetics.com> Date: Mon, 12 Nov 2018 10:40:57 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 Content-Language: en-US X-Barracuda-Connect: UNKNOWN[10.157.1.126] X-Barracuda-Start-Time: 1542037258 X-Barracuda-Encrypted: DES-CBC3-SHA X-Barracuda-URL: https://10.157.1.122:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 2539 X-Virus-Scanned: by bsmtpd at cybernetics.com X-Barracuda-BRTS-Status: 1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP I posted v3 on August 7. Nobody acked or merged the patches, and then I got too busy with other stuff to repost until now. The only change since v3: *) Dropped patch #10 (the mpt3sas patch) since the mpt3sas maintainers didn't show any interest. I believe these patches are ready for merging. --- drivers/scsi/mpt3sas is running into a scalability problem with the kernel's DMA pool implementation. With a LSI/Broadcom SAS 9300-8i 12Gb/s HBA and max_sgl_entries=256, during modprobe, mpt3sas does the equivalent of: chain_dma_pool = dma_pool_create(size = 128); for (i = 0; i < 373959; i++) { dma_addr[i] = dma_pool_alloc(chain_dma_pool); } And at rmmod, system shutdown, or system reboot, mpt3sas does the equivalent of: for (i = 0; i < 373959; i++) { dma_pool_free(chain_dma_pool, dma_addr[i]); } dma_pool_destroy(chain_dma_pool); With this usage, both dma_pool_alloc() and dma_pool_free() exhibit O(n^2) complexity, although dma_pool_free() is much worse due to implementation details. On my system, the dma_pool_free() loop above takes about 9 seconds to run. Note that the problem was even worse before commit 74522a92bbf0 ("scsi: mpt3sas: Optimize I/O memory consumption in driver."), where the dma_pool_free() loop could take ~30 seconds. mpt3sas also has some other DMA pools, but chain_dma_pool is the only one with so many allocations: cat /sys/devices/pci0000:80/0000:80:07.0/0000:85:00.0/pools (manually cleaned up column alignment) poolinfo - 0.1 reply_post_free_array pool 1 21 192 1 reply_free pool 1 1 41728 1 reply pool 1 1 1335296 1 sense pool 1 1 970272 1 chain pool 373959 386048 128 12064 reply_post_free pool 12 12 166528 12 The patches in this series improve the scalability of the DMA pool implementation, which significantly reduces the running time of the DMA alloc/free loops. With the patches applied, "modprobe mpt3sas", "rmmod mpt3sas", and system shutdown/reboot with mpt3sas loaded are significantly faster. Here are some benchmarks (of DMA alloc/free only, not the entire modprobe/rmmod): dma_pool_create() + dma_pool_alloc() loop, size = 128, count = 373959 original: 350 ms ( 1x) dmapool patches: 17 ms (21x) dma_pool_free() loop + dma_pool_destroy(), size = 128, count = 373959 original: 8901 ms ( 1x) dmapool patches: 15 ms ( 618x)