From patchwork Thu Aug 2 19:58:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Battersby X-Patchwork-Id: 10554101 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 29E6C1708 for ; Thu, 2 Aug 2018 19:58:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 25B222C07C for ; Thu, 2 Aug 2018 19:58:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1754A2C085; Thu, 2 Aug 2018 19:58:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 76AE32C07C for ; Thu, 2 Aug 2018 19:58:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C8596B000D; Thu, 2 Aug 2018 15:58:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 977BE6B000E; Thu, 2 Aug 2018 15:58:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86AA96B0010; Thu, 2 Aug 2018 15:58:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk0-f200.google.com (mail-qk0-f200.google.com [209.85.220.200]) by kanga.kvack.org (Postfix) with ESMTP id 5F6DD6B000D for ; Thu, 2 Aug 2018 15:58:44 -0400 (EDT) Received: by mail-qk0-f200.google.com with SMTP id l15-v6so3018578qki.18 for ; Thu, 02 Aug 2018 12:58:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:subject :to:message-id:date:user-agent:mime-version :content-transfer-encoding:content-language; bh=DUg+9O2mHCnpWzr3U5aK8KcJVPJKbfLzR3qi70smxnU=; b=blDB7km342flDJMbg0S0HVVxpgEyafqKivSQ8hMw/9zF5OHh0dX1X6+XPvai6AtBpp FhPebHqDhMHfl0achCExbFpS4FGWkMVpCJNcwyvYkTjnZH6Ampcamwu/4fFkoumxh6K7 7c284M5SMT+0xiM/4sYsoKWGdhwHvslJXA9BCzTeVtu5KK0OiGju0D1se6eOm1Xj8Si6 OYtEOAMoaZVeuSwI3PARZCPwW5L+UJCB5k9pC9aiG3xO5mE4pTrIBKILhn8UAtVDFQLN nu7RiMSi+C/RRrv25qSxW8I3H6PLbW35ca0fMdH+ZhZznYEIPZrqaBhG33mPurGyJTV2 aHeQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of btv1==75224751413==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==75224751413==tonyb@cybernetics.com" X-Gm-Message-State: AOUpUlE8nR36H96b74B0HUVxvlbZpzPw2mxPONRnEG5m2PZTrmA2PF8x USy8Rg6Q2L898S0U8Ty6/3m3I9AZfKH27Gvdoe12h4G+WXjlp4uPcwI8eD24Bo/ATQFBGqFkkvu fqagLZXWFaMSHIzNpiqnxSy+H1ma1e23wnM/ijFYSZlC55WIRPApefPsAuhovMnv0fw== X-Received: by 2002:a0c:88c2:: with SMTP id 2-v6mr921030qvo.51.1533239924136; Thu, 02 Aug 2018 12:58:44 -0700 (PDT) X-Google-Smtp-Source: AAOMgperRffHGhGnEIayxiEaZaehGu7BrUjhdcTSt6+qHqlah+OK50tcgyZGu6BBIGqRXJOLZMN4 X-Received: by 2002:a0c:88c2:: with SMTP id 2-v6mr920979qvo.51.1533239923034; Thu, 02 Aug 2018 12:58:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533239923; cv=none; d=google.com; s=arc-20160816; b=O5PztmGQj5yyX6oG95m5VlH5Zaixfq5Q6yvRrFf/QQTEfr7F/jtqzE3myKWFsNgvXT iO3t8h3nI6cELF+w1vn3Bld8d6JuxvMrTgQhmvKXeT8WBop68BcAWzmQ5ivY/zjSzj1r gCgdwH6DMzCscPJ8wKMrgtUCv2QN44DoljEKCepy3RfeSrYB+kCCqDo2JKDCiB+6xnTU NSDJtbDFpBf5a6ArZCwRygoY/CKYXuKb/p/9AVOOIc3923KDamUpH+t1WTv9b9hMFcAC AY4zAn4ZvK4wLkzC8AqQgUXBBF2zf5azHh5PaeZEoutkuvPqssSrLHqecgZn6PSm+bMp bVbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-language:content-transfer-encoding:mime-version:user-agent :date:message-id:to:subject:from:arc-authentication-results; bh=DUg+9O2mHCnpWzr3U5aK8KcJVPJKbfLzR3qi70smxnU=; b=s44zC4OfZC0j3pw8L5nhEXmRrDH9LQVA3Rvght8Tpr3+SCGZovjS2XEOvB/WAVUprI kHgNDs+Fj2bTB7dm98XLIX4KV80BaYUyhn9RvE+8xB+dqEvNgUBtRwMsGleAfNwWw0E7 8rCDXHXfGBdghTxBVoBsDzqQsQXTYa4lm7HrNSNcPHrXIYPhype182RkGS/1nYSlNABP iRx0viUOLetAp4MJWxHpbNkCfpizjB4YbRiEG5Leq3XD/Gu/3dgg8rI6G5QWjkJ4NkO9 9zfPJoZAfxbjYCcscpAXJjSKraNaqL9uH9STeFSQUWlA/WOQHI0KYTOdREQimk/mdKwL brsA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of btv1==75224751413==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==75224751413==tonyb@cybernetics.com" Received: from mail.cybernetics.com (mail.cybernetics.com. [173.71.130.66]) by mx.google.com with ESMTPS id g12-v6si1045063qtc.225.2018.08.02.12.58.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Aug 2018 12:58:43 -0700 (PDT) Received-SPF: pass (google.com: domain of btv1==75224751413==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) client-ip=173.71.130.66; Authentication-Results: mx.google.com; spf=pass (google.com: domain of btv1==75224751413==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==75224751413==tonyb@cybernetics.com" X-ASG-Debug-ID: 1533239921-0fb3b01fb33f5990001-v9ZeMO Received: from cybernetics.com ([10.157.1.126]) by mail.cybernetics.com with ESMTP id mAhDNUnZILFnEUKx (version=SSLv3 cipher=DES-CBC3-SHA bits=112 verify=NO); Thu, 02 Aug 2018 15:58:41 -0400 (EDT) X-Barracuda-Envelope-From: tonyb@cybernetics.com X-ASG-Whitelist: Client Received: from [10.157.2.224] (account tonyb HELO [192.168.200.1]) by cybernetics.com (CommuniGate Pro SMTP 5.1.14) with ESMTPSA id 8317833; Thu, 02 Aug 2018 15:58:40 -0400 From: Tony Battersby Subject: [PATCH v2 4/9] dmapool: improve scalability of dma_pool_alloc To: Matthew Wilcox , Christoph Hellwig , Marek Szyprowski , Sathya Prakash , Chaitra P B , Suganath Prabu Subramani , iommu@lists.linux-foundation.org, linux-mm , linux-scsi , MPT-FusionLinux.pdl@broadcom.com X-ASG-Orig-Subj: [PATCH v2 4/9] dmapool: improve scalability of dma_pool_alloc Message-ID: <1dbe6204-17fc-efd9-2381-48186cae2b94@cybernetics.com> Date: Thu, 2 Aug 2018 15:58:40 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 Content-Language: en-US X-Barracuda-Connect: UNKNOWN[10.157.1.126] X-Barracuda-Start-Time: 1533239921 X-Barracuda-Encrypted: DES-CBC3-SHA X-Barracuda-URL: https://10.157.1.122:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 7247 X-Virus-Scanned: by bsmtpd at cybernetics.com X-Barracuda-BRTS-Status: 1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP dma_pool_alloc() scales poorly when allocating a large number of pages because it does a linear scan of all previously-allocated pages before allocating a new one. Improve its scalability by maintaining a separate list of pages that have free blocks ready to (re)allocate. In big O notation, this improves the algorithm from O(n^2) to O(n). Signed-off-by: Tony Battersby --- Changes since v1: *) In v1, there was one (original) list for all pages and one (new) list for pages with free blocks. In v2, there is one list for pages with free blocks and one list for pages without free blocks, and pages are moved back and forth between the two lists. This is to avoid bloating struct dma_page with extra list pointers, which is important so that a later patch can move its fields into struct page. *) Use list_first_entry_or_null instead of !list_empty/list_first_entry. Note that pool_find_page() will be removed entirely by a later patch, so the extra code there won't stay for long. --- linux/mm/dmapool.c.orig 2018-08-02 10:01:26.000000000 -0400 +++ linux/mm/dmapool.c 2018-08-02 10:03:46.000000000 -0400 @@ -15,11 +15,16 @@ * Many older drivers still have their own code to do this. * * The current design of this allocator is fairly simple. The pool is - * represented by the 'struct dma_pool' which keeps a doubly-linked list of - * allocated pages. Each page in the page_list is split into blocks of at - * least 'size' bytes. Free blocks are tracked in an unsorted singly-linked - * list of free blocks within the page. Used blocks aren't tracked, but we - * keep a count of how many are currently allocated from each page. + * represented by the 'struct dma_pool'. Each allocated page is split into + * blocks of at least 'size' bytes. Free blocks are tracked in an unsorted + * singly-linked list of free blocks within the page. Used blocks aren't + * tracked, but we keep a count of how many are currently allocated from each + * page. + * + * The pool keeps two doubly-linked list of allocated pages. The 'available' + * list tracks pages that have one or more free blocks, and the 'full' list + * tracks pages that have no free blocks. Pages are moved from one list to + * the other as their blocks are allocated and freed. */ #include @@ -43,7 +48,10 @@ #endif struct dma_pool { /* the pool */ - struct list_head page_list; +#define POOL_FULL_IDX 0 +#define POOL_AVAIL_IDX 1 +#define POOL_N_LISTS 2 + struct list_head page_list[POOL_N_LISTS]; spinlock_t lock; size_t size; struct device *dev; @@ -54,7 +62,7 @@ struct dma_pool { /* the pool */ }; struct dma_page { /* cacheable header for 'allocation' bytes */ - struct list_head page_list; + struct list_head dma_list; void *vaddr; dma_addr_t dma; unsigned int in_use; @@ -70,7 +78,6 @@ show_pools(struct device *dev, struct de unsigned temp; unsigned size; char *next; - struct dma_page *page; struct dma_pool *pool; next = buf; @@ -84,11 +91,18 @@ show_pools(struct device *dev, struct de list_for_each_entry(pool, &dev->dma_pools, pools) { unsigned pages = 0; unsigned blocks = 0; + int list_idx; spin_lock_irq(&pool->lock); - list_for_each_entry(page, &pool->page_list, page_list) { - pages++; - blocks += page->in_use; + for (list_idx = 0; list_idx < POOL_N_LISTS; list_idx++) { + struct dma_page *page; + + list_for_each_entry(page, + &pool->page_list[list_idx], + dma_list) { + pages++; + blocks += page->in_use; + } } spin_unlock_irq(&pool->lock); @@ -163,7 +177,8 @@ struct dma_pool *dma_pool_create(const c retval->dev = dev; - INIT_LIST_HEAD(&retval->page_list); + INIT_LIST_HEAD(&retval->page_list[0]); + INIT_LIST_HEAD(&retval->page_list[1]); spin_lock_init(&retval->lock); retval->size = size; retval->boundary = boundary; @@ -252,7 +267,7 @@ static void pool_free_page(struct dma_po void *vaddr = page->vaddr; dma_addr_t dma = page->dma; - list_del(&page->page_list); + list_del(&page->dma_list); if (is_page_busy(page)) { dev_err(pool->dev, @@ -278,8 +293,8 @@ static void pool_free_page(struct dma_po */ void dma_pool_destroy(struct dma_pool *pool) { - struct dma_page *page; bool empty = false; + int list_idx; if (unlikely(!pool)) return; @@ -294,10 +309,15 @@ void dma_pool_destroy(struct dma_pool *p device_remove_file(pool->dev, &dev_attr_pools); mutex_unlock(&pools_reg_lock); - while ((page = list_first_entry_or_null(&pool->page_list, - struct dma_page, - page_list))) { - pool_free_page(pool, page); + for (list_idx = 0; list_idx < POOL_N_LISTS; list_idx++) { + struct dma_page *page; + + while ((page = list_first_entry_or_null( + &pool->page_list[list_idx], + struct dma_page, + dma_list))) { + pool_free_page(pool, page); + } } kfree(pool); @@ -325,10 +345,11 @@ void *dma_pool_alloc(struct dma_pool *po might_sleep_if(gfpflags_allow_blocking(mem_flags)); spin_lock_irqsave(&pool->lock, flags); - list_for_each_entry(page, &pool->page_list, page_list) { - if (page->offset < pool->allocation) - goto ready; - } + page = list_first_entry_or_null(&pool->page_list[POOL_AVAIL_IDX], + struct dma_page, + dma_list); + if (page) + goto ready; /* pool_alloc_page() might sleep, so temporarily drop &pool->lock */ spin_unlock_irqrestore(&pool->lock, flags); @@ -339,11 +360,16 @@ void *dma_pool_alloc(struct dma_pool *po spin_lock_irqsave(&pool->lock, flags); - list_add(&page->page_list, &pool->page_list); + list_add(&page->dma_list, &pool->page_list[POOL_AVAIL_IDX]); ready: page->in_use++; offset = page->offset; page->offset = *(int *)(page->vaddr + offset); + if (page->offset >= pool->allocation) { + /* Move page from the "available" list to the "full" list. */ + list_del(&page->dma_list); + list_add(&page->dma_list, &pool->page_list[POOL_FULL_IDX]); + } retval = offset + page->vaddr; *handle = offset + page->dma; #ifdef DMAPOOL_DEBUG @@ -381,13 +407,19 @@ EXPORT_SYMBOL(dma_pool_alloc); static struct dma_page *pool_find_page(struct dma_pool *pool, dma_addr_t dma) { - struct dma_page *page; + int list_idx; + + for (list_idx = 0; list_idx < POOL_N_LISTS; list_idx++) { + struct dma_page *page; - list_for_each_entry(page, &pool->page_list, page_list) { - if (dma < page->dma) - continue; - if ((dma - page->dma) < pool->allocation) - return page; + list_for_each_entry(page, + &pool->page_list[list_idx], + dma_list) { + if (dma < page->dma) + continue; + if ((dma - page->dma) < pool->allocation) + return page; + } } return NULL; } @@ -444,6 +476,11 @@ void dma_pool_free(struct dma_pool *pool #endif page->in_use--; + if (page->offset >= pool->allocation) { + /* Move page from the "full" list to the "available" list. */ + list_del(&page->dma_list); + list_add(&page->dma_list, &pool->page_list[POOL_AVAIL_IDX]); + } *(int *)vaddr = page->offset; page->offset = offset; /*