From patchwork Mon Nov 12 15:43:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Battersby X-Patchwork-Id: 10678869 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B44D14E2 for ; Mon, 12 Nov 2018 15:43:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 84E3D287F3 for ; Mon, 12 Nov 2018 15:43:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 783FE2A0AE; Mon, 12 Nov 2018 15:43:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BA5112A07C for ; Mon, 12 Nov 2018 15:43:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2A676B0298; Mon, 12 Nov 2018 10:43:29 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AD9456B029A; Mon, 12 Nov 2018 10:43:29 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B02A6B029B; Mon, 12 Nov 2018 10:43:29 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 6ADC66B0298 for ; Mon, 12 Nov 2018 10:43:29 -0500 (EST) Received: by mail-qk1-f200.google.com with SMTP id y83so24255925qka.7 for ; Mon, 12 Nov 2018 07:43:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:subject :to:cc:message-id:date:user-agent:mime-version :content-transfer-encoding:content-language; bh=ZKosqTz7mNSRhZGnFWHvk6gtxp5FA22N+5H113WFItA=; b=EiNNKGcc6rSBeCN2jv9U4FTM56Lvv36WjcpqA9K2cch+7fyGYFpRWfr/yE9a/xh4jG oY7nhSLL1D6rsVoZmQL8DTftm+64AWe836CMhqiDZptmqeoMHOzuqdBGTDN9x/Dp0Zx3 XjbQFGOXk3VcFe4IIicL5733SF88Vmfyzp0/vs5TMPA7zDUCyLu0YJ4O/4VCn/MeiDc5 8hY9cU59Kt/ZK7gCFauCiWX3AlT5fE8zDdcoP0sgLacJk7ymglgWBkBxne4o7v3Bhg+u ldfmwz1wJ8XNO26EoD97LZ8SGfJy5P3E2hAoARhC7JZWmsfnZAq4RHEy7afpVFhHKTXs Ltrw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of btv1==854ac0a7dab==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==854ac0a7dab==tonyb@cybernetics.com" X-Gm-Message-State: AGRZ1gLPUV1GyAtjb1h8m0BeYY4inV14lAHIqy4xtPdBIH7vEglTuteS T8q6emY+5SLRt4tN4L2thfeHbEPCn03mw7pJZ3N8ZUailO7tB7Q9mbST6q6S6tuZTVuTWIL7QXv XhJDpHnuv3/yc1f8UrQZRweCouE7Ex1rkQVppTA1X0F4AaGbTKYSyBY4VVOCj/4zMZw== X-Received: by 2002:aed:3641:: with SMTP id e59mr1381241qtb.59.1542037409170; Mon, 12 Nov 2018 07:43:29 -0800 (PST) X-Google-Smtp-Source: AJdET5elth1o6dU2DjHK482MKdSJIYa3+N0YfwCCfJ/fgmiA1qFH9efQ6og4y6IC79IJAnFxEP+u X-Received: by 2002:aed:3641:: with SMTP id e59mr1381179qtb.59.1542037408198; Mon, 12 Nov 2018 07:43:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542037408; cv=none; d=google.com; s=arc-20160816; b=JqYaLLjZblATqrV6qWecdWko/2ofysvKjvQJcFqN8gzTEzMcv/YWh9ivS1REn7/fEK 83k/cDPnG8MRB5nqIq3I91ng7jh2NtOkQmtcpGpQYpaxt7bLSknCIagMQXk7JB/CkkbV ijIn9tQVSUot4ussEO8F7ryqUNuJ+Ptyb5oBxlDVOXpSOVawUJkbl3JMASdHejv2t509 ekpDieEh04FqitCuz15vxDofe+8NKAHuGfYtjDpBRXh9J0LKtNG4rqgUJ9lIZf+tPF2e U+tqkzFdLIuo5d5wO8Refq6+K4uDQhJ4Gmc+WDIA78xHuhbKhNrjb4vxHD2o2HY6XvBF yvnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-language:content-transfer-encoding:mime-version:user-agent :date:message-id:cc:to:subject:from; bh=ZKosqTz7mNSRhZGnFWHvk6gtxp5FA22N+5H113WFItA=; b=dBheCn81K5hg/IjkOSzMJSEilQ4MKzUdvEIgH1oO3qg9f4/aSQIPeuaykb29l6BIUC 5AhjS8bMcC888SyA5RyUTdhu0xoWbL8ynEwTsT3LpUHfbspVale+zrh+7ELLQ7WgHMPu RCDQf/pJR7Hq2ILVq4vfSfYhNH3e1PDrbG2nIvsjaYi94E+szIIlwjCQehKI1QzZhJpT Rqyg0wbO5Pq5H4ScEGTwbWX9qU9nKnKvdH6WEU9QbWsRO1QBqucCTOYnWSJRnCs/48P1 31CVZEtdlmtbrlu0vA3+Fc5XADrGvBPTdNUt7Mg8p+L07ia50nsMS+wsDLzNvCPcRtar jZxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of btv1==854ac0a7dab==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==854ac0a7dab==tonyb@cybernetics.com" Received: from mail.cybernetics.com (mail.cybernetics.com. [173.71.130.66]) by mx.google.com with ESMTPS id k17si8091222qkh.130.2018.11.12.07.43.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Nov 2018 07:43:28 -0800 (PST) Received-SPF: pass (google.com: domain of btv1==854ac0a7dab==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) client-ip=173.71.130.66; Authentication-Results: mx.google.com; spf=pass (google.com: domain of btv1==854ac0a7dab==tonyb@cybernetics.com designates 173.71.130.66 as permitted sender) smtp.mailfrom="btv1==854ac0a7dab==tonyb@cybernetics.com" X-ASG-Debug-ID: 1542037406-0fb3b01fb3add540001-v9ZeMO Received: from cybernetics.com ([10.157.1.126]) by mail.cybernetics.com with ESMTP id xFxYtrZlvdXMpMCa (version=SSLv3 cipher=DES-CBC3-SHA bits=112 verify=NO); Mon, 12 Nov 2018 10:43:26 -0500 (EST) X-Barracuda-Envelope-From: tonyb@cybernetics.com X-ASG-Whitelist: Client Received: from [10.157.2.224] (account tonyb HELO [192.168.200.1]) by cybernetics.com (CommuniGate Pro SMTP 5.1.14) with ESMTPSA id 8529348; Mon, 12 Nov 2018 10:43:26 -0500 From: Tony Battersby Subject: [PATCH v4 4/9] dmapool: improve scalability of dma_pool_alloc To: Matthew Wilcox , Christoph Hellwig , Marek Szyprowski , iommu@lists.linux-foundation.org, linux-mm@kvack.org X-ASG-Orig-Subj: [PATCH v4 4/9] dmapool: improve scalability of dma_pool_alloc Cc: "linux-scsi@vger.kernel.org" Message-ID: Date: Mon, 12 Nov 2018 10:43:25 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 Content-Language: en-US X-Barracuda-Connect: UNKNOWN[10.157.1.126] X-Barracuda-Start-Time: 1542037406 X-Barracuda-Encrypted: DES-CBC3-SHA X-Barracuda-URL: https://10.157.1.122:443/cgi-mod/mark.cgi X-Barracuda-Scan-Msg-Size: 6584 X-Virus-Scanned: by bsmtpd at cybernetics.com X-Barracuda-BRTS-Status: 1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP dma_pool_alloc() scales poorly when allocating a large number of pages because it does a linear scan of all previously-allocated pages before allocating a new one. Improve its scalability by maintaining a separate list of pages that have free blocks ready to (re)allocate. In big O notation, this improves the algorithm from O(n^2) to O(n). Signed-off-by: Tony Battersby Acked-by: Matthew Wilcox --- linux/mm/dmapool.c.orig 2018-08-03 16:16:49.000000000 -0400 +++ linux/mm/dmapool.c 2018-08-03 16:45:33.000000000 -0400 @@ -15,11 +15,16 @@ * Many older drivers still have their own code to do this. * * The current design of this allocator is fairly simple. The pool is - * represented by the 'struct dma_pool' which keeps a doubly-linked list of - * allocated pages. Each page in the page_list is split into blocks of at - * least 'size' bytes. Free blocks are tracked in an unsorted singly-linked - * list of free blocks within the page. Used blocks aren't tracked, but we - * keep a count of how many are currently allocated from each page. + * represented by the 'struct dma_pool'. Each allocated page is split into + * blocks of at least 'size' bytes. Free blocks are tracked in an unsorted + * singly-linked list of free blocks within the page. Used blocks aren't + * tracked, but we keep a count of how many are currently allocated from each + * page. + * + * The pool keeps two doubly-linked list of allocated pages. The 'available' + * list tracks pages that have one or more free blocks, and the 'full' list + * tracks pages that have no free blocks. Pages are moved from one list to + * the other as their blocks are allocated and freed. */ #include @@ -43,7 +48,10 @@ #endif struct dma_pool { /* the pool */ - struct list_head page_list; +#define POOL_FULL_IDX 0 +#define POOL_AVAIL_IDX 1 +#define POOL_MAX_IDX 2 + struct list_head page_list[POOL_MAX_IDX]; spinlock_t lock; size_t size; struct device *dev; @@ -54,7 +62,7 @@ struct dma_pool { /* the pool */ }; struct dma_page { /* cacheable header for 'allocation' bytes */ - struct list_head page_list; + struct list_head dma_list; void *vaddr; dma_addr_t dma; unsigned int in_use; @@ -70,7 +78,6 @@ show_pools(struct device *dev, struct de unsigned temp; unsigned size; char *next; - struct dma_page *page; struct dma_pool *pool; next = buf; @@ -84,11 +91,18 @@ show_pools(struct device *dev, struct de list_for_each_entry(pool, &dev->dma_pools, pools) { unsigned pages = 0; unsigned blocks = 0; + int list_idx; spin_lock_irq(&pool->lock); - list_for_each_entry(page, &pool->page_list, page_list) { - pages++; - blocks += page->in_use; + for (list_idx = 0; list_idx < POOL_MAX_IDX; list_idx++) { + struct dma_page *page; + + list_for_each_entry(page, + &pool->page_list[list_idx], + dma_list) { + pages++; + blocks += page->in_use; + } } spin_unlock_irq(&pool->lock); @@ -163,7 +177,8 @@ struct dma_pool *dma_pool_create(const c retval->dev = dev; - INIT_LIST_HEAD(&retval->page_list); + INIT_LIST_HEAD(&retval->page_list[POOL_FULL_IDX]); + INIT_LIST_HEAD(&retval->page_list[POOL_AVAIL_IDX]); spin_lock_init(&retval->lock); retval->size = size; retval->boundary = boundary; @@ -252,7 +267,7 @@ static void pool_free_page(struct dma_po void *vaddr = page->vaddr; dma_addr_t dma = page->dma; - list_del(&page->page_list); + list_del(&page->dma_list); if (is_page_busy(page)) { dev_err(pool->dev, @@ -278,8 +293,8 @@ static void pool_free_page(struct dma_po */ void dma_pool_destroy(struct dma_pool *pool) { - struct dma_page *page; bool empty = false; + int list_idx; if (unlikely(!pool)) return; @@ -294,10 +309,15 @@ void dma_pool_destroy(struct dma_pool *p device_remove_file(pool->dev, &dev_attr_pools); mutex_unlock(&pools_reg_lock); - while ((page = list_first_entry_or_null(&pool->page_list, - struct dma_page, - page_list))) { - pool_free_page(pool, page); + for (list_idx = 0; list_idx < POOL_MAX_IDX; list_idx++) { + struct dma_page *page; + + while ((page = list_first_entry_or_null( + &pool->page_list[list_idx], + struct dma_page, + dma_list))) { + pool_free_page(pool, page); + } } kfree(pool); @@ -325,10 +345,11 @@ void *dma_pool_alloc(struct dma_pool *po might_sleep_if(gfpflags_allow_blocking(mem_flags)); spin_lock_irqsave(&pool->lock, flags); - list_for_each_entry(page, &pool->page_list, page_list) { - if (page->offset < pool->allocation) - goto ready; - } + page = list_first_entry_or_null(&pool->page_list[POOL_AVAIL_IDX], + struct dma_page, + dma_list); + if (page) + goto ready; /* pool_alloc_page() might sleep, so temporarily drop &pool->lock */ spin_unlock_irqrestore(&pool->lock, flags); @@ -339,11 +360,15 @@ void *dma_pool_alloc(struct dma_pool *po spin_lock_irqsave(&pool->lock, flags); - list_add(&page->page_list, &pool->page_list); + list_add(&page->dma_list, &pool->page_list[POOL_AVAIL_IDX]); ready: page->in_use++; offset = page->offset; page->offset = *(int *)(page->vaddr + offset); + if (page->offset >= pool->allocation) + /* Move page from the "available" list to the "full" list. */ + list_move_tail(&page->dma_list, + &pool->page_list[POOL_FULL_IDX]); retval = offset + page->vaddr; *handle = offset + page->dma; #ifdef DMAPOOL_DEBUG @@ -381,13 +406,19 @@ EXPORT_SYMBOL(dma_pool_alloc); static struct dma_page *pool_find_page(struct dma_pool *pool, dma_addr_t dma) { - struct dma_page *page; + int list_idx; - list_for_each_entry(page, &pool->page_list, page_list) { - if (dma < page->dma) - continue; - if ((dma - page->dma) < pool->allocation) - return page; + for (list_idx = 0; list_idx < POOL_MAX_IDX; list_idx++) { + struct dma_page *page; + + list_for_each_entry(page, + &pool->page_list[list_idx], + dma_list) { + if (dma < page->dma) + continue; + if ((dma - page->dma) < pool->allocation) + return page; + } } return NULL; } @@ -444,6 +475,9 @@ void dma_pool_free(struct dma_pool *pool #endif page->in_use--; + if (page->offset >= pool->allocation) + /* Move page from the "full" list to the "available" list. */ + list_move(&page->dma_list, &pool->page_list[POOL_AVAIL_IDX]); *(int *)vaddr = page->offset; page->offset = offset; /*