From patchwork Tue Nov 6 05:28:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Aaron Lu X-Patchwork-Id: 10669663 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B4B3315A6 for ; Tue, 6 Nov 2018 05:28:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9767F29954 for ; Tue, 6 Nov 2018 05:28:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B3E429964; Tue, 6 Nov 2018 05:28:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.4 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,SUBJ_OBFU_PUNCT_FEW,SUBJ_OBFU_PUNCT_MANY autolearn=no version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DBC8D29954 for ; Tue, 6 Nov 2018 05:28:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9BC7F6B02B1; Tue, 6 Nov 2018 00:28:40 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 96AB96B02B2; Tue, 6 Nov 2018 00:28:40 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 835976B02B3; Tue, 6 Nov 2018 00:28:40 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 3F5276B02B1 for ; Tue, 6 Nov 2018 00:28:40 -0500 (EST) Received: by mail-pg1-f199.google.com with SMTP id s22so5280997pgv.8 for ; Mon, 05 Nov 2018 21:28:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:message-id:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=FJwERo/p7GG8RYmcfHyHJ5ObFnfPJ7fn5hJqpx0m+Jc=; b=ul7Ksgy03T2o5QPbnVewB28TvuhHGRapmB4cXXMg3IJ1we+cX3coy5yMqecnkDAeoq xgpnIl0Hnxt0cNs3ESgr2r33cboiyilNG3cBps0RrzXxw+arDLV1FffEHHdtxSUEzPSJ weBoEedhcVIgxqgw9lXZYV1fDDJeH+42tkDyPLO8zQow4lpNhYd923ZHKUT+sP0Wxntq Ud0XlVBdgT/aDVpNkG6iEh52mNl7ty0qSr3Tb2RMT/2IcGLWoxsojVIP4aEsJc1B0lKe StIuV7TwExCN2S2bZi4+f/8CGkorz6jbjv2i5UaZREsxijGmtSPebo/rxSoPoYkpYibA FQtQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of aaron.lu@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=aaron.lu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AGRZ1gIJV8AIZhiOjL7M+n6tt3Np0ZPPVgQz/nRz+IT9FC+UwiXvIgJK OZ4GC7x+O1psrQZ0f9LH5kAcLCOBzZlUeq08uB9AnEwyF4Y4HaUwWDyybI0oKjWr/zKc4UNSD8n tN3SQ/KJ98xNUICCWUfDTHuGStqItueE63uJAx6MdZoIclNaM7Q556MRuAGbNLb94/A== X-Received: by 2002:a17:902:bc8c:: with SMTP id bb12-v6mr23954196plb.275.1541482119857; Mon, 05 Nov 2018 21:28:39 -0800 (PST) X-Google-Smtp-Source: AJdET5dPdMlo922C9GfHOKgT3+ZgPVq9yIppEnMvFfhSfvMKW+wVctPWG/LDNi7eiG+Mh0grObzU X-Received: by 2002:a17:902:bc8c:: with SMTP id bb12-v6mr23954163plb.275.1541482119002; Mon, 05 Nov 2018 21:28:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541482118; cv=none; d=google.com; s=arc-20160816; b=XErGRIj//7r2pvn9NvZ/BVssTzmt5KtcxWL0wbj+DeNALIltwCh3WF4U+XRq1uK/yv MOfA92oa2yyylgVx9mLLMcLarVfZT0hEE7+wpltTU9CmYF1xCFfQ2NESGOd5mN2dGf+f v1WngGnpnM6sxeV9x7ITEDwsdl4zcLojsUqpZoYqk7q1iS2NcbXWNpzrR5ZU2LanV5mn BdopbUQMceqH4gCtDeH+NijiFxQP32CNqjMfY1rvkVj3L3jLZwa+/6FnXn4ENZporSM9 CFiKBVdWTovyZkopJzCDLD9zyX3vJzATaRsNljQYkQOQljt4za9n86eIvaZnx4YPf10s K80A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=FJwERo/p7GG8RYmcfHyHJ5ObFnfPJ7fn5hJqpx0m+Jc=; b=TNK85PMZ4LKmXpp5WRUQVDGLvLho29FH7T68+bQ7XbEJhCW4JJKtgoPuoo5Q4vwyzY jZwxO5TcdaIpfnEzt4pePbBES7U7egMvOE72s4v5zLb7ZXev3RX2XablAteutFLdJPrG HYpb3QgkmY7N15RRD6vLwqxcePeP6uFmQ4m4MSaTBV02jZ9rVydjk47q0NYFdCvDTtgQ D6DMyCoDrd1zQa5fJAonFtFl5nVJPNCYdy0nl4RSu0NSjHiK2G4ai/vr81bjdnQ6xsEH 2x/oiysG0E49O7HXTOjq4STM5LCTFQlLLwiOjio3o2vpiVO6DzwRLPLaGLwSyk1WSPOh OxTw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of aaron.lu@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=aaron.lu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id w11-v6si16119561ply.404.2018.11.05.21.28.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 05 Nov 2018 21:28:38 -0800 (PST) Received-SPF: pass (google.com: domain of aaron.lu@intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: domain of aaron.lu@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=aaron.lu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Nov 2018 21:28:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,470,1534834800"; d="scan'208";a="106230163" Received: from aaronlu.sh.intel.com (HELO intel.com) ([10.239.159.44]) by orsmga002.jf.intel.com with ESMTP; 05 Nov 2018 21:28:34 -0800 Date: Tue, 6 Nov 2018 13:28:33 +0800 From: Aaron Lu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: Andrew Morton , =?utf-8?b?UGF3ZcWC?= Staszewski , Jesper Dangaard Brouer , Eric Dumazet , Tariq Toukan , Ilias Apalodimas , Yoel Caspersen , Mel Gorman , Saeed Mahameed , Michal Hocko , Vlastimil Babka , Dave Hansen , Alexander Duyck Subject: [PATCH v2 1/2] mm/page_alloc: free order-0 pages through PCP in page_frag_free() Message-ID: <20181106052833.GC6203@intel.com> References: <20181105085820.6341-1-aaron.lu@intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20181105085820.6341-1-aaron.lu@intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP page_frag_free() calls __free_pages_ok() to free the page back to Buddy. This is OK for high order page, but for order-0 pages, it misses the optimization opportunity of using Per-Cpu-Pages and can cause zone lock contention when called frequently. Paweł Staszewski recently shared his result of 'how Linux kernel handles normal traffic'[1] and from perf data, Jesper Dangaard Brouer found the lock contention comes from page allocator: mlx5e_poll_tx_cq | --16.34%--napi_consume_skb | |--12.65%--__free_pages_ok | | | --11.86%--free_one_page | | | |--10.10%--queued_spin_lock_slowpath | | | --0.65%--_raw_spin_lock | |--1.55%--page_frag_free | --1.44%--skb_release_data Jesper explained how it happened: mlx5 driver RX-page recycle mechanism is not effective in this workload and pages have to go through the page allocator. The lock contention happens during mlx5 DMA TX completion cycle. And the page allocator cannot keep up at these speeds.[2] I thought that __free_pages_ok() are mostly freeing high order pages and thought this is an lock contention for high order pages but Jesper explained in detail that __free_pages_ok() here are actually freeing order-0 pages because mlx5 is using order-0 pages to satisfy its page pool allocation request.[3] The free path as pointed out by Jesper is: skb_free_head() -> skb_free_frag() -> page_frag_free() And the pages being freed on this path are order-0 pages. Fix this by doing similar things as in __page_frag_cache_drain() - send the being freed page to PCP if it's an order-0 page, or directly to Buddy if it is a high order page. With this change, Paweł hasn't noticed lock contention yet in his workload and Jesper has noticed a 7% performance improvement using a micro benchmark and lock contention is gone. Ilias' test on a 'low' speed 1Gbit interface on an cortex-a53 shows ~11% performance boost testing with 64byte packets and __free_pages_ok() disappeared from perf top. [1]: https://www.spinics.net/lists/netdev/msg531362.html [2]: https://www.spinics.net/lists/netdev/msg531421.html [3]: https://www.spinics.net/lists/netdev/msg531556.html Reported-by: Paweł Staszewski Analysed-by: Jesper Dangaard Brouer Acked-by: Vlastimil Babka Acked-by: Mel Gorman Acked-by: Jesper Dangaard Brouer Acked-by: Ilias Apalodimas Tested-by: Ilias Apalodimas Acked-by: Alexander Duyck Signed-off-by: Aaron Lu Acked-by: Tariq Toukan --- v2: only changelog changes: - remove the duplicated skb_free_frag() as pointed by Jesper; - add Ilias' test result; - add people's ack/test tag. mm/page_alloc.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ae31839874b8..91a9a6af41a2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4555,8 +4555,14 @@ void page_frag_free(void *addr) { struct page *page = virt_to_head_page(addr); - if (unlikely(put_page_testzero(page))) - __free_pages_ok(page, compound_order(page)); + if (unlikely(put_page_testzero(page))) { + unsigned int order = compound_order(page); + + if (order == 0) + free_unref_page(page); + else + __free_pages_ok(page, order); + } } EXPORT_SYMBOL(page_frag_free);