From patchwork Tue Nov 20 01:45:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Aaron Lu X-Patchwork-Id: 10689757 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 118BA5A4 for ; Tue, 20 Nov 2018 01:45:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0047329D4E for ; Tue, 20 Nov 2018 01:45:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E734629D99; Tue, 20 Nov 2018 01:45:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3EF2629D4E for ; Tue, 20 Nov 2018 01:45:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CFC86B1CA4; Mon, 19 Nov 2018 20:45:50 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 57F8B6B1CA5; Mon, 19 Nov 2018 20:45:50 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 445E66B1CA6; Mon, 19 Nov 2018 20:45:50 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 0304F6B1CA4 for ; Mon, 19 Nov 2018 20:45:50 -0500 (EST) Received: by mail-pg1-f199.google.com with SMTP id o9so226115pgv.19 for ; Mon, 19 Nov 2018 17:45:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:message-id:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=MfITaOE2f/knXjS0KxRRgWfhZCCh4b+IRmZ1cidD4aY=; b=dzhUNeFpFnvwONzIFsWjFRREw96JdmFyGMlpVxefEmLyGfUrZZFDYbLPMnew/R2FdD fH/ToyZUAv123cOdc8EpKOBBY7KMFxulQMtylnBWX3rV/81Ef7vz8KW6m87qT15qihIU LqGjOwp24B2mHudLqDM2DepMiG13XiCWiYKyFM4WA1KLNIiyR6PJxOkjXDWmYwenuwLG KgAni33qt7bWmeLcskJr1Q5ROcgIsUAxf4+4blPMco1I0dequAlVFN0TMub5bJKwN6Qb LiFCL4qcBNb2KiH8VtC0U/A8bQRaKCwlAnbqZZX3jFdtpI0HFqg1kMlINC5P7Xzg5kLP N4tg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of aaron.lu@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=aaron.lu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AA+aEWZLjNKq2v9XPQP2402LiIzVM6NyPaDEs18L1D69F2iCxmtwsZD2 C+a8kXGCB1b1Sq/TLqC01WTyWnIGYGJOXWKtBa9jUkvrZ/uXlH+s9P0N1RNDxfK7w+z7gbQnkKp RtUVLJezrxESp7ToSU/bWDSL9zLcfQ9r3vDhL2d8lU2yXU2n4pwsN8huoPa5o86hKzA== X-Received: by 2002:a65:6684:: with SMTP id b4mr77025pgw.55.1542678349638; Mon, 19 Nov 2018 17:45:49 -0800 (PST) X-Google-Smtp-Source: AJdET5elfxxMqt1BX9jHAVcznRZMrSahjSbTnmW6BZj+Ti79/Zq6e6M7VI9VhYNM4owKTWj82s71 X-Received: by 2002:a65:6684:: with SMTP id b4mr76990pgw.55.1542678348882; Mon, 19 Nov 2018 17:45:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542678348; cv=none; d=google.com; s=arc-20160816; b=VNfU6cYVmSNLqGvEDwN1VkxtIC4jzeg5pxHdScJ+Ek7toVIEZlAMat8shhtxBUwMCV dnYtkiX4iP8y8sXSdDNcBtYiFLJVRV69P8dRQhmMcF6t+/zJpr0cYwpU0GsxGG2CMJDH 7YtNNYxbkVeZ0BXNPfCB8OHV885LLiy9oo5rw0qIddumcWFdDKZ2WZ4wDNh80kuuwx0+ PrV8JY5iZyPYiiCsgAJue+tETc2v0S6AmVfBra4rWxsEFChkQN7Upfw7konXv4RNX55I 3sbuQLKeUyUNGJGIBBFB7tdfxAGMiCzgMddYEDvQr9yErtc8cSYU5OktG42IxhKYPIQh /APw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=MfITaOE2f/knXjS0KxRRgWfhZCCh4b+IRmZ1cidD4aY=; b=U1HxPjeZvjp208N+GCZfgoG517/1RXII35DmCKIn9zaC0AcjeRCjfDfdm9k7pA0foj EPW6X5o2qJylHgguNz12neZd+t4bk0PjzDYJirPw+b97SxPCrnsApnIa6H7vq6aUbMad c2k3o6ItBtFgXM58br/7oNDq7dIQv9c2CL1RAVde2PF8ufLf9kpMk4Sp2xMR7g30KjEK fxz46xAvkMqHF0ag3rCq5rM89Np63EhxcHTmGMAN46Wlos9Z8gprnvRRV9qacItrljPZ SMfEUV6kQ/ciN9BU9rjBm4f1xpY1CI64eY2ncaLfvkaQqx1+r/V3z4U5MFOLP8UXjmXt QrHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of aaron.lu@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=aaron.lu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga17.intel.com (mga17.intel.com. [192.55.52.151]) by mx.google.com with ESMTPS id z14si42025678pgj.73.2018.11.19.17.45.48 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Nov 2018 17:45:48 -0800 (PST) Received-SPF: pass (google.com: domain of aaron.lu@intel.com designates 192.55.52.151 as permitted sender) client-ip=192.55.52.151; Authentication-Results: mx.google.com; spf=pass (google.com: domain of aaron.lu@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=aaron.lu@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Nov 2018 17:45:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,255,1539673200"; d="scan'208";a="109528008" Received: from aaronlu.sh.intel.com (HELO intel.com) ([10.239.159.44]) by fmsmga001.fm.intel.com with ESMTP; 19 Nov 2018 17:45:45 -0800 Date: Tue, 20 Nov 2018 09:45:44 +0800 From: Aaron Lu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: Andrew Morton , =?utf-8?b?UGF3ZcWC?= Staszewski , Jesper Dangaard Brouer , Eric Dumazet , Tariq Toukan , Ilias Apalodimas , Yoel Caspersen , Mel Gorman , Saeed Mahameed , Michal Hocko , Vlastimil Babka , Dave Hansen , Alexander Duyck , Ian Kumlien Subject: [PATCH v2 RESEND update 1/2] mm/page_alloc: free order-0 pages through PCP in page_frag_free() Message-ID: <20181120014544.GB10657@intel.com> References: <20181119134834.17765-1-aaron.lu@intel.com> <20181119134834.17765-2-aaron.lu@intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20181119134834.17765-2-aaron.lu@intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP page_frag_free() calls __free_pages_ok() to free the page back to Buddy. This is OK for high order page, but for order-0 pages, it misses the optimization opportunity of using Per-Cpu-Pages and can cause zone lock contention when called frequently. Paweł Staszewski recently shared his result of 'how Linux kernel handles normal traffic'[1] and from perf data, Jesper Dangaard Brouer found the lock contention comes from page allocator: mlx5e_poll_tx_cq | --16.34%--napi_consume_skb | |--12.65%--__free_pages_ok | | | --11.86%--free_one_page | | | |--10.10%--queued_spin_lock_slowpath | | | --0.65%--_raw_spin_lock | |--1.55%--page_frag_free | --1.44%--skb_release_data Jesper explained how it happened: mlx5 driver RX-page recycle mechanism is not effective in this workload and pages have to go through the page allocator. The lock contention happens during mlx5 DMA TX completion cycle. And the page allocator cannot keep up at these speeds.[2] I thought that __free_pages_ok() are mostly freeing high order pages and thought this is an lock contention for high order pages but Jesper explained in detail that __free_pages_ok() here are actually freeing order-0 pages because mlx5 is using order-0 pages to satisfy its page pool allocation request.[3] The free path as pointed out by Jesper is: skb_free_head() -> skb_free_frag() -> page_frag_free() And the pages being freed on this path are order-0 pages. Fix this by doing similar things as in __page_frag_cache_drain() - send the being freed page to PCP if it's an order-0 page, or directly to Buddy if it is a high order page. With this change, Paweł hasn't noticed lock contention yet in his workload and Jesper has noticed a 7% performance improvement using a micro benchmark and lock contention is gone. Ilias' test on a 'low' speed 1Gbit interface on an cortex-a53 shows ~11% performance boost testing with 64byte packets and __free_pages_ok() disappeared from perf top. [1]: https://www.spinics.net/lists/netdev/msg531362.html [2]: https://www.spinics.net/lists/netdev/msg531421.html [3]: https://www.spinics.net/lists/netdev/msg531556.html Reported-by: Paweł Staszewski Analysed-by: Jesper Dangaard Brouer Acked-by: Vlastimil Babka Acked-by: Mel Gorman Acked-by: Jesper Dangaard Brouer Acked-by: Ilias Apalodimas Tested-by: Ilias Apalodimas Acked-by: Alexander Duyck Acked-by: Tariq Toukan Signed-off-by: Aaron Lu Acked-by: Pankaj gupta --- update: fix Tariq's email tag. mm/page_alloc.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 421c5b652708..8f8c6b33b637 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4677,8 +4677,14 @@ void page_frag_free(void *addr) { struct page *page = virt_to_head_page(addr); - if (unlikely(put_page_testzero(page))) - __free_pages_ok(page, compound_order(page)); + if (unlikely(put_page_testzero(page))) { + unsigned int order = compound_order(page); + + if (order == 0) + free_unref_page(page); + else + __free_pages_ok(page, order); + } } EXPORT_SYMBOL(page_frag_free);