From patchwork Mon Aug 22 00:17:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 12950094 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 370A0C00140 for ; Mon, 22 Aug 2022 00:18:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C0BEE6B0073; Sun, 21 Aug 2022 20:18:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BBB678E0003; Sun, 21 Aug 2022 20:18:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A5C258E0002; Sun, 21 Aug 2022 20:18:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 966C76B0073 for ; Sun, 21 Aug 2022 20:18:54 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 776674094A for ; Mon, 22 Aug 2022 00:18:54 +0000 (UTC) X-FDA: 79825318188.29.8A60B3A Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf23.hostedemail.com (Postfix) with ESMTP id 1D0CF140094 for ; Mon, 22 Aug 2022 00:17:44 +0000 (UTC) Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-33580e26058so160878247b3.4 for ; Sun, 21 Aug 2022 17:17:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:from:to:cc; bh=eVYfwZxwoaftgj4urkXWqJfl1yf2BkP0ai3v6afDBzY=; b=M1l5wTrPM/Bn69v9qPWUy2ze89j/C2QhgC3uGWDIOLN0WzyTcSLHJG+RMJMatiXqzX 27XcKNJabca81sRVdeO2CT1uL6V3eUrxJu8cAS5kLO6npO5tc6V6+FSswOr5rvF6RaE5 NnZaglMI2t5Fcdxoo3js4AEbpcjU5+NawpoYbCoq0mblDVlFiOzpAKZf/doM3EvCHCLx 4Q3+m22Td0kDdin0LFVj23XqtyJX1saMxHUxHMwrnzPDoQvY1ZsDuHKa9Kh07T/Mylwk qX7Y7PhcDJVc31z1kgWb/ztxtLAbJS62kPbXsESgW/3GkZHCTfqGGUoWHVPWeD15A1hl kqqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:x-gm-message-state :from:to:cc; bh=eVYfwZxwoaftgj4urkXWqJfl1yf2BkP0ai3v6afDBzY=; b=BWWtaujNDnFOs2U3SNsE2zWSwsQQEuIZuPid/+qHhHnzDgCQelgVVGC6KcYh2BUctf Mjh6MBkriPqwB2hqVbA9YnUi3kaxZFddXxokrFsfGPH/A5Zu+2nont39vtlLjSe+Fh7v Zxm4nt5DbzNLos3UKAPxyuxsSt/jTUdx6Plv36lVuS/22iMunLDwmwXELCqYWp6Yv4G5 k3mAReepe8RkosmPDU5G0Qoj3s1eE8zmBlRfEywwoI0CZorYDppdi26CJamzs7IYC3Xy 3N5T3q8WuTu3P0QDJuYuTd8Nhk06POIeIRHAyGhAa5yDwef5r6JbBGXacPSjZirHLhiU zzfw== X-Gm-Message-State: ACgBeo1sFMyAaYv4UTjTVAgn/ulW4Uq3oiozb1+FtnDB1a7D45ZnFD2S i6g6YgNMoEzLEj+FpnPeejUt6IJWUUkTQw== X-Google-Smtp-Source: AA6agR78v4wOfPY4ftnsexAz/OZtv9vS4y8enZfqi9EG/iRXLt8jDPrzlVln+ZDLQBU87YcB+TpOeVGaN3NjPA== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:28b]) (user=shakeelb job=sendgmr) by 2002:a25:7304:0:b0:693:bc0d:fc6c with SMTP id o4-20020a257304000000b00693bc0dfc6cmr15772624ybc.375.1661127464253; Sun, 21 Aug 2022 17:17:44 -0700 (PDT) Date: Mon, 22 Aug 2022 00:17:34 +0000 Message-Id: <20220822001737.4120417-1-shakeelb@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.37.1.595.g718a3a8f04-goog Subject: [PATCH 0/3] memcg: optimizatize charge codepath From: Shakeel Butt To: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song Cc: " =?utf-8?q?Michal_Koutn=C3=BD?= " , Eric Dumazet , Soheil Hassas Yeganeh , Feng Tang , Oliver Sang , Andrew Morton , lkp@lists.01.org, cgroups@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Shakeel Butt ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=M1l5wTrP; spf=pass (imf23.hostedemail.com: domain of 3KMsCYwgKCOcbQJTNNUKPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--shakeelb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3KMsCYwgKCOcbQJTNNUKPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--shakeelb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661127465; a=rsa-sha256; cv=none; b=cXfGQosJIgL6dkmFaF4piuJKgq4OttkmpNUJr+aFYNcfDp4Yo1fry7+FrfyJhVOO0F8lBE MD5yd9kf4DnVW1/et9EIlGP9nhi+Hgnp3smZHa6PSekx9+h31tfoFpwUxsSQILJZIk16fL LlidH29/FkN2SdnmGkrgwpGWIPIdfhw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661127465; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=eVYfwZxwoaftgj4urkXWqJfl1yf2BkP0ai3v6afDBzY=; b=yQJEyHzShsef6cKuAwTx3Ryh9FceMLEWMjuT7mHCWnQUsQIpYR3qPLrmu5tj/+wys+WUN+ U0hpZxZUO94eHkmPhudQjvUe4PHf6Q7xKRkBdYsR+77sFBjRczbWK7IaFl8fi/8MrE/9hX zxKWdgow8mwR2M3Kgot+HrL/DaZM21E= Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=M1l5wTrP; spf=pass (imf23.hostedemail.com: domain of 3KMsCYwgKCOcbQJTNNUKPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--shakeelb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3KMsCYwgKCOcbQJTNNUKPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--shakeelb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: ny88t1usm5w4fpczfi85u1mmzai74dng X-Rspamd-Queue-Id: 1D0CF140094 X-HE-Tag: 1661127464-242237 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Recently Linux networking stack has moved from a very old per socket pre-charge caching to per-cpu caching to avoid pre-charge fragmentation and unwarranted OOMs. One impact of this change is that for network traffic workloads, memcg charging codepath can become a bottleneck. The kernel test robot has also reported this regression. This patch series tries to improve the memcg charging for such workloads. This patch series implement three optimizations: (A) Reduce atomic ops in page counter update path. (B) Change layout of struct page_counter to eliminate false sharing between usage and high. (C) Increase the memcg charge batch to 64. To evaluate the impact of these optimizations, on a 72 CPUs machine, we ran the following workload in root memcg and then compared with scenario where the workload is run in a three level of cgroup hierarchy with top level having min and low setup appropriately. $ netserver -6 # 36 instances of netperf with following params $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K Results (average throughput of netperf): 1. root memcg 21694.8 2. 6.0-rc1 10482.7 (-51.6%) 3. 6.0-rc1 + (A) 14542.5 (-32.9%) 4. 6.0-rc1 + (B) 12413.7 (-42.7%) 5. 6.0-rc1 + (C) 17063.7 (-21.3%) 6. 6.0-rc1 + (A+B+C) 20120.3 (-7.2%) With all three optimizations, the memcg overhead of this workload has been reduced from 51.6% to just 7.2%. Shakeel Butt (3): mm: page_counter: remove unneeded atomic ops for low/min mm: page_counter: rearrange struct page_counter fields memcg: increase MEMCG_CHARGE_BATCH to 64 include/linux/memcontrol.h | 7 ++++--- include/linux/page_counter.h | 34 +++++++++++++++++++++++----------- mm/page_counter.c | 13 ++++++------- 3 files changed, 33 insertions(+), 21 deletions(-)