From patchwork Mon Aug 28 23:33:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13368407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8AFFC83F11 for ; Mon, 28 Aug 2023 23:33:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0EE50280025; Mon, 28 Aug 2023 19:33:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 09F008E001E; Mon, 28 Aug 2023 19:33:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8122280025; Mon, 28 Aug 2023 19:33:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D4C388E001E for ; Mon, 28 Aug 2023 19:33:25 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 77277C0673 for ; Mon, 28 Aug 2023 23:33:25 +0000 (UTC) X-FDA: 81175117170.11.8FE033D Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf18.hostedemail.com (Postfix) with ESMTP id BB9FB1C0030 for ; Mon, 28 Aug 2023 23:33:23 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=UPEryLU2; spf=pass (imf18.hostedemail.com: domain of 3wi7tZAoKCAM1rvu1dkphgjrrjoh.frpolqx0-ppnydfn.ruj@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3wi7tZAoKCAM1rvu1dkphgjrrjoh.frpolqx0-ppnydfn.ruj@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693265603; a=rsa-sha256; cv=none; b=YrviE9+vuW1orJhxQtEg63d59mx2S4c+4Mey18jdZhNfNA1msHpFGNIUpqt2tHIbI8ngh9 fEMQevbzmsfSHxLeW9legQp+hIlo/rxmD9DgO+yGUyMH0HjkB2Yijfgn7h1jxpNdkYb/xl QyNqKlMbS0qtjUnJiMeQ6wOMQfePYB0= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=UPEryLU2; spf=pass (imf18.hostedemail.com: domain of 3wi7tZAoKCAM1rvu1dkphgjrrjoh.frpolqx0-ppnydfn.ruj@flex--yosryahmed.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3wi7tZAoKCAM1rvu1dkphgjrrjoh.frpolqx0-ppnydfn.ruj@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693265603; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=KQTsLL1n7cCq71JxJGUMYc54rEwbEyB3OuO4iu2aarw=; b=Hewn9DUn9SxqiHXg733w0OiRPRUbae/e5zj5/qaoybiHPGMWbtc8M35UzrlJxdRbfYgoNa e5VsTxa+QqSwJYbRVRaMyXB6ZiSHb6NNOmZRK4zbPmB/2DPpRmTzXP2FcSWpLo9RiFJS9u uO8hTzBs2lRrTd81VS6MVQ2gmpwhPgk= Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-68bee0c327eso4584020b3a.1 for ; Mon, 28 Aug 2023 16:33:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1693265602; x=1693870402; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:from:to:cc:subject:date:message-id:reply-to; bh=KQTsLL1n7cCq71JxJGUMYc54rEwbEyB3OuO4iu2aarw=; b=UPEryLU2GgdQoGyZh3R8dc8akU7NKnHSY8YQq196qyPn5q7YC7C+9h8N9EB+FtnlHs GZNJt9Hw1VKSRbB9r4AqsYFwfY7d/dbJ7z8C6CKrqsjGtNdWtPLvVLEh8T4Ivzysl93l qSxDurpdYj2UpjTQUnZw48OSObetl2LyyihLIiPl83eIL29OfNsAXdhOFvI+Vzuqkgl9 tGLw3ltaCZA7nrlI4vfrJ8x3Bf7bbke3gB/moM8qrDVrz1N2z7p8VThO+vxf8IiIezzq repAHOCi/EM9d70gn4esvBCcaVXNySYPtOU1gg/KTODmwhL7fgqK2QcyXNaFrtoP8gsq hpQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693265602; x=1693870402; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KQTsLL1n7cCq71JxJGUMYc54rEwbEyB3OuO4iu2aarw=; b=dnskLagcNT46OG+d2bwySItTbZ1La8ceDXDAKu4RNeX5w75yVCk38Wh9/QC+gbO08k mjFxA36ZpudAgtGqSrW8ILJdiJbqYzE4cOATSYQVZkSZZnIx4EXKuVCrG0E0RibbuFSM nok65++1oOsixhg4WvGYHvGcggcoWcMkhXsK3mMtVJ5pyyj1JaGK8vVSe1vw8T4SbOfh LvXApbJe7rpPpaZVnpmbTFPU80hjAB6e6xCEQm8kufQ0sFR74zjZAcDzzRVQJvPqd+IK iMt1L1Vz/tnu/IC97xRK7sYTKcUxJXksJCgAtvzbSypY0/JPMJmcSqJPOc1bZpsuqjm5 eReA== X-Gm-Message-State: AOJu0YzLABpPtrpGNbWpuQ/Xk5Zsq3xhsUak+m9T0RrWxfJTlH7fnKPJ iflZZCqsmkryh86wSag/M8y50vcoHA4S+WWZ X-Google-Smtp-Source: AGHT+IHu5D1vnZqc5tr+eCDW6Mg43DzP7XlEkCvF+uj4cjZd1gRalpgdfoQt2IB2rrSWuVezWxIGvc2DWu9WCeYf X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a05:6a00:189f:b0:68c:1004:1fd7 with SMTP id x31-20020a056a00189f00b0068c10041fd7mr2726364pfh.6.1693265602247; Mon, 28 Aug 2023 16:33:22 -0700 (PDT) Date: Mon, 28 Aug 2023 23:33:14 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.42.0.rc2.253.gd59a3bf2b4-goog Message-ID: <20230828233319.340712-1-yosryahmed@google.com> Subject: [PATCH v2 0/4] memcg: non-unified flushing for userspace stats From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Ivan Babrou , Tejun Heo , " =?utf-8?q?Michal_Koutn=C3=BD?= " , Waiman Long , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: BB9FB1C0030 X-Stat-Signature: fs6m7jn4nj9oawpyb4buiyd6dyoy5cyu X-Rspam-User: X-HE-Tag: 1693265603-228525 X-HE-Meta: U2FsdGVkX19n4Ha9eZjrXn+GzC2jtnSyCqT6RbGSuosm8MtO/VmsBbeHaO7pJ2eYRbZJlpqx4iRMUD6mq89HC0TsnBW4GUuToTbjdaVBDGyti/yweooitzb3yeqCPapNaZ7NB1bj9DmejJ0pj4kO2tBXzdp/JjOgpHORT+eC1mXVqRCtF5PNJUHOxNu1QBHcfMomU0+geH/8wjllq1+Rfiqz3qUAQemYtrtpcRlZCVTTlZLseZLWWRPSn15hY9l2vsFynnYF+hnhpOHFiOLCNLpgTVf99Uw06F+1OBpLENYrMhCCYswk3/XDbHbhJCnmbp53SUkACATQbM/oUCGDXLsKDiTnziviY2+4mZLFRTcJXOxKDzwRhrsmU87/mP6pyIL9ZaFSWSKNC3ydkbXPLmg735BCkVA8YhRLigU1PfAGZ+lwvpd1J9cojWJpl0sey3VY7YqB1YJI5eh1eGxlZ8EkQyMoIjwh7X5HPRIs6TneaIFL7I9+0l3SKYdzI0KilTuFwnvyF5ip7D3STyTNFsSAq28V3YymC3CSohzyh8CN1y95V4xklq0Y/bmfq9zYvkbQ3eV8SEeB2L2XDaMl6W3TjdIxB2W3vJfYXcUN7z+F/gKwW4uZdRH3+PacSsms+mymPktZCH2jA41FEleWCv1CGF7IKS5fYvbIW2plE0oprYYN2w2D29F7cT5VpAV41MnBSp/cBVaDMbjRcM8JDX2SB95EOwqCl5Lhb4199c/T6owIPMNx04By2hctnjqtLBFNX6s/cZKzzqDm03E+1xaTxOeJVKI8n+sF18ZDOhtx9BhrxugbssWHO7Lm/t6KuaxntjZyGfD3MJivg4YRhx58+vfjgZcOg4hxPjMA0qMbK+46QENUKiBSQ9Cz5UacLMTtNDvb4esKAmU4+ihgsuA9K1Y5AbZtRmApcCfL7nwXUOiUDzRBKliYxFWbm0dpBdLnwaJYMiaGb6YVk/I 91M4H8lZ H0eDsn09RlpeHl+68fwDAbu+r0UulrthasRdvi9Gfo1LZjXrzj6jQNFoomw9xM3/tlYuvNjYv9zJrecZR5I8VOqNcdcykSxfgekQTPJYrj5HuApHm6OqcLWpU6FNQCp791WJr+6N+vDCjX5fh7KeCRW9l/n4Sr2ukIXtMiqllZIsKTjDa6GDXfokoCtskH9Dr5pbT5cjZxWZfUbIgyD5zmos0+xf3LdqsSJo97guUa5cJGwHlZwyP0k6JHgOhSeQH3QojdugCYWvbgqFXUw08pLGByH/GchcvCKKYWLlr9QcEzsUlXuJQglX63fktgRzt3FUqZQwD3VfnzCArQvvynQGvBZyfPMSQWLJwqB5RLS3vFACrQqlE5y0yqN50Vs7Fu2z0b99oyxOp4pUpyZR4Uh5qU6lPFBQ4voeiWv/1khMlLzJjjSxwCRxQgamCCBPkbYjrksMbRAO+QQAxTIMn0jf56ECWum5ZCuo40N8azr9IOSKYcq6y5JX8UeXWTH0cFzv4Mcl1sT8V7Jd9lRC8CRZIt/ZTDx8SX8pFOroTXEsbognparHdB5N1ClbRKloteAn1ac/zOpjIDe16myHuWnfIX7bMV+FHuMDYK2dqf59m3tjQujt0KaP1cg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Most memcg flushing contexts using "unified" flushing, where only one flusher is allowed at a time (others skip), and all flushers need to flush the entire tree. This works well with high concurrency, which mostly comes from in-kernel flushers (e.g. reclaim, refault, ..). For userspace reads, unified flushing leads to non-deterministic stats staleness and reading cost. This series clarifies and documents the differences between unified and non-unified flushing (patches 1 & 2), then opts userspace reads out of unified flushing (patch 3). This patch series is a follow up on the discussion in [1]. That was a patch that proposed that userspace reads wait for ongoing unified flushers to complete before returning. There were concerns about the latency that this introduces to userspace reads, especially with ongoing reports of expensive stat reads even with unified flushing. Hence, this series follows a different approach, by opting userspace reads out of unified flushing completely. The cost of userspace reads are now determinstic, and depend on the size of the subtree being read. This should fix both the *sometimes* expensive reads (due to flushing the entire tree) and occasional staless (due to skipping flushing). I attempted to remove unified flushing completely, but noticed that in-kernel flushers with high concurrency (e.g. hundreds of concurrent reclaimers). This sort of concurrency is not expected from userspace reads. More details about testing and some numbers in the last patch's changelog. v1 -> v2: - Added patch 3 to help unified stats with non-unified root flushes as suggested by Michal Koutný. - Updated the last patch changelog after discussions with Michal Hocko, Shakeel Butt, and Waiman Long. Yosry Ahmed (4): mm: memcg: properly name and document unified stats flushing mm: memcg: add a helper for non-unified stats flushing mm: memcg: let non-unified root stats flushes help unified flushes mm: memcg: use non-unified stats flushing for userspace reads include/linux/memcontrol.h | 8 ++-- mm/memcontrol.c | 83 ++++++++++++++++++++++++++------------ mm/vmscan.c | 2 +- mm/workingset.c | 4 +- 4 files changed, 65 insertions(+), 32 deletions(-)