From patchwork Mon Jun 24 17:42:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 11013869 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 429DD76 for ; Mon, 24 Jun 2019 17:43:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D6FD28AA5 for ; Mon, 24 Jun 2019 17:43:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1EF2928B68; Mon, 24 Jun 2019 17:43:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A90FC28AA5 for ; Mon, 24 Jun 2019 17:43:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E1FCD6B0006; Mon, 24 Jun 2019 13:43:37 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DD09E8E0003; Mon, 24 Jun 2019 13:43:37 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBF3F8E0002; Mon, 24 Jun 2019 13:43:37 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id ABE656B0006 for ; Mon, 24 Jun 2019 13:43:37 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id v58so17890007qta.2 for ; Mon, 24 Jun 2019 10:43:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=elA606R+IaCLXLd8PRvcbOhh3L8eYi5ke0P4RLCssjE=; b=OI8eNXW5M6NiiQY/1yMZiEQ6MMzAB/JHi9DlxAqWiHRyhqhTlcQ9B58RXQ81pTYHnY /E5nT8IRKu2JaiXQKXOsDiRLabu4SolpUxbGIAZOHAEnCBvqtD7/uGuPc8po2fteHwKs mIXLMyJoWWIGIaWC3u1xAV5YWbO/Dy3pktCOWFFlY/d7Hki0vV5B2gRdpm1GoK4q/2x0 gg0UsiINWidmAVAKFQqC7uVyhitGLDjOjUy/7af6nEJdy+nhssRyuPQ6WnYUccdXsxGJ gb8obfmL1aZkQryYY8p4AMUz2LX3Zbfm5kBeKGYJLRB9lZrAnFczjUh3memMvt7RhGk1 b2Ig== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of longman@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXUTsB9CkEdabSkmZXE0N9JzBrzM985YkY02ihyfRxpXhTX3eKg Thb5Bvy22lNvPkZQc6I528p1AfCfx0W6vtl+laDCOTBkTKgT4z3Mhk7ZkRSHbf41/QDYWuIR3Qd SFZwWgxdHsDhDEm74ZKC4gAPSIee+C59/JA8dcxEVTskgCyHgCpXGxZDAqqepCnpHDw== X-Received: by 2002:aed:2e64:: with SMTP id j91mr112788865qtd.318.1561398217450; Mon, 24 Jun 2019 10:43:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqwNDJWDtGq0Xa6Yb1xQ4RWeNGuvza8HZcY/sRHiRwKX7Y65mOhk1ePuq8SrGCg2ms28whfH X-Received: by 2002:aed:2e64:: with SMTP id j91mr112788826qtd.318.1561398216571; Mon, 24 Jun 2019 10:43:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561398216; cv=none; d=google.com; s=arc-20160816; b=oFvTilLu3xD5I7agIduyOXwA3DIrKB4RsdeqRINrRtFiVNwa+/yDnTleUVVBXSyq9S ptCnco2IpP6LNZq8pF7NQJvUmC8Z3hPNlWfYCKh7hibYCE3ZITkjxdmQ+BNp5tTqojMb U6yB+ds7VoBFJeVxZi+muimT6TJEKjJunFvYvYmuLH2GVaaQnBJnPPG11WOChlAhSduA yKoSC5zHb+NpA1YRE8U2zYKBtTMMWTdFbgS6oQ3VLrdAo8dGELcvpBeNPK1omWdKOqBx y+JqBBSOSK0UKo7u5eeq7Zfdv6rbdHcS5mS67NNjhJ8dCSJJ7I+A7whGGQi1e5KyoHV9 yzPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=elA606R+IaCLXLd8PRvcbOhh3L8eYi5ke0P4RLCssjE=; b=CDZF+zvRZGC71J2CcLT7K/XB3Py2/A4+G2ZN2k6R6yOpB1TZkxVGS6y5jEKNhLX4PU AiDHhW6nlcBhsnA1aEFBK1FjJnHTzI0LxeN/mgG9QIMR550hOrI0ZHAHG50Jtkzupa2o zxTDcth0zU6zwqeHYa9v/jzZ4m50y/1cmEpYv0hM06cctrZdwu3fCP7IAe++LD9RuhGm TJs1THGC3FNe39rdnRPSmLuqdoYyxGvAbW29P6M79zl4kw09urAoRPgKZ6lVsOFum4f8 7MfTR68U2eg761bk388u3I+RfwV1SQBzAvsRVEKKYiXbEhFOY/8bMFyD7KbSpOab+SeV 403A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of longman@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id h28si5452158qkl.149.2019.06.24.10.43.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jun 2019 10:43:36 -0700 (PDT) Received-SPF: pass (google.com: domain of longman@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of longman@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DB640C058CBD; Mon, 24 Jun 2019 17:43:17 +0000 (UTC) Received: from llong.com (dhcp-17-85.bos.redhat.com [10.18.17.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 13DF95D9D3; Mon, 24 Jun 2019 17:43:14 +0000 (UTC) From: Waiman Long To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Alexander Viro , Jonathan Corbet , Luis Chamberlain , Kees Cook , Johannes Weiner , Michal Hocko , Vladimir Davydov Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Roman Gushchin , Shakeel Butt , Andrea Arcangeli , Waiman Long Subject: [PATCH 1/2] mm, memcontrol: Add memcg_iterate_all() Date: Mon, 24 Jun 2019 13:42:18 -0400 Message-Id: <20190624174219.25513-2-longman@redhat.com> In-Reply-To: <20190624174219.25513-1-longman@redhat.com> References: <20190624174219.25513-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Mon, 24 Jun 2019 17:43:35 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add a memcg_iterate_all() function for iterating all the available memory cgroups and call the given callback function for each of the memory cgruops. Signed-off-by: Waiman Long --- include/linux/memcontrol.h | 3 +++ mm/memcontrol.c | 13 +++++++++++++ 2 files changed, 16 insertions(+) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 1dcb763bb610..0e31418e5a47 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1268,6 +1268,9 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep); void memcg_kmem_put_cache(struct kmem_cache *cachep); +extern void memcg_iterate_all(void (*callback)(struct mem_cgroup *memcg, + void *arg), void *arg); + #ifdef CONFIG_MEMCG_KMEM int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order); void __memcg_kmem_uncharge(struct page *page, int order); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ba9138a4a1de..c1c4706f7696 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -443,6 +443,19 @@ static int memcg_alloc_shrinker_maps(struct mem_cgroup *memcg) static void memcg_free_shrinker_maps(struct mem_cgroup *memcg) { } #endif /* CONFIG_MEMCG_KMEM */ +/* + * Iterate all the memory cgroups and call the given callback function + * for each of the memory cgroups. + */ +void memcg_iterate_all(void (*callback)(struct mem_cgroup *memcg, void *arg), + void *arg) +{ + struct mem_cgroup *memcg; + + for_each_mem_cgroup(memcg) + callback(memcg, arg); +} + /** * mem_cgroup_css_from_page - css of the memcg associated with a page * @page: page of interest From patchwork Mon Jun 24 17:42:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 11013873 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 91AC46C5 for ; Mon, 24 Jun 2019 17:43:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FEF328AA5 for ; Mon, 24 Jun 2019 17:43:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 73CEF28B68; Mon, 24 Jun 2019 17:43:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3965828AA5 for ; Mon, 24 Jun 2019 17:43:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 399556B0007; Mon, 24 Jun 2019 13:43:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3225B8E0003; Mon, 24 Jun 2019 13:43:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 175A98E0002; Mon, 24 Jun 2019 13:43:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id EBD236B0007 for ; Mon, 24 Jun 2019 13:43:42 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id o16so17837458qtj.6 for ; Mon, 24 Jun 2019 10:43:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=JSpF2NUCMDbG/5aaSELxhrMmYwoGB8rBkLI1jEMCSKk=; b=gxBUDDOZHlbb5bwHVPFSQkeg6WKq4NrrKdx8pKTeR/403pJtSmCN3LCwWPJ9Zn5WHT q0LWlKhdaNuWFNzrKBGfWNZyqJ4lcFVeOq83Y+91SXCwP3qQknx5uoPfUTelMVM9UZp2 /dC4SUiO7n4drY951xg9d/SeH8q3l4/lqgDW9rfU5vFC9dQhEGRnGVtjXFMB1vNz/Ksn r1xi6NDXXxTVuXjjgqxBO2XhKIn8nd3SGPMXFyU1C1ZmKlHQWwlXrhiQQw6699hJ6Cd4 NQHxA2YOG57eyMI3GPCXNMZ91MD0KceFGMUWJU3tkamyBuYk0shB7bkeyWh+QjRwGXN4 gOyQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of longman@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAV4s6WWZFs08YpTyx7/6NjJNTnYNTsmUGgQjzkyv9ljj7Od4wnr LPgLqJgpa6iom3oZTTzbbCvJygnrBZjmr6LtBDu384myN1CCDYIrOvNmzMRk5vwFhzjcQlM0Z9j CG+H2yoFIqXQ3Eqv6xPiMbXj/Yk18chZj2QwSyZW40+f7NV6oyWl0kSAta6mF8b6VKA== X-Received: by 2002:ac8:3098:: with SMTP id v24mr81732124qta.47.1561398222717; Mon, 24 Jun 2019 10:43:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqxDx1IWpyQ14uz3xq159NgFAEN2ILKNvNmoU6P4YACahkfUG5R7bhQZC/ZIAxcaZcNBMFzC X-Received: by 2002:ac8:3098:: with SMTP id v24mr81732068qta.47.1561398221757; Mon, 24 Jun 2019 10:43:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561398221; cv=none; d=google.com; s=arc-20160816; b=QO1tzNDqXczSOl34J9TIhNRK2LjAGh+a9n6bLxlNQNWZQSSIOBnKuTluce1A1gb0Vq yOVxwxPFT1CIgiL90+xjD4LA8Q3p06Tp3tOEYyekLyMVli+x6kkw6vUhHxOWevzGJkFc YKISMWqqFDEVlaqFi1KBXTAiwvDSK2cZzey1/vDhsN0suGkxnBy9iuDKZp34ostfYD7o horiVs+9ZbgZKOvC+UC2hQD6Yg45jfuw8lg2IkvhKdXPb6Z/jrZGfdv3A/eHJTdrzNj3 DgNtrA215vDQRdZ9zU0rUX1aC5munCvCU+KpSmVQFrWzF05vu3cDMbL/7qArrt+CdbgY PfgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from; bh=JSpF2NUCMDbG/5aaSELxhrMmYwoGB8rBkLI1jEMCSKk=; b=m8814+CPBQ+htVeZxJvPKbePVfnUgtA53CWqv8TJPfMo9UaqfSRRWs8UHaWMzMHnnH UrnatCA0tllFaPTywgyjTUedvnf+vFAzl5H2Mvl3B3aRo5YxrGkY2Hnlg5TQKnEn9Cb6 Qh01pmeNS/toifdUZEBi0bA3xX2lFna+U+KF0XEcveR/xjAn/fVf2VzaXPm0/xuF5pc7 +Kb+bDJJL/aoZWnhue2dgVXH4RZ9AiMvxrTWsokcjR+UeXwsfofvXFM1gJveKg8Apeyl FTzc6jW/aq+EzDcJZgo6hUd8huQ11WFqDP4CezG3G6p8wFQu2ZybXItLXbhRA7QY8LLC Q0hA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of longman@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v27si5171063qtb.140.2019.06.24.10.43.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jun 2019 10:43:41 -0700 (PDT) Received-SPF: pass (google.com: domain of longman@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of longman@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EEC937FDE5; Mon, 24 Jun 2019 17:43:22 +0000 (UTC) Received: from llong.com (dhcp-17-85.bos.redhat.com [10.18.17.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0D47B5D9D5; Mon, 24 Jun 2019 17:43:17 +0000 (UTC) From: Waiman Long To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Alexander Viro , Jonathan Corbet , Luis Chamberlain , Kees Cook , Johannes Weiner , Michal Hocko , Vladimir Davydov Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Roman Gushchin , Shakeel Butt , Andrea Arcangeli , Waiman Long Subject: [PATCH 2/2] mm, slab: Extend vm/drop_caches to shrink kmem slabs Date: Mon, 24 Jun 2019 13:42:19 -0400 Message-Id: <20190624174219.25513-3-longman@redhat.com> In-Reply-To: <20190624174219.25513-1-longman@redhat.com> References: <20190624174219.25513-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Mon, 24 Jun 2019 17:43:40 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP With the slub memory allocator, the numbers of active slab objects reported in /proc/slabinfo are not real because they include objects that are held by the per-cpu slab structures whether they are actually used or not. The problem gets worse the more CPUs a system have. For instance, looking at the reported number of active task_struct objects, one will wonder where all the missing tasks gone. I know it is hard and costly to get a real count of active objects. So I am not advocating for that. Instead, this patch extends the /proc/sys/vm/drop_caches sysctl parameter by using a new bit (bit 3) to shrink all the kmem slabs which will flush out all the slabs in the per-cpu structures and give a more accurate view of how much memory are really used up by the active slab objects. This is a costly operation, of course, but it gives a way to have a clearer picture of the actual number of slab objects used, if the need arises. The upper range of the drop_caches sysctl parameter is increased to 15 to allow all possible combinations of the lowest 4 bits. On a 2-socket 64-core 256-thread ARM64 system with 64k page size after a parallel kernel build, the amount of memory occupied by slabs before and after echoing to drop_caches were: # grep task_struct /proc/slabinfo task_struct 48376 48434 4288 61 4 : tunables 0 0 0 : slabdata 794 794 0 # grep "^S[lRU]" /proc/meminfo Slab: 3419072 kB SReclaimable: 354688 kB SUnreclaim: 3064384 kB # echo 3 > /proc/sys/vm/drop_caches # grep "^S[lRU]" /proc/meminfo Slab: 3351680 kB SReclaimable: 316096 kB SUnreclaim: 3035584 kB # echo 8 > /proc/sys/vm/drop_caches # grep "^S[lRU]" /proc/meminfo Slab: 1008192 kB SReclaimable: 126912 kB SUnreclaim: 881280 kB # grep task_struct /proc/slabinfo task_struct 2601 6588 4288 61 4 : tunables 0 0 0 : slabdata 108 108 0 Shrinking the slabs saves more than 2GB of memory in this case. This new feature certainly fulfills the promise of dropping caches. Unlike counting objects in the per-node caches done by /proc/slabinfo which is rather light weight, iterating all the per-cpu caches and shrinking them is much more heavy weight. For this particular instance, the time taken to shrinks all the root caches was about 30.2ms. There were 73 memory cgroup and the longest time taken for shrinking the largest one was about 16.4ms. The total shrinking time was about 101ms. Because of the potential long time to shrinks all the caches, the slab_mutex was taken multiple times - once for all the root caches and once for each memory cgroup. This is to reduce the slab_mutex hold time to minimize impact to other running applications that may need to acquire the mutex. The slab shrinking feature is only available when CONFIG_MEMCG_KMEM is defined as the code need to access slab_root_caches to iterate all the root caches. Signed-off-by: Waiman Long --- Documentation/sysctl/vm.txt | 11 ++++++++-- fs/drop_caches.c | 4 ++++ include/linux/slab.h | 1 + kernel/sysctl.c | 4 ++-- mm/slab_common.c | 44 +++++++++++++++++++++++++++++++++++++ 5 files changed, 60 insertions(+), 4 deletions(-) diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 749322060f10..b643ac8968d2 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -207,8 +207,8 @@ Setting this to zero disables periodic writeback altogether. drop_caches Writing to this will cause the kernel to drop clean caches, as well as -reclaimable slab objects like dentries and inodes. Once dropped, their -memory becomes free. +reclaimable slab objects like dentries and inodes. It can also be used +to shrink the slabs. Once dropped, their memory becomes free. To free pagecache: echo 1 > /proc/sys/vm/drop_caches @@ -216,6 +216,8 @@ To free reclaimable slab objects (includes dentries and inodes): echo 2 > /proc/sys/vm/drop_caches To free slab objects and pagecache: echo 3 > /proc/sys/vm/drop_caches +To shrink the slabs: + echo 8 > /proc/sys/vm/drop_caches This is a non-destructive operation and will not free any dirty objects. To increase the number of objects freed by this operation, the user may run @@ -223,6 +225,11 @@ To increase the number of objects freed by this operation, the user may run number of dirty objects on the system and create more candidates to be dropped. +Shrinking the slabs can reduce the memory footprint used by the slabs. +It also makes the number of active objects reported in /proc/slabinfo +more representative of the actual number of objects used for the slub +memory allocator. + This file is not a means to control the growth of the various kernel caches (inodes, dentries, pagecache, etc...) These objects are automatically reclaimed by the kernel when memory is needed elsewhere on the system. diff --git a/fs/drop_caches.c b/fs/drop_caches.c index d31b6c72b476..633b99e25dab 100644 --- a/fs/drop_caches.c +++ b/fs/drop_caches.c @@ -9,6 +9,7 @@ #include #include #include +#include #include "internal.h" /* A global variable is a bit ugly, but it keeps the code simple */ @@ -65,6 +66,9 @@ int drop_caches_sysctl_handler(struct ctl_table *table, int write, drop_slab(); count_vm_event(DROP_SLAB); } + if (sysctl_drop_caches & 8) { + kmem_cache_shrink_all(); + } if (!stfu) { pr_info("%s (%d): drop_caches: %d\n", current->comm, task_pid_nr(current), diff --git a/include/linux/slab.h b/include/linux/slab.h index 9449b19c5f10..f7c1626b2aa6 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -149,6 +149,7 @@ struct kmem_cache *kmem_cache_create_usercopy(const char *name, void (*ctor)(void *)); void kmem_cache_destroy(struct kmem_cache *); int kmem_cache_shrink(struct kmem_cache *); +void kmem_cache_shrink_all(void); void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *); void memcg_deactivate_kmem_caches(struct mem_cgroup *); diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 1beca96fb625..feeb867dabd7 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -129,7 +129,7 @@ static int __maybe_unused neg_one = -1; static int zero; static int __maybe_unused one = 1; static int __maybe_unused two = 2; -static int __maybe_unused four = 4; +static int __maybe_unused fifteen = 15; static unsigned long zero_ul; static unsigned long one_ul = 1; static unsigned long long_max = LONG_MAX; @@ -1455,7 +1455,7 @@ static struct ctl_table vm_table[] = { .mode = 0644, .proc_handler = drop_caches_sysctl_handler, .extra1 = &one, - .extra2 = &four, + .extra2 = &fifteen, }, #ifdef CONFIG_COMPACTION { diff --git a/mm/slab_common.c b/mm/slab_common.c index 58251ba63e4a..b3c5b64f9bfb 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -956,6 +956,50 @@ int kmem_cache_shrink(struct kmem_cache *cachep) } EXPORT_SYMBOL(kmem_cache_shrink); +#ifdef CONFIG_MEMCG_KMEM +static void kmem_cache_shrink_memcg(struct mem_cgroup *memcg, + void __maybe_unused *arg) +{ + struct kmem_cache *s; + + if (memcg == root_mem_cgroup) + return; + mutex_lock(&slab_mutex); + list_for_each_entry(s, &memcg->kmem_caches, + memcg_params.kmem_caches_node) { + kmem_cache_shrink(s); + } + mutex_unlock(&slab_mutex); + cond_resched(); +} + +/* + * Shrink all the kmem caches. + * + * If there are a large number of memory cgroups outstanding, it may take + * a while to shrink all of them. So we may need to release the lock, call + * cond_resched() and reacquire the lock from time to time. + */ +void kmem_cache_shrink_all(void) +{ + struct kmem_cache *s; + + /* Shrink all the root caches */ + mutex_lock(&slab_mutex); + list_for_each_entry(s, &slab_root_caches, root_caches_node) + kmem_cache_shrink(s); + mutex_unlock(&slab_mutex); + cond_resched(); + + /* + * Flush each of the memcg individually + */ + memcg_iterate_all(kmem_cache_shrink_memcg, NULL); +} +#else +void kmem_cache_shrink_all(void) { } +#endif + bool slab_is_available(void) { return slab_state >= UP;