From patchwork Sat Mar 5 04:28:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 12770239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EB99C433EF for ; Sat, 5 Mar 2022 04:29:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0EF008D0006; Fri, 4 Mar 2022 23:29:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 09E1C8D0001; Fri, 4 Mar 2022 23:29:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED0428D0006; Fri, 4 Mar 2022 23:28:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id DF74C8D0001 for ; Fri, 4 Mar 2022 23:28:59 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id C05CF805CE for ; Sat, 5 Mar 2022 04:28:59 +0000 (UTC) X-FDA: 79209052398.13.7FC103E Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf23.hostedemail.com (Postfix) with ESMTP id 26A2D140002 for ; Sat, 5 Mar 2022 04:28:59 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1FEA0B82C01; Sat, 5 Mar 2022 04:28:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BFBB9C004E1; Sat, 5 Mar 2022 04:28:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1646454535; bh=9nAY9ICZcjkx5xMNMZYev2N5Y6/PT5o6FlYrHhEaFhE=; h=Date:To:From:In-Reply-To:Subject:From; b=MC6frjbH3JPxTPKvW+FkTcmZUKPOzY2vx9MSwh1O/EA8w6nBvH0mfPCRJF2bHWmjn m3Sl/ryudMAJT/jmqZbZBEx2pLzQE2zRWp/l5RbdlP+6+tZ8pojGItz0IHXSOzpWdl xAWXePdWQd3b/dAUpzUUtt7ur0ilB3dPwd0uRDw0= Date: Fri, 04 Mar 2022 20:28:55 -0800 To: willy@infradead.org,vbabka@suse.cz,sumit.semwal@linaro.org,sashal@kernel.org,pcc@google.com,mhocko@suse.com,legion@kernel.org,kirill.shutemov@linux.intel.com,keescook@chromium.org,hannes@cmpxchg.org,gorcunov@gmail.com,ebiederm@xmission.com,david@redhat.com,dave@stgolabs.net,dave.hansen@intel.com,chris.hyser@oracle.com,ccross@google.com,caoxiaofeng@yulong.com,brauner@kernel.org,surenb@google.com,akpm@linux-foundation.org,patches@lists.linux.dev,linux-mm@kvack.org,mm-commits@vger.kernel.org,torvalds@linux-foundation.org,akpm@linux-foundation.org From: Andrew Morton In-Reply-To: <20220304202822.d47f8084928321c83070d7d7@linux-foundation.org> Subject: [patch 3/8] mm: prevent vm_area_struct::anon_name refcount saturation Message-Id: <20220305042855.BFBB9C004E1@smtp.kernel.org> X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 26A2D140002 X-Stat-Signature: yh79f9rz6zqh3dapuud9bejiemrfxhf7 Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=MC6frjbH; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none X-HE-Tag: 1646454539-704151 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Suren Baghdasaryan Subject: mm: prevent vm_area_struct::anon_name refcount saturation A deep process chain with many vmas could grow really high. With default sysctl_max_map_count (64k) and default pid_max (32k) the max number of vmas in the system is 2147450880 and the refcounter has headroom of 1073774592 before it reaches REFCOUNT_SATURATED (3221225472). Therefore it's unlikely that an anonymous name refcounter will overflow with these defaults. Currently the max for pid_max is PID_MAX_LIMIT (4194304) and for sysctl_max_map_count it's INT_MAX (2147483647). In this configuration anon_vma_name refcount overflow becomes theoretically possible (that still require heavy sharing of that anon_vma_name between processes). kref refcounting interface used in anon_vma_name structure will detect a counter overflow when it reaches REFCOUNT_SATURATED value but will only generate a warning and freeze the ref counter. This would lead to the refcounted object never being freed. A determined attacker could leak memory like that but it would be rather expensive and inefficient way to do so. To ensure anon_vma_name refcount does not overflow, stop anon_vma_name sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still leaves INT_MAX/2 (1073741823) values before the counter reaches REFCOUNT_SATURATED. This should provide enough headroom for raising the refcounts temporarily. Link: https://lkml.kernel.org/r/20220223153613.835563-2-surenb@google.com Link: https://lkml.kernel.org/r/20220223153613.835563-2-surenb@google.com Signed-off-by: Suren Baghdasaryan Suggested-by: Michal Hocko Acked-by: Michal Hocko Cc: Alexey Gladkov Cc: Chris Hyser Cc: Christian Brauner Cc: Colin Cross Cc: Cyrill Gorcunov Cc: Dave Hansen Cc: David Hildenbrand Cc: Davidlohr Bueso Cc: "Eric W. Biederman" Cc: Johannes Weiner Cc: Kees Cook Cc: "Kirill A. Shutemov" Cc: Matthew Wilcox Cc: Peter Collingbourne Cc: Sasha Levin Cc: Sumit Semwal Cc: Vlastimil Babka Cc: Xiaofeng Cao Signed-off-by: Andrew Morton --- include/linux/mm_inline.h | 18 ++++++++++++++---- mm/madvise.c | 3 +-- 2 files changed, 15 insertions(+), 6 deletions(-) --- a/include/linux/mm_inline.h~mm-prevent-vm_area_struct-anon_name-refcount-saturation +++ a/include/linux/mm_inline.h @@ -161,15 +161,25 @@ static inline void anon_vma_name_put(str kref_put(&anon_name->kref, anon_vma_name_free); } +static inline +struct anon_vma_name *anon_vma_name_reuse(struct anon_vma_name *anon_name) +{ + /* Prevent anon_name refcount saturation early on */ + if (kref_read(&anon_name->kref) < REFCOUNT_MAX) { + anon_vma_name_get(anon_name); + return anon_name; + + } + return anon_vma_name_alloc(anon_name->name); +} + static inline void dup_anon_vma_name(struct vm_area_struct *orig_vma, struct vm_area_struct *new_vma) { struct anon_vma_name *anon_name = anon_vma_name(orig_vma); - if (anon_name) { - anon_vma_name_get(anon_name); - new_vma->anon_name = anon_name; - } + if (anon_name) + new_vma->anon_name = anon_vma_name_reuse(anon_name); } static inline void free_anon_vma_name(struct vm_area_struct *vma) --- a/mm/madvise.c~mm-prevent-vm_area_struct-anon_name-refcount-saturation +++ a/mm/madvise.c @@ -113,8 +113,7 @@ static int replace_anon_vma_name(struct if (anon_vma_name_eq(orig_name, anon_name)) return 0; - anon_vma_name_get(anon_name); - vma->anon_name = anon_name; + vma->anon_name = anon_vma_name_reuse(anon_name); anon_vma_name_put(orig_name); return 0;