From patchwork Fri Aug 21 21:20:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 11730587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8784F138C for ; Fri, 21 Aug 2020 21:21:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 46C95207DE for ; Fri, 21 Aug 2020 21:21:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="quax3u38" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 46C95207DE Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=fb.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 523A58D0086; Fri, 21 Aug 2020 17:21:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5037A8D0002; Fri, 21 Aug 2020 17:21:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40EF68D0086; Fri, 21 Aug 2020 17:21:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0153.hostedemail.com [216.40.44.153]) by kanga.kvack.org (Postfix) with ESMTP id 2BE8C8D0002 for ; Fri, 21 Aug 2020 17:21:33 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id D3793181AEF10 for ; Fri, 21 Aug 2020 21:21:32 +0000 (UTC) X-FDA: 77175847224.16.help46_06150002703c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id 9CFA0100603E4 for ; Fri, 21 Aug 2020 21:21:32 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,prvs=55027216e2=guro@fb.com,,RULES_HIT:30054:30064:30070:30090,0,RBL:67.231.145.42:@fb.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100;04yrbfq4o3yuem6j89e9kjgnii8c8opotp4ndn75tx3ptg3adk1padmrhnerpab.6s8jm351twsc1td9gu3rncdm4z1xt6447tnj4dht8x4wsb6ttdg1ezqahwiucah.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: help46_06150002703c X-Filterd-Recvd-Size: 10050 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Fri, 21 Aug 2020 21:21:31 +0000 (UTC) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07LLFANe003042 for ; Fri, 21 Aug 2020 14:21:30 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding : content-type; s=facebook; bh=D55i0VrjpMlrOb2OUAwT58YrxQDIA+sSmCrKN0Zevgc=; b=quax3u387BGmcNVtc7EwsMtu09ga+J/wX8TIo13DP2Rv+tz2guPwcePJ6p6HZAN2WTXc GtcLvqph7jaLzZZ40UEg8kPlfWcV6McDN/bB0ICqpu3hSSBuRpXowQbNFEtZoHXZmbTw ohXSj3xNPKALEi2R2F4bLCWcSBVTFt1v05k= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 332kkn0vsh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 21 Aug 2020 14:21:30 -0700 Received: from intmgw001.41.prn1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::f) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1979.3; Fri, 21 Aug 2020 14:21:29 -0700 Received: by devvm1096.prn0.facebook.com (Postfix, from userid 111017) id C3781347CCB3; Fri, 21 Aug 2020 14:21:17 -0700 (PDT) Smtp-Origin-Hostprefix: devvm From: Roman Gushchin Smtp-Origin-Hostname: devvm1096.prn0.facebook.com To: Andrew Morton CC: Johannes Weiner , Michal Hocko , , , , Roman Gushchin , Shakeel Butt , Dan Schatzberg Smtp-Origin-Cluster: prn0c01 Subject: [PATCH] mm: rework remote memcg charging API to support nesting Date: Fri, 21 Aug 2020 14:20:56 -0700 Message-ID: <20200821212056.3769116-1-guro@fb.com> X-Mailer: git-send-email 2.24.1 MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235,18.0.687 definitions=2020-08-21_10:2020-08-21,2020-08-21 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 suspectscore=0 phishscore=0 adultscore=0 priorityscore=1501 mlxscore=0 spamscore=0 impostorscore=0 malwarescore=0 bulkscore=0 mlxlogscore=999 lowpriorityscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008210197 X-FB-Internal: deliver X-Rspamd-Queue-Id: 9CFA0100603E4 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Currently the remote memcg charging API consists of two functions: memalloc_use_memcg() and memalloc_unuse_memcg(), which set and clear the memcg value, which overwrites the memcg of the current task. memalloc_use_memcg(target_memcg); <...> memalloc_unuse_memcg(); It works perfectly for allocations performed from a normal context, however an attempt to call it from an interrupt context or just nest two remote charging blocks will lead to an incorrect accounting. On exit from the inner block the active memcg will be cleared instead of being restored. memalloc_use_memcg(target_memcg); memalloc_use_memcg(target_memcg_2); <...> memalloc_unuse_memcg(); Error: allocation here are charged to the memcg of the current process instead of target_memcg. memalloc_unuse_memcg(); This patch extends the remote charging API by switching to a single function: struct mem_cgroup *set_active_memcg(struct mem_cgroup *memcg), which sets the new value and returns the old one. So a remote charging block will look like: old_memcg = set_active_memcg(target_memcg); <...> set_active_memcg(old_memcg); This patch is heavily based on the patch by Johannes Weiner, which can be found here: https://lkml.org/lkml/2020/5/28/806 . Signed-off-by: Roman Gushchin Cc: Johannes Weiner Cc: Shakeel Butt Cc: Dan Schatzberg Reviewed-by: Shakeel Butt --- fs/buffer.c | 6 +++--- fs/notify/fanotify/fanotify.c | 5 +++-- fs/notify/inotify/inotify_fsnotify.c | 5 +++-- include/linux/sched/mm.h | 30 ++++++++++------------------ mm/memcontrol.c | 6 +++--- 5 files changed, 22 insertions(+), 30 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index ac0310d24657..f1f2fb1b2432 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -842,13 +842,13 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, struct buffer_head *bh, *head; gfp_t gfp = GFP_NOFS | __GFP_ACCOUNT; long offset; - struct mem_cgroup *memcg; + struct mem_cgroup *memcg, *old_memcg; if (retry) gfp |= __GFP_NOFAIL; memcg = get_mem_cgroup_from_page(page); - memalloc_use_memcg(memcg); + old_memcg = set_active_memcg(memcg); head = NULL; offset = PAGE_SIZE; @@ -867,7 +867,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, set_bh_page(bh, page, offset); } out: - memalloc_unuse_memcg(); + set_active_memcg(old_memcg); mem_cgroup_put(memcg); return head; /* diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c index c942910a8649..9167884a61ec 100644 --- a/fs/notify/fanotify/fanotify.c +++ b/fs/notify/fanotify/fanotify.c @@ -531,6 +531,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group, struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir); const struct path *path = fsnotify_data_path(data, data_type); unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS); + struct mem_cgroup *old_memcg; struct inode *child = NULL; bool name_event = false; @@ -580,7 +581,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group, gfp |= __GFP_RETRY_MAYFAIL; /* Whoever is interested in the event, pays for the allocation. */ - memalloc_use_memcg(group->memcg); + old_memcg = set_active_memcg(group->memcg); if (fanotify_is_perm_event(mask)) { event = fanotify_alloc_perm_event(path, gfp); @@ -608,7 +609,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group, event->pid = get_pid(task_tgid(current)); out: - memalloc_unuse_memcg(); + set_active_memcg(old_memcg); return event; } diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c index a65cf8c9f600..9ddcbadc98e2 100644 --- a/fs/notify/inotify/inotify_fsnotify.c +++ b/fs/notify/inotify/inotify_fsnotify.c @@ -66,6 +66,7 @@ static int inotify_one_event(struct fsnotify_group *group, u32 mask, int ret; int len = 0; int alloc_len = sizeof(struct inotify_event_info); + struct mem_cgroup *old_memcg; if ((inode_mark->mask & FS_EXCL_UNLINK) && path && d_unlinked(path->dentry)) @@ -87,9 +88,9 @@ static int inotify_one_event(struct fsnotify_group *group, u32 mask, * trigger OOM killer in the target monitoring memcg as it may have * security repercussion. */ - memalloc_use_memcg(group->memcg); + old_memcg = set_active_memcg(group->memcg); event = kmalloc(alloc_len, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL); - memalloc_unuse_memcg(); + set_active_memcg(old_memcg); if (unlikely(!event)) { /* diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index f889e332912f..4c69a4349ac1 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -305,38 +305,28 @@ static inline void memalloc_nocma_restore(unsigned int flags) #ifdef CONFIG_MEMCG /** - * memalloc_use_memcg - Starts the remote memcg charging scope. + * set_active_memcg - Starts the remote memcg charging scope. * @memcg: memcg to charge. * * This function marks the beginning of the remote memcg charging scope. All the * __GFP_ACCOUNT allocations till the end of the scope will be charged to the * given memcg. * - * NOTE: This function is not nesting safe. + * NOTE: This function can nest. Users must save the return value and + * reset the previous value after their own charging scope is over. */ -static inline void memalloc_use_memcg(struct mem_cgroup *memcg) +static inline struct mem_cgroup * +set_active_memcg(struct mem_cgroup *memcg) { - WARN_ON_ONCE(current->active_memcg); + struct mem_cgroup *old = current->active_memcg; current->active_memcg = memcg; -} - -/** - * memalloc_unuse_memcg - Ends the remote memcg charging scope. - * - * This function marks the end of the remote memcg charging scope started by - * memalloc_use_memcg(). - */ -static inline void memalloc_unuse_memcg(void) -{ - current->active_memcg = NULL; + return old; } #else -static inline void memalloc_use_memcg(struct mem_cgroup *memcg) -{ -} - -static inline void memalloc_unuse_memcg(void) +static inline struct mem_cgroup * +set_active_memcg(struct mem_cgroup *memcg) { + return NULL; } #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 130093bdf74b..ac5290a3f3f1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5271,12 +5271,12 @@ static struct cgroup_subsys_state * __ref mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) { struct mem_cgroup *parent = mem_cgroup_from_css(parent_css); - struct mem_cgroup *memcg; + struct mem_cgroup *memcg, *old_memcg; long error = -ENOMEM; - memalloc_use_memcg(parent); + old_memcg = set_active_memcg(parent); memcg = mem_cgroup_alloc(); - memalloc_unuse_memcg(); + set_active_memcg(old_memcg); if (IS_ERR(memcg)) return ERR_CAST(memcg);