From patchwork Tue Mar 10 05:16:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 11428497 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 345DC14E3 for ; Tue, 10 Mar 2020 05:17:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E5F7F24680 for ; Tue, 10 Mar 2020 05:17:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WSg4BIUK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E5F7F24680 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 20B716B0005; Tue, 10 Mar 2020 01:17:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1BC9D6B0006; Tue, 10 Mar 2020 01:17:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0ABB56B0007; Tue, 10 Mar 2020 01:17:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0079.hostedemail.com [216.40.44.79]) by kanga.kvack.org (Postfix) with ESMTP id E3ED66B0005 for ; Tue, 10 Mar 2020 01:16:59 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 92B6F8248047 for ; Tue, 10 Mar 2020 05:16:59 +0000 (UTC) X-FDA: 76578293358.07.trail56_3d0273d504012 X-Spam-Summary: 2,0,0,d3a292cfa348547b,d41d8cd98f00b204,3ysjnxggkckqwleoiipfksskpi.gsqpmryb-qqozego.svk@flex--shakeelb.bounces.google.com,,RULES_HIT:41:69:152:355:379:541:800:960:966:973:988:989:1260:1277:1313:1314:1345:1359:1437:1516:1518:1535:1543:1593:1594:1711:1730:1747:1777:1792:2196:2199:2393:2559:2562:2693:2904:3138:3139:3140:3141:3142:3152:3355:3865:3866:3867:3868:3870:3871:3872:4117:4250:4321:4385:4605:5007:6117:6119:6261:6653:6742:7875:7903:9592:9969:10004:10400:10450:10455:11026:11232:11658:11914:12043:12291:12296:12297:12438:12555:12679:12895:12986:13227:13229:13869:14096:14097:14181:14394:14659:14721:19904:19999:21080:21092:21433:21444:21450:21451:21611:21627:21740:21939:21987:21990:30054:30070,0,RBL:209.85.216.73:@flex--shakeelb.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: trail56_3d0273d504012 X-Filterd-Recvd-Size: 6455 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Tue, 10 Mar 2020 05:16:59 +0000 (UTC) Received: by mail-pj1-f73.google.com with SMTP id v9so1167405pjh.7 for ; Mon, 09 Mar 2020 22:16:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=TNtB+QSpy4uNCr/VYeHhGUHpJm44aSfYPkSBHYJ0zsw=; b=WSg4BIUKLHoYQm/UXhi2utv1mRHFPdRhsVGHEtX37hLa6oKvf+GiEOyE40iROyKSf+ L8F8KCMSY/cpHRmKMpDK5fpqRQM8P7TbJm/Y+ZT7w3HU2m09s3n7nYI66paQrMjhnAsx oSA3qeYL4hyN6rQC8zQrXFwCNsA67HPzYJJVlVE5ai9KnE1ahr9W70vJ+Tpw3xqWS5h2 OQY2h9TXySmUQic68Xmva58powbgn/8Fcnq6BG1k1/u742vQOOniW8UqSBo1e3jGV6Eg zzbwVCS93BgxblcxdJjABsarILtkj4oJJpF9GaltxoxQuwVY2OU2qJz6tVGcGaCL6x5s miZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=TNtB+QSpy4uNCr/VYeHhGUHpJm44aSfYPkSBHYJ0zsw=; b=Hj02s4E3roqTwOdbjqaoUHJNIebIiYAif11cx0XjDjTiJF0WJm8W1I4gB5qNoYj5So uWB5fbV4CdtGu18GA4TvPTFY20tXOB2SO7doD2SR1U4WKdPpbDET4fcM4d2bqk8JdNLK SWixZ7Cs+2SdfTdyKvkzRKvHhNwcQ1PilMWSUQtozBp5lWmVO4t6qJ0pVaFn0zepSuQV lbGIsnH+ZQLcbK+qwWWcsxB1tfT4d//Q+/6oGY3/CD5Gx6CJ5YL1o037cfl7rwyb9Vb7 futxhoGQjwAlD3XBTqxmB8r6gDfVFAN9qKk2a807uJHcHgqq7vpN5cVSFkRFeYuB9SX+ qBpg== X-Gm-Message-State: ANhLgQ3dUsx3JsbqqxrGnB2eOwU+DoAqSxyLagE3veLtXAnnUKRVHBvf konkPT09ojc13OJZSHJ86UHE1kkddHL/tQ== X-Google-Smtp-Source: ADFU+vvCgK2su0Vey9F/dsAYP7pK2PzEwMOn7opQOpgiX/aLfjRILGRM2uW867o24mCxDlVAb6tRZVuzSwHkyQ== X-Received: by 2002:a17:90a:8005:: with SMTP id b5mr1361533pjn.37.1583817417871; Mon, 09 Mar 2020 22:16:57 -0700 (PDT) Date: Mon, 9 Mar 2020 22:16:06 -0700 In-Reply-To: <20200310051606.33121-1-shakeelb@google.com> Message-Id: <20200310051606.33121-2-shakeelb@google.com> Mime-Version: 1.0 References: <20200310051606.33121-1-shakeelb@google.com> X-Mailer: git-send-email 2.25.1.481.gfbce0eb801-goog Subject: [PATCH v4 2/2] net: memcg: late association of sock to memcg From: Shakeel Butt To: Eric Dumazet , Roman Gushchin Cc: Johannes Weiner , Michal Hocko , Greg Thelen , Andrew Morton , "David S . Miller" , Alexey Kuznetsov , Hideaki YOSHIFUJI , netdev@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Shakeel Butt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If a TCP socket is allocated in IRQ context or cloned from unassociated (i.e. not associated to a memcg) in IRQ context then it will remain unassociated for its whole life. Almost half of the TCPs created on the system are created in IRQ context, so, memory used by such sockets will not be accounted by the memcg. This issue is more widespread in cgroup v1 where network memory accounting is opt-in but it can happen in cgroup v2 if the source socket for the cloning was created in root memcg. To fix the issue, just do the association of the sockets at the accept() time in the process context and then force charge the memory buffer already used and reserved by the socket. Signed-off-by: Shakeel Butt Reviewed-by: Eric Dumazet Reviewed-by: Roman Gushchin --- Changes since v3: - Moved the memcg association completely at accept time. Changes since v2: - Additional check for charging. - Release the sock after charging. Changes since v1: - added sk->sk_rmem_alloc to initial charging. - added synchronization to get memory usage and set sk_memcg race-free. mm/memcontrol.c | 14 -------------- net/core/sock.c | 5 ++++- net/ipv4/inet_connection_sock.c | 20 ++++++++++++++++++++ 3 files changed, 24 insertions(+), 15 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 06a889b0538b..351603c6c1c9 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6737,20 +6737,6 @@ void mem_cgroup_sk_alloc(struct sock *sk) if (!mem_cgroup_sockets_enabled) return; - /* - * Socket cloning can throw us here with sk_memcg already - * filled. It won't however, necessarily happen from - * process context. So the test for root memcg given - * the current task's memcg won't help us in this case. - * - * Respecting the original socket's memcg is a better - * decision in this case. - */ - if (sk->sk_memcg) { - css_get(&sk->sk_memcg->css); - return; - } - /* Do not associate the sock with unrelated interrupted task's memcg. */ if (in_interrupt()) return; diff --git a/net/core/sock.c b/net/core/sock.c index e4af4dbc1c9e..0fc8937a7ff4 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1832,7 +1832,10 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority) atomic_set(&newsk->sk_zckey, 0); sock_reset_flag(newsk, SOCK_DONE); - mem_cgroup_sk_alloc(newsk); + + /* sk->sk_memcg will be populated at accept() time */ + newsk->sk_memcg = NULL; + cgroup_sk_alloc(&newsk->sk_cgrp_data); rcu_read_lock(); diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index a4db79b1b643..65a3b2565102 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -482,6 +482,26 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern) } spin_unlock_bh(&queue->fastopenq.lock); } + + if (mem_cgroup_sockets_enabled) { + int amt; + + /* atomically get the memory usage, set and charge the + * sk->sk_memcg. + */ + lock_sock(newsk); + + /* The sk has not been accepted yet, no need to look at + * sk->sk_wmem_queued. + */ + amt = sk_mem_pages(newsk->sk_forward_alloc + + atomic_read(&sk->sk_rmem_alloc)); + mem_cgroup_sk_alloc(newsk); + if (newsk->sk_memcg && amt) + mem_cgroup_charge_skmem(newsk->sk_memcg, amt); + + release_sock(newsk); + } out: release_sock(sk); if (req)