From patchwork Thu Aug 11 12:41:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abel Wu X-Patchwork-Id: 12941422 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6971C19F2A for ; Thu, 11 Aug 2022 12:42:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0ED3E8E0002; Thu, 11 Aug 2022 08:42:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 09E1B8E0001; Thu, 11 Aug 2022 08:42:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E80E18E0002; Thu, 11 Aug 2022 08:42:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D80368E0001 for ; Thu, 11 Aug 2022 08:42:07 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 9AB291A1616 for ; Thu, 11 Aug 2022 12:42:07 +0000 (UTC) X-FDA: 79787274294.30.0C111B5 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by imf15.hostedemail.com (Postfix) with ESMTP id 53F71A0190 for ; Thu, 11 Aug 2022 12:42:06 +0000 (UTC) Received: by mail-pf1-f176.google.com with SMTP id k14so14351219pfh.0 for ; Thu, 11 Aug 2022 05:42:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc; bh=0NMhQVuhzXkwwEud1IyNm6hvZaVuRoUuPkQoEp6AaTY=; b=U2AwqSzPv6fO8PpGUCaF2AQotw7T7X2O6thhEuTk1KEihDUucDVJe5ziLoUa6E6d0y BqnXd6O76kRCk6OFQ5nFGHiwaWZIFk/ANu5Zzfd40OwXF64SLcPX1MlHJeU5agYRo0EN VHMo9ynMC0Zyal7n4+ERV5Zh4E33sJfM7rOQHgDF3N/YlQ5Izl5Ui0TmaSO8hZStkOEW jQ19UPY/Ji+zJMlUVPAanUybYLK58sjcPqDTKh51IVFHGKXEB8A+UglgWADRqwb6qsav ZWZtJtH5nNWJOR0TV59ioeKcoZOe9riuBAKvQP/9w7DhyOj0o1eqWr02yEDDCQY1fBIJ d5MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc; bh=0NMhQVuhzXkwwEud1IyNm6hvZaVuRoUuPkQoEp6AaTY=; b=POgPEQlSS1t6Vu0VXMNySHwwE85Q+3wXbf01wCTOJekUiYNHpUt6S9SCuQZAfl85/A 6W6Bxdmj8STmMqfBE4Uy20NKb1XlqyrMTjexPS46q98xQjbfi7BUjkl5r/kiaipMWfTn AXGiCg7CdhNcpAiPxwJzAolBtwKYVdaCpNYIRhuihIFJTE4dAr53QnxnfpsEh5R9y3Ux gXRB7la6FYdwJJzHc306YWVUfGBpQKBTaG1SoAgldmjfoD6m3V5Eu3uje8EmxvUi6wGv tR0+6sKwFwD0u6e58iDOBlY3PsH+VMXnpfM+9x5DwdBXlB8980XtutsjZ8rAeBF90l9x zfhw== X-Gm-Message-State: ACgBeo2Z0Pfee+SHCViaaTZ1jAHh/1L41oCcIUQVintW5svXkKW/5xQl 79F9RdKQwp4Ebsg+DF8flo2CLg== X-Google-Smtp-Source: AA6agR4/HxN7yvc3GgDrwV/AysM6JnJrKGknSriYDYyiEe9VplJgKHrc61lciSxG6nIdfeLuw0i9qg== X-Received: by 2002:a63:43c2:0:b0:41a:9dea:5dac with SMTP id q185-20020a6343c2000000b0041a9dea5dacmr26052449pga.585.1660221725016; Thu, 11 Aug 2022 05:42:05 -0700 (PDT) Received: from C02DV8HUMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id w2-20020a62c702000000b0052f0a404fa7sm4056949pfg.146.2022.08.11.05.42.01 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Aug 2022 05:42:04 -0700 (PDT) From: Abel Wu To: Andrew Morton , Vlastimil Babka , Michal Hocko , Mel Gorman , Muchun Song Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Abel Wu Subject: [PATCH v2] mm/mempolicy: fix lock contention on mems_allowed Date: Thu, 11 Aug 2022 20:41:57 +0800 Message-Id: <20220811124157.74888-1-wuyun.abel@bytedance.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660221727; a=rsa-sha256; cv=none; b=VG17Y3Zgz0TjxvIMvqk2WE8gGTFMxB2Zpzf57bE/nrS5Xq/rJMlHTb1nBjRNdyS+e2bqwn uaa/fMW0iU4pzeH3BmwGarJkJDgnDH/qO7zCF2E6NVunDMhTm41jmc9VmBtrqw/no/wHjS nxXPZHqyltSb5zFMfiTuMGddRza1ozU= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=U2AwqSzP; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf15.hostedemail.com: domain of wuyun.abel@bytedance.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=wuyun.abel@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660221727; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=0NMhQVuhzXkwwEud1IyNm6hvZaVuRoUuPkQoEp6AaTY=; b=BkmcNf+Q6dX5qmGwCax7AMWkzSoyfwDoj95G4pqgo/CAQau2GA/C4tfddqE+F7VVzS3MVn zjyaXcScUmAslhCvPMz6XhhCNPjGIBJvTox1ZbHL5dKbJCv0LDprFp1YUDqXnbMXYcSAiB eH5GqXJlFeE0NwGZpNXuo1pzxnPlh1A= X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 53F71A0190 Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=U2AwqSzP; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf15.hostedemail.com: domain of wuyun.abel@bytedance.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=wuyun.abel@bytedance.com X-Stat-Signature: 3hbf9qq5cec5y4rmzbgazc6sph1o4eim X-Rspam-User: X-HE-Tag: 1660221726-854783 X-Bogosity: Ham, tests=bogofilter, spamicity=0.003283, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The mems_allowed field can be modified by other tasks, so it isn't safe to access it with alloc_lock unlocked even in the current process context. Say there are two tasks: A from cpusetA is performing set_mempolicy(2), and B is changing cpusetA's cpuset.mems: A (set_mempolicy) B (echo xx > cpuset.mems) ------------------------------------------------------- pol = mpol_new(); update_tasks_nodemask(cpusetA) { foreach t in cpusetA { cpuset_change_task_nodemask(t) { mpol_set_nodemask(pol) { task_lock(t); // t could be A new = f(A->mems_allowed); update t->mems_allowed; pol.create(pol, new); task_unlock(t); } } } } task_lock(A); A->mempolicy = pol; task_unlock(A); In this case A's pol->nodes is computed by old mems_allowed, and could be inconsistent with A's new mems_allowed. While it is different when replacing vmas' policy: the pol->nodes is gone wild only when current_cpuset_is_being_rebound(): A (mbind) B (echo xx > cpuset.mems) ------------------------------------------------------- pol = mpol_new(); mmap_write_lock(A->mm); cpuset_being_rebound = cpusetA; update_tasks_nodemask(cpusetA) { foreach t in cpusetA { cpuset_change_task_nodemask(t) { mpol_set_nodemask(pol) { task_lock(t); // t could be A mask = f(A->mems_allowed); update t->mems_allowed; pol.create(pol, mask); task_unlock(t); } } foreach v in A->mm { if (cpuset_being_rebound == cpusetA) pol.rebind(pol, cpuset.mems); v->vma_policy = pol; } mmap_write_unlock(A->mm); mmap_write_lock(t->mm); mpol_rebind_mm(t->mm); mmap_write_unlock(t->mm); } } cpuset_being_rebound = NULL; In this case, the cpuset.mems, which has already done updating, is finally used for calculating pol->nodes, rather than A->mems_allowed. So it is OK to call mpol_set_nodemask() with alloc_lock unlocked when doing mbind(2). Fixes: 78b132e9bae9 ("mm/mempolicy: remove or narrow the lock on current") Signed-off-by: Abel Wu Reviewed-by: Muchun Song Reviewed-by: Wei Yang --- mm/mempolicy.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index d39b01fd52fe..61e4e6f5cfe8 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -855,12 +855,14 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags, goto out; } + task_lock(current); ret = mpol_set_nodemask(new, nodes, scratch); if (ret) { + task_unlock(current); mpol_put(new); goto out; } - task_lock(current); + old = current->mempolicy; current->mempolicy = new; if (new && new->mode == MPOL_INTERLEAVE)