From patchwork Thu Aug 9 23:36:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10562131 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D9E01057 for ; Thu, 9 Aug 2018 23:36:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B873289DD for ; Thu, 9 Aug 2018 23:36:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4F56D2BB3A; Thu, 9 Aug 2018 23:36:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D62B289DD for ; Thu, 9 Aug 2018 23:36:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B11D16B0003; Thu, 9 Aug 2018 19:36:32 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AC2D26B0005; Thu, 9 Aug 2018 19:36:32 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B3376B0007; Thu, 9 Aug 2018 19:36:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 5BEFC6B0003 for ; Thu, 9 Aug 2018 19:36:32 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id w23-v6so3519606pgv.1 for ; Thu, 09 Aug 2018 16:36:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=2201v2MmNuRtwjB5F8oqWVU8vl5q9Dp42OYAHY53gsU=; b=fG+t6GHiAJXilSH7JqtN/BGXgfg5GZxZKwbm2XLNRVT/2KtANEB1cic2gzUxncbg3B qX61peVdkMQa0D0CMSTYdhsJ1/zH4f7Lu/byMY2HgXu71RFMqc7EYT446T7OeYWQQroP poo/4lqpsfaCJ9U7bd4ckQMrtSxfhlR7PKzMtO134p0wuQ1xeaynpXIpX8J53wdIvyzh I2AUxGAHf/cq89q3xnjY8hd6uLKycAPKn8IH3I+F2i/OgISsQ262IN0xWYmIvo6KWOQq bR5Uj1UyUDZJOUHH9XuWtlYcLuKOPZjBqLlvWnhiN41Yf7uy+IB+/wtD8NBOuPuQnIgJ RgUA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.132 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: AOUpUlGu3nOH6Ul8frX2OgioShHHgzbrc2hJI/aQgsoJ8r4hsL2+bsuC uR4vPjmDnwFXfwXTd+KGBBnNxHAuHoYs24q+W47bIQWteSOKAoLa277LLbBh0wqtx5iaPo2U6TJ zjsqEwUQU2wWadyZ9N6bqWAJIfNSJrTKyR0YLxgr1+lVYcMXKgbxUD5ihv2hqSAhZpA== X-Received: by 2002:a62:11c4:: with SMTP id 65-v6mr4414941pfr.54.1533857792003; Thu, 09 Aug 2018 16:36:32 -0700 (PDT) X-Google-Smtp-Source: AA+uWPx1ATe8gJ/MKubprxEppVpMrUnhDCLdvaaV0TP3w8iMlEb/No7FY33043JcYO9ewd+l4c9+ X-Received: by 2002:a62:11c4:: with SMTP id 65-v6mr4414891pfr.54.1533857790871; Thu, 09 Aug 2018 16:36:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533857790; cv=none; d=google.com; s=arc-20160816; b=rQma/oO1VVZGAG74UiqQskXow77PUpRpU18TiXhZo8oCWqXQjXsAm5pBzyqemg5Ifm RLJkMTT8AwNy8tt2tKX6AIqv0qZD9e6hW+3k5mRn3BHyRb7VlJ4gkkPHaBPOAAcONeuj 0wKglE8Wle73qK3q9Dh7S8opxQ55BUM1ETH88CLWm3IZMYHuiXj5VJCLA9ZOj2APLoEo eKDJ5qdHc+0CBNroUaefErBKbbDec23nsW69HcLRebhXkXHc5UcSYaziQeksBhE6V5yU EiHyFsCvzCoXj0cCldhnJmaaGLYiHsegSeIXY3B3PqDLOaDVMP2FOytf6ZzvzaS3881u ra6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=2201v2MmNuRtwjB5F8oqWVU8vl5q9Dp42OYAHY53gsU=; b=Y0wAwtYs2EC9uLE5lsvx2mRkIMcR8AD8qahzTyRPJ/+Jn4IVtV1VcS3wqJhwPcIClZ o9SPcazlqRVnXvnj5jMsszmiJRrLtxIva+R94THtmkG9FYAzGMTZVfKvhXEBVU8Yby9R q6IwsB8BJEX1Hg4Sut720nH56Z6odgeIPVxZGbo0PxDOIObNDG7zqwDdhECYfDDabwTc 0MEn3CUn3iJVnxLPtm8vUek333jVMEohVygZeu/qToG3k8UgynAvYGotakCiohjLrV08 uZw8/gOB+DlAejEGk/Nm7pnl+n/2hhbEf+OIJbr7BDGlStsc8Ie6RYVSx+ZY2c3CGUin wVng== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.132 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com. [115.124.30.132]) by mx.google.com with ESMTPS id h130-v6si7941113pfe.119.2018.08.09.16.36.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Aug 2018 16:36:30 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.132 as permitted sender) client-ip=115.124.30.132; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.132 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01451;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0T6MWc.W_1533857777; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0T6MWc.W_1533857777) by smtp.aliyun-inc.com(127.0.0.1); Fri, 10 Aug 2018 07:36:26 +0800 From: Yang Shi To: mhocko@kernel.org, willy@infradead.org, ldufour@linux.vnet.ibm.com, kirill@shutemov.name, vbabka@suse.cz, akpm@linux-foundation.org, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC v7 PATCH 1/4] mm: refactor do_munmap() to extract the common part Date: Fri, 10 Aug 2018 07:36:00 +0800 Message-Id: <1533857763-43527-2-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533857763-43527-1-git-send-email-yang.shi@linux.alibaba.com> References: <1533857763-43527-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Introduces three new helper functions: * addr_ok() * munmap_lookup_vma() * munlock_vmas() They will be used by do_munmap() and the new do_munmap with zapping large mapping early in the later patch. There is no functional change, just code refactor. Reviewed-by: Laurent Dufour Signed-off-by: Yang Shi Acked-by: Vlastimil Babka --- mm/mmap.c | 100 ++++++++++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 71 insertions(+), 29 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 17bbf4d..2a6898b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2681,35 +2681,40 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma, return __split_vma(mm, vma, addr, new_below); } -/* Munmap is split into 2 main parts -- this part which finds - * what needs doing, and the areas themselves, which do the - * work. This now handles partial unmappings. - * Jeremy Fitzhardinge - */ -int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, - struct list_head *uf) +static inline bool addr_ok(unsigned long start, size_t len) { - unsigned long end; - struct vm_area_struct *vma, *prev, *last; - if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE-start) - return -EINVAL; + return false; - len = PAGE_ALIGN(len); - if (len == 0) - return -EINVAL; + if (PAGE_ALIGN(len) == 0) + return false; + + return true; +} + +/* + * munmap_lookup_vma: find the first overlap vma and split overlap vmas. + * @mm: mm_struct + * @start: start address + * @end: end address + * + * returns the pointer to vma, NULL or err ptr when spilt_vma returns error. + */ +static struct vm_area_struct *munmap_lookup_vma(struct mm_struct *mm, + unsigned long start, unsigned long end) +{ + struct vm_area_struct *vma, *prev, *last; /* Find the first overlapping VMA */ vma = find_vma(mm, start); if (!vma) - return 0; - prev = vma->vm_prev; - /* we have start < vma->vm_end */ + return NULL; + /* we have start < vma->vm_end */ /* if it doesn't overlap, we have nothing.. */ - end = start + len; if (vma->vm_start >= end) - return 0; + return NULL; + prev = vma->vm_prev; /* * If we need to split any vma, do it now to save pain later. @@ -2727,11 +2732,11 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, * its limit temporarily, to help free resources as expected. */ if (end < vma->vm_end && mm->map_count >= sysctl_max_map_count) - return -ENOMEM; + return ERR_PTR(-ENOMEM); error = __split_vma(mm, vma, start, 0); if (error) - return error; + return ERR_PTR(error); prev = vma; } @@ -2740,10 +2745,53 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, if (last && end > last->vm_start) { int error = __split_vma(mm, last, end, 1); if (error) - return error; + return ERR_PTR(error); } vma = prev ? prev->vm_next : mm->mmap; + return vma; +} + +static inline void munlock_vmas(struct vm_area_struct *vma, + unsigned long end) +{ + struct mm_struct *mm = vma->vm_mm; + + while (vma && vma->vm_start < end) { + if (vma->vm_flags & VM_LOCKED) { + mm->locked_vm -= vma_pages(vma); + munlock_vma_pages_all(vma); + } + vma = vma->vm_next; + } +} + +/* Munmap is split into 2 main parts -- this part which finds + * what needs doing, and the areas themselves, which do the + * work. This now handles partial unmappings. + * Jeremy Fitzhardinge + */ +int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, + struct list_head *uf) +{ + unsigned long end; + struct vm_area_struct *vma, *prev; + + if (!addr_ok(start, len)) + return -EINVAL; + + len = PAGE_ALIGN(len); + + end = start + len; + + vma = munmap_lookup_vma(mm, start, end); + if (!vma) + return 0; + if (IS_ERR(vma)) + return PTR_ERR(vma); + + prev = vma->vm_prev; + if (unlikely(uf)) { /* * If userfaultfd_unmap_prep returns an error the vmas @@ -2764,13 +2812,7 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, */ if (mm->locked_vm) { struct vm_area_struct *tmp = vma; - while (tmp && tmp->vm_start < end) { - if (tmp->vm_flags & VM_LOCKED) { - mm->locked_vm -= vma_pages(tmp); - munlock_vma_pages_all(tmp); - } - tmp = tmp->vm_next; - } + munlock_vmas(tmp, end); } /* From patchwork Thu Aug 9 23:36:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10562139 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 461F01057 for ; Thu, 9 Aug 2018 23:36:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 34048289DD for ; Thu, 9 Aug 2018 23:36:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 288862BB3A; Thu, 9 Aug 2018 23:36:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B0D7289DD for ; Thu, 9 Aug 2018 23:36:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4AC7B6B000E; Thu, 9 Aug 2018 19:36:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 459636B0010; Thu, 9 Aug 2018 19:36:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 284666B0266; Thu, 9 Aug 2018 19:36:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id D3C436B000E for ; Thu, 9 Aug 2018 19:36:54 -0400 (EDT) Received: by mail-pf1-f197.google.com with SMTP id c8-v6so4240634pfn.2 for ; Thu, 09 Aug 2018 16:36:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=gAHnU0r/L5IJLhkHPCSm+EWrpcn9NZt5T126pSoAJ00=; b=YFmYkJE8QjpITUk5viWuRas+sS4u8eYNgjHVZ318K2Ne9o6H7p3i4cnldJy2o+9NvI 7cFfwmOapNghjFxNH0seSe2A5WxOQHmjwspIgK+hqmqOHCdrzH9Ck/h8Kjd7QtHAEOe4 GBd5M2QJoqyKPDJe1uQqLnl8kzX2gWlcKgTzn2EO5K+9JNWkI2Wd6PfugpTDAmNx3SK7 Fi+EKEqMlM6Dd8F58UOHY/H+iHDaQgsVOWegkXgDcSMHNRiFlBnzHqnes13y+vh3QP5v FnicVS0OTw+bxn6ao2q/R3KoME6TQexNaY/BbiLMeemG5zNcQAs7R+ugYo5EPLOeMnUe oyyA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: AOUpUlH7q/rOGj1cwCWTdSEXt/4iLvGRN445CfMeg9SEkoELjY6tYCiW Bqlk/JddXB6ZH3K1DqN+imW62t9SpCCyEfuBuG8QS+VSidELwCIhWGVAVC3w5OB/tVbfSlm1QjL AzdZ6+or3uQFPkE6ixdwtcnZz3a9uGON6NW/w+Ce57tqoeajotChwyc0PddESWMb+mw== X-Received: by 2002:a17:902:158b:: with SMTP id m11-v6mr3856332pla.102.1533857814506; Thu, 09 Aug 2018 16:36:54 -0700 (PDT) X-Google-Smtp-Source: AA+uWPyB0tfoaPynxddwIFBkyjK10YdEtYPPzVMShY/HnhOpXa+IuzGU/KQ0Jv4UUUV746T74g04 X-Received: by 2002:a17:902:158b:: with SMTP id m11-v6mr3856298pla.102.1533857813475; Thu, 09 Aug 2018 16:36:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533857813; cv=none; d=google.com; s=arc-20160816; b=Wh9HE5M5C3FwZ42+gAMic4+8GmQgqlSDVdaY67kIBIXuBw6YvSPWeFwjLCxSe0AJkp A+Sjqe4/Tm1+0r4tov4uXhAxm7ehCcjMY9AzI+Uza0SU7SbanTM/5bSmNRcgPyjeXokO EzJRAYL2zGKXRNnlSOoBkEkW/S3N/2qpBA7BrmoK922r0DfUlLwjUGIjPhPiaM+HRXOs MQqthcsVN52AkpsdPmc8PtxoZ5RRqvSLI6CXiQJ2SEfKUGX7G+1spe+RtYc/BR4fp/HW Q4L1EopgJPSwgfzxyeOfIeE/BLcV1VzxJH50tP1ATO1fiki8z3L5DOZlkNyzKtr+tQyi B2RA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=gAHnU0r/L5IJLhkHPCSm+EWrpcn9NZt5T126pSoAJ00=; b=pI3SWaN9WSzWaf2we4VcGDvJXnMIslCjyFZ4I3KprlshRCnY3L1d9mjIPh7O23NE1k sRIFkzjzQtwr3jzkimyv3KH6Cj94wD2gt4wkwPk+AEWWWgblFgaZT1zVkePE+fqpM1Z4 QYRGLfdbJqA/3IjaBBSRURDBFXLenhBVoKjhi33mCUkQUrXd2xtVrphoQugfzZJODPix JY/+ADq99RIl0RJ0Wl2pF1S0xkl9nCTEwYbBVr0Y54Xld11Is7LonJynYryBP0O+V5N5 8h9AV/LKL7X71vLKAoCuhEzEWnHl24mSyoOQLC+tmrD0ziYIlJz2F3QzzCHJQWv9B1NO 0F2Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com. [47.88.44.36]) by mx.google.com with ESMTPS id p11-v6si7431802pgh.274.2018.08.09.16.36.52 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Aug 2018 16:36:53 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) client-ip=47.88.44.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0T6MWc.W_1533857777; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0T6MWc.W_1533857777) by smtp.aliyun-inc.com(127.0.0.1); Fri, 10 Aug 2018 07:36:26 +0800 From: Yang Shi To: mhocko@kernel.org, willy@infradead.org, ldufour@linux.vnet.ibm.com, kirill@shutemov.name, vbabka@suse.cz, akpm@linux-foundation.org, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC v7 PATCH 2/4] mm: mmap: zap pages with read mmap_sem in munmap Date: Fri, 10 Aug 2018 07:36:01 +0800 Message-Id: <1533857763-43527-3-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533857763-43527-1-git-send-email-yang.shi@linux.alibaba.com> References: <1533857763-43527-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When running some mmap/munmap scalability tests with large memory (i.e. > 300GB), the below hung task issue may happen occasionally. INFO: task ps:14018 blocked for more than 120 seconds. Tainted: G E 4.9.79-009.ali3000.alios7.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ps D 0 14018 1 0x00000004 ffff885582f84000 ffff885e8682f000 ffff880972943000 ffff885ebf499bc0 ffff8828ee120000 ffffc900349bfca8 ffffffff817154d0 0000000000000040 00ffffff812f872a ffff885ebf499bc0 024000d000948300 ffff880972943000 Call Trace: [] ? __schedule+0x250/0x730 [] schedule+0x36/0x80 [] rwsem_down_read_failed+0xf0/0x150 [] call_rwsem_down_read_failed+0x18/0x30 [] down_read+0x20/0x40 [] proc_pid_cmdline_read+0xd9/0x4e0 [] ? do_filp_open+0xa5/0x100 [] __vfs_read+0x37/0x150 [] ? security_file_permission+0x9b/0xc0 [] vfs_read+0x96/0x130 [] SyS_read+0x55/0xc0 [] entry_SYSCALL_64_fastpath+0x1a/0xc5 It is because munmap holds mmap_sem exclusively from very beginning to all the way down to the end, and doesn't release it in the middle. When unmapping large mapping, it may take long time (take ~18 seconds to unmap 320GB mapping with every single page mapped on an idle machine). Zapping pages is the most time consuming part, according to the suggestion from Michal Hocko [1], zapping pages can be done with holding read mmap_sem, like what MADV_DONTNEED does. Then re-acquire write mmap_sem to cleanup vmas. But, some part may need write mmap_sem, for example, vma splitting. So, the design is as follows: acquire write mmap_sem lookup vmas (find and split vmas) deal with special mappings detach vmas downgrade_write zap pages free page tables release mmap_sem The vm events with read mmap_sem may come in during page zapping, but since vmas have been detached before, they, i.e. page fault, gup, etc, will not be able to find valid vma, then just return SIGSEGV or -EFAULT as expected. If the vma has VM_HUGETLB | VM_PFNMAP or uprobe, they are considered as special mappings. For the safer and bisectable sake, they will be handled by falling back to regular do_munmap() with exclusive mmap_sem held in a separate patch. Since it may be not safe to update vm_flags with read mmap_sem, although it sounds safe for this specific case hence vmas have been detached. With the "detach vmas first" approach we don't have to re-acquire mmap_sem again to clean up vmas to avoid race window which might get the address space changed since downgrade_write() doesn't release the lock to lead regression, which simply downgrades to read lock. And, since the lock acquire/release cost is managed to the minimum and almost as same as before, the optimization could be extended to any size of mapping without incurring significant penalty to small mappings. For the time being, just do this in munmap syscall path. Other vm_munmap() or do_munmap() call sites (i.e mmap, mremap, etc) remain intact due to some implementation difficulties since they acquire write mmap_sem from very beginning and hold it until the end, do_munmap() might be called in the middle. But, the optimized do_munmap would like to be called without mmap_sem held so that we can do the optimization. So, if we want to do the similar optimization for mmap/mremap path, I'm afraid we would have to redesign them. mremap might be called on very large area depending on the usecases, the optimization to it will be considered in the future. With the patches, exclusive mmap_sem hold time when munmap a 80GB address space on a machine with 32 cores of E5-2680 @ 2.70GHz dropped to us level from second. munmap_test-15002 [008] 594.380138: funcgraph_entry: | vm_munmap_zap_rlock() { munmap_test-15002 [008] 594.380146: funcgraph_entry: !2485684 us | unmap_region(); munmap_test-15002 [008] 596.865836: funcgraph_exit: !2485692 us | } Here the excution time of unmap_region() is used to evaluate the time of holding read mmap_sem, then the remaining time is used with holding exclusive lock. [1] https://lwn.net/Articles/753269/ Suggested-by: Michal Hocko Suggested-by: Kirill A. Shutemov Cc: Matthew Wilcox Cc: Laurent Dufour Cc: Andrew Morton Signed-off-by: Yang Shi --- mm/mmap.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 79 insertions(+), 2 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 2a6898b..2234d5a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2766,6 +2766,73 @@ static inline void munlock_vmas(struct vm_area_struct *vma, } } +/* + * Zap pages with read mmap_sem held + * + * uf is the list for userfaultfd + */ +static int do_munmap_zap_rlock(struct mm_struct *mm, unsigned long start, + size_t len, struct list_head *uf) +{ + unsigned long end; + struct vm_area_struct *start_vma, *prev, *vma; + int ret = 0; + + if (!addr_ok(start, len)) + return -EINVAL; + + len = PAGE_ALIGN(len); + + end = start + len; + + /* + * Need write mmap_sem to split vmas and detach vmas + * splitting vma up-front to save PITA to clean if it is failed + */ + if (down_write_killable(&mm->mmap_sem)) + return -EINTR; + + start_vma = munmap_lookup_vma(mm, start, end); + if (!start_vma) + goto out; + if (IS_ERR(start_vma)) { + ret = PTR_ERR(start_vma); + goto out; + } + + prev = start_vma->vm_prev; + + if (unlikely(uf)) { + ret = userfaultfd_unmap_prep(start_vma, start, end, uf); + if (ret) + goto out; + } + + /* Handle mlocked vmas */ + if (mm->locked_vm) { + vma = start_vma; + munlock_vmas(vma, end); + } + + /* Detach vmas from rbtree */ + detach_vmas_to_be_unmapped(mm, start_vma, prev, end); + + downgrade_write(&mm->mmap_sem); + + /* Zap mappings with read mmap_sem */ + unmap_region(mm, start_vma, prev, start, end); + + arch_unmap(mm, start_vma, start, end); + remove_vma_list(mm, start_vma); + up_read(&mm->mmap_sem); + + return 0; + +out: + up_write(&mm->mmap_sem); + return ret; +} + /* Munmap is split into 2 main parts -- this part which finds * what needs doing, and the areas themselves, which do the * work. This now handles partial unmappings. @@ -2829,6 +2896,17 @@ int do_munmap(struct mm_struct *mm, unsigned long start, size_t len, return 0; } +static int vm_munmap_zap_rlock(unsigned long start, size_t len) +{ + int ret; + struct mm_struct *mm = current->mm; + LIST_HEAD(uf); + + ret = do_munmap_zap_rlock(mm, start, len, &uf); + userfaultfd_unmap_complete(mm, &uf); + return ret; +} + int vm_munmap(unsigned long start, size_t len) { int ret; @@ -2848,10 +2926,9 @@ int vm_munmap(unsigned long start, size_t len) SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len) { profile_munmap(addr); - return vm_munmap(addr, len); + return vm_munmap_zap_rlock(addr, len); } - /* * Emulation of deprecated remap_file_pages() syscall. */ From patchwork Thu Aug 9 23:36:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10562135 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4F77D13BB for ; Thu, 9 Aug 2018 23:36:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3EC06289DD for ; Thu, 9 Aug 2018 23:36:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 326F72BB3A; Thu, 9 Aug 2018 23:36:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8FBDE289DD for ; Thu, 9 Aug 2018 23:36:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 656FD6B0007; Thu, 9 Aug 2018 19:36:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5DF996B000A; Thu, 9 Aug 2018 19:36:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4CDA16B000C; Thu, 9 Aug 2018 19:36:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 0C2576B0007 for ; Thu, 9 Aug 2018 19:36:44 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id q12-v6so3506295pgp.6 for ; Thu, 09 Aug 2018 16:36:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=uBWlWGDsan3cE1ilCu507LrPY4cQ+2dlDtyHVWB0prI=; b=pWokjWv0oL9awD65IPC5uHICTgOTAJs6iG4Gss6TbFzD2HaXRqr2/5FUQSFP+2eFUd wjDk7oUWPLLrXziUuDsEJLod4njay7Xdnjk4Jhj8NPvsPCiXOlt95VvabTyrH6xVTGKf jnJYCmulsn7nuFBwSxryV0OCI2KR9QVGiCwYAVbJk0NTFm3D95ayDnqydhhhWnjkuL5o bQhMyJtYdckCnPoEG5dqgtwVyMtkT0bvHI6cdFVZhAhNjfcGIug+Otgilqr+0RJwVW5Q fqhxFXMUcJNzQakeyoa5tmxpkVcK66iAUr3Ph0g3LslkOQ2/QK4IoiUPysO+iHx54ap9 qvlg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: AOUpUlGk2mU7c3aMh9AduJkOIT3I2Q1IrCRFHKxaxU83LU2z83qyQrpq dtWHoYDKfmG5hyPHK9iRl8mmRtznSaf2qE0LMU60htKln0Uw9l+Yp/eFSuLsiK5wtbR0UMbSDMU MKaossSCAVpavkJvnHol5eyVtXt7XzvPiqcFKD+GkToeRgtBVfJQbneDjOcLU+B94GQ== X-Received: by 2002:a63:eb0e:: with SMTP id t14-v6mr3941516pgh.198.1533857803717; Thu, 09 Aug 2018 16:36:43 -0700 (PDT) X-Google-Smtp-Source: AA+uWPx9xTYE1YYQDGAKrH8Gvti4epDioPCxTTuefQpsItBtEvgJjr7/uGUp9ATmBnAzlL/OWUN4 X-Received: by 2002:a63:eb0e:: with SMTP id t14-v6mr3941479pgh.198.1533857802733; Thu, 09 Aug 2018 16:36:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533857802; cv=none; d=google.com; s=arc-20160816; b=HdTWBjv3sM0BrJxnoZ5+pvMXsa+6SdEM+6SU2WPu8wTNb9vT+9RhY35B7Q4TasUxul QeEIpZfq0c6Gc/nQKsfOFlcOSI5+Qo+cWBmDeyfvcuFVIP/bAl05qrrhMnjaReyWmann 2Q/RLh0REpB9NcBWjAc51gFhU9LLdtR/7XDsIX1LQS6BTVAXcCyRccxNObsX39WbX2YG EJ//sZ+Y9jYtwL1Vld8UUA+MxUTGjoeLFOhDm/kltYWkHSciqMZnbUq8czMHxf0eNXTX EhZRzk+QjwbloavjvLtioie1o8RE9TLPT98eiuX4NWL4jA0TK+WEh9aNdRQC375Ta8t/ 2jaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=uBWlWGDsan3cE1ilCu507LrPY4cQ+2dlDtyHVWB0prI=; b=zGAqLHNaDj5QkjwbYpnVJJ729eFpEHhn7MUUtZ64S7PiXhscrfK+e8EuoB4kzMrq4q l3/hYKlFCG32Lo+kqAaQTZm2UzQrnOsMWV+K7A1hD9BxHfQ1kHvnqq3P1k2LZUNrkyJO eMo1tLkHLg6N4RllWhGLWjRvgG348hyIjgMC0ZhVIdhmgAC++kEqfVcvuieEgM0BJkGB pZQw30SHRSZ6Nv4JifXi3xT/EhqIjMUfKiHM1HpQ3yfwLK22E8ETEwYFGe5N9GX1aSeM 0MRwrl+DdGE95zxjrTOvWnaLwgr1hiO9mpzYfXGwJHL1En+6hmDK0Oh0CZBPschYbObR Zmxw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com. [47.88.44.36]) by mx.google.com with ESMTPS id d191-v6si7238763pga.157.2018.08.09.16.36.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Aug 2018 16:36:42 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) client-ip=47.88.44.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07486;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0T6MWc.W_1533857777; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0T6MWc.W_1533857777) by smtp.aliyun-inc.com(127.0.0.1); Fri, 10 Aug 2018 07:36:27 +0800 From: Yang Shi To: mhocko@kernel.org, willy@infradead.org, ldufour@linux.vnet.ibm.com, kirill@shutemov.name, vbabka@suse.cz, akpm@linux-foundation.org, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC v7 PATCH 3/4] uprobes: make vma_has_uprobes non-static Date: Fri, 10 Aug 2018 07:36:02 +0800 Message-Id: <1533857763-43527-4-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533857763-43527-1-git-send-email-yang.shi@linux.alibaba.com> References: <1533857763-43527-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP vma_has_uprobes() will be used in the following patch to check if a vma could be unmapped with holding read mmap_sem, but it is static. So, make it non-static to use outside uprobe. Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Signed-off-by: Yang Shi --- include/linux/uprobes.h | 7 +++++++ kernel/events/uprobes.c | 2 +- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h index 0a294e9..caeb26b 100644 --- a/include/linux/uprobes.h +++ b/include/linux/uprobes.h @@ -149,6 +149,8 @@ struct uprobes_state { extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs); extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, void *src, unsigned long len); +extern bool vma_has_uprobes(struct vm_area_struct *vma, unsigned long start, + unsigned long end); #else /* !CONFIG_UPROBES */ struct uprobes_state { }; @@ -203,5 +205,10 @@ static inline void uprobe_copy_process(struct task_struct *t, unsigned long flag static inline void uprobe_clear_state(struct mm_struct *mm) { } +static inline bool vma_has_uprobes(struct vm_area_struct *vma, unsigned long start, + unsigned long end) +{ + return false; +} #endif /* !CONFIG_UPROBES */ #endif /* _LINUX_UPROBES_H */ diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index ccc579a..4880c46 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1095,7 +1095,7 @@ int uprobe_mmap(struct vm_area_struct *vma) return 0; } -static bool +bool vma_has_uprobes(struct vm_area_struct *vma, unsigned long start, unsigned long end) { loff_t min, max; From patchwork Thu Aug 9 23:36:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 10562137 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9D08D13BB for ; Thu, 9 Aug 2018 23:36:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89006289DD for ; Thu, 9 Aug 2018 23:36:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7B5842BB3A; Thu, 9 Aug 2018 23:36:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D8AC4289DD for ; Thu, 9 Aug 2018 23:36:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE3976B000D; Thu, 9 Aug 2018 19:36:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E91776B0010; Thu, 9 Aug 2018 19:36:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D82CC6B0266; Thu, 9 Aug 2018 19:36:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl0-f71.google.com (mail-pl0-f71.google.com [209.85.160.71]) by kanga.kvack.org (Postfix) with ESMTP id 97A746B000D for ; Thu, 9 Aug 2018 19:36:54 -0400 (EDT) Received: by mail-pl0-f71.google.com with SMTP id j1-v6so4553675pld.23 for ; Thu, 09 Aug 2018 16:36:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=SieF3IXAtGzjYc0c9H6/bJPFf5KkOkiCZmqBBgLq4E8=; b=MynHFXDxYrhZmtD3ckvhbQ6DGam/9/GFEa9tfCYEA8d7YYpdOz+DLx32ESvcGtq7vS luUAlzyGlczsgDtWG9XfEaZ2ltwAzydQ3IitnfjND9u+Bq5mRjinPqB5orcajzSqk1YD hXqS3MurnnTF6DxV8nj0brLo5Ay3I+vrtsBMu6GgI/OlVLPPJTXcEbZthtBIMN3Rm6op zfdovRedU4pMnuowtB35v2iCyEQKkhoTfOfC1RBNgQ+W/7yufPFtYGh2yhnKFq3l4UfH x6zBWTy0OLPZb7xix1Q6CbZGVqZ0OrMZemtqovSGzrRPWnrvRMldWWNWsOBFfAAEH3Ix 8XiQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Gm-Message-State: AOUpUlF2LLs8WZO9H+GR6V2dkVmxi7mQLZJaOkMx9uJpEwdPfCrhadPQ Zd5ni/1V8ghn5bv2mp8RxBSDFgckXUXRMXdQDUL5e+SDaV7RMbosxRFTFNMZbCrE0zqo7vca3ZT BB3fySwfSXnxiQa87xYzS6Wr695ohmQO6WB9lHNl1dvQMQtkbG+XHpBOVeMxndeY1/Q== X-Received: by 2002:a62:6746:: with SMTP id b67-v6mr4360832pfc.243.1533857814287; Thu, 09 Aug 2018 16:36:54 -0700 (PDT) X-Google-Smtp-Source: AA+uWPwO7VPYiZjkeIqNz3CNQpGYOdlzKLZ5/gBHNeahWgJXFZcvHwfrYLYcYrJIZNXyjYYOYJNN X-Received: by 2002:a62:6746:: with SMTP id b67-v6mr4360797pfc.243.1533857813195; Thu, 09 Aug 2018 16:36:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533857813; cv=none; d=google.com; s=arc-20160816; b=hpn7BNIKiFQJZDoPta68bf32UEtxlzFOD+8kgPNgbfgwdRBPGCXl6fCKMjm+O8sslQ TpxnSwqUDQ24TAUxzMyeoFKbFGlG5VidGktCUhXSoGRz7vofmKQ5vLdn+3YipeOwbC1x 639BRzGFedQZtkRLHSlXVEbjSCsA5lnn4CFVvkh9yoWoIobPdLe7fUGOgeSkLLO2EHPg Z3TNBu5dcYDSwINI8lo4XrvsB1j2P5G/gtgfAFYa4EPMrtsmfZReA+e1tAMnNBalqqTz ZNjgZwZZ3fPhih70dZwL2VN5ThxsCabf3H6EdDUWJHPBfXcTIOE4kBk77k1Nv2AwCLjN cGjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=SieF3IXAtGzjYc0c9H6/bJPFf5KkOkiCZmqBBgLq4E8=; b=OVBoGrSRDOfub4Godt5yZHLdRhbD7ktCjk9Pfx/sgGSB2e6dyN8IJEBpwFS7Zj+tn5 eJkkSaFcpA6WO3iI+yQkOUh+xHb0Gi4e1Y/JjyTCyTwBdBMGBz8mFiYGjvhPSKsvwTA/ YXqTDajUSdgN9GX8a2e66mm8evgg/8TWDmV3RO1vULwo70PjuAwoP5iEKxPJUcMOBeBC gF+2ZzHRX/aOr30F0e6GmSFenk9HHf+q+xmMG0VoalDiA+6UeMKTcFimJoaqP/TF9+KV sR/KU77fXkGkQ1bxcHuvSlyWxTsNrkexMX6Y8reQRhPGrjG28M5Kha4CJVdPlWXda86I RxzQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com. [47.88.44.36]) by mx.google.com with ESMTPS id g27-v6si8502766pgm.208.2018.08.09.16.36.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Aug 2018 16:36:53 -0700 (PDT) Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) client-ip=47.88.44.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of yang.shi@linux.alibaba.com designates 47.88.44.36 as permitted sender) smtp.mailfrom=yang.shi@linux.alibaba.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01422;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0T6MWc.W_1533857777; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0T6MWc.W_1533857777) by smtp.aliyun-inc.com(127.0.0.1); Fri, 10 Aug 2018 07:36:27 +0800 From: Yang Shi To: mhocko@kernel.org, willy@infradead.org, ldufour@linux.vnet.ibm.com, kirill@shutemov.name, vbabka@suse.cz, akpm@linux-foundation.org, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC v7 PATCH 4/4] mm: unmap special vmas with regular do_munmap() Date: Fri, 10 Aug 2018 07:36:03 +0800 Message-Id: <1533857763-43527-5-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533857763-43527-1-git-send-email-yang.shi@linux.alibaba.com> References: <1533857763-43527-1-git-send-email-yang.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Unmapping vmas, which have VM_HUGETLB | VM_PFNMAP flag set or have uprobes set, need get done with write mmap_sem held since they may update vm_flags. So, it might be not safe enough to deal with these kind of special mappings with read mmap_sem. Deal with such mappings with regular do_munmap() call. Michal suggested to make this as a separate patch for safer and more bisectable sake. Cc: Michal Hocko Signed-off-by: Yang Shi --- mm/mmap.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 2234d5a..06cb83c 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2766,6 +2766,16 @@ static inline void munlock_vmas(struct vm_area_struct *vma, } } +static inline bool can_zap_with_rlock(struct vm_area_struct *vma) +{ + if ((vma->vm_file && + vma_has_uprobes(vma, vma->vm_start, vma->vm_end)) || + (vma->vm_flags | (VM_HUGETLB | VM_PFNMAP))) + return false; + + return true; +} + /* * Zap pages with read mmap_sem held * @@ -2808,6 +2818,17 @@ static int do_munmap_zap_rlock(struct mm_struct *mm, unsigned long start, goto out; } + /* + * Unmapping vmas, which have VM_HUGETLB | VM_PFNMAP flag set or + * have uprobes set, need get done with write mmap_sem held since + * they may update vm_flags. Deal with such mappings with regular + * do_munmap() call. + */ + for (vma = start_vma; vma && vma->vm_start < end; vma = vma->vm_next) { + if (!can_zap_with_rlock(vma)) + goto regular_path; + } + /* Handle mlocked vmas */ if (mm->locked_vm) { vma = start_vma; @@ -2828,6 +2849,9 @@ static int do_munmap_zap_rlock(struct mm_struct *mm, unsigned long start, return 0; +regular_path: + ret = do_munmap(mm, start, len, uf); + out: up_write(&mm->mmap_sem); return ret;