From patchwork Thu Apr 4 02:00:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884735 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 022C81708 for ; Thu, 4 Apr 2019 02:01:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD391288E4 for ; Thu, 4 Apr 2019 02:01:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D0B7E288EC; Thu, 4 Apr 2019 02:01:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7A1D288E4 for ; Thu, 4 Apr 2019 02:01:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2B3C6B0008; Wed, 3 Apr 2019 22:01:16 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AB31F6B000D; Wed, 3 Apr 2019 22:01:16 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A7B36B000A; Wed, 3 Apr 2019 22:01:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 49A766B000A for ; Wed, 3 Apr 2019 22:01:16 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id n10so967148qtk.9 for ; Wed, 03 Apr 2019 19:01:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=4EpfoQzzzw77qn7dCDXDDJUXpnn7Bnp/NAwQ6GdLHWQ=; b=hGtHl/SWOM9yJiHLV+A1Y5bNJFrI1QwKuZTT1IH0TWt8qxKoHJpuu7S7Bp4ITvpR3L 11BUlKQfjqVzMi92K9GHr4I32qIUUcUtQi1s2MMXnTc+UdNevVuhnLA2yuuHpr6tOcT5 c53lYkV4tN0RzVSa5o4xmWyPOoNrxOClfNa/N6WB62eIw/xbkdq3hixHHSkGbyRrK5xG XbfooGHmvVC+uMFRRydkjVPuaysqxNcqgKpJCKO19rnZ69Uyh2n4qKwLOT970BT8ZIQ9 C0c78CNqbuoHIpto2t+dent2/GC05iffKqF43E12hkrcnuhas1RrEfApfq6k8bB6tJsa GfkQ== X-Gm-Message-State: APjAAAVXvQB2esdvWN3UGnYhvvqOwFCddLXtEC+WwGS7ifLqzgKIxNC5 h8yvv4NZhiPZKKmt+o69XcpIVsxoIsKFGW2aiB/lp3M5f6RqfumXUtm1dpqjMKIkRIg0KVYFdsS 9IKofVg4+f9ZrLcavBJd18QwS33QcSNqXrzIpTZfInenwaH+S/qV1Fa6vvKMqdSHxVQ== X-Received: by 2002:ac8:344a:: with SMTP id v10mr3216922qtb.9.1554343276011; Wed, 03 Apr 2019 19:01:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqzTPgKRTZiUdE0kdIwKyVH/N7IJ28T3slgEDPgcUFZpNv6BiTTChFOjez75UmyGrGNw0kNO X-Received: by 2002:ac8:344a:: with SMTP id v10mr3216841qtb.9.1554343274966; Wed, 03 Apr 2019 19:01:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343274; cv=none; d=google.com; s=arc-20160816; b=nEnoxldTWN4hQae4T0T82iyCVi5rAHP5RqPAk2jTAoBfOs1tApPahgKiTeueaDxXkU Bt34f2B1tbFlINsrM9MSCNKx/PBGCWGVyWrgRzpM7iBuibUUffidVEogtz4pcYaJnA3E hU1jo1TrRyi2DoFE+/uTYbH3B2A85JfK5Yqs9abXfHmd973uTkaRRZLRpgSKmEznfQD0 UN3F/UuQliL4PE96pTW0qZfgDCBCB6rjj/wbbQATCyM15lpSslEt6wr+d1qZyxZ+M+H5 ylOcyEf93OLejenRqDSBdBF56R7sG0cPJNnx7E6GwLyZ+a/hmxxph6X3eeACFaAkgro8 9S+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=4EpfoQzzzw77qn7dCDXDDJUXpnn7Bnp/NAwQ6GdLHWQ=; b=TjkjoHAp3fhj1aMu2yJzjERnH4gKJ51yTMeziB5ldMcTyCnmqwHv1SDDVDAuSlbLBk 2vaGncNFpLZ0nXTw9ddKOJUTHKBFCm7Fw4jv2mVo+BRU+7GvY6fPqDhyJMrI+CcciGwK 2I6OGW5P5wtZ8NPCi6ANukm9VmYZrGLCjXhIVRSJ5S9rLFBnuQwfP5Fm5+kuXR3IjyaS AhQSmEkFQjs97Lkf/m2wYKPqIfSmlK3Vro6X6JK/qoSFaRx/ZVX91BllvelGmjElW3yy 6xAMW/bM76AS2E4g+tkFackhQ2ODI1W7niL7rfVBTF0BcPuI1B0jEk838ceyscROeQut lV5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=p1od3tTn; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=nDodF2tW; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id u28si930720qta.232.2019.04.03.19.01.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:14 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=p1od3tTn; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=nDodF2tW; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id AAAF5221BF; Wed, 3 Apr 2019 22:01:14 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=4EpfoQzzzw77q n7dCDXDDJUXpnn7Bnp/NAwQ6GdLHWQ=; b=p1od3tTnJY6BUbGgwrwNqkOZzp7q+ o6DwVSpwMhxqIKGDeCP7E5zQkDGlCI0y8hAEtS/nzGxKVnZDLZrs7xdQrMjjTqBO P5mPv06kOjHwRa7kQPr5QVxSmt+n8ra4tUpqoch9dFrPqBnu1OUPVDDJhu0xoV5I KdqJfDwr8/vQmBor5jxfckfOTMXAWOgPjlyztkA/5xZfE3ojZXFKCk6SwjEA+gPR 7F65EsldaMSOr/6o4tmrb/3t8WUiQMteG4eKBqpP/FasBrdgVMBRGNieyX6qMRdX 01h1pLwVY478z7ojm696nb4qcOgeiG9TrYwIzk3Wm8zSY5PM38EmJjBlQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=4EpfoQzzzw77qn7dCDXDDJUXpnn7Bnp/NAwQ6GdLHWQ=; b=nDodF2tW QaRaADQt2eaHQvTkogTS9TRDdSzKUn+hyDGwYRg4RVHvxSp/7i5INpea0cbjYfFK JgZSCYRYsv21ARjce5XdM5SlXDQHCbTQjv0paeuePDNVTMfGpQF2JLx3UeI7LCxG XssFt4LoPYsHtDE84rkKPLo4EsDyWwucVIevOAnDcb8l1Z6MlNL5mOxjy/pT6NUo orje26OW1bg2G1lTFXk7kzmTeyoZezYs+9h33bZU0Lbnk4wH9n2lGLlNl7Bc4N1b xI6jE84PUZX309zo4yL+TWh50v5IgmzqEhA13RzK3b+o4W7gAYEKsvWd8C08xOqN 0PkOBYWYJS1mpA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgeptd X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id B77671030F; Wed, 3 Apr 2019 22:01:12 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 01/25] mm: migrate: Change migrate_mode to support combination migration modes. Date: Wed, 3 Apr 2019 19:00:22 -0700 Message-Id: <20190404020046.32741-2-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan No functionality is changed. Prepare for the following patches, which add parallel, concurrent page migration modes in conjunction to the existing modes. Signed-off-by: Zi Yan --- fs/aio.c | 10 +++++----- fs/f2fs/data.c | 4 ++-- fs/hugetlbfs/inode.c | 2 +- fs/iomap.c | 2 +- fs/ubifs/file.c | 2 +- include/linux/migrate_mode.h | 2 ++ mm/balloon_compaction.c | 2 +- mm/compaction.c | 22 +++++++++++----------- mm/migrate.c | 18 +++++++++--------- mm/zsmalloc.c | 2 +- 10 files changed, 34 insertions(+), 32 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 38b741a..0a88dfd 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -389,7 +389,7 @@ static int aio_migratepage(struct address_space *mapping, struct page *new, * happen under the ctx->completion_lock. That does not work with the * migration workflow of MIGRATE_SYNC_NO_COPY. */ - if (mode == MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) == MIGRATE_SYNC_NO_COPY) return -EINVAL; rc = 0; @@ -1300,10 +1300,10 @@ static long read_events(struct kioctx *ctx, long min_nr, long nr, * Create an aio_context capable of receiving at least nr_events. * ctxp must not point to an aio_context that already exists, and * must be initialized to 0 prior to the call. On successful - * creation of the aio_context, *ctxp is filled in with the resulting + * creation of the aio_context, *ctxp is filled in with the resulting * handle. May fail with -EINVAL if *ctxp is not initialized, - * if the specified nr_events exceeds internal limits. May fail - * with -EAGAIN if the specified nr_events exceeds the user's limit + * if the specified nr_events exceeds internal limits. May fail + * with -EAGAIN if the specified nr_events exceeds the user's limit * of available events. May fail with -ENOMEM if insufficient kernel * resources are available. May fail with -EFAULT if an invalid * pointer is passed for ctxp. Will fail with -ENOSYS if not @@ -1373,7 +1373,7 @@ COMPAT_SYSCALL_DEFINE2(io_setup, unsigned, nr_events, u32 __user *, ctx32p) #endif /* sys_io_destroy: - * Destroy the aio_context specified. May cancel any outstanding + * Destroy the aio_context specified. May cancel any outstanding * AIOs and block on completion. Will fail with -ENOSYS if not * implemented. May fail with -EINVAL if the context pointed to * is invalid. diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 97279441..e7f0e3a 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2792,7 +2792,7 @@ int f2fs_migrate_page(struct address_space *mapping, /* migrating an atomic written page is safe with the inmem_lock hold */ if (atomic_written) { - if (mode != MIGRATE_SYNC) + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC) return -EBUSY; if (!mutex_trylock(&fi->inmem_lock)) return -EAGAIN; @@ -2825,7 +2825,7 @@ int f2fs_migrate_page(struct address_space *mapping, f2fs_clear_page_private(page); } - if (mode != MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) migrate_page_copy(newpage, page); else migrate_page_states(newpage, page); diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index ec32fec..04ba8bb 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -885,7 +885,7 @@ static int hugetlbfs_migrate_page(struct address_space *mapping, set_page_private(page, 0); } - if (mode != MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) migrate_page_copy(newpage, page); else migrate_page_states(newpage, page); diff --git a/fs/iomap.c b/fs/iomap.c index abdd18e..8ee3f9f 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -584,7 +584,7 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage, SetPagePrivate(newpage); } - if (mode != MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) migrate_page_copy(newpage, page); else migrate_page_states(newpage, page); diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index 5d2ffb1..2bb8788 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -1490,7 +1490,7 @@ static int ubifs_migrate_page(struct address_space *mapping, SetPagePrivate(newpage); } - if (mode != MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) migrate_page_copy(newpage, page); else migrate_page_states(newpage, page); diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h index 883c992..59d75fc 100644 --- a/include/linux/migrate_mode.h +++ b/include/linux/migrate_mode.h @@ -17,6 +17,8 @@ enum migrate_mode { MIGRATE_SYNC_LIGHT, MIGRATE_SYNC, MIGRATE_SYNC_NO_COPY, + + MIGRATE_MODE_MASK = 3, }; #endif /* MIGRATE_MODE_H_INCLUDED */ diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c index ef858d5..5acb55f 100644 --- a/mm/balloon_compaction.c +++ b/mm/balloon_compaction.c @@ -158,7 +158,7 @@ int balloon_page_migrate(struct address_space *mapping, * is unlikely to be use with ballon pages. See include/linux/hmm.h for * user of the MIGRATE_SYNC_NO_COPY mode. */ - if (mode == MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) == MIGRATE_SYNC_NO_COPY) return -EINVAL; VM_BUG_ON_PAGE(!PageLocked(page), page); diff --git a/mm/compaction.c b/mm/compaction.c index f171a83..bfcbe08 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -408,7 +408,7 @@ static void update_cached_migrate(struct compact_control *cc, unsigned long pfn) if (pfn > zone->compact_cached_migrate_pfn[0]) zone->compact_cached_migrate_pfn[0] = pfn; - if (cc->mode != MIGRATE_ASYNC && + if ((cc->mode & MIGRATE_MODE_MASK) != MIGRATE_ASYNC && pfn > zone->compact_cached_migrate_pfn[1]) zone->compact_cached_migrate_pfn[1] = pfn; } @@ -475,7 +475,7 @@ static bool compact_lock_irqsave(spinlock_t *lock, unsigned long *flags, struct compact_control *cc) { /* Track if the lock is contended in async mode */ - if (cc->mode == MIGRATE_ASYNC && !cc->contended) { + if (((cc->mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC) && !cc->contended) { if (spin_trylock_irqsave(lock, *flags)) return true; @@ -792,7 +792,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, */ while (unlikely(too_many_isolated(pgdat))) { /* async migration should just abort */ - if (cc->mode == MIGRATE_ASYNC) + if ((cc->mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC) return 0; congestion_wait(BLK_RW_ASYNC, HZ/10); @@ -803,7 +803,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, cond_resched(); - if (cc->direct_compaction && (cc->mode == MIGRATE_ASYNC)) { + if (cc->direct_compaction && ((cc->mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC)) { skip_on_failure = true; next_skip_pfn = block_end_pfn(low_pfn, cc->order); } @@ -1117,7 +1117,7 @@ static bool suitable_migration_source(struct compact_control *cc, if (pageblock_skip_persistent(page)) return false; - if ((cc->mode != MIGRATE_ASYNC) || !cc->direct_compaction) + if (((cc->mode & MIGRATE_MODE_MASK) != MIGRATE_ASYNC) || !cc->direct_compaction) return true; block_mt = get_pageblock_migratetype(page); @@ -1216,7 +1216,7 @@ fast_isolate_around(struct compact_control *cc, unsigned long pfn, unsigned long return; /* Minimise scanning during async compaction */ - if (cc->direct_compaction && cc->mode == MIGRATE_ASYNC) + if (cc->direct_compaction && (cc->mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC) return; /* Pageblock boundaries */ @@ -1448,7 +1448,7 @@ static void isolate_freepages(struct compact_control *cc) block_end_pfn = min(block_start_pfn + pageblock_nr_pages, zone_end_pfn(zone)); low_pfn = pageblock_end_pfn(cc->migrate_pfn); - stride = cc->mode == MIGRATE_ASYNC ? COMPACT_CLUSTER_MAX : 1; + stride = (cc->mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC ? COMPACT_CLUSTER_MAX : 1; /* * Isolate free pages until enough are available to migrate the @@ -1734,7 +1734,7 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone, struct page *page; const isolate_mode_t isolate_mode = (sysctl_compact_unevictable_allowed ? ISOLATE_UNEVICTABLE : 0) | - (cc->mode != MIGRATE_SYNC ? ISOLATE_ASYNC_MIGRATE : 0); + (((cc->mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC) ? ISOLATE_ASYNC_MIGRATE : 0); bool fast_find_block; /* @@ -1907,7 +1907,7 @@ static enum compact_result __compact_finished(struct compact_control *cc) * to sync compaction, as async compaction operates * on pageblocks of the same migratetype. */ - if (cc->mode == MIGRATE_ASYNC || + if ((cc->mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC || IS_ALIGNED(cc->migrate_pfn, pageblock_nr_pages)) { return COMPACT_SUCCESS; @@ -2063,7 +2063,7 @@ compact_zone(struct compact_control *cc, struct capture_control *capc) unsigned long start_pfn = cc->zone->zone_start_pfn; unsigned long end_pfn = zone_end_pfn(cc->zone); unsigned long last_migrated_pfn; - const bool sync = cc->mode != MIGRATE_ASYNC; + const bool sync = (cc->mode & MIGRATE_MODE_MASK) != MIGRATE_ASYNC; bool update_cached; cc->migratetype = gfpflags_to_migratetype(cc->gfp_mask); @@ -2195,7 +2195,7 @@ compact_zone(struct compact_control *cc, struct capture_control *capc) * order-aligned block, so skip the rest of it. */ if (cc->direct_compaction && - (cc->mode == MIGRATE_ASYNC)) { + ((cc->mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC)) { cc->migrate_pfn = block_end_pfn( cc->migrate_pfn - 1, cc->order); /* Draining pcplists is useless in this case */ diff --git a/mm/migrate.c b/mm/migrate.c index ac6f493..c161c03 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -691,7 +691,7 @@ int migrate_page(struct address_space *mapping, if (rc != MIGRATEPAGE_SUCCESS) return rc; - if (mode != MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) migrate_page_copy(newpage, page); else migrate_page_states(newpage, page); @@ -707,7 +707,7 @@ static bool buffer_migrate_lock_buffers(struct buffer_head *head, struct buffer_head *bh = head; /* Simple case, sync compaction */ - if (mode != MIGRATE_ASYNC) { + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_ASYNC) { do { lock_buffer(bh); bh = bh->b_this_page; @@ -804,7 +804,7 @@ static int __buffer_migrate_page(struct address_space *mapping, SetPagePrivate(newpage); - if (mode != MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) migrate_page_copy(newpage, page); else migrate_page_states(newpage, page); @@ -895,7 +895,7 @@ static int fallback_migrate_page(struct address_space *mapping, { if (PageDirty(page)) { /* Only writeback pages in full synchronous migration */ - switch (mode) { + switch (mode & MIGRATE_MODE_MASK) { case MIGRATE_SYNC: case MIGRATE_SYNC_NO_COPY: break; @@ -911,7 +911,7 @@ static int fallback_migrate_page(struct address_space *mapping, */ if (page_has_private(page) && !try_to_release_page(page, GFP_KERNEL)) - return mode == MIGRATE_SYNC ? -EAGAIN : -EBUSY; + return (mode & MIGRATE_MODE_MASK) == MIGRATE_SYNC ? -EAGAIN : -EBUSY; return migrate_page(mapping, newpage, page, mode); } @@ -1009,7 +1009,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage, bool is_lru = !__PageMovable(page); if (!trylock_page(page)) { - if (!force || mode == MIGRATE_ASYNC) + if (!force || ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC)) goto out; /* @@ -1038,7 +1038,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage, * the retry loop is too short and in the sync-light case, * the overhead of stalling is too much */ - switch (mode) { + switch (mode & MIGRATE_MODE_MASK) { case MIGRATE_SYNC: case MIGRATE_SYNC_NO_COPY: break; @@ -1303,9 +1303,9 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, return -ENOMEM; if (!trylock_page(hpage)) { - if (!force) + if (!force || ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC)) goto out; - switch (mode) { + switch (mode & MIGRATE_MODE_MASK) { case MIGRATE_SYNC: case MIGRATE_SYNC_NO_COPY: break; diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 0787d33..018bb51 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1981,7 +1981,7 @@ static int zs_page_migrate(struct address_space *mapping, struct page *newpage, * happen under the zs lock, which does not work with * MIGRATE_SYNC_NO_COPY workflow. */ - if (mode == MIGRATE_SYNC_NO_COPY) + if ((mode & MIGRATE_MODE_MASK) == MIGRATE_SYNC_NO_COPY) return -EINVAL; VM_BUG_ON_PAGE(!PageMovable(page), page); From patchwork Thu Apr 4 02:00:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884737 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E22B317E9 for ; Thu, 4 Apr 2019 02:01:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CB206288E4 for ; Thu, 4 Apr 2019 02:01:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BE959288EC; Thu, 4 Apr 2019 02:01:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 00E91288E4 for ; Thu, 4 Apr 2019 02:01:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCD8E6B000A; Wed, 3 Apr 2019 22:01:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B04FD6B000D; Wed, 3 Apr 2019 22:01:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 97DC46B0266; Wed, 3 Apr 2019 22:01:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 7309B6B000A for ; Wed, 3 Apr 2019 22:01:18 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id o135so932344qke.11 for ; Wed, 03 Apr 2019 19:01:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=ihM+5l/zyd7XBvRYBdStqW9O8n8lQHe/JBYsxvDJM5o=; b=kYmZbPYPBd5WbRtzZUcScIqbworKgVwlAdovxZUp3FaNiqlkvGKamPzd/tNuIrI1FD cRuxWf7IUCXj03Rm3SRqt2UUDyvIZfksXpnMXZwQw19qaXgrhErrLkzcstBUEzBs4p8J Wf3UJdlyGhPDFfy9Lh0DqDM8VcTr18qhIVjeZNiX0UqdTV6w1XyQ1liw06MM6bvyLm+U o0wCn4tWSzebh6qyFn++gInSeIRoxXhqs73hDKIZ/CoYYjTuJ6GcjJ9HFZs+nSVsqrnQ 3TpHONpC4Oa5pPqMkN9vHgxpHxtknNAsJySha9XSPvGNfVBXJV3M03ML2jWts0Mlbon4 Z7zQ== X-Gm-Message-State: APjAAAXnUu93etVAJ6Q4L1CXbnJLxOXFoxvnCRx8WZxHAwyMRjaTs+wt 5ExsfVeFfvXuptoJhQe0UFaV08mfbpAhxjOGwaAlx7tJkA/Y0BpUj2fITgWfdPBI5rnT1jK1js2 7JjdZkQfnVSDithSGUg+CNwvio0IVabxAaCKzV2giVn5i4rv+5RGJLA+wPFlXM1a1KA== X-Received: by 2002:ac8:1bba:: with SMTP id z55mr3043121qtj.354.1554343278187; Wed, 03 Apr 2019 19:01:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqwvHJwlXgvFPS76SdV3xYurAx9dH5exx2Wkuye+TV8M8G59KBPli29b2nnm41op2B6AjWW3 X-Received: by 2002:ac8:1bba:: with SMTP id z55mr3043039qtj.354.1554343276885; Wed, 03 Apr 2019 19:01:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343276; cv=none; d=google.com; s=arc-20160816; b=LjIcxHK8ZVnahuCvEPTDsReysQjp8oK7j9iRZ5JnLPsZgmd6GjB9fq3dFDLca3rSxW U0J7DqQdIX+8D7krKWEcmYCQfcts9WPZtLmjzsAOqjbeOS4L0FqFf5Oyg1kfdZhNGatt CxtRjncVW56QBUjLX4V79H6gVRITqKqMFs9pCH1npCW/JyNpLLCZNUECnyrDtXipEPig h32KgwD2taEH5UnVAj5/Qz3HjNTWKUm0Z2zUFaJ6q03teFeJjMvbDU6keK7qrh4hua9t NyU9nodaPNAVgx4Oe3OjtECxfC9t/sHWrH1teLeI1XeMO275T8AIBCyk/t584I9RXWvX FlYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=ihM+5l/zyd7XBvRYBdStqW9O8n8lQHe/JBYsxvDJM5o=; b=U/8W7voInszlDhXMmpBl0pSdqHaMimqCXjnFP4SQAm6FzKF++bEvqAZvRi2AuG1U+H 0xdItlnn9oYIdlBY4mJ7CtYyAR9Fzwf7tLhruxNW3BV0p+si3+Of2mwOI3gj7qaxPVC1 9GclUlznFw+2Wq2H0rXhH9NiP/n6VF9lxMFssNVMcrau7xdM2vwBwjJE7/IMlkK4Q48O kfomp3h79164lQUfAUtTKYZOlPS+fJ6f5PNf5VRBbGUXaA644A0Iznb0U7OH28vtM6ZM NF/TyPUaPr1LPNCZXsYXvWb9SrUl5jVZwL+w6ZfI9nOIJgIH10parX4plCIEMt/Fqgjo gbyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=VQPd0O7h; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=wEDUiOU+; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id y7si8667453qty.257.2019.04.03.19.01.16 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:16 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=VQPd0O7h; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=wEDUiOU+; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 7375122525; Wed, 3 Apr 2019 22:01:16 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=ihM+5l/zyd7XB vRYBdStqW9O8n8lQHe/JBYsxvDJM5o=; b=VQPd0O7hJ0wvsMutCNmD78faZPZ5E FcEPvoAd7Xjju9weyKnmAElxvDbqB1bHoXgsC2+QHRKTN+fameCnUyeTwgNwb9Z2 SH2yYSBnBh46c9YFeLWXZpNzWm+j32E+oNDXHCc6U1d34PJYkSDMzt5DGM2Qz1QN Tbann4HmlYW88Og5rD86lUCqyqwOtXUc4QrTJgQ9X3bW8XLl/mM89KCJb2AtllwH qTaWMb8BM/noaYd2lkeZ7OiTMgoUfum7rgp1T2qqEJ4MKErjr4tfP6RDkwVBX/pr iMF5HBeEVdWsMXFWHP/XOZEP9paSDOyaeZyf13T16m8x3mdTAc4hVeY9g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=ihM+5l/zyd7XBvRYBdStqW9O8n8lQHe/JBYsxvDJM5o=; b=wEDUiOU+ XucSrcKL+pGCjAdhb2GEEwivLv8ggXuPckHBFYVugrbr/hAyoeUpAZhgdmoR0QP0 G4n3B3iLo4uL3tajbYleVaZIZrg7j+YXrDCPX75ip4LMeW56EID6gUJQXEEG4wUU ud9CedxwOfnuOT0e2H4MzXs056FqRdCjOf7HFb7115e22yr0W2R6Re/fZfcJX+Us +vOVD5oRRYjN6/uuKr7btlTWxZBbKXxAXpYQr+xsgfvW9uDIjGJLIi0W585hGjeU PGQauqdYZImfeI2KZf6K4Ks/iUgSWrTDzy3yBKj5IJXEq18sHP7R7YmmshBv02I1 egaYHWkrpm2XHw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepud X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 6372E1031A; Wed, 3 Apr 2019 22:01:14 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 02/25] mm: migrate: Add mode parameter to support future page copy routines. Date: Wed, 3 Apr 2019 19:00:23 -0700 Message-Id: <20190404020046.32741-3-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan MIGRATE_SINGLETHREAD is added as the default behavior. migrate_page_copy() and copy_huge_page() are changed. Signed-off-by: Zi Yan --- fs/aio.c | 2 +- fs/f2fs/data.c | 2 +- fs/hugetlbfs/inode.c | 2 +- fs/iomap.c | 2 +- fs/ubifs/file.c | 2 +- include/linux/migrate.h | 6 ++++-- include/linux/migrate_mode.h | 3 +++ mm/migrate.c | 14 ++++++++------ 8 files changed, 20 insertions(+), 13 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index 0a88dfd..986d21e 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -437,7 +437,7 @@ static int aio_migratepage(struct address_space *mapping, struct page *new, * events from being lost. */ spin_lock_irqsave(&ctx->completion_lock, flags); - migrate_page_copy(new, old); + migrate_page_copy(new, old, MIGRATE_SINGLETHREAD); BUG_ON(ctx->ring_pages[idx] != old); ctx->ring_pages[idx] = new; spin_unlock_irqrestore(&ctx->completion_lock, flags); diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index e7f0e3a..6a419a9 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2826,7 +2826,7 @@ int f2fs_migrate_page(struct address_space *mapping, } if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) - migrate_page_copy(newpage, page); + migrate_page_copy(newpage, page, MIGRATE_SINGLETHREAD); else migrate_page_states(newpage, page); diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 04ba8bb..03dfa49 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -886,7 +886,7 @@ static int hugetlbfs_migrate_page(struct address_space *mapping, } if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) - migrate_page_copy(newpage, page); + migrate_page_copy(newpage, page, MIGRATE_SINGLETHREAD); else migrate_page_states(newpage, page); diff --git a/fs/iomap.c b/fs/iomap.c index 8ee3f9f..a6e0456 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -585,7 +585,7 @@ iomap_migrate_page(struct address_space *mapping, struct page *newpage, } if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) - migrate_page_copy(newpage, page); + migrate_page_copy(newpage, page, MIGRATE_SINGLETHREAD); else migrate_page_states(newpage, page); return MIGRATEPAGE_SUCCESS; diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index 2bb8788..3a3dbbd 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -1491,7 +1491,7 @@ static int ubifs_migrate_page(struct address_space *mapping, } if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) - migrate_page_copy(newpage, page); + migrate_page_copy(newpage, page, MIGRATE_SINGLETHREAD); else migrate_page_states(newpage, page); return MIGRATEPAGE_SUCCESS; diff --git a/include/linux/migrate.h b/include/linux/migrate.h index e13d9bf..5218a07 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -73,7 +73,8 @@ extern void putback_movable_page(struct page *page); extern int migrate_prep(void); extern int migrate_prep_local(void); extern void migrate_page_states(struct page *newpage, struct page *page); -extern void migrate_page_copy(struct page *newpage, struct page *page); +extern void migrate_page_copy(struct page *newpage, struct page *page, + enum migrate_mode mode); extern int migrate_huge_page_move_mapping(struct address_space *mapping, struct page *newpage, struct page *page); extern int migrate_page_move_mapping(struct address_space *mapping, @@ -97,7 +98,8 @@ static inline void migrate_page_states(struct page *newpage, struct page *page) } static inline void migrate_page_copy(struct page *newpage, - struct page *page) {} + struct page *page, + enum migrate_mode mode) {} static inline int migrate_huge_page_move_mapping(struct address_space *mapping, struct page *newpage, struct page *page) diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h index 59d75fc..da44940 100644 --- a/include/linux/migrate_mode.h +++ b/include/linux/migrate_mode.h @@ -11,6 +11,8 @@ * with the CPU. Instead, page copy happens outside the migratepage() * callback and is likely using a DMA engine. See migrate_vma() and HMM * (mm/hmm.c) for users of this mode. + * MIGRATE_SINGLETHREAD uses a single thread to move pages, it is the default + * behavior */ enum migrate_mode { MIGRATE_ASYNC, @@ -19,6 +21,7 @@ enum migrate_mode { MIGRATE_SYNC_NO_COPY, MIGRATE_MODE_MASK = 3, + MIGRATE_SINGLETHREAD = 0, }; #endif /* MIGRATE_MODE_H_INCLUDED */ diff --git a/mm/migrate.c b/mm/migrate.c index c161c03..2b2653e 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -567,7 +567,8 @@ static void __copy_gigantic_page(struct page *dst, struct page *src, } } -static void copy_huge_page(struct page *dst, struct page *src) +static void copy_huge_page(struct page *dst, struct page *src, + enum migrate_mode mode) { int i; int nr_pages; @@ -657,10 +658,11 @@ void migrate_page_states(struct page *newpage, struct page *page) } EXPORT_SYMBOL(migrate_page_states); -void migrate_page_copy(struct page *newpage, struct page *page) +void migrate_page_copy(struct page *newpage, struct page *page, + enum migrate_mode mode) { if (PageHuge(page) || PageTransHuge(page)) - copy_huge_page(newpage, page); + copy_huge_page(newpage, page, mode); else copy_highpage(newpage, page); @@ -692,7 +694,7 @@ int migrate_page(struct address_space *mapping, return rc; if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) - migrate_page_copy(newpage, page); + migrate_page_copy(newpage, page, mode); else migrate_page_states(newpage, page); return MIGRATEPAGE_SUCCESS; @@ -805,7 +807,7 @@ static int __buffer_migrate_page(struct address_space *mapping, SetPagePrivate(newpage); if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC_NO_COPY) - migrate_page_copy(newpage, page); + migrate_page_copy(newpage, page, MIGRATE_SINGLETHREAD); else migrate_page_states(newpage, page); @@ -2024,7 +2026,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, new_page->index = page->index; /* flush the cache before copying using the kernel virtual address */ flush_cache_range(vma, start, start + HPAGE_PMD_SIZE); - migrate_page_copy(new_page, page); + migrate_page_copy(new_page, page, MIGRATE_SINGLETHREAD); WARN_ON(PageLRU(new_page)); /* Recheck the target PMD */ From patchwork Thu Apr 4 02:00:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884739 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6873717E9 for ; Thu, 4 Apr 2019 02:01:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5155F288E4 for ; Thu, 4 Apr 2019 02:01:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 45610288EC; Thu, 4 Apr 2019 02:01:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 877C9288E4 for ; Thu, 4 Apr 2019 02:01:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90E136B000D; Wed, 3 Apr 2019 22:01:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 820D06B0266; Wed, 3 Apr 2019 22:01:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 673776B0269; Wed, 3 Apr 2019 22:01:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 39A976B000D for ; Wed, 3 Apr 2019 22:01:19 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id q21so953928qtf.10 for ; Wed, 03 Apr 2019 19:01:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=CjIqcHjtGoHFPW1884J+3zbLpqMgt9/Z/pz8Qn99yHE=; b=LhWnLOW8fIPkRVTOI/tXKkkcy9Mnx5wOT0oBZYr0YXsyWszpqWuO0OWCJKhXO2yHhW XiQMGFITArXqPOdjzOcPtbGZWJvmPENRB4S9SAZkfDr92z/nlMW4nluKPHaIP+JfittI IufEVl/0H8C95y6SfWkDEP6tAg0B0/x83nv5BQ3f+1dYV5/n6S3xrpqfat6yrLGiIfCl Ty7lmsBQnqWL0QC1JtlQHMFOHHESfibey9UX8cPqRv+eeBD4QwKpDCz0fsXFX3imDOjR gqja+10o0LxmGIy5N2Wz3LvsicqA3jAou09AhP30fp4DW+kv+UzN3Usuv+YiwOXx5lUk BBmA== X-Gm-Message-State: APjAAAUDZNp4AWzh2O/Unz/gLnyYCAOavK+PzeuxOnNJtfY4Jr3waSCA QZZTXDZLrgWq78QjP1hXL6I0gdLUG2Nqco9UDZ71UhAqdgFm/froJ/uCp1mS9vclwnW1gkQ2w5Y IplnH+MS66tnA3GFlGPL2qnwDMyoBdjSr7DJ8RmoBiZDm5+4qOolsDJauRTi3+a++Gw== X-Received: by 2002:ac8:674f:: with SMTP id n15mr2935378qtp.289.1554343278985; Wed, 03 Apr 2019 19:01:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqy+JhpAqV1jTUkF7G2o6FJdkoMXVAXaxGQfW27f8+ErvJtjo6mdShCQQulcIn7+W2FvPTGz X-Received: by 2002:ac8:674f:: with SMTP id n15mr2935331qtp.289.1554343278241; Wed, 03 Apr 2019 19:01:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343278; cv=none; d=google.com; s=arc-20160816; b=VBVj26K/khOXczmGDcY2fujqm/di90UmPYNEcaNiJM93CooTXRdEpc/ONIOFZDWYDe 8+C17T8PWSi4kpTUikFf2kcdNYgfJxTHjkmMszznSbxYgmjHj2Q/jDdmsQ4xXRqIT3oC ZaTezBMPQU6vlRUjfv+WyWWl/SW6CeWHnZuRtBVL1w4GKBeB2bVqDrKFwUTt6Cs67asv 593Gvz7J4N10PpZKokHp8kQv4S3BYY77eq/1qcnndFYZ55Q08diUsQjAbpfKa6ax7Pmw GcFgY75objRKktE5hUoVRf+UVvcqXdo8Fsx5K5Bk3r4eGEDT3hZnZVn1m5HUdUMtwaS0 3hqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=CjIqcHjtGoHFPW1884J+3zbLpqMgt9/Z/pz8Qn99yHE=; b=gzF9t1arZitG/EPmeCXNmAA1ezU2jm1dNhuZOYbigosYec/nJGYJ9vaf1/Q6f4GiRp CaEi/00XhTNw+SzmlWfUArXGrCWjAL21Uz6sfmPLmzHhxN4/M0jSGstTTIL8bn76QQJo wxXoMUdHOCeBIEMdb2/l5mFHhUoNQOUpU/utK5GRLd+W6EY770NLm7ilPYEG7OJMNIM7 FwYWxm2kgeRU07TbNlMLGrvwGCDg1P1rU8T0Io9cJlOl+h5gzvEjMvVUmrOad3jPJQRO TZeaV1A/qe8e2Kmtk4w5Kc1w3yG0V/x3EZp/qM1dlFjZAVRYzI06sjcAu988FWiYWeiH 2aMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=uEpllha3; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=qLsh8QQG; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id d27si440829qko.133.2019.04.03.19.01.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:18 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=uEpllha3; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=qLsh8QQG; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id F0B7A21FAE; Wed, 3 Apr 2019 22:01:17 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=CjIqcHjtGoHFP W1884J+3zbLpqMgt9/Z/pz8Qn99yHE=; b=uEpllha3uvC3u8tcgOeoEFcXkG5l/ kmY8OsTWncvsDv4bYW4h5o21393tnwIs+WUiL4zrefo1QL4rQ+ZvDAp/hsBq25q9 uVHArQYBQcDX9GZwkZYjsUUeXypWjMdGbZg5ms0VemMPA6NwIaek4m1xG7N78fec d17GfR4Za/gg5jSTifdKj3kChmqv3B7rSvfts7eTeeibCJZY3T4+ElpOWrIJBFBM jlSc4VcmSJhkJ7vE7upfE2AIhLZ36JPrcByF5POiSfNUdPldR6LxDikHzEhcKr+p QOjf+Zaw26zJdM9tQN8ZLb0xj+vnVNcnUL7Hv3S4ojZwFrFo83T0iPJlw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=CjIqcHjtGoHFPW1884J+3zbLpqMgt9/Z/pz8Qn99yHE=; b=qLsh8QQG SvXnhLHkzdEuprx7pRLulPcYDKK8+yCfY+Kl3tLISnITQUlI4zDeAu04tVbgiLKs vyIus0Y1JCSJt6G9if22pfcQ+bGMaxt0H439gBf6FtSx/3+dG2fsgeD7/NcQC71u 9cgyqjh5LMUpuoE1ijA1YjYpppZR32P0Bc8aihs7JSoSgjCqIcqREl3ftSqbfGP2 /ksAOF5L619riBCPlfOHGOjDverDWuDtE1/E7kWlUUHoBN6tdcj2+whK1a8F8c1b PHKiJ2pjh4CAHbQdZ698Kq5rqe/MFpha3MpXWi+ztNbT1IgfKsEUAE4wk5RK0sZ6 +RqqLRKWtVu5iA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepud X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 1744410316; Wed, 3 Apr 2019 22:01:16 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 03/25] mm: migrate: Add a multi-threaded page migration function. Date: Wed, 3 Apr 2019 19:00:24 -0700 Message-Id: <20190404020046.32741-4-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan copy_page_multithread() function is added to migrate huge pages in multi-threaded way, which provides higher throughput than a single-threaded way. Internally, copy_page_multithread() splits and distributes a huge page into multiple threads, then send them as jobs to system_highpri_wq. Signed-off-by: Zi Yan --- include/linux/highmem.h | 2 + mm/Makefile | 2 + mm/copy_page.c | 128 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 132 insertions(+) create mode 100644 mm/copy_page.c diff --git a/include/linux/highmem.h b/include/linux/highmem.h index ea5cdbd8c..0f50dc5 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -276,4 +276,6 @@ static inline void copy_highpage(struct page *to, struct page *from) #endif +int copy_page_multithread(struct page *to, struct page *from, int nr_pages); + #endif /* _LINUX_HIGHMEM_H */ diff --git a/mm/Makefile b/mm/Makefile index d210cc9..fa02a9f 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -44,6 +44,8 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ obj-y += init-mm.o obj-y += memblock.o +obj-y += copy_page.o + ifdef CONFIG_MMU obj-$(CONFIG_ADVISE_SYSCALLS) += madvise.o endif diff --git a/mm/copy_page.c b/mm/copy_page.c new file mode 100644 index 0000000..9cf849c --- /dev/null +++ b/mm/copy_page.c @@ -0,0 +1,128 @@ +/* + * Enhanced page copy routine. + * + * Copyright 2019 by NVIDIA. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Authors: Zi Yan + * + */ + +#include +#include +#include +#include + + +const unsigned int limit_mt_num = 4; + +/* ======================== multi-threaded copy page ======================== */ + +struct copy_item { + char *to; + char *from; + unsigned long chunk_size; +}; + +struct copy_page_info { + struct work_struct copy_page_work; + unsigned long num_items; + struct copy_item item_list[0]; +}; + +static void copy_page_routine(char *vto, char *vfrom, + unsigned long chunk_size) +{ + memcpy(vto, vfrom, chunk_size); +} + +static void copy_page_work_queue_thread(struct work_struct *work) +{ + struct copy_page_info *my_work = (struct copy_page_info *)work; + int i; + + for (i = 0; i < my_work->num_items; ++i) + copy_page_routine(my_work->item_list[i].to, + my_work->item_list[i].from, + my_work->item_list[i].chunk_size); +} + +int copy_page_multithread(struct page *to, struct page *from, int nr_pages) +{ + unsigned int total_mt_num = limit_mt_num; + int to_node = page_to_nid(to); + int i; + struct copy_page_info *work_items[NR_CPUS] = {0}; + char *vto, *vfrom; + unsigned long chunk_size; + const struct cpumask *per_node_cpumask = cpumask_of_node(to_node); + int cpu_id_list[NR_CPUS] = {0}; + int cpu; + int err = 0; + + total_mt_num = min_t(unsigned int, total_mt_num, + cpumask_weight(per_node_cpumask)); + if (total_mt_num > 1) + total_mt_num = (total_mt_num / 2) * 2; + + if (total_mt_num > num_online_cpus() || total_mt_num <=1) + return -ENODEV; + + for (cpu = 0; cpu < total_mt_num; ++cpu) { + work_items[cpu] = kzalloc(sizeof(struct copy_page_info) + + sizeof(struct copy_item), GFP_KERNEL); + if (!work_items[cpu]) { + err = -ENOMEM; + goto free_work_items; + } + } + + i = 0; + for_each_cpu(cpu, per_node_cpumask) { + if (i >= total_mt_num) + break; + cpu_id_list[i] = cpu; + ++i; + } + + vfrom = kmap(from); + vto = kmap(to); + chunk_size = PAGE_SIZE*nr_pages / total_mt_num; + + for (i = 0; i < total_mt_num; ++i) { + INIT_WORK((struct work_struct *)work_items[i], + copy_page_work_queue_thread); + + work_items[i]->num_items = 1; + work_items[i]->item_list[0].to = vto + i * chunk_size; + work_items[i]->item_list[0].from = vfrom + i * chunk_size; + work_items[i]->item_list[0].chunk_size = chunk_size; + + queue_work_on(cpu_id_list[i], + system_highpri_wq, + (struct work_struct *)work_items[i]); + } + + /* Wait until it finishes */ + for (i = 0; i < total_mt_num; ++i) + flush_work((struct work_struct *)work_items[i]); + + kunmap(to); + kunmap(from); + +free_work_items: + for (cpu = 0; cpu < total_mt_num; ++cpu) + if (work_items[cpu]) + kfree(work_items[cpu]); + + return err; +} From patchwork Thu Apr 4 02:00:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884741 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33BCB1708 for ; Thu, 4 Apr 2019 02:01:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16F95288E4 for ; Thu, 4 Apr 2019 02:01:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0B20528928; Thu, 4 Apr 2019 02:01:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C104288E4 for ; Thu, 4 Apr 2019 02:01:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1DE4A6B0266; Wed, 3 Apr 2019 22:01:21 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 18CDD6B0269; Wed, 3 Apr 2019 22:01:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 007E86B026A; Wed, 3 Apr 2019 22:01:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id D09416B0266 for ; Wed, 3 Apr 2019 22:01:20 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id 77so942259qkd.9 for ; Wed, 03 Apr 2019 19:01:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=sTcFdQ0uTFmNEqF/buqEuv4aNZlA40xkHnCTG+zeMnI=; b=BKFFXFSUXH2aIGQDPot0UOjIwSse2CrkO+PsrPzsWpzaZJTud1brDEScaIB7b8TrGw QsFUPIOAeH6VmWP+ezgRu38NzvwuvORUW0BlBRyCns+sWsyxD+MC1XXRxn4Cy0cTOCfU ROdO8lFY5QsC1VwXgceKAJfmsWcLUZHD1Fw+GZFyFVBzLULi25YzNlpOFwgid6lG6Fop /QfE1iEfXHk+ilqo12FS25dgESMtSFMoSLAA3et24d+V9TEBotT0gqyiPwaeho8stfJz 669CTV1/eOClYZHj26pIBZUqesDKim3/QjEe/dBaJT4OIEqjGy1xydWI+AdobDvqr0n5 OpDA== X-Gm-Message-State: APjAAAUJFISRGHiWjP08g5Q+gf2lgTw8htzTjqaeIFzmaASgj5S1m2Oy rAqIJ9GNnsNmBvboTlCsglbr/nafN7odZwofjiPOX4NDfyofw2ei23FRg7HLxBPstR4V60Fctoy nkdLEV7BSfwHKKf6ykf1XEYvuIMAVFwhV5mBPt9afl5PIicZgi68X54gNg2/gFt7wKQ== X-Received: by 2002:ac8:2de7:: with SMTP id q36mr3023094qta.3.1554343280557; Wed, 03 Apr 2019 19:01:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqy39vXIqXru2GmNHldv70bAIrmJnzf1lYkt7slQ6+loiMlhxTFvjQfz5Wty6PaD/ihSg79p X-Received: by 2002:ac8:2de7:: with SMTP id q36mr3023061qta.3.1554343279862; Wed, 03 Apr 2019 19:01:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343279; cv=none; d=google.com; s=arc-20160816; b=nZY4PGIWWvDW2Vtc+y43zp2EqUwTY7Tlmo22vfJqusQ8APXh4gPnN5bbozGT7iYyzY +pQXzWqo8pLfKjIdpIi3l4WYE6GIs1041EWj9GWW8QpY+ogjATFS9lAQfnCgjR31UK6k Ys6+QvstL8qybM/KdX7/5yIAymg4CssfiS/Yr0Rf3m+4XMO4TnyFwCWJn4jIAIKk72nx 6+RHDv9utm28pcHHQtASkqxaaU06OPzHeD8yKAuO4cwYlaR+ns5GpTUvO8ITv5MQ9u8S erpbBGNMK2MysvsyF/3098e6eDrXl0PkW9oenQKhnouWnoFfcphDFHHBM++YfqcJJMcO zdNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=sTcFdQ0uTFmNEqF/buqEuv4aNZlA40xkHnCTG+zeMnI=; b=khQNHX8DhKs/JH/dR32QC0I+b4ilOv5MGFiinFecFigNwVkKBNi+FxMbK69qZib8Zu ODnbhNjPZO85I/VtWmBau0nA3Jy/HD6VdSirEMXdORrHm3q3nSEORLUq0EIeobYfH5Oe NGl7By+BLikt/TPn3kNQsyEyfLRmJbIazzamK8NJNHRlzXO+prKx1lmqFD7vdr6fkrtJ 1W10rUozRJxeSONiuRTY85YXJ8sksWBblH+OxNmoHMfk3cT5sPbT1q9Clt+WPpfmNbFk tqs37TUIWA8xyUmsV+7QhYaYbPLfEAW3UtD1jRlyNtPWi3sEgekgU9F1+t/DinLXnytR z1HQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=V0JIpLVR; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=kwN41DXj; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id a29si8169888qte.337.2019.04.03.19.01.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:19 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=V0JIpLVR; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=kwN41DXj; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 8F98A2257C; Wed, 3 Apr 2019 22:01:19 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=sTcFdQ0uTFmNE qF/buqEuv4aNZlA40xkHnCTG+zeMnI=; b=V0JIpLVR1+jFUK3V21cuqVcW5pHEz EjA/Azwlmjyub9N3bs1MsXPgbBsO2TbK48M6mumUTV6WCI5VuqZYbRO+GKb/F2T2 Vjd2VxpLT5dcKrq5JWA4CIlwCOot8ZdU6Zmwu5DyJLLe1fAQfq+XaO0/mHN97L7p YC4X04nk0pRDz0tbycrlOugeHPKN7jLOUHyb9Wl6UCug0pM0yQ8SbaCQ/vvCdVRP OHuhIP2qIUYE/YZrDf/47drANb1XAr4qHcJ1ChcZ/Z1cqry5k+pBRCptIjE/yqZZ Lxni0AmzKX5tCCHhHXao1ySlxQRlPOnWX9cXwelHJVeYAdC6lFbQMbqJw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=sTcFdQ0uTFmNEqF/buqEuv4aNZlA40xkHnCTG+zeMnI=; b=kwN41DXj 0DXVNlAxGxdTXxrF5A2KNJ+rdQQp1auwloTcMOilZmB5wftzWBgF7+BDjm+haTXZ wY29smEaDNOa3uKbyjcnfL8JvksBch6T2C7Ad10foa0wKFC7NGFnytKrGpR5BF1j oLXtBZaMW3XgQa44IumNHFDXQQYLe4L12argM29pFDY/tnqldRBoMba9+CmEEWNx R2A2J4fg64nzem4SqoPRSUl+IvAHSHBse7s1+SairkNqiGloxy0aNrqhwaUv/ah4 7zWfKW37ypooI9QzHt3wWiNkuwK3fkvoK75RkCnuRRm3eJhdgW66K2a4BC3uPOzv 0A3gJOEARFO6gg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepud X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id B33B51030F; Wed, 3 Apr 2019 22:01:17 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 04/25] mm: migrate: Add copy_page_multithread into migrate_pages. Date: Wed, 3 Apr 2019 19:00:25 -0700 Message-Id: <20190404020046.32741-5-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan An option is added to move_pages() syscall to use multi-threaded page migration. Signed-off-by: Zi Yan --- include/linux/migrate_mode.h | 1 + include/uapi/linux/mempolicy.h | 2 ++ mm/migrate.c | 29 +++++++++++++++++++---------- 3 files changed, 22 insertions(+), 10 deletions(-) diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h index da44940..5bc8a77 100644 --- a/include/linux/migrate_mode.h +++ b/include/linux/migrate_mode.h @@ -22,6 +22,7 @@ enum migrate_mode { MIGRATE_MODE_MASK = 3, MIGRATE_SINGLETHREAD = 0, + MIGRATE_MT = 1<<4, }; #endif /* MIGRATE_MODE_H_INCLUDED */ diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index 3354774..890269b 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -48,6 +48,8 @@ enum { #define MPOL_MF_LAZY (1<<3) /* Modifies '_MOVE: lazy migrate on fault */ #define MPOL_MF_INTERNAL (1<<4) /* Internal flags start here */ +#define MPOL_MF_MOVE_MT (1<<6) /* Use multi-threaded page copy routine */ + #define MPOL_MF_VALID (MPOL_MF_STRICT | \ MPOL_MF_MOVE | \ MPOL_MF_MOVE_ALL) diff --git a/mm/migrate.c b/mm/migrate.c index 2b2653e..dd6ccbe 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -572,6 +572,7 @@ static void copy_huge_page(struct page *dst, struct page *src, { int i; int nr_pages; + int rc = -EFAULT; if (PageHuge(src)) { /* hugetlbfs page */ @@ -588,10 +589,14 @@ static void copy_huge_page(struct page *dst, struct page *src, nr_pages = hpage_nr_pages(src); } - for (i = 0; i < nr_pages; i++) { - cond_resched(); - copy_highpage(dst + i, src + i); - } + if (mode & MIGRATE_MT) + rc = copy_page_multithread(dst, src, nr_pages); + + if (rc) + for (i = 0; i < nr_pages; i++) { + cond_resched(); + copy_highpage(dst + i, src + i); + } } /* @@ -1500,7 +1505,7 @@ static int store_status(int __user *status, int start, int value, int nr) } static int do_move_pages_to_node(struct mm_struct *mm, - struct list_head *pagelist, int node) + struct list_head *pagelist, int node, bool migrate_mt) { int err; @@ -1508,7 +1513,8 @@ static int do_move_pages_to_node(struct mm_struct *mm, return 0; err = migrate_pages(pagelist, alloc_new_node_page, NULL, node, - MIGRATE_SYNC, MR_SYSCALL); + MIGRATE_SYNC | (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD), + MR_SYSCALL); if (err) putback_movable_pages(pagelist); return err; @@ -1629,7 +1635,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, current_node = node; start = i; } else if (node != current_node) { - err = do_move_pages_to_node(mm, &pagelist, current_node); + err = do_move_pages_to_node(mm, &pagelist, current_node, + flags & MPOL_MF_MOVE_MT); if (err) goto out; err = store_status(status, start, current_node, i - start); @@ -1652,7 +1659,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, if (err) goto out_flush; - err = do_move_pages_to_node(mm, &pagelist, current_node); + err = do_move_pages_to_node(mm, &pagelist, current_node, + flags & MPOL_MF_MOVE_MT); if (err) goto out; if (i > start) { @@ -1667,7 +1675,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, return err; /* Make sure we do not overwrite the existing error */ - err1 = do_move_pages_to_node(mm, &pagelist, current_node); + err1 = do_move_pages_to_node(mm, &pagelist, current_node, + flags & MPOL_MF_MOVE_MT); if (!err1) err1 = store_status(status, start, current_node, i - start); if (!err) @@ -1763,7 +1772,7 @@ static int kernel_move_pages(pid_t pid, unsigned long nr_pages, nodemask_t task_nodes; /* Check flags */ - if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL)) + if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL|MPOL_MF_MOVE_MT)) return -EINVAL; if ((flags & MPOL_MF_MOVE_ALL) && !capable(CAP_SYS_NICE)) From patchwork Thu Apr 4 02:00:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884743 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9297217E9 for ; Thu, 4 Apr 2019 02:01:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7B911288E4 for ; Thu, 4 Apr 2019 02:01:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6F211288E8; Thu, 4 Apr 2019 02:01:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F007A28929 for ; Thu, 4 Apr 2019 02:01:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E5EDC6B0269; Wed, 3 Apr 2019 22:01:22 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E393B6B026A; Wed, 3 Apr 2019 22:01:22 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8B766B026B; Wed, 3 Apr 2019 22:01:22 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 934836B0269 for ; Wed, 3 Apr 2019 22:01:22 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id w124so928507qkb.12 for ; Wed, 03 Apr 2019 19:01:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=lg2BrG+7fYw7q7Cq2sx/23jUXOHpzN8JkKb4HEWmLsw=; b=nlR1DenPUhQv022O1YngUtclhQp1xTUL9x/o7QHC3cxEWY5xD/7Golvb/p1Z6bNi+s bnuR3aHP1Rfq2puXQgYigRXiywm6L6lITR8CyMgL0Gk50x8FUYD2LOoZfulvc3BkrX0i L9FomuzHzvRTj8MtychLgXFpHeGZTu4gmUklqt/PTD31xhTkLA6TQ8K6VTmkHisdipsy VgWKgWOC9OQMveDbibZsslZd+T217nGcVEQRP5dVqldqYoGX8Nt5/GEVp/j7FMto7uIZ 5+arFxd8TCxeCOdO5mVJx3lbJ/UPmKsN7vuUfp04T1k5+VkxFEXKRE9BiXFKfJJsmzjW 9jVQ== X-Gm-Message-State: APjAAAWfp+nuIhdHt0lBiJGsumnjO/+kbrq82ksEPDihJkpaHWpuwPqn qXQ3sbNM3qr3UV4H28tw5yR07FiXA0vY6bwZAMHBBh/Kw0NSDJ6eKIpxESyko0zXnXkFNQdZFOt PzPSaH5OJedqrdvrKWlLje7LZX4zpJuYjPE41nNANOzamA/xWSQC4CJ+aFVAZ83b2Lg== X-Received: by 2002:a37:b444:: with SMTP id d65mr2887097qkf.125.1554343282332; Wed, 03 Apr 2019 19:01:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqxDFyUIUBttcSz6zGtGZI7XgsFjJJ+YVkAAlUDFF9jsrmv3xxi2RIRLwahdGRZWNlgNkb7s X-Received: by 2002:a37:b444:: with SMTP id d65mr2887058qkf.125.1554343281735; Wed, 03 Apr 2019 19:01:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343281; cv=none; d=google.com; s=arc-20160816; b=g0Iq0IwhNm8BBhPVK1A+bhrB1BLvyED7UPnqfL0nQywhQ7xBQsFQDS0nelm22/NdoP ktr88E9WTj5iMptuJJjWNYAN1rMPGXVEc68MtK7rbN0QoBXnV6Wp3Ock5Jg0gNnmkl7C SJW4llsmBph0mB6ZnlvgdXNtn6RL9v0CZYp3KvJVHjCpyMOS9tOnQ+ueCOvModZtG9d+ tRTaBE8Nw8tqwC/vRm43wmPDuxgsvnZDFxviNCkSUNkebBk3CipC7vvVgnK4DAWq5Ukd 6PeCycAfSyZOK4wK2VFfMKg0Fdg3yDKtOJvmtS6UpckBebDAL3tZKI4elaCIwdcbSW36 oZLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=lg2BrG+7fYw7q7Cq2sx/23jUXOHpzN8JkKb4HEWmLsw=; b=gAllh3ZnzxyMuTjG2URZ7AgDo7d0HnIYcf0WnLvMpwk/apjd49qKJDvz/JLRJQ7PVr 4ZgXOsgGeNXxttbII+PFkszmUAnCKf4T8o/HtlVgb5sAKCS2M3wWxDzcvcwpKX91ZS1r F5pTnlTOk8UW4+7XNUqs/ZKtYu+9It+E/nDkFcvnovF0BhJtOF3Yhoec4B6gNAhmoPav DLoE37Y64Jy525FNJYFi+R1TVgHpWU9uEy1HDO6GumxkBMpwbiK3yAnmoK0lW9Volzcz fGjenlZAjuKbm2f4iqu4LPJu1vQ76cjm5c5imYrcMJdU1EsmZUqotBd2qSfWpyISr+7T HkzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=uNeajQ0V; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=st5240RR; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id d58si1954952qtk.97.2019.04.03.19.01.21 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:21 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=uNeajQ0V; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=st5240RR; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 73C3222205; Wed, 3 Apr 2019 22:01:21 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=lg2BrG+7fYw7q 7Cq2sx/23jUXOHpzN8JkKb4HEWmLsw=; b=uNeajQ0VmyZrTaaKNxl7RQ6fUnbft HfoH4ce85/rJbvEi+CBqtoLJtqZyFMMlQlEStmcyVUcAjrVglB+yyZbiZo24q6L1 KQTQUHeJA/wzXfBDhYRXZADV2wN6UPVYjXI9yvcURTE1qn/I5n2NROOjS6/nSgu+ qgzKPIJPVf0AYKvnce2HOqcQthA7gXL29ObEO/J8kn9BELEPsJvj2KWK/4ep7r3h CEzAUvxtYO8JHjbO5FCasFxIzZA2cogG/bxz2DRpBkl3Z98GlFRJcgVpmCI07J2m rLBju+P4I7ACY3XmHB9i6Bf0ene5A+R9Efnra7Kbpt9ygm978Hgrf9BYA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=lg2BrG+7fYw7q7Cq2sx/23jUXOHpzN8JkKb4HEWmLsw=; b=st5240RR dYFUN5ooCSyxrStwKafWNdgADoCKkzOnri6niAKYhRbzjvvmOB2n/IGnbhzs3Rdr BzFAI7LWFHIaSfd4C1wnUYKAjfCRjYPSiZr2T6l9a7AtPuvd/Wc+c5iMwLi5TJ/8 3FbeCLHf8/+YCi+Kfq1+meQ1qMQncoTqjUr8eUBhYnfyqSRN0L83PwDnahGfVrFj KSS9R6Ebdc5Mdjz2oGrlKUfR8yTa+AhU22GyAFAn4nyUwKKrB5i+U9yExQRb5ejx F5jr4MpsizRiOzgjGUqu/7yx/HlMEc4xbZbbXQP/Yi+lhMx+dIKT+/fKyeaumUic 8C4Nqe6TaDtLzQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepge X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 64B1610393; Wed, 3 Apr 2019 22:01:19 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 05/25] mm: migrate: Add vm.accel_page_copy in sysfs to control page copy acceleration. Date: Wed, 3 Apr 2019 19:00:26 -0700 Message-Id: <20190404020046.32741-6-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Since base page migration did not gain any speedup from multi-threaded methods, we only accelerate the huge page case. Signed-off-by: Zi Yan --- kernel/sysctl.c | 11 +++++++++++ mm/migrate.c | 6 ++++++ 2 files changed, 17 insertions(+) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index e5da394..3d8490e 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -101,6 +101,8 @@ #if defined(CONFIG_SYSCTL) +extern int accel_page_copy; + /* External variables not in a header file. */ extern int suid_dumpable; #ifdef CONFIG_COREDUMP @@ -1430,6 +1432,15 @@ static struct ctl_table vm_table[] = { .extra2 = &one, }, #endif + { + .procname = "accel_page_copy", + .data = &accel_page_copy, + .maxlen = sizeof(accel_page_copy), + .mode = 0644, + .proc_handler = proc_dointvec, + .extra1 = &zero, + .extra2 = &one, + }, { .procname = "hugetlb_shm_group", .data = &sysctl_hugetlb_shm_group, diff --git a/mm/migrate.c b/mm/migrate.c index dd6ccbe..8a344e2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -55,6 +55,8 @@ #include "internal.h" +int accel_page_copy = 1; + /* * migrate_prep() needs to be called before we start compiling a list of pages * to be migrated using isolate_lru_page(). If scheduling work on other CPUs is @@ -589,6 +591,10 @@ static void copy_huge_page(struct page *dst, struct page *src, nr_pages = hpage_nr_pages(src); } + /* Try to accelerate page migration if it is not specified in mode */ + if (accel_page_copy) + mode |= MIGRATE_MT; + if (mode & MIGRATE_MT) rc = copy_page_multithread(dst, src, nr_pages); From patchwork Thu Apr 4 02:00:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884745 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 199BD1708 for ; Thu, 4 Apr 2019 02:01:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03802288E4 for ; Thu, 4 Apr 2019 02:01:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EC13F28929; Thu, 4 Apr 2019 02:01:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82D99288E8 for ; Thu, 4 Apr 2019 02:01:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D3EB6B026A; Wed, 3 Apr 2019 22:01:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 58A166B026B; Wed, 3 Apr 2019 22:01:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B3AB6B026C; Wed, 3 Apr 2019 22:01:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 15E616B026A for ; Wed, 3 Apr 2019 22:01:26 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id q21so954215qtf.10 for ; Wed, 03 Apr 2019 19:01:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=g7J5EGJkF982neBEIpQt+y8NYDKq8lV93wLGcNRGrUg=; b=cuAj2a5I1Sb1b3O0fhMnRd5S3r/zRkR4Vy9uJxG8jG8Xl/Q+zuMCzvem8kzH9hqtij W+vJFHm81Js1F0Q05aaEY9jRKoaLK33Z7Li3Fy59R3YubD7DjiIp2rCHSC5smuPntNhk 94InyxMUPULFjSpkBEaISvqwGcCwCP56eVjTGE60+PU3sHcPQ0qvo/8QeH9ZAh/d0pfK LvGtHcPEaJpg/eHGVQs1IG8Op57UuBGDNO5CuH1wlCjbL/zVSNg7gl8QwX+kTn7ovlg5 Gtz4Sr8Ma1MPaW13enZRNulwQ1Fyh6xJZZGDs6KrgdantDuE3/+Uis+3Zw1PL21Ub2Vw oItw== X-Gm-Message-State: APjAAAXN5pYsPXVbCc3WmdYeORp/Yqu8ZbbNbPBCoxj+9TIpSUmIsALc QBn4czf97t6hzjo5vkOLc1Gt+aP9kUWvSP0lvvvDafRAm3qibItSjwecu/DKZv+qlDBPXvgKwmZ eGW+PoDyXAhQWDqtRtptlyNQmH8N6pExzpsYvJsQlI3wVDnplUqwAIlIAlT8FG26/3A== X-Received: by 2002:a0c:ac98:: with SMTP id m24mr2579085qvc.3.1554343285885; Wed, 03 Apr 2019 19:01:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqwrNC8G4NMbaUa8wNnc8dGfXjwwBlSQdrKx+bqifeA7IUnBaEX1FxA4bVQWnwwSNLnyND1A X-Received: by 2002:a0c:ac98:: with SMTP id m24mr2579027qvc.3.1554343285095; Wed, 03 Apr 2019 19:01:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343285; cv=none; d=google.com; s=arc-20160816; b=wLUQpYtEJuXgkwSKds3nsOTPWnvjsQddXcsNpITGsVMd8uvq5bdx3TIVruAxLJmQoI o+BA7z2LpujF692riZlhX1UsSCbAfQPwKGkW1wLEDZ2lb/Mekjy+l+raYXHJi3CNVOz4 /0w5iwdXGM7KgjgUljh6hFEHjccSqSA/nvOwdOmA9tqbhw1B2M6nHkOwIT22xMTEi6t/ tgGh0BkwWUKFQRVuErqcGkU0+E4WjAca/Q0H+xSrfS7WdLrpqKHJDvTZmQibTgYKl0zY VZ4RVJcbUzfUlNl1c05KVC3uaaaL2RMw53g6OvazRqXSzsr3/Vcz7UhmRL/6q8GvyLye DX+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=g7J5EGJkF982neBEIpQt+y8NYDKq8lV93wLGcNRGrUg=; b=vpRSypob3INczGneghjXwsYfCDDvYziL3mMrfUDxyetosbKHywDekWoQvwgdk0FaAM M6cmy7JSu3E8FtILBNrX4TxNI8qJSq2CL1a5NqGcyk3GgbsKA0df3dOd5dQxHbEgeOkn gZ4dRXZ44CpW/PeW9hFqFguwBBqRtF1xkEtfRiJHOjlTXJDaNBHVogo9oYvG1FEz6P9A NzuEvytOb9GjPspaLF09SLZfpSxdOerFwJQ7GFAQePAaIvWMtwop1P66WnCynmVsYLQL jEve/Erfp/y17MHyqIy5Ng5kBHFFFI3PHIV+6Z+OZId23y9cx38T7lpjld73box7F5Yu nbBQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=jSVkSs1r; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=jtoOK51R; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id g10si2107215qvd.203.2019.04.03.19.01.25 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:25 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=jSVkSs1r; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=jtoOK51R; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id CC6CC22349; Wed, 3 Apr 2019 22:01:24 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=g7J5EGJkF982n eBEIpQt+y8NYDKq8lV93wLGcNRGrUg=; b=jSVkSs1rwApArJJjueU/Y1eE6QFai I7aySDXdakm6FTGtVY90fGPEfYRnfabDBiMF/46y6vtkbaSWlBgx0GxQ0wpR6KzY lIWQ90i+kEMssaE275N1SF5NIc1GHAaDmNt3yfwxU1gmG/a5JXN3wUtCt9vzHhgO XABVtqqtZxdKHuhhMfmOEaTe6z19aflOYV1hulx4kIOZctp9cGSAPghX8q8+jAsQ o8DcQIvN5vH9WcEcAAedNRUGimVk7PHlR6Nze1kNehNbssIXhxuXgL51viGllNd6 clTLSZt/H8zbnzmJ1n7GzBRpobSSlG+jarYPhgnlqV+dobnGbwUFexd7g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=g7J5EGJkF982neBEIpQt+y8NYDKq8lV93wLGcNRGrUg=; b=jtoOK51R zy2Eaf8VRvQ6u4ssXI9X3RRF9upJUM1hxNW+RYxexrXDcxg9nx72SESIcFGVqo3J +A7Wdt+y3HlXgKYSqqncBSwe7XWDgWqSh+xQjHTvfIT57DMQmQXlWgbpn3wfeSXE Qc/aMkRJFYbm8/Se4Kx79aw2qkv0WUSdefLs4txlJRkgdvILfJvE8JYPGsZ/OsJJ Ei+0BiCbxjuD8bljF2XBRo7nBhGpBZkhipuAcTx7fP/ydO9fr/jJalTQEo1M8iGA tHAAnao64a8KcN/UTP/2cz3VMRhmoj1MzSj7r8/uAgsm18tHOWMryON6RHuaZbEw p062fZGSIw+npQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepge X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id A6B321030F; Wed, 3 Apr 2019 22:01:21 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 06/25] mm: migrate: Make the number of copy threads adjustable via sysctl. Date: Wed, 3 Apr 2019 19:00:27 -0700 Message-Id: <20190404020046.32741-7-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Signed-off-by: Zi Yan --- kernel/sysctl.c | 9 +++++++++ mm/copy_page.c | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 3d8490e..0eae0b8 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -102,6 +102,7 @@ #if defined(CONFIG_SYSCTL) extern int accel_page_copy; +extern unsigned int limit_mt_num; /* External variables not in a header file. */ extern int suid_dumpable; @@ -1441,6 +1442,14 @@ static struct ctl_table vm_table[] = { .extra1 = &zero, .extra2 = &one, }, + { + .procname = "limit_mt_num", + .data = &limit_mt_num, + .maxlen = sizeof(limit_mt_num), + .mode = 0644, + .proc_handler = proc_dointvec, + .extra1 = &zero, + }, { .procname = "hugetlb_shm_group", .data = &sysctl_hugetlb_shm_group, diff --git a/mm/copy_page.c b/mm/copy_page.c index 9cf849c..6665e3d 100644 --- a/mm/copy_page.c +++ b/mm/copy_page.c @@ -23,7 +23,7 @@ #include -const unsigned int limit_mt_num = 4; +unsigned int limit_mt_num = 4; /* ======================== multi-threaded copy page ======================== */ From patchwork Thu Apr 4 02:00:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884747 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 145641708 for ; Thu, 4 Apr 2019 02:01:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F2362288E4 for ; Thu, 4 Apr 2019 02:01:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E63BE28928; Thu, 4 Apr 2019 02:01:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 07193288E4 for ; Thu, 4 Apr 2019 02:01:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8A996B026B; Wed, 3 Apr 2019 22:01:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C3AE36B026C; Wed, 3 Apr 2019 22:01:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B08676B026D; Wed, 3 Apr 2019 22:01:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 8A31E6B026B for ; Wed, 3 Apr 2019 22:01:28 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id v2so908587qkf.21 for ; Wed, 03 Apr 2019 19:01:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=6ZpqxtTHTHc65gkfR21N4lwZirSEr4tGkw1nMDmRxNg=; b=Xpy4khlz7Q0f58ffY67Y4NPTw2EXTPkoPa9Vo9BYpIov5eV4YWC9dG0CG1iN8Z5LLM kIxloPBS2T10Q5eoVWmj0aAis1WaI61SxtX9bL1BrS0L7F2prlYyu+Dn1FLK9uUo+qYW y7bNFdzg3D85CAgReLPLfbksgCZtICSqJxZySU1GBcMBMinSXkQ+w8zagryAupzPYBG+ TDpeQnyLcME4JIjqgNjyN9MSVksPXNQUkQd5REH3Q1BXlS4Bo73pCINrQmkKqRa01RO3 puPzhN1lvYLHAQZGxg28TLDtfcJX05kqTNAvz0+PESVZMSk/cXVYhQ/pXq5YPi9xHlvf HuRg== X-Gm-Message-State: APjAAAXH+NGJ9Hm5WPqP9xDCqc3IW+8imNJCwxkyXikxwXKPhF7s9/wY O4gIIsSdbUbPLDdX8Udv23jNpnzklPYEoQLTgnklF2SlwKvutm+YIAcIa+DhJ3cR+RBUbnn0rm2 Cn1jA2nwD7XaG0czgGCcX6jjyYR61iubd84n1eVpetyTV6VEx1/Gg+JSuDk/yMSNUjA== X-Received: by 2002:a05:620a:130f:: with SMTP id o15mr3023640qkj.252.1554343288318; Wed, 03 Apr 2019 19:01:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqwXeCLa3y1nxt2ySgRndT7FTOPMk70Nnr85sNLETI1Yd+PXbai9ajvAThQ9TJfNYTOJElKM X-Received: by 2002:a05:620a:130f:: with SMTP id o15mr3023554qkj.252.1554343286965; Wed, 03 Apr 2019 19:01:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343286; cv=none; d=google.com; s=arc-20160816; b=sekh+quNtu5EwpegS/XiN8ImA2migo6ud6VgfEzcQRNVGQrm4mTtgJpQgXDtzs2s4L hPs/mYxS0JtpxVl6+nha5I6brWNjA8EeI8YL+2zyGv8XTHksH8OYbobYTIGYSJ8vMRX4 fo3HqVcSf1Y8wy+K/r8N1dIXwv9rM7NtAHNJ2mg5vtiXvkuo+mbTEEAyfq/bU3T9CbCq uHr0RIe+y4LjlDMvog6IO5GCQRzsbR1of4QC4b718zgg4sWMBLt8Xde3Q3srfVX9vkdO SanIM8F86assIFLBOs+yEoEGqn/w9QaNQI9VwnH9fbNRqg/WuUj58SFYf8xuziOyx4uq 7GqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=6ZpqxtTHTHc65gkfR21N4lwZirSEr4tGkw1nMDmRxNg=; b=BthWu1Bc05h/b9tMnXinqj62AjCCvdoxn0vvLsMAs/9k+mTsp47AQ4qJpvPFhrDrBf 8+6HXMpFOFZOCBLCw1IjVFIrB5adHMmp5QYIBpkhe+n+vT5XttdYvp8h0vr/Gi0/bYXQ hiHOT1qk7cbIF0XYzMuh7GoBz9l2rnF0vcM6c0O95VOedkVBfvEYoKrS78degGL83Rpc 5RSAGn9yDlnScFt4TgY6GrhYnx7w/j0BVYLsUEDqsPKahRy2yxCH7g7GPeBjPNXaTKmZ pgqclzFLnRixf02J0h4GFFdF+p4jdM0UjN+oqNRS4zGrsl/ZDu4PH8VSf7QRWiQNvme+ 2LAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b="P/2vymj9"; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=8LiFw7Ip; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id m32si573275qvg.172.2019.04.03.19.01.26 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:26 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b="P/2vymj9"; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=8LiFw7Ip; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id A41B922694; Wed, 3 Apr 2019 22:01:26 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=6ZpqxtTHTHc65 gkfR21N4lwZirSEr4tGkw1nMDmRxNg=; b=P/2vymj9CobgSqo/KRxkjY3P/RjnS mIIrbc/li7lTPkHRWM+arjKvaG0SeqSZSDBWZ/8MT2JovStY/VPvGUSvjmv6cY1N oggj1vnYt0MQ9DcGsCBFQqcA4r7fn64o5qgsoP6FUWPAoEf9GdNCnyhy8lUy2fjF rOSf/81l3/42yl3V9G8hB2Hz/L+ftlIsfQltHa3jAT3HfVT9Ah1zY2wgxdkC9eQ6 IjpsyfKVK7jwi1pkmpWL5XhNJqjM2XzifAIGwGKnCtDFNB33J5LlFyiwfcWAzn2A s+zWWm43pwmspzERIz4UcYDrXj7jsvZ6vU9VxS06VPWowoFYhlGabCqUg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=6ZpqxtTHTHc65gkfR21N4lwZirSEr4tGkw1nMDmRxNg=; b=8LiFw7Ip Wot0oSyhzmOsxW7f8bKtllJDgs5MI+TwfS97Ur3S+I3Mvx0k/sx/KU4W3PjumwmZ MHrkmvuXkfqzSzSTIlNOB/brsXAJSVappV/7vF28jDQis6Ec006ZtAhr1QqcWRJP kjh++V2ker12Az9NOFYKgiBrk8EukNLApg3sTWEUj8C3PLSYRuLC9PLVwf64hm6V z0xqmmRKnFkd5vVgRE9tXFVe9IElSEiouQTli5fa275eHrxcn2WSU9KW3uvjqFuo UgU46Xyg5sBIvx224RbGwqHtDJ8QnF+5A/Gy7lKU53UVrVraWDzOb8h+M6pETLsf f9PelGzAOtuGTw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepge X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id D5C6C10310; Wed, 3 Apr 2019 22:01:24 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 07/25] mm: migrate: Add copy_page_dma to use DMA Engine to copy pages. Date: Wed, 3 Apr 2019 19:00:28 -0700 Message-Id: <20190404020046.32741-8-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan vm.use_all_dma_chans will grab all usable DMA channels vm.limit_dma_chans will limit how many DMA channels in use Signed-off-by: Zi Yan --- include/linux/highmem.h | 1 + include/linux/sched/sysctl.h | 3 + kernel/sysctl.c | 19 +++ mm/copy_page.c | 291 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 314 insertions(+) diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 0f50dc5..119bb39 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -277,5 +277,6 @@ static inline void copy_highpage(struct page *to, struct page *from) #endif int copy_page_multithread(struct page *to, struct page *from, int nr_pages); +int copy_page_dma(struct page *to, struct page *from, int nr_pages); #endif /* _LINUX_HIGHMEM_H */ diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h index 99ce6d7..ce11241 100644 --- a/include/linux/sched/sysctl.h +++ b/include/linux/sched/sysctl.h @@ -90,4 +90,7 @@ extern int sched_energy_aware_handler(struct ctl_table *table, int write, loff_t *ppos); #endif +extern int sysctl_dma_page_migration(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, + loff_t *ppos); #endif /* _LINUX_SCHED_SYSCTL_H */ diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 0eae0b8..b8712eb 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -103,6 +103,8 @@ extern int accel_page_copy; extern unsigned int limit_mt_num; +extern int use_all_dma_chans; +extern int limit_dma_chans; /* External variables not in a header file. */ extern int suid_dumpable; @@ -1451,6 +1453,23 @@ static struct ctl_table vm_table[] = { .extra1 = &zero, }, { + .procname = "use_all_dma_chans", + .data = &use_all_dma_chans, + .maxlen = sizeof(use_all_dma_chans), + .mode = 0644, + .proc_handler = sysctl_dma_page_migration, + .extra1 = &zero, + .extra2 = &one, + }, + { + .procname = "limit_dma_chans", + .data = &limit_dma_chans, + .maxlen = sizeof(limit_dma_chans), + .mode = 0644, + .proc_handler = proc_dointvec, + .extra1 = &zero, + }, + { .procname = "hugetlb_shm_group", .data = &sysctl_hugetlb_shm_group, .maxlen = sizeof(gid_t), diff --git a/mm/copy_page.c b/mm/copy_page.c index 6665e3d..5e7a797 100644 --- a/mm/copy_page.c +++ b/mm/copy_page.c @@ -126,3 +126,294 @@ int copy_page_multithread(struct page *to, struct page *from, int nr_pages) return err; } +/* ======================== DMA copy page ======================== */ +#include +#include + +#define NUM_AVAIL_DMA_CHAN 16 + + +int use_all_dma_chans = 0; +int limit_dma_chans = NUM_AVAIL_DMA_CHAN; + + +struct dma_chan *copy_chan[NUM_AVAIL_DMA_CHAN] = {0}; +struct dma_device *copy_dev[NUM_AVAIL_DMA_CHAN] = {0}; + + + +#ifdef CONFIG_PROC_SYSCTL +extern int proc_dointvec_minmax(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos); +int sysctl_dma_page_migration(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, + loff_t *ppos) +{ + int err = 0; + int use_all_dma_chans_prior_val = use_all_dma_chans; + dma_cap_mask_t copy_mask; + + if (write && !capable(CAP_SYS_ADMIN)) + return -EPERM; + + err = proc_dointvec_minmax(table, write, buffer, lenp, ppos); + + if (err < 0) + return err; + if (write) { + /* Grab all DMA channels */ + if (use_all_dma_chans_prior_val == 0 && use_all_dma_chans == 1) { + int i; + + dma_cap_zero(copy_mask); + dma_cap_set(DMA_MEMCPY, copy_mask); + + dmaengine_get(); + for (i = 0; i < NUM_AVAIL_DMA_CHAN; ++i) { + if (!copy_chan[i]) { + copy_chan[i] = dma_request_channel(copy_mask, NULL, NULL); + } + if (!copy_chan[i]) { + pr_err("%s: cannot grab channel: %d\n", __func__, i); + continue; + } + + copy_dev[i] = copy_chan[i]->device; + + if (!copy_dev[i]) { + pr_err("%s: no device: %d\n", __func__, i); + continue; + } + } + + } + /* Release all DMA channels */ + else if (use_all_dma_chans_prior_val == 1 && use_all_dma_chans == 0) { + int i; + + for (i = 0; i < NUM_AVAIL_DMA_CHAN; ++i) { + if (copy_chan[i]) { + dma_release_channel(copy_chan[i]); + copy_chan[i] = NULL; + copy_dev[i] = NULL; + } + } + + dmaengine_put(); + } + + if (err) + use_all_dma_chans = use_all_dma_chans_prior_val; + } + return err; +} + +#endif + +static int copy_page_dma_once(struct page *to, struct page *from, int nr_pages) +{ + static struct dma_chan *copy_chan = NULL; + struct dma_device *device = NULL; + struct dma_async_tx_descriptor *tx = NULL; + dma_cookie_t cookie; + enum dma_ctrl_flags flags = 0; + struct dmaengine_unmap_data *unmap = NULL; + dma_cap_mask_t mask; + int ret_val = 0; + + + dma_cap_zero(mask); + dma_cap_set(DMA_MEMCPY, mask); + + dmaengine_get(); + + copy_chan = dma_request_channel(mask, NULL, NULL); + + if (!copy_chan) { + pr_err("%s: cannot get a channel\n", __func__); + ret_val = -1; + goto no_chan; + } + + device = copy_chan->device; + + if (!device) { + pr_err("%s: cannot get a device\n", __func__); + ret_val = -2; + goto release; + } + + unmap = dmaengine_get_unmap_data(device->dev, 2, GFP_NOWAIT); + + if (!unmap) { + pr_err("%s: cannot get unmap data\n", __func__); + ret_val = -3; + goto release; + } + + unmap->to_cnt = 1; + unmap->addr[0] = dma_map_page(device->dev, from, 0, PAGE_SIZE*nr_pages, + DMA_TO_DEVICE); + unmap->from_cnt = 1; + unmap->addr[1] = dma_map_page(device->dev, to, 0, PAGE_SIZE*nr_pages, + DMA_FROM_DEVICE); + unmap->len = PAGE_SIZE*nr_pages; + + tx = device->device_prep_dma_memcpy(copy_chan, + unmap->addr[1], + unmap->addr[0], unmap->len, + flags); + + if (!tx) { + pr_err("%s: null tx descriptor\n", __func__); + ret_val = -4; + goto unmap_dma; + } + + cookie = tx->tx_submit(tx); + + if (dma_submit_error(cookie)) { + pr_err("%s: submission error\n", __func__); + ret_val = -5; + goto unmap_dma; + } + + if (dma_sync_wait(copy_chan, cookie) != DMA_COMPLETE) { + pr_err("%s: dma does not complete properly\n", __func__); + ret_val = -6; + } + +unmap_dma: + dmaengine_unmap_put(unmap); +release: + if (copy_chan) { + dma_release_channel(copy_chan); + } +no_chan: + dmaengine_put(); + + return ret_val; +} + +static int copy_page_dma_always(struct page *to, struct page *from, int nr_pages) +{ + struct dma_async_tx_descriptor *tx[NUM_AVAIL_DMA_CHAN] = {0}; + dma_cookie_t cookie[NUM_AVAIL_DMA_CHAN]; + enum dma_ctrl_flags flags[NUM_AVAIL_DMA_CHAN] = {0}; + struct dmaengine_unmap_data *unmap[NUM_AVAIL_DMA_CHAN] = {0}; + int ret_val = 0; + int total_available_chans = NUM_AVAIL_DMA_CHAN; + int i; + size_t page_offset; + + for (i = 0; i < NUM_AVAIL_DMA_CHAN; ++i) { + if (!copy_chan[i]) { + total_available_chans = i; + } + } + if (total_available_chans != NUM_AVAIL_DMA_CHAN) { + pr_err("%d channels are missing", NUM_AVAIL_DMA_CHAN - total_available_chans); + } + + total_available_chans = min_t(int, total_available_chans, limit_dma_chans); + + /* round down to closest 2^x value */ + total_available_chans = 1<dev, 2, GFP_NOWAIT); + if (!unmap[i]) { + pr_err("%s: no unmap data at chan %d\n", __func__, i); + ret_val = -3; + goto unmap_dma; + } + } + + for (i = 0; i < total_available_chans; ++i) { + if (nr_pages == 1) { + page_offset = PAGE_SIZE / total_available_chans; + + unmap[i]->to_cnt = 1; + unmap[i]->addr[0] = dma_map_page(copy_dev[i]->dev, from, page_offset*i, + page_offset, + DMA_TO_DEVICE); + unmap[i]->from_cnt = 1; + unmap[i]->addr[1] = dma_map_page(copy_dev[i]->dev, to, page_offset*i, + page_offset, + DMA_FROM_DEVICE); + unmap[i]->len = page_offset; + } else { + page_offset = nr_pages / total_available_chans; + + unmap[i]->to_cnt = 1; + unmap[i]->addr[0] = dma_map_page(copy_dev[i]->dev, + from + page_offset*i, + 0, + PAGE_SIZE*page_offset, + DMA_TO_DEVICE); + unmap[i]->from_cnt = 1; + unmap[i]->addr[1] = dma_map_page(copy_dev[i]->dev, + to + page_offset*i, + 0, + PAGE_SIZE*page_offset, + DMA_FROM_DEVICE); + unmap[i]->len = PAGE_SIZE*page_offset; + } + } + + for (i = 0; i < total_available_chans; ++i) { + tx[i] = copy_dev[i]->device_prep_dma_memcpy(copy_chan[i], + unmap[i]->addr[1], + unmap[i]->addr[0], + unmap[i]->len, + flags[i]); + if (!tx[i]) { + pr_err("%s: no tx descriptor at chan %d\n", __func__, i); + ret_val = -4; + goto unmap_dma; + } + } + + for (i = 0; i < total_available_chans; ++i) { + cookie[i] = tx[i]->tx_submit(tx[i]); + + if (dma_submit_error(cookie[i])) { + pr_err("%s: submission error at chan %d\n", __func__, i); + ret_val = -5; + goto unmap_dma; + } + + dma_async_issue_pending(copy_chan[i]); + } + + for (i = 0; i < total_available_chans; ++i) { + if (dma_sync_wait(copy_chan[i], cookie[i]) != DMA_COMPLETE) { + ret_val = -6; + pr_err("%s: dma does not complete at chan %d\n", __func__, i); + } + } + +unmap_dma: + + for (i = 0; i < total_available_chans; ++i) { + if (unmap[i]) + dmaengine_unmap_put(unmap[i]); + } + + return ret_val; +} + +int copy_page_dma(struct page *to, struct page *from, int nr_pages) +{ + BUG_ON(hpage_nr_pages(from) != nr_pages); + BUG_ON(hpage_nr_pages(to) != nr_pages); + + if (!use_all_dma_chans) { + return copy_page_dma_once(to, from, nr_pages); + } + + return copy_page_dma_always(to, from, nr_pages); +} From patchwork Thu Apr 4 02:00:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884749 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3E76F1708 for ; Thu, 4 Apr 2019 02:01:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2765C28984 for ; Thu, 4 Apr 2019 02:01:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1B420288E4; Thu, 4 Apr 2019 02:01:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 676B3288E4 for ; Thu, 4 Apr 2019 02:01:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E86686B026C; Wed, 3 Apr 2019 22:01:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E37AF6B026D; Wed, 3 Apr 2019 22:01:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDC546B026E; Wed, 3 Apr 2019 22:01:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 9DE406B026C for ; Wed, 3 Apr 2019 22:01:30 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id q21so954432qtf.10 for ; Wed, 03 Apr 2019 19:01:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=VIkgM+882IONo3Guhv/rUEZozu4u9IBUTwV5VJxdnmI=; b=N0DW1bHJq4spRT3usTiIrqFQ03pDlT2QppFx2/B+Z/oVFxLezSnFXH+pKqqCB2+j/4 yHyPOvPYrH2OVQTIVnaD57i68X/XuqwE6PUsvzfxkgt1l3kE///byg9V+pBvcyJr8czD qdv4t2VlMihUr8gjKrVluCdVsbHViTVXhSWwyPnJYyYgbsSJJg1oLDI7sGU4c3BfrQgG 4kfuSWoJOhPk9RZkkje1rNiPjOL7D4QMHA4/7JP/U0hbRra7Kiwi9+i4e10FS8ijhVtn VxqmnN9ZUf1tUAs3x7hQhRfaIdSiyuEIMOHZai+u8Sx7PLp3lJDok2EN6f68Vq7XgVgi Pscg== X-Gm-Message-State: APjAAAUSEZGUFdkL6OzDeaxTn6/KibfxwLn3SbQoH40l3RXpyh35NbES lGySQnUClSoatPzJvG2TQrxENpx5QYwCskE4JYJxDhqApm9LSAtZQFJHZyap+PtldFxv/ErdOVa 5otG0okVs0ofCpT+FrmntAPsnC2d2GQH7gO72IxvZXZ0/34VKfQ+8VzZZopjaAaNFXA== X-Received: by 2002:aed:2196:: with SMTP id l22mr2947532qtc.226.1554343290392; Wed, 03 Apr 2019 19:01:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqyq5MgJsfL/87NNpSX5/eeP8tmAjwj8BbRJ0J2GVaDaDVOh2tJGrj1a+EctuF6z8uB5uf5W X-Received: by 2002:aed:2196:: with SMTP id l22mr2947479qtc.226.1554343289580; Wed, 03 Apr 2019 19:01:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343289; cv=none; d=google.com; s=arc-20160816; b=EMNR9gIExzZKBEJSJUdPisFeHKIfmZgDih8hhTsuVVL/mc0Lt4PMaISSnZbLgDE5UJ n/6euq4MX0pzVn7grzKcWifjSF9TJdAvavTdH1Chb0PA3JJkW6r5+Yyfuxjc56rOxeNX BmrYpg9K6OV0WYTenMk35DQoSGDyQBkkMYUnMoE6VIQRjA2TYmeZoEjrnEHa0oLZLKBA lcC7QgUbisUdiN+OhkQkxSL8KUHlel4PeDjKKxgAW73J+xmufxV/5c74+4YpqM6pZfQW 7F0xz/UZ6ntwuuhKo8JSnDDXLJswKllVopQ6XPNJFPopnkanW5HGpRO2kCURYFfycX6e sn1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=VIkgM+882IONo3Guhv/rUEZozu4u9IBUTwV5VJxdnmI=; b=EdeR9nf+3kvLz3zN3Vh6L8NUFQzFyIACoT6nsuaGi1ykG3Un4J86CvrIud0dp9T31G w/DOEuq56EFNE3KfkZperCWs/PbnKYCtybERPjN1zkY6Acr42tCqMQGMHhVryYxOs7YK 019SDv3U/1jtcv4/OsyiR1YV6hBgVP7v2FcqSFq6WqqMK780phFS3jYeVhwEtoqDc2NV AdW2ZQuGIJw+PHyv+xUtFQ7jt5ZfAQoU/NB0r2BEzQXvDvcO19a1rbqbINjKehY4yzpU z07h9Qr6IW/RUB6cX0O50gKnAT0HpJW57c7N+BPvkkdGfX5gJHTa6mwtSbaECDi1J62a nhGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=YnCSB3YN; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=jdmARHKr; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id t17si4638019qkt.181.2019.04.03.19.01.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:29 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=YnCSB3YN; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=jdmARHKr; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id BC556226AD; Wed, 3 Apr 2019 22:01:28 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=VIkgM+882IONo 3Guhv/rUEZozu4u9IBUTwV5VJxdnmI=; b=YnCSB3YNRz/K6xN+4chFMo1NLtyZ0 kfSrQtI9P1RLihuwc5xXZ086brvPD38Tj81bn6cCtVcknYPaPuPm3mzKZEy8TIlI brQvSbwJe/XEAqOofxIGsuZNFXSsQV4vJiZZq3Uvko0zNgM2Nt4qS8jxInGurjaa M3nMz1hhMw4mRPw31I8o/ONAOLsP371pKpx134KkRijpljihQHgWwQqHkU+uO/tz UIF9+XCbyMYto9Lv3LtxTKIyj6YhubPWzxaoSgDnWTccQs856/ISylyoprlXRxhy R15JZ0nbIBY+PbPcSI4OZoyYid9V88Y6wHJMt4xDMHfzwHLqNrnccDbkA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=VIkgM+882IONo3Guhv/rUEZozu4u9IBUTwV5VJxdnmI=; b=jdmARHKr 73TxQMOntRLiEEfU1UDhBk9EGXmzpkGXFKVh4e4cGq7/FmBPaRIRgfXZmNN0cJej xyfBRN79joGo30hm1m8rJ9b7sJBMpwZSPRCD0KWO6FezA26TBPU41/dRaEPC9ZSA iQ2lMy54De3y7HRMG2ZVBisXUZdRFkPo6T2hKZ8uyptRf1k2W0GtsbUG4KcYoVQ6 rcD0/L7QOo09OnoVosRDag+/RLLdcYhQPNWVZWb3bUUF7wBuPKTmkEiARvtFZMlV grBvJrHk3D2q0cAgz0NKJwMCSXARybhdFybXKKowkjvXra+9JvBXVksHoVX3BbIx lHvFwhe9zrgbsQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepje X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 87CC310316; Wed, 3 Apr 2019 22:01:26 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 08/25] mm: migrate: Add copy_page_dma into migrate_page_copy. Date: Wed, 3 Apr 2019 19:00:29 -0700 Message-Id: <20190404020046.32741-9-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Fallback to copy_highpage when it fails. Signed-off-by: Zi Yan --- include/linux/migrate_mode.h | 1 + include/uapi/linux/mempolicy.h | 1 + mm/migrate.c | 31 +++++++++++++++++++++---------- 3 files changed, 23 insertions(+), 10 deletions(-) diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h index 5bc8a77..4f7f5557 100644 --- a/include/linux/migrate_mode.h +++ b/include/linux/migrate_mode.h @@ -23,6 +23,7 @@ enum migrate_mode { MIGRATE_MODE_MASK = 3, MIGRATE_SINGLETHREAD = 0, MIGRATE_MT = 1<<4, + MIGRATE_DMA = 1<<5, }; #endif /* MIGRATE_MODE_H_INCLUDED */ diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index 890269b..49573a6 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -48,6 +48,7 @@ enum { #define MPOL_MF_LAZY (1<<3) /* Modifies '_MOVE: lazy migrate on fault */ #define MPOL_MF_INTERNAL (1<<4) /* Internal flags start here */ +#define MPOL_MF_MOVE_DMA (1<<5) /* Use DMA page copy routine */ #define MPOL_MF_MOVE_MT (1<<6) /* Use multi-threaded page copy routine */ #define MPOL_MF_VALID (MPOL_MF_STRICT | \ diff --git a/mm/migrate.c b/mm/migrate.c index 8a344e2..09114d3 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -553,15 +553,21 @@ int migrate_huge_page_move_mapping(struct address_space *mapping, * specialized. */ static void __copy_gigantic_page(struct page *dst, struct page *src, - int nr_pages) + int nr_pages, enum migrate_mode mode) { int i; struct page *dst_base = dst; struct page *src_base = src; + int rc = -EFAULT; for (i = 0; i < nr_pages; ) { cond_resched(); - copy_highpage(dst, src); + + if (mode & MIGRATE_DMA) + rc = copy_page_dma(dst, src, 1); + + if (rc) + copy_highpage(dst, src); i++; dst = mem_map_next(dst, dst_base, i); @@ -582,7 +588,7 @@ static void copy_huge_page(struct page *dst, struct page *src, nr_pages = pages_per_huge_page(h); if (unlikely(nr_pages > MAX_ORDER_NR_PAGES)) { - __copy_gigantic_page(dst, src, nr_pages); + __copy_gigantic_page(dst, src, nr_pages, mode); return; } } else { @@ -597,6 +603,8 @@ static void copy_huge_page(struct page *dst, struct page *src, if (mode & MIGRATE_MT) rc = copy_page_multithread(dst, src, nr_pages); + else if (mode & MIGRATE_DMA) + rc = copy_page_dma(dst, src, nr_pages); if (rc) for (i = 0; i < nr_pages; i++) { @@ -674,8 +682,9 @@ void migrate_page_copy(struct page *newpage, struct page *page, { if (PageHuge(page) || PageTransHuge(page)) copy_huge_page(newpage, page, mode); - else + else { copy_highpage(newpage, page); + } migrate_page_states(newpage, page); } @@ -1511,7 +1520,8 @@ static int store_status(int __user *status, int start, int value, int nr) } static int do_move_pages_to_node(struct mm_struct *mm, - struct list_head *pagelist, int node, bool migrate_mt) + struct list_head *pagelist, int node, + bool migrate_mt, bool migrate_dma) { int err; @@ -1519,7 +1529,8 @@ static int do_move_pages_to_node(struct mm_struct *mm, return 0; err = migrate_pages(pagelist, alloc_new_node_page, NULL, node, - MIGRATE_SYNC | (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD), + MIGRATE_SYNC | (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD) | + (migrate_dma ? MIGRATE_DMA : MIGRATE_SINGLETHREAD), MR_SYSCALL); if (err) putback_movable_pages(pagelist); @@ -1642,7 +1653,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, start = i; } else if (node != current_node) { err = do_move_pages_to_node(mm, &pagelist, current_node, - flags & MPOL_MF_MOVE_MT); + flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA); if (err) goto out; err = store_status(status, start, current_node, i - start); @@ -1666,7 +1677,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; err = do_move_pages_to_node(mm, &pagelist, current_node, - flags & MPOL_MF_MOVE_MT); + flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA); if (err) goto out; if (i > start) { @@ -1682,7 +1693,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, /* Make sure we do not overwrite the existing error */ err1 = do_move_pages_to_node(mm, &pagelist, current_node, - flags & MPOL_MF_MOVE_MT); + flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA); if (!err1) err1 = store_status(status, start, current_node, i - start); if (!err) @@ -1778,7 +1789,7 @@ static int kernel_move_pages(pid_t pid, unsigned long nr_pages, nodemask_t task_nodes; /* Check flags */ - if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL|MPOL_MF_MOVE_MT)) + if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL|MPOL_MF_MOVE_MT|MPOL_MF_MOVE_DMA)) return -EINVAL; if ((flags & MPOL_MF_MOVE_ALL) && !capable(CAP_SYS_NICE)) From patchwork Thu Apr 4 02:00:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884751 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F9421708 for ; Thu, 4 Apr 2019 02:01:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A71C288E8 for ; Thu, 4 Apr 2019 02:01:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7E7EF28929; Thu, 4 Apr 2019 02:01:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4045288E8 for ; Thu, 4 Apr 2019 02:01:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E632F6B026D; Wed, 3 Apr 2019 22:01:31 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D9C4C6B026E; Wed, 3 Apr 2019 22:01:31 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C13CD6B026F; Wed, 3 Apr 2019 22:01:31 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 9BE476B026D for ; Wed, 3 Apr 2019 22:01:31 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id h51so925041qte.22 for ; Wed, 03 Apr 2019 19:01:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=WZXwx4HISUDgI8TfJQzyJVeD3sCMwmp5Girw2m9njPA=; b=Szl5bcajdDCXCJ4D5irCX2UQiAzGY91EjZYXK1SlipjkZWnD2cWD6C2vDKUfZtstjp N8PGJ6crxRD9uC7LBae0yk5v+j1HmKb9DUQlQXfJRUoWsYeGW8PpT2XNC9xPROUJp6gS IJazep6L60czbKIJpEuE78SdznazQGsuyKmOz5yEOl0+lkOGX7zhlVxs42ixOWFdgHKP Mlmyrkf0ogHZJCZBs5tDtT4UUENiR2zVub/rF5wCGLv2tXQULJCgErw5kcaVDFT9/Kbt h585t7YH3Y74YQFM/dD7sbbKTyCUul9LNDPgFTh+0EaBEIkp1i0WoMtL481S9q1+pk2g gDHg== X-Gm-Message-State: APjAAAUhU4COQnizePbvlpeXL2NsBAooxlS9HjKfY6OlkV8relNOHo02 mwqxwbQRi/0tQYfQHGzOEBq08ry/aNFjv0TUdwze1f8wgk9w4Ql2GgpTu5Fn9QzENGNj8G9lGwk orbyqe/NGeTlT1p4pMs4I2yqb06qgtq9fAO8A8AzBcQTuUbUwkU6vFBgR1XhDsvmfMQ== X-Received: by 2002:ac8:372e:: with SMTP id o43mr3029329qtb.74.1554343291385; Wed, 03 Apr 2019 19:01:31 -0700 (PDT) X-Google-Smtp-Source: APXvYqzXxfFVTgAzNKK0bjIL5RzQ+RXY/oK4PZWwBLrhdE4OzOib5mdkjVDHtnlhj95mozsZm6iN X-Received: by 2002:ac8:372e:: with SMTP id o43mr3029282qtb.74.1554343290628; Wed, 03 Apr 2019 19:01:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343290; cv=none; d=google.com; s=arc-20160816; b=JJ40m3q8nCjFTKy9dFW0BKP4YwAePlMouRGUcHfIg+qls56X8GzmTnYhz8syAAJnGL TFHpc3mgW0ubJHealxYS6d1AGvTzPX06zMZXgx2+bl6viktJ6SBI7NF5Inpn7uMezhWi sGse8bRvAJnTeWGTSyZ/aOCCADvkNop3qszhTAuJ7kJ85urmF3kq2Kp5tCHjqUQwJMwE +IFh4xinim+vXJgDBcMareS5R2/K3WTI0Uko1L3RBl9739r7LsXGidw3QhXq4eZ/5jW4 jZ9v+r8uOfVVZcTxk0WDjRUmBWt3YKzAA03nxPZXuBqmlDNVdKb7FGfh2TYl7ZnU68Kr jMog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=WZXwx4HISUDgI8TfJQzyJVeD3sCMwmp5Girw2m9njPA=; b=Lz93GQ2D1FUz3VMdohhnhJwavVPLIxYFn0kdwecW9f2AH3e3BoruOOLYbGEFbsdBiE R5aIki3RPSN8wLTB9PLXjfUdUZY1xSp7YJLVe/cq+1DZAFGHpSCHT/IbN92fQ9dTvWPA Ys7H7LK1tpZoNMYRc+PWsfUPmUvW7yrXyDrRLRvYkHRgV/9piW7V0eBLbQ1hmNqUcYHd xsQtVj9os+A5sOHuAjE4Mi6f/etCelMYXVptDg+r4cpA/C1Xr/e+QGEJ72L/2mDq5yxf s2UmZofQysU0AbJdHqem4GrK/NqAH59AAoHXzgI/Mszf1xHvv39mqHsdytfUZ0Rq38jy z7mQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=YHWZ7pws; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=lFdHxGNc; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id p9si1868968qvf.115.2019.04.03.19.01.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:30 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=YHWZ7pws; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=lFdHxGNc; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 571F722738; Wed, 3 Apr 2019 22:01:30 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=WZXwx4HISUDgI 8TfJQzyJVeD3sCMwmp5Girw2m9njPA=; b=YHWZ7pwsO+AQY2MK+dVV1sfc+RQqa y0+PemKbFnqlXT3HnwW+jj+8wqbXpEOjA/J9qWo8nIFU0xSIK8+1t5Y+joz9aP9D PCwKEP54wy1JUiE4Yx4QyxZSYdca5E5MEHYdFNa9S5f1cVCFnilHQmzvxxOjIcFF TaEPskiO88H48XLZ/6DQWZfHgw34G4Gr622cdgH98yrQJSuQpQurdSDqjmGRnclI DB0VWw1Q7hYC+V6vfqSFUQfGagLdeQOtsdjQjhlPBkhLclDBi+mGmBhUY7+Lpwkk yo6T/QxpNAPxVN/6inaE7rn9+SBVIEgj6ry3kdZK5Sai0F8BBpZOlbvUw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=WZXwx4HISUDgI8TfJQzyJVeD3sCMwmp5Girw2m9njPA=; b=lFdHxGNc PI7EEvnuiKPmFifQ79uju5ydBluuPGo2k4N1/BRmGvOT6rQN+bHDZ9SqDlO9bp5u kVTs63VrcyST5SsloiB3RiWjcOK3LrNS+QhGwV7UKFShTciQXlHvumgq1bV+YYg2 D5+QYJy26/QDkqvVd+EuaYb+4Qqz8HpYFPEFiKsb34oLFLpYaa9vX0ULkbVNYpue 9t9a2Adz1hm8inZJr4pNPnM/CupgQnCSkQXyRKDWPoQHHL0SWivq3S0Kv7uDHQff 31kzoQqO2qAKgtMeCyPVx3ECiAM9swStgwR9u9WVAm3XEOypf2p9FCIq3s85e9HN NyZceirh0PvA9w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepje X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 615E210319; Wed, 3 Apr 2019 22:01:28 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 09/25] mm: migrate: Add copy_page_lists_dma_always to support copy a list of pages. Date: Wed, 3 Apr 2019 19:00:30 -0700 Message-Id: <20190404020046.32741-10-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Both src and dst page lists should match the page size at each page and the length of both lists is shared. Signed-off-by: Zi Yan --- mm/copy_page.c | 166 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ mm/internal.h | 4 ++ 2 files changed, 170 insertions(+) diff --git a/mm/copy_page.c b/mm/copy_page.c index 5e7a797..84f1c02 100644 --- a/mm/copy_page.c +++ b/mm/copy_page.c @@ -417,3 +417,169 @@ int copy_page_dma(struct page *to, struct page *from, int nr_pages) return copy_page_dma_always(to, from, nr_pages); } + +/* + * Use DMA copy a list of pages to a new location + * + * Just put each page into individual DMA channel. + * + * */ +int copy_page_lists_dma_always(struct page **to, struct page **from, int nr_items) +{ + struct dma_async_tx_descriptor **tx = NULL; + dma_cookie_t *cookie = NULL; + enum dma_ctrl_flags flags[NUM_AVAIL_DMA_CHAN] = {0}; + struct dmaengine_unmap_data *unmap[NUM_AVAIL_DMA_CHAN] = {0}; + int ret_val = 0; + int total_available_chans = NUM_AVAIL_DMA_CHAN; + int i; + int page_idx; + + for (i = 0; i < NUM_AVAIL_DMA_CHAN; ++i) { + if (!copy_chan[i]) { + total_available_chans = i; + } + } + if (total_available_chans != NUM_AVAIL_DMA_CHAN) { + pr_err("%d channels are missing\n", NUM_AVAIL_DMA_CHAN - total_available_chans); + } + if (limit_dma_chans < total_available_chans) + total_available_chans = limit_dma_chans; + + /* round down to closest 2^x value */ + total_available_chans = 1< 128) { + ret_val = -ENOMEM; + pr_err("%s: too many pages to be transferred\n", __func__); + goto out_free_both; + } + + unmap[i] = dmaengine_get_unmap_data(copy_dev[i]->dev, + 2 * num_xfer_per_dev, GFP_NOWAIT); + if (!unmap[i]) { + pr_err("%s: no unmap data at chan %d\n", __func__, i); + ret_val = -ENODEV; + goto unmap_dma; + } + } + + page_idx = 0; + for (i = 0; i < total_available_chans; ++i) { + int num_xfer_per_dev = nr_items / total_available_chans; + int xfer_idx; + + if (i < (nr_items % total_available_chans)) + num_xfer_per_dev += 1; + + unmap[i]->to_cnt = num_xfer_per_dev; + unmap[i]->from_cnt = num_xfer_per_dev; + unmap[i]->len = hpage_nr_pages(from[i]) * PAGE_SIZE; + + for (xfer_idx = 0; xfer_idx < num_xfer_per_dev; ++xfer_idx, ++page_idx) { + size_t page_len = hpage_nr_pages(from[page_idx]) * PAGE_SIZE; + + BUG_ON(page_len != hpage_nr_pages(to[page_idx]) * PAGE_SIZE); + BUG_ON(unmap[i]->len != page_len); + + unmap[i]->addr[xfer_idx] = + dma_map_page(copy_dev[i]->dev, from[page_idx], + 0, + page_len, + DMA_TO_DEVICE); + + unmap[i]->addr[xfer_idx+num_xfer_per_dev] = + dma_map_page(copy_dev[i]->dev, to[page_idx], + 0, + page_len, + DMA_FROM_DEVICE); + } + } + + page_idx = 0; + for (i = 0; i < total_available_chans; ++i) { + int num_xfer_per_dev = nr_items / total_available_chans; + int xfer_idx; + + if (i < (nr_items % total_available_chans)) + num_xfer_per_dev += 1; + + for (xfer_idx = 0; xfer_idx < num_xfer_per_dev; ++xfer_idx, ++page_idx) { + + tx[page_idx] = copy_dev[i]->device_prep_dma_memcpy(copy_chan[i], + unmap[i]->addr[xfer_idx + num_xfer_per_dev], + unmap[i]->addr[xfer_idx], + unmap[i]->len, + flags[i]); + if (!tx[page_idx]) { + pr_err("%s: no tx descriptor at chan %d xfer %d\n", + __func__, i, xfer_idx); + ret_val = -ENODEV; + goto unmap_dma; + } + + cookie[page_idx] = tx[page_idx]->tx_submit(tx[page_idx]); + + if (dma_submit_error(cookie[page_idx])) { + pr_err("%s: submission error at chan %d xfer %d\n", + __func__, i, xfer_idx); + ret_val = -ENODEV; + goto unmap_dma; + } + } + + dma_async_issue_pending(copy_chan[i]); + } + + page_idx = 0; + for (i = 0; i < total_available_chans; ++i) { + int num_xfer_per_dev = nr_items / total_available_chans; + int xfer_idx; + + if (i < (nr_items % total_available_chans)) + num_xfer_per_dev += 1; + + for (xfer_idx = 0; xfer_idx < num_xfer_per_dev; ++xfer_idx, ++page_idx) { + + if (dma_sync_wait(copy_chan[i], cookie[page_idx]) != DMA_COMPLETE) { + ret_val = -6; + pr_err("%s: dma does not complete at chan %d, xfer %d\n", + __func__, i, xfer_idx); + } + } + } + +unmap_dma: + for (i = 0; i < total_available_chans; ++i) { + if (unmap[i]) + dmaengine_unmap_put(unmap[i]); + } + +out_free_both: + kfree(cookie); +out_free_tx: + kfree(tx); +out: + + return ret_val; +} diff --git a/mm/internal.h b/mm/internal.h index 9eeaf2b..cb1a610 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -555,4 +555,8 @@ static inline bool is_migrate_highatomic_page(struct page *page) void setup_zone_pageset(struct zone *zone); extern struct page *alloc_new_node_page(struct page *page, unsigned long node); + +extern int copy_page_lists_dma_always(struct page **to, + struct page **from, int nr_pages); + #endif /* __MM_INTERNAL_H */ From patchwork Thu Apr 4 02:00:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884753 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4FC6417E9 for ; Thu, 4 Apr 2019 02:01:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 36354288E4 for ; Thu, 4 Apr 2019 02:01:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2987628929; Thu, 4 Apr 2019 02:01:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 781ED288E4 for ; Thu, 4 Apr 2019 02:01:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 672BD6B026E; Wed, 3 Apr 2019 22:01:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5FADB6B026F; Wed, 3 Apr 2019 22:01:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44D136B0270; Wed, 3 Apr 2019 22:01:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 24B9C6B026E for ; Wed, 3 Apr 2019 22:01:33 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id s26so921551qkm.19 for ; Wed, 03 Apr 2019 19:01:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=fFiPOXcG+tagawAnsTiiPFqaNASH1sdzyZFszg1HWpM=; b=KHq/d4olYLeGWM9vMFGDTuvZojAcdf4w+lLx6yFJdNvS7EX6x4Zx9+OivyE7nsweSy PJxjUSDv7ERDs+Q5rBWlBOwyLthHM1j9Zj/WeaaSpIxtfjG1veEmW66FJJdBHtWKAQIl Zh1E6J5wH+X4geRhnp3GRua+Ba2PC2wZ0XhXYoqBiB7K80Xh+XIexYsPo3FxWbpSQ3l9 HHb561tw+SB3CWTohTU/OY98eqoAYtRNwsbL6iFRUHb5JY+aizmQVSy02VA5kW69hWl3 uNUipsn2rKj1j4jQj8LF0PReWfKS8hz2769NtGABfa9RCRBHk7sblb5PXrdBVEwMFW9N g6Kg== X-Gm-Message-State: APjAAAXPeyzzp91zzSXNI1pp+53dsoSfp/KXkVSuhVrZk2j6DDiIhtBN O1BJjkFkuPtTs0hzGkmMgBsewkfFK/JELuXYjs/cuwiEGfoySASsgUukV4WNRqThHUJ3HD+5Pq5 2ICQn7HrarwqjjbJ0kTYVr7dH35BlGZCmiGCCqm3fOMae7lizeQvPy+f0ekpnLI1f/w== X-Received: by 2002:ae9:c210:: with SMTP id j16mr2862778qkg.218.1554343292912; Wed, 03 Apr 2019 19:01:32 -0700 (PDT) X-Google-Smtp-Source: APXvYqySdzkV03Sh5PGpBxpsamZOXDeTBiz5gxEE0hTTjSaQFzcziETfTp6cZsl0+p7u/0O7CYxJ X-Received: by 2002:ae9:c210:: with SMTP id j16mr2862707qkg.218.1554343292120; Wed, 03 Apr 2019 19:01:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343292; cv=none; d=google.com; s=arc-20160816; b=b1oKRLoU4L8WKIA8eICmkCqY0KdTw8PS3mNoxBTiOcOKwUJB+HKKF+MpGvNAlGrFyl LctRnvqQnkbOtd6cLp5vqnyTHsH0WRznGSN80sqEMUGC/HVaME+vUTZgKPuZ3TA6Md6A bMooZ80wRPRynNfC1vgpYVbe3IN9HfkajcoetWcrKHOfiiP3j+44V1wUvc6ieS8rhDit jPHysQRIJBxLvP+FJCJj8Hn/s1djcQ5nsdXgUzX5bu6pxqye9jIu1YNV4umPcsW7r8IE fKg/sRoaVY5AGdz8vBV+jC3RN0OIo3O048TncimyZs7GASf3UP/WeOFTFx85qYJkuAUK 3p+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=fFiPOXcG+tagawAnsTiiPFqaNASH1sdzyZFszg1HWpM=; b=g8KuspQTOq+HElsnPpyPa6d/LGg7JIS/a9v7EQCxUwqNaO4vKdBqJCG7taqgGRpsOK UuuD2f1dKCsIkXeE3lwEYhGUz9RDrgBJAs5tWvYzyDO3E16Ois2R9uVTe7jSkJGnBYKl q81qKwnJ+GYXIluWsuxdc8Rh8wWQKQVCH8c+ypaTyzMXGlCJWYYZjynxRjapNzqi4AQD h4UI8hVy/MZUqBjEDeYV0a8BJixhpH9mhuvnjguBkAjuz6DDomDHY9Rzbsv+UHF8LxRn 06MmDD2MrgLcfII4V/+BRfTqErCMT/h+PHmbr6LeGLI0Aubr8GBd5IEUVS9JYo/cBOjn FF3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=Gfm1ydd0; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=Mu+YIkDB; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id p28si8406767qtj.306.2019.04.03.19.01.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:32 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=Gfm1ydd0; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=Mu+YIkDB; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id D651B226C0; Wed, 3 Apr 2019 22:01:31 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=fFiPOXcG+taga wAnsTiiPFqaNASH1sdzyZFszg1HWpM=; b=Gfm1ydd0rEKEmAh6YCF/WLq52HFoN uiTjw6U6R4ewtBPRLPopBIVKqrLS3oaxHPx2JysLOdHndECZQV3mzJToXrf0CfSZ 8uNotHy9s7Lxwj6l+Ac1XYU1xPa8loNBctcDGy+HTTnaucve2eGtFfpTg87f2rmZ 8KMWUCKSSN4soiPwjyIhc3FjbC5oOXSxmp1EWqwKgtq0rC4KvbNdUID83tLhT44/ nXTrD5q0BzNuDKkd5HAqTpWFoyQpDimuLK98s2K1KwbyrmWH5H2XsP07lHuS3IDR IZeejK9F1vRWL3lJ2En4W6zO8nJnlXk61vcAorwgrQ4+KGaIycNgAgE+w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=fFiPOXcG+tagawAnsTiiPFqaNASH1sdzyZFszg1HWpM=; b=Mu+YIkDB Twp8/EFWuJX/uJEN9yZF70j/31sYOwPAsAyX0cV086LDxC6ULQiCS6w/31s24Qra cKaPx+dKXMOy1in6YoLMO7VgEkQG2Ljj3P8pi5nNg4V/iOxpIbNoFS2RjzNzBzDs agys7n9q4mqfUry4ncDiNQj90GMagvs+zgbtibF0PcjtYF4GG3VqN8a/BW0ooHnz knZ6mEI1qZb2+XJXDJ8XR+BzLML2I0+fGFNRkMj/J5Znh3NCpfl38W18WHCRpyko +W7YVyNYYZZk3IDP7TB4RZs6rn6pVi9gpy/tT6gPty5+hOMEZnBhmoz5l1VwGr+D A9q7qCTiuWVfFA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepje X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 1474710310; Wed, 3 Apr 2019 22:01:30 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 10/25] mm: migrate: copy_page_lists_mt() to copy a page list using multi-threads. Date: Wed, 3 Apr 2019 19:00:31 -0700 Message-Id: <20190404020046.32741-11-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan This prepare the support for migrate_page_concur(), which migrates multiple pages at the same time. Signed-off-by: Zi Yan --- mm/copy_page.c | 123 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ mm/internal.h | 2 + 2 files changed, 125 insertions(+) diff --git a/mm/copy_page.c b/mm/copy_page.c index 84f1c02..d2fd67e 100644 --- a/mm/copy_page.c +++ b/mm/copy_page.c @@ -126,6 +126,129 @@ int copy_page_multithread(struct page *to, struct page *from, int nr_pages) return err; } + +int copy_page_lists_mt(struct page **to, struct page **from, int nr_items) +{ + int err = 0; + unsigned int total_mt_num = limit_mt_num; + int to_node = page_to_nid(*to); + int i; + struct copy_page_info *work_items[NR_CPUS] = {0}; + const struct cpumask *per_node_cpumask = cpumask_of_node(to_node); + int cpu_id_list[NR_CPUS] = {0}; + int cpu; + int max_items_per_thread; + int item_idx; + + total_mt_num = min_t(unsigned int, total_mt_num, + cpumask_weight(per_node_cpumask)); + + + if (total_mt_num > num_online_cpus()) + return -ENODEV; + + /* Each threads get part of each page, if nr_items < totla_mt_num */ + if (nr_items < total_mt_num) + max_items_per_thread = nr_items; + else + max_items_per_thread = (nr_items / total_mt_num) + + ((nr_items % total_mt_num)?1:0); + + + for (cpu = 0; cpu < total_mt_num; ++cpu) { + work_items[cpu] = kzalloc(sizeof(struct copy_page_info) + + sizeof(struct copy_item)*max_items_per_thread, GFP_KERNEL); + if (!work_items[cpu]) { + err = -ENOMEM; + goto free_work_items; + } + } + + i = 0; + for_each_cpu(cpu, per_node_cpumask) { + if (i >= total_mt_num) + break; + cpu_id_list[i] = cpu; + ++i; + } + + if (nr_items < total_mt_num) { + for (cpu = 0; cpu < total_mt_num; ++cpu) { + INIT_WORK((struct work_struct *)work_items[cpu], + copy_page_work_queue_thread); + work_items[cpu]->num_items = max_items_per_thread; + } + + for (item_idx = 0; item_idx < nr_items; ++item_idx) { + unsigned long chunk_size = PAGE_SIZE * hpage_nr_pages(from[item_idx]) / total_mt_num; + char *vfrom = kmap(from[item_idx]); + char *vto = kmap(to[item_idx]); + VM_BUG_ON(PAGE_SIZE * hpage_nr_pages(from[item_idx]) % total_mt_num); + BUG_ON(hpage_nr_pages(to[item_idx]) != + hpage_nr_pages(from[item_idx])); + + for (cpu = 0; cpu < total_mt_num; ++cpu) { + work_items[cpu]->item_list[item_idx].to = vto + chunk_size * cpu; + work_items[cpu]->item_list[item_idx].from = vfrom + chunk_size * cpu; + work_items[cpu]->item_list[item_idx].chunk_size = + chunk_size; + } + } + + for (cpu = 0; cpu < total_mt_num; ++cpu) + queue_work_on(cpu_id_list[cpu], + system_highpri_wq, + (struct work_struct *)work_items[cpu]); + } else { + item_idx = 0; + for (cpu = 0; cpu < total_mt_num; ++cpu) { + int num_xfer_per_thread = nr_items / total_mt_num; + int per_cpu_item_idx; + + if (cpu < (nr_items % total_mt_num)) + num_xfer_per_thread += 1; + + INIT_WORK((struct work_struct *)work_items[cpu], + copy_page_work_queue_thread); + + work_items[cpu]->num_items = num_xfer_per_thread; + for (per_cpu_item_idx = 0; per_cpu_item_idx < work_items[cpu]->num_items; + ++per_cpu_item_idx, ++item_idx) { + work_items[cpu]->item_list[per_cpu_item_idx].to = kmap(to[item_idx]); + work_items[cpu]->item_list[per_cpu_item_idx].from = + kmap(from[item_idx]); + work_items[cpu]->item_list[per_cpu_item_idx].chunk_size = + PAGE_SIZE * hpage_nr_pages(from[item_idx]); + + BUG_ON(hpage_nr_pages(to[item_idx]) != + hpage_nr_pages(from[item_idx])); + } + + queue_work_on(cpu_id_list[cpu], + system_highpri_wq, + (struct work_struct *)work_items[cpu]); + } + if (item_idx != nr_items) + pr_err("%s: only %d out of %d pages are transferred\n", __func__, + item_idx - 1, nr_items); + } + + /* Wait until it finishes */ + for (i = 0; i < total_mt_num; ++i) + flush_work((struct work_struct *)work_items[i]); + + for (i = 0; i < nr_items; ++i) { + kunmap(to[i]); + kunmap(from[i]); + } + +free_work_items: + for (cpu = 0; cpu < total_mt_num; ++cpu) + if (work_items[cpu]) + kfree(work_items[cpu]); + + return err; +} /* ======================== DMA copy page ======================== */ #include #include diff --git a/mm/internal.h b/mm/internal.h index cb1a610..51f5e1b 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -558,5 +558,7 @@ extern struct page *alloc_new_node_page(struct page *page, unsigned long node); extern int copy_page_lists_dma_always(struct page **to, struct page **from, int nr_pages); +extern int copy_page_lists_mt(struct page **to, + struct page **from, int nr_pages); #endif /* __MM_INTERNAL_H */ From patchwork Thu Apr 4 02:00:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884757 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3A8751708 for ; Thu, 4 Apr 2019 02:01:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BD08288E4 for ; Thu, 4 Apr 2019 02:01:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0EFA728928; Thu, 4 Apr 2019 02:01:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C6D2288E4 for ; Thu, 4 Apr 2019 02:01:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7D0EA6B026F; Wed, 3 Apr 2019 22:01:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7AA516B0270; Wed, 3 Apr 2019 22:01:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6239C6B0271; Wed, 3 Apr 2019 22:01:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 30B2D6B026F for ; Wed, 3 Apr 2019 22:01:36 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id f89so981772qtb.4 for ; Wed, 03 Apr 2019 19:01:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=N+F/W2v9sslMyn5Wy884Xor9GycaYH25VDUebD4fYj4=; b=Peyb9aqwIJKFWGeuqUMYCf6qVLy0Gh/qOdVsumYRdWVMoz6+aTTcilK6aBans4CKJi mPjoI0aVx1vIAtKR8BznzAW/2N0Jpa4foxSbv3s1mC6QFTjddgG/LlJww1fpKehueo65 v5sAvm+lRGVDeG5JgumWoxSUSndzgXhMg6JSSl/YmTdlmy+UYLK/ZTozNBj7eMNZxS5F 0Hkl2VFetL/riG+2ZxD8LcP/ZxK8eJgsVmRnJs+AgPowkVyoY/iLD4pu3nr9Jh3lSQfa 9p3Ku6TdExD9qNWnIdA0pGrSHhlr73/Tfr+fYmxDJBXJUWA+O7La7xNE0AsmGRWMbB50 ufnQ== X-Gm-Message-State: APjAAAU4jdqSu/Se0ATJ/oBM3hPtpQRCBebx7fmdNFV6lKFOeQLkNeMY omTt8RqDvzWap391IDGeKgTyLJqv649ntN2IuyLiXGhWOPeEggalKh1KgBazFseleLyqPUss6n7 8FJKlQAMMudT+c6Lk86ZAAdwYMOY0I13i3xx17IuCslm+IvoB4qE9EhQwoZq5MW5ZMQ== X-Received: by 2002:ae9:e916:: with SMTP id x22mr2852148qkf.66.1554343295891; Wed, 03 Apr 2019 19:01:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqz5IGUqxnumQAjoqlOhhqXAoJ3UtmOyOd+WGJGUeCAXvS0slJNJpqG3FwoEDwfSqGALT2c1 X-Received: by 2002:ae9:e916:: with SMTP id x22mr2852031qkf.66.1554343294020; Wed, 03 Apr 2019 19:01:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343294; cv=none; d=google.com; s=arc-20160816; b=Cf/kbSccAY4iWiLIiNEsM2gBK0Csi2NFH79TyW5vwVZnDv3Wy0QFF1hPI+CLc85zm0 FbaetCW+12DrHmH5PTBck2/IftaW2nzollbInxeIo77GEItyguN77kQy3JRWzrea9zQQ Dopi/JZsvh2sgLSDVjA8MhXNJbfgQ0uW6JRPm/9PYV2wgKF8y533UvpWL5RKtp0/jM9e 5ZO/UUHkXJoQPjzHXxiFBRHA2aM0oucSjWX3SbdZSiYGUhhOH3ViEoo/oP5t/6VQtNTf x7iA6kmrkAa2dOl8bwdYwdW7a8elNNrUgGmd+fFcqJ7eiKNzrb55dH5dq4Mqr/ITixkW M22g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=N+F/W2v9sslMyn5Wy884Xor9GycaYH25VDUebD4fYj4=; b=sRQEcUYrwTJYqcjRgGCy93zAW3iorGVLEaoIoaJHEUr2AxWdGP8/XSY9OTgyTrcQ6c hAivVPgREV1yrjIUX8WM2LzybXfXzFhK6ZWuIKH1vxE9nDh7E4wNVRYqeYV9Kd90J+V7 sD7lPMn8lum54H2UTqVK4XzCkWZEvMr3vugnRtLL4TyZgR/9aZ2+d+N+De04HgHCuhN9 Sqizdt/Q0S9vXZqaXN55axg96UJ1K9IfWtj6KKu5F8kLxFUM9gusUNEQnll9MmVQ+rP+ R3AH7lgn5I9Q8VMkXStvxtpk9AQ4D6f33An5iR27iBI5efII6sIQEY7waoZGH6bT+ZCW WmCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=xvGijxix; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=Z1hOWcZu; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id x37si2172391qtj.307.2019.04.03.19.01.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:34 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=xvGijxix; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=Z1hOWcZu; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id A7F5D2258E; Wed, 3 Apr 2019 22:01:33 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=N+F/W2v9sslMy n5Wy884Xor9GycaYH25VDUebD4fYj4=; b=xvGijxixMYAIGuaZ8Ss8BIFrOKt5j Z3EgjAmS0gEwqkrciPnlP5vFRhwAfsH4zwnTCxdyULIIC8vqD6UcAnsYePDCQR1N QN9xxVk5DTs0alpYSNxYlt5lwhkFP/nxIjDadkgWNxNYt9HGl6oyjYDoeg7UTDQn g41bLRJIMuR6bbnaneRqvBcDliTyGYJJZW6lsuVrAneoXZFWSKm5mhhjrOEIipby A3sHCB33PTpOvE1diIlx7Kj0SQPyueV78Gv2SK716h+SQ/7XI9jVAfazsYv2UgSW aE49smEx73ZGLhKm8NqYDBJPBnli4EEsTUdJbic00LFqO1LyBBXXjc+Kg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=N+F/W2v9sslMyn5Wy884Xor9GycaYH25VDUebD4fYj4=; b=Z1hOWcZu 1LE4neVBqzhgmo/mlyPStSz5HFVWLsdHHyEZSCuiDiK6TaSxJaC1nr/DnjqVxH2V 3bqLgkTjaAhW5u0IhBwn79uZrmIIpZoy/1CAEZ1ZGswE3Vopi0gn/y+3vGylVg+N nLcCLlBFGDSDmBtQd99iZSe9MDnc+5M84jo8USvd/gU5R8aFpTcxTdyneNbm1Fok B8LsiJ73deVGikFeW2JLu67QhEL6YFalpy2Bp5t65Sh4rwm6LjL0vC2GmbPrpeM2 r9X7BjgYnQv6F9pEsVxmqXZqRxUyAXwxXfSY6096RvbcbHGRrc7eq/0aTBC0dWa3 RGtDbRafO9g7Fg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepuddt X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id B1E0910390; Wed, 3 Apr 2019 22:01:31 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 11/25] mm: migrate: Add concurrent page migration into move_pages syscall. Date: Wed, 3 Apr 2019 19:00:32 -0700 Message-Id: <20190404020046.32741-12-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Concurrent page migration unmaps all pages in a list, copy all pages in one function (copy_page_list*), finally remaps all new pages. This is different from existing page migration process which migrate one page at a time. Only anonymous pages are supported. All file-backed pages are still migrated sequentially. Because locking becomes more complicated when a list of file-backed pages belong to different files, which might cause deadlocks if locks on each file are not done properly. Signed-off-by: Zi Yan --- include/linux/migrate.h | 6 + include/linux/migrate_mode.h | 1 + include/uapi/linux/mempolicy.h | 1 + mm/migrate.c | 543 ++++++++++++++++++++++++++++++++++++++++- 4 files changed, 542 insertions(+), 9 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 5218a07..1001a1c 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -67,6 +67,8 @@ extern int migrate_page(struct address_space *mapping, enum migrate_mode mode); extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free, unsigned long private, enum migrate_mode mode, int reason); +extern int migrate_pages_concur(struct list_head *l, new_page_t new, free_page_t free, + unsigned long private, enum migrate_mode mode, int reason); extern int isolate_movable_page(struct page *page, isolate_mode_t mode); extern void putback_movable_page(struct page *page); @@ -87,6 +89,10 @@ static inline int migrate_pages(struct list_head *l, new_page_t new, free_page_t free, unsigned long private, enum migrate_mode mode, int reason) { return -ENOSYS; } +static inline int migrate_pages_concur(struct list_head *l, new_page_t new, + free_page_t free, unsigned long private, enum migrate_mode mode, + int reason) + { return -ENOSYS; } static inline int isolate_movable_page(struct page *page, isolate_mode_t mode) { return -EBUSY; } diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h index 4f7f5557..68263da 100644 --- a/include/linux/migrate_mode.h +++ b/include/linux/migrate_mode.h @@ -24,6 +24,7 @@ enum migrate_mode { MIGRATE_SINGLETHREAD = 0, MIGRATE_MT = 1<<4, MIGRATE_DMA = 1<<5, + MIGRATE_CONCUR = 1<<6, }; #endif /* MIGRATE_MODE_H_INCLUDED */ diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index 49573a6..eb6560e 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -50,6 +50,7 @@ enum { #define MPOL_MF_MOVE_DMA (1<<5) /* Use DMA page copy routine */ #define MPOL_MF_MOVE_MT (1<<6) /* Use multi-threaded page copy routine */ +#define MPOL_MF_MOVE_CONCUR (1<<7) /* Move pages in a batch */ #define MPOL_MF_VALID (MPOL_MF_STRICT | \ MPOL_MF_MOVE | \ diff --git a/mm/migrate.c b/mm/migrate.c index 09114d3..ad02797 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -57,6 +57,15 @@ int accel_page_copy = 1; + +struct page_migration_work_item { + struct list_head list; + struct page *old_page; + struct page *new_page; + struct anon_vma *anon_vma; + int page_was_mapped; +}; + /* * migrate_prep() needs to be called before we start compiling a list of pages * to be migrated using isolate_lru_page(). If scheduling work on other CPUs is @@ -1396,6 +1405,509 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, return rc; } +static int __unmap_page_concur(struct page *page, struct page *newpage, + struct anon_vma **anon_vma, + int *page_was_mapped, + int force, enum migrate_mode mode) +{ + int rc = -EAGAIN; + bool is_lru = !__PageMovable(page); + + *anon_vma = NULL; + *page_was_mapped = 0; + + if (!trylock_page(page)) { + if (!force || ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC)) + goto out; + + /* + * It's not safe for direct compaction to call lock_page. + * For example, during page readahead pages are added locked + * to the LRU. Later, when the IO completes the pages are + * marked uptodate and unlocked. However, the queueing + * could be merging multiple pages for one bio (e.g. + * mpage_readpages). If an allocation happens for the + * second or third page, the process can end up locking + * the same page twice and deadlocking. Rather than + * trying to be clever about what pages can be locked, + * avoid the use of lock_page for direct compaction + * altogether. + */ + if (current->flags & PF_MEMALLOC) + goto out; + + lock_page(page); + } + + /* We are working on page_mapping(page) == NULL */ + VM_BUG_ON_PAGE(PageWriteback(page), page); +#if 0 + if (PageWriteback(page)) { + /* + * Only in the case of a full synchronous migration is it + * necessary to wait for PageWriteback. In the async case, + * the retry loop is too short and in the sync-light case, + * the overhead of stalling is too much + */ + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC) { + rc = -EBUSY; + goto out_unlock; + } + if (!force) + goto out_unlock; + wait_on_page_writeback(page); + } +#endif + + /* + * By try_to_unmap(), page->mapcount goes down to 0 here. In this case, + * we cannot notice that anon_vma is freed while we migrates a page. + * This get_anon_vma() delays freeing anon_vma pointer until the end + * of migration. File cache pages are no problem because of page_lock() + * File Caches may use write_page() or lock_page() in migration, then, + * just care Anon page here. + * + * Only page_get_anon_vma() understands the subtleties of + * getting a hold on an anon_vma from outside one of its mms. + * But if we cannot get anon_vma, then we won't need it anyway, + * because that implies that the anon page is no longer mapped + * (and cannot be remapped so long as we hold the page lock). + */ + if (PageAnon(page) && !PageKsm(page)) + *anon_vma = page_get_anon_vma(page); + + /* + * Block others from accessing the new page when we get around to + * establishing additional references. We are usually the only one + * holding a reference to newpage at this point. We used to have a BUG + * here if trylock_page(newpage) fails, but would like to allow for + * cases where there might be a race with the previous use of newpage. + * This is much like races on refcount of oldpage: just don't BUG(). + */ + if (unlikely(!trylock_page(newpage))) + goto out_unlock; + + if (unlikely(!is_lru)) { + /* Just migrate the page and remove it from item list */ + VM_BUG_ON(1); + rc = move_to_new_page(newpage, page, mode); + goto out_unlock_both; + } + + /* + * Corner case handling: + * 1. When a new swap-cache page is read into, it is added to the LRU + * and treated as swapcache but it has no rmap yet. + * Calling try_to_unmap() against a page->mapping==NULL page will + * trigger a BUG. So handle it here. + * 2. An orphaned page (see truncate_complete_page) might have + * fs-private metadata. The page can be picked up due to memory + * offlining. Everywhere else except page reclaim, the page is + * invisible to the vm, so the page can not be migrated. So try to + * free the metadata, so the page can be freed. + */ + if (!page->mapping) { + VM_BUG_ON_PAGE(PageAnon(page), page); + if (page_has_private(page)) { + try_to_free_buffers(page); + goto out_unlock_both; + } + } else if (page_mapped(page)) { + /* Establish migration ptes */ + VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !*anon_vma, + page); + try_to_unmap(page, + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + *page_was_mapped = 1; + } + + return MIGRATEPAGE_SUCCESS; + +out_unlock_both: + unlock_page(newpage); +out_unlock: + /* Drop an anon_vma reference if we took one */ + if (*anon_vma) + put_anon_vma(*anon_vma); + unlock_page(page); +out: + return rc; +} + +static int unmap_pages_and_get_new_concur(new_page_t get_new_page, + free_page_t put_new_page, unsigned long private, + struct page_migration_work_item *item, + int force, + enum migrate_mode mode, enum migrate_reason reason) +{ + int rc = MIGRATEPAGE_SUCCESS; + + if (!thp_migration_supported() && PageTransHuge(item->old_page)) + return -ENOMEM; + + item->new_page = get_new_page(item->old_page, private); + if (!item->new_page) + return -ENOMEM; + + if (page_count(item->old_page) == 1) { + /* page was freed from under us. So we are done. */ + ClearPageActive(item->old_page); + ClearPageUnevictable(item->old_page); + if (unlikely(__PageMovable(item->old_page))) { + lock_page(item->old_page); + if (!PageMovable(item->old_page)) + __ClearPageIsolated(item->old_page); + unlock_page(item->old_page); + } + if (put_new_page) + put_new_page(item->new_page, private); + else + put_page(item->new_page); + item->new_page = NULL; + goto out; + } + + rc = __unmap_page_concur(item->old_page, item->new_page, &item->anon_vma, + &item->page_was_mapped, + force, mode); + if (rc == MIGRATEPAGE_SUCCESS) + return rc; + +out: + if (rc != -EAGAIN) { + list_del(&item->old_page->lru); + + if (likely(!__PageMovable(item->old_page))) + mod_node_page_state(page_pgdat(item->old_page), NR_ISOLATED_ANON + + page_is_file_cache(item->old_page), + -hpage_nr_pages(item->old_page)); + } + + if (rc == MIGRATEPAGE_SUCCESS) { + /* only for pages freed under us */ + VM_BUG_ON(page_count(item->old_page) != 1); + put_page(item->old_page); + item->old_page = NULL; + + } else { + if (rc != -EAGAIN) { + if (likely(!__PageMovable(item->old_page))) { + putback_lru_page(item->old_page); + goto put_new; + } + + lock_page(item->old_page); + if (PageMovable(item->old_page)) + putback_movable_page(item->old_page); + else + __ClearPageIsolated(item->old_page); + unlock_page(item->old_page); + put_page(item->old_page); + } + + /* + * If migration was not successful and there's a freeing callback, use + * it. Otherwise, putback_lru_page() will drop the reference grabbed + * during isolation. + */ +put_new: + if (put_new_page) + put_new_page(item->new_page, private); + else + put_page(item->new_page); + item->new_page = NULL; + + } + + return rc; +} + +static int move_mapping_concurr(struct list_head *unmapped_list_ptr, + struct list_head *wip_list_ptr, + free_page_t put_new_page, unsigned long private, + enum migrate_mode mode) +{ + struct page_migration_work_item *iterator, *iterator2; + struct address_space *mapping; + + list_for_each_entry_safe(iterator, iterator2, unmapped_list_ptr, list) { + VM_BUG_ON_PAGE(!PageLocked(iterator->old_page), iterator->old_page); + VM_BUG_ON_PAGE(!PageLocked(iterator->new_page), iterator->new_page); + + mapping = page_mapping(iterator->old_page); + + VM_BUG_ON(mapping); + + VM_BUG_ON(PageWriteback(iterator->old_page)); + + if (page_count(iterator->old_page) != 1) { + list_move(&iterator->list, wip_list_ptr); + if (iterator->page_was_mapped) + remove_migration_ptes(iterator->old_page, + iterator->old_page, false); + unlock_page(iterator->new_page); + if (iterator->anon_vma) + put_anon_vma(iterator->anon_vma); + unlock_page(iterator->old_page); + + if (put_new_page) + put_new_page(iterator->new_page, private); + else + put_page(iterator->new_page); + iterator->new_page = NULL; + continue; + } + + iterator->new_page->index = iterator->old_page->index; + iterator->new_page->mapping = iterator->old_page->mapping; + if (PageSwapBacked(iterator->old_page)) + SetPageSwapBacked(iterator->new_page); + } + + return 0; +} + +static int copy_to_new_pages_concur(struct list_head *unmapped_list_ptr, + enum migrate_mode mode) +{ + struct page_migration_work_item *iterator; + int num_pages = 0, idx = 0; + struct page **src_page_list = NULL, **dst_page_list = NULL; + unsigned long size = 0; + int rc = -EFAULT; + + if (list_empty(unmapped_list_ptr)) + return 0; + + list_for_each_entry(iterator, unmapped_list_ptr, list) { + ++num_pages; + size += PAGE_SIZE * hpage_nr_pages(iterator->old_page); + } + + src_page_list = kzalloc(sizeof(struct page *)*num_pages, GFP_KERNEL); + if (!src_page_list) { + BUG(); + return -ENOMEM; + } + dst_page_list = kzalloc(sizeof(struct page *)*num_pages, GFP_KERNEL); + if (!dst_page_list) { + BUG(); + return -ENOMEM; + } + + list_for_each_entry(iterator, unmapped_list_ptr, list) { + src_page_list[idx] = iterator->old_page; + dst_page_list[idx] = iterator->new_page; + ++idx; + } + + BUG_ON(idx != num_pages); + + if (mode & MIGRATE_DMA) + rc = copy_page_lists_dma_always(dst_page_list, src_page_list, + num_pages); + else if (mode & MIGRATE_MT) + rc = copy_page_lists_mt(dst_page_list, src_page_list, + num_pages); + + if (rc) { + list_for_each_entry(iterator, unmapped_list_ptr, list) { + if (PageHuge(iterator->old_page) || + PageTransHuge(iterator->old_page)) + copy_huge_page(iterator->new_page, iterator->old_page, 0); + else + copy_highpage(iterator->new_page, iterator->old_page); + } + } + + kfree(src_page_list); + kfree(dst_page_list); + + list_for_each_entry(iterator, unmapped_list_ptr, list) { + migrate_page_states(iterator->new_page, iterator->old_page); + } + + return 0; +} + +static int remove_migration_ptes_concurr(struct list_head *unmapped_list_ptr) +{ + struct page_migration_work_item *iterator, *iterator2; + + list_for_each_entry_safe(iterator, iterator2, unmapped_list_ptr, list) { + if (iterator->page_was_mapped) + remove_migration_ptes(iterator->old_page, iterator->new_page, false); + + unlock_page(iterator->new_page); + + if (iterator->anon_vma) + put_anon_vma(iterator->anon_vma); + + unlock_page(iterator->old_page); + + list_del(&iterator->old_page->lru); + mod_node_page_state(page_pgdat(iterator->old_page), NR_ISOLATED_ANON + + page_is_file_cache(iterator->old_page), + -hpage_nr_pages(iterator->old_page)); + + put_page(iterator->old_page); + iterator->old_page = NULL; + + if (unlikely(__PageMovable(iterator->new_page))) + put_page(iterator->new_page); + else + putback_lru_page(iterator->new_page); + iterator->new_page = NULL; + } + + return 0; +} + +int migrate_pages_concur(struct list_head *from, new_page_t get_new_page, + free_page_t put_new_page, unsigned long private, + enum migrate_mode mode, int reason) +{ + int retry = 1; + int nr_failed = 0; + int nr_succeeded = 0; + int pass = 0; + struct page *page; + int swapwrite = current->flags & PF_SWAPWRITE; + int rc; + int total_num_pages = 0, idx; + struct page_migration_work_item *item_list; + struct page_migration_work_item *iterator, *iterator2; + int item_list_order = 0; + + LIST_HEAD(wip_list); + LIST_HEAD(unmapped_list); + LIST_HEAD(serialized_list); + LIST_HEAD(failed_list); + + if (!swapwrite) + current->flags |= PF_SWAPWRITE; + + list_for_each_entry(page, from, lru) + ++total_num_pages; + + item_list_order = get_order(total_num_pages * + sizeof(struct page_migration_work_item)); + + if (item_list_order > MAX_ORDER) { + item_list = alloc_pages_exact(total_num_pages * + sizeof(struct page_migration_work_item), GFP_ATOMIC); + memset(item_list, 0, total_num_pages * + sizeof(struct page_migration_work_item)); + } else { + item_list = (struct page_migration_work_item *)__get_free_pages(GFP_ATOMIC, + item_list_order); + memset(item_list, 0, PAGE_SIZE<new_page) { + pr_info("%s: iterator already has a new page?\n", __func__); + VM_BUG_ON_PAGE(1, iterator->old_page); + } + + /* We do not migrate huge pages, file-backed, or swapcached pages */ + if (PageHuge(iterator->old_page)) { + rc = -ENODEV; + } + else if ((page_mapping(iterator->old_page) != NULL)) { + rc = -ENODEV; + } + else + rc = unmap_pages_and_get_new_concur(get_new_page, put_new_page, + private, iterator, pass > 2, mode, + reason); + + switch(rc) { + case -ENODEV: + list_move(&iterator->list, &serialized_list); + break; + case -ENOMEM: + if (PageTransHuge(page)) + list_move(&iterator->list, &serialized_list); + else + goto out; + break; + case -EAGAIN: + retry++; + break; + case MIGRATEPAGE_SUCCESS: + if (iterator->old_page) { + list_move(&iterator->list, &unmapped_list); + nr_succeeded++; + } else { /* pages are freed under us */ + list_del(&iterator->list); + } + break; + default: + /* + * Permanent failure (-EBUSY, -ENOSYS, etc.): + * unlike -EAGAIN case, the failed page is + * removed from migration page list and not + * retried in the next outer loop. + */ + list_move(&iterator->list, &failed_list); + nr_failed++; + break; + } + } +out: + if (list_empty(&unmapped_list)) + continue; + + /* move page->mapping to new page, only -EAGAIN could happen */ + move_mapping_concurr(&unmapped_list, &wip_list, put_new_page, private, mode); + /* copy pages in unmapped_list */ + copy_to_new_pages_concur(&unmapped_list, mode); + /* remove migration pte, if old_page is NULL?, unlock old and new + * pages, put anon_vma, put old and new pages */ + remove_migration_ptes_concurr(&unmapped_list); + } + nr_failed += retry; + rc = nr_failed; + + if (!list_empty(from)) + rc = migrate_pages(from, get_new_page, put_new_page, + private, mode, reason); + + if (nr_succeeded) + count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); + if (nr_failed) + count_vm_events(PGMIGRATE_FAIL, nr_failed); + trace_mm_migrate_pages(nr_succeeded, nr_failed, mode, reason); + + if (item_list_order >= MAX_ORDER) { + free_pages_exact(item_list, total_num_pages * + sizeof(struct page_migration_work_item)); + } else { + free_pages((unsigned long)item_list, item_list_order); + } + + if (!swapwrite) + current->flags &= ~PF_SWAPWRITE; + + return rc; +} + /* * migrate_pages - migrate the pages specified in a list, to the free pages * supplied as the target for the page migration @@ -1521,17 +2033,25 @@ static int store_status(int __user *status, int start, int value, int nr) static int do_move_pages_to_node(struct mm_struct *mm, struct list_head *pagelist, int node, - bool migrate_mt, bool migrate_dma) + bool migrate_mt, bool migrate_dma, bool migrate_concur) { int err; if (list_empty(pagelist)) return 0; - err = migrate_pages(pagelist, alloc_new_node_page, NULL, node, - MIGRATE_SYNC | (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD) | - (migrate_dma ? MIGRATE_DMA : MIGRATE_SINGLETHREAD), - MR_SYSCALL); + if (migrate_concur) { + err = migrate_pages_concur(pagelist, alloc_new_node_page, NULL, node, + MIGRATE_SYNC | (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD) | + (migrate_dma ? MIGRATE_DMA : MIGRATE_SINGLETHREAD), + MR_SYSCALL); + + } else { + err = migrate_pages(pagelist, alloc_new_node_page, NULL, node, + MIGRATE_SYNC | (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD) | + (migrate_dma ? MIGRATE_DMA : MIGRATE_SINGLETHREAD), + MR_SYSCALL); + } if (err) putback_movable_pages(pagelist); return err; @@ -1653,7 +2173,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, start = i; } else if (node != current_node) { err = do_move_pages_to_node(mm, &pagelist, current_node, - flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA); + flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA, + flags & MPOL_MF_MOVE_CONCUR); if (err) goto out; err = store_status(status, start, current_node, i - start); @@ -1677,7 +2198,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; err = do_move_pages_to_node(mm, &pagelist, current_node, - flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA); + flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA, + flags & MPOL_MF_MOVE_CONCUR); if (err) goto out; if (i > start) { @@ -1693,7 +2215,8 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, /* Make sure we do not overwrite the existing error */ err1 = do_move_pages_to_node(mm, &pagelist, current_node, - flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA); + flags & MPOL_MF_MOVE_MT, flags & MPOL_MF_MOVE_DMA, + flags & MPOL_MF_MOVE_CONCUR); if (!err1) err1 = store_status(status, start, current_node, i - start); if (!err) @@ -1789,7 +2312,9 @@ static int kernel_move_pages(pid_t pid, unsigned long nr_pages, nodemask_t task_nodes; /* Check flags */ - if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL|MPOL_MF_MOVE_MT|MPOL_MF_MOVE_DMA)) + if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL| + MPOL_MF_MOVE_DMA|MPOL_MF_MOVE_MT| + MPOL_MF_MOVE_CONCUR)) return -EINVAL; if ((flags & MPOL_MF_MOVE_ALL) && !capable(CAP_SYS_NICE)) From patchwork Thu Apr 4 02:00:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884759 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7057417E9 for ; Thu, 4 Apr 2019 02:02:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 57EB7288E4 for ; Thu, 4 Apr 2019 02:02:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4AA5328929; Thu, 4 Apr 2019 02:02:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1DE828928 for ; Thu, 4 Apr 2019 02:02:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B15D6B0270; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3882F6B0271; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 204D36B0272; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id E4C9B6B0270 for ; Wed, 3 Apr 2019 22:01:37 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id p26so927010qtq.21 for ; Wed, 03 Apr 2019 19:01:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=yK2eUC0I9aY6x1nEEB/U0Fv2Pko7D0eSucs6UTRXLHU=; b=oq9kBkMAtYGGXlM6WAv+LBX9UVKIdoNqHfO39ZN93886y5QLBSSOJO54QUXeNScUDP eFslZqBIa4FwuOE03mS4YjlxY54sSh3E5bnxPZnZRUenjVcUV7KHc2QgYnj9WbdPHN55 7AKmR2P67MyiHzOJ1VgFfpOUxgYmO9Su/TgLMeaSj4luJ6gf1xUZHGmww6Wvr4M/PIuY yBQorBWN5rarn7L7ZCL1K3qnx8A4inuzYY1GckbUY/otG3B4tft16vQopCGV8+rPGXAR ynAMifgpwQvFheq84cWr61Ar5SMRFySd9WAzs6jh8zz10nRxnn8pbaqJNKa+s20l6tiN Mdjw== X-Gm-Message-State: APjAAAXmq3LRN6S88sq5PdL7tIWT6mQcy3+YatdzhInRk3xk3HYuh9En AudomjvvH69jHaoMeu6N/MRDfv4Tt2ctUHZZoPiiG0U+rrC6GkldSHdzQuQGIUcOXyRtYdGg2Cm XYorEIUSThqidOE7EwYHlte21bdgjYgZi/KbeuiRR7Z57dFTfweKTS7cHFnyn90tfMw== X-Received: by 2002:ac8:2684:: with SMTP id 4mr3201870qto.67.1554343297634; Wed, 03 Apr 2019 19:01:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqyx8Sph4kooxcm4kls2eEEZsjqj3rn/Kx/Hico1mTl1Tjz5xUoTVrNh6q51nzzAdgYeYJcV X-Received: by 2002:ac8:2684:: with SMTP id 4mr3201728qto.67.1554343295775; Wed, 03 Apr 2019 19:01:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343295; cv=none; d=google.com; s=arc-20160816; b=Quq4L1VlLYDi9Z5743LjRJY1AaBD4q4+d/CqXYbXR7KKihuaVbaoEfLv0WX/bGnKVn EJ5ggTKNzvB3O4HhyeJ9LUJdKFJjNWc2vD3B5IYB+sYnrXA+HAA7X7jbi8WWUE+p8+T3 gqnUcC9zoatTD7B6rB0F0JyozLluMt/BWQM/vzvhUJyjQVbT7hhqtM8+USLMFAB5r8Wy c/cJytps4vLw/VDKRveOALIdMBQ089cSGD2it90oLyCsJjKk17JXvONhPUOuVcNx/RtW NLJkT/kxB4MB2XnkGcz1fqkLQnvmSXPOo3HKIN0bjHYFcmqKxGgBUlJHG4rDj05yQYrV JsGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=yK2eUC0I9aY6x1nEEB/U0Fv2Pko7D0eSucs6UTRXLHU=; b=SLaT4tui0nOVotP/Sm6oqYZn2jefI8ZEQfb/W2RfjycVNHb0HFYqE6r1kl4gxI70H4 FngyTPIApHwrG6UHrQeKHp7iMtQIcU5bEeJVIs52N3AfH8RGWyvkl+YKjOvWUO/KGbY1 mgYt4Zqar6abc6FlKpCQFcReovuW8P4H8DPbaPzqOlXCODkIRtbvxABdkAddfOvPjBdi LXeVEdbUzpdopeyPsPP3ut6SpC1qxNh1G63/uOQO145qPIjtRLLxIRoxObsFtkS3DZbv firydxSlSxh71biiZw7uB7Jytj7TKQ9GQNWwsPVNJhanrNXyXJF3pv/JSDRkTSwVndpU m/6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=q6z6NMOG; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=fMkp9+Bf; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id i7si2579877qke.204.2019.04.03.19.01.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:35 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=q6z6NMOG; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=fMkp9+Bf; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 7C0B222826; Wed, 3 Apr 2019 22:01:35 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=yK2eUC0I9aY6x 1nEEB/U0Fv2Pko7D0eSucs6UTRXLHU=; b=q6z6NMOGCfnvgFrtoi86cET1Cc2ja E3zSKbA6IHmp5gGp9/W95Aj7SnuptDIou0/g1dV6Eksldx/RrUFSBstcIR3BSOHy 1Js9KXkez7HJ5A1NdiwAJlBfgv3VnPSmmPqXFut3gQ0GtotYnZJpTuYJ7DQ0SQ6E itXy+JmonkkrUErFsNPn+/CSYF2mgsP3oFAWvXMCZ1I+Ieh7IMNBX28n1LNh7huu vvm3IwC74efbWWYmIDCXKX+QJgeYI5c4F90u8AfCg25T2mi1Q1QcM+mrVMjG/Gq8 bmk6PfVHqrV+NHavozQdYhqGS7bevriR5W1D0aAzdBa8X5c95IcoMG/mw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=yK2eUC0I9aY6x1nEEB/U0Fv2Pko7D0eSucs6UTRXLHU=; b=fMkp9+Bf g/fn2PPQYRra7dNca9xApuTL6t1n8eZ3gOIYDly+suAEQA8lEKFFygEwigy0zthL xwbu3jYzsCgT4eQQF02+7OssBKNd1IHIG9t4qjh8fbKMiX26GHVKg9HF/dSoEghv venpMBWrAezIbOY0tPVd3HKYPArf85P5bNxKJrHrBVdT+TuS91SDQKMIakHN33cJ 4tYMDhBYcvBf/d2Tez8VIUtBrmRQ6ciCzSE/GOMYewaw98bsncKpQfn1RpbDRPy9 12b9DsqPl8BX62zv1UM2qXLl5iGOjD1snFplc7jj3IgfMmsL+X1hdL+TZMX8AR3s WN+EjFGnXY8Bcg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepuddt X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 60CC41030F; Wed, 3 Apr 2019 22:01:33 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 12/25] exchange pages: new page migration mechanism: exchange_pages() Date: Wed, 3 Apr 2019 19:00:33 -0700 Message-Id: <20190404020046.32741-13-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan It exchanges two pages by unmapping both first, then exchanging the data of the pages using a u64 register, and finally remapping both pages. It saves the overheads of allocating two new pages in two back-to-back migrate_pages(). Signed-off-by: Zi Yan --- include/linux/exchange.h | 23 ++ include/linux/ksm.h | 4 + mm/Makefile | 1 + mm/exchange.c | 597 +++++++++++++++++++++++++++++++++++++++++++++++ mm/ksm.c | 35 +++ 5 files changed, 660 insertions(+) create mode 100644 include/linux/exchange.h create mode 100644 mm/exchange.c diff --git a/include/linux/exchange.h b/include/linux/exchange.h new file mode 100644 index 0000000..778068e --- /dev/null +++ b/include/linux/exchange.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_EXCHANGE_H +#define _LINUX_EXCHANGE_H + +#include + +struct exchange_page_info { + struct page *from_page; + struct page *to_page; + + struct anon_vma *from_anon_vma; + struct anon_vma *to_anon_vma; + + int from_page_was_mapped; + int to_page_was_mapped; + + struct list_head list; +}; + +int exchange_pages(struct list_head *exchange_list, + enum migrate_mode mode, + int reason); +#endif /* _LINUX_EXCHANGE_H */ diff --git a/include/linux/ksm.h b/include/linux/ksm.h index e48b1e4..170312d 100644 --- a/include/linux/ksm.h +++ b/include/linux/ksm.h @@ -55,6 +55,7 @@ void rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc); void ksm_migrate_page(struct page *newpage, struct page *oldpage); bool reuse_ksm_page(struct page *page, struct vm_area_struct *vma, unsigned long address); +void ksm_exchange_page(struct page *to_page, struct page *from_page); #else /* !CONFIG_KSM */ @@ -92,6 +93,9 @@ static inline bool reuse_ksm_page(struct page *page, struct vm_area_struct *vma, unsigned long address) { return false; +static inline void ksm_exchange_page(struct page *to_page, + struct page *from_page) +{ } #endif /* CONFIG_MMU */ #endif /* !CONFIG_KSM */ diff --git a/mm/Makefile b/mm/Makefile index fa02a9f..5e6c591 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -45,6 +45,7 @@ obj-y += init-mm.o obj-y += memblock.o obj-y += copy_page.o +obj-y += exchange.o ifdef CONFIG_MMU obj-$(CONFIG_ADVISE_SYSCALLS) += madvise.o diff --git a/mm/exchange.c b/mm/exchange.c new file mode 100644 index 0000000..626bbea --- /dev/null +++ b/mm/exchange.c @@ -0,0 +1,597 @@ +/* + * Exchange two in-use pages. Page flags and page->mapping are exchanged + * as well. Only anonymous pages are supported. + * + * Copyright (C) 2016 NVIDIA, Zi Yan + * + * This work is licensed under the terms of the GNU GPL, version 2. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +#include "internal.h" + +/* + * Move a list of individual pages + */ +struct pages_to_node { + unsigned long from_addr; + int from_status; + + unsigned long to_addr; + int to_status; +}; + +struct page_flags { + unsigned int page_error :1; + unsigned int page_referenced:1; + unsigned int page_uptodate:1; + unsigned int page_active:1; + unsigned int page_unevictable:1; + unsigned int page_checked:1; + unsigned int page_mappedtodisk:1; + unsigned int page_dirty:1; + unsigned int page_is_young:1; + unsigned int page_is_idle:1; + unsigned int page_swapcache:1; + unsigned int page_writeback:1; + unsigned int page_private:1; + unsigned int __pad:3; +}; + + +static void exchange_page(char *to, char *from) +{ + u64 tmp; + int i; + + for (i = 0; i < PAGE_SIZE; i += sizeof(tmp)) { + tmp = *((u64*)(from + i)); + *((u64*)(from + i)) = *((u64*)(to + i)); + *((u64*)(to + i)) = tmp; + } +} + +static inline void exchange_highpage(struct page *to, struct page *from) +{ + char *vfrom, *vto; + + vfrom = kmap_atomic(from); + vto = kmap_atomic(to); + exchange_page(vto, vfrom); + kunmap_atomic(vto); + kunmap_atomic(vfrom); +} + +static void __exchange_gigantic_page(struct page *dst, struct page *src, + int nr_pages) +{ + int i; + struct page *dst_base = dst; + struct page *src_base = src; + + for (i = 0; i < nr_pages; ) { + cond_resched(); + exchange_highpage(dst, src); + + i++; + dst = mem_map_next(dst, dst_base, i); + src = mem_map_next(src, src_base, i); + } +} + +static void exchange_huge_page(struct page *dst, struct page *src) +{ + int i; + int nr_pages; + + if (PageHuge(src)) { + /* hugetlbfs page */ + struct hstate *h = page_hstate(src); + nr_pages = pages_per_huge_page(h); + + if (unlikely(nr_pages > MAX_ORDER_NR_PAGES)) { + __exchange_gigantic_page(dst, src, nr_pages); + return; + } + } else { + /* thp page */ + BUG_ON(!PageTransHuge(src)); + nr_pages = hpage_nr_pages(src); + } + + for (i = 0; i < nr_pages; i++) { + cond_resched(); + exchange_highpage(dst + i, src + i); + } +} + +/* + * Copy the page to its new location without polluting cache + */ +static void exchange_page_flags(struct page *to_page, struct page *from_page) +{ + int from_cpupid, to_cpupid; + struct page_flags from_page_flags, to_page_flags; + struct mem_cgroup *to_memcg = page_memcg(to_page), + *from_memcg = page_memcg(from_page); + + from_cpupid = page_cpupid_xchg_last(from_page, -1); + + from_page_flags.page_error = TestClearPageError(from_page); + from_page_flags.page_referenced = TestClearPageReferenced(from_page); + from_page_flags.page_uptodate = PageUptodate(from_page); + ClearPageUptodate(from_page); + from_page_flags.page_active = TestClearPageActive(from_page); + from_page_flags.page_unevictable = TestClearPageUnevictable(from_page); + from_page_flags.page_checked = PageChecked(from_page); + ClearPageChecked(from_page); + from_page_flags.page_mappedtodisk = PageMappedToDisk(from_page); + ClearPageMappedToDisk(from_page); + from_page_flags.page_dirty = PageDirty(from_page); + ClearPageDirty(from_page); + from_page_flags.page_is_young = test_and_clear_page_young(from_page); + from_page_flags.page_is_idle = page_is_idle(from_page); + clear_page_idle(from_page); + from_page_flags.page_swapcache = PageSwapCache(from_page); + from_page_flags.page_private = PagePrivate(from_page); + ClearPagePrivate(from_page); + from_page_flags.page_writeback = test_clear_page_writeback(from_page); + + + to_cpupid = page_cpupid_xchg_last(to_page, -1); + + to_page_flags.page_error = TestClearPageError(to_page); + to_page_flags.page_referenced = TestClearPageReferenced(to_page); + to_page_flags.page_uptodate = PageUptodate(to_page); + ClearPageUptodate(to_page); + to_page_flags.page_active = TestClearPageActive(to_page); + to_page_flags.page_unevictable = TestClearPageUnevictable(to_page); + to_page_flags.page_checked = PageChecked(to_page); + ClearPageChecked(to_page); + to_page_flags.page_mappedtodisk = PageMappedToDisk(to_page); + ClearPageMappedToDisk(to_page); + to_page_flags.page_dirty = PageDirty(to_page); + ClearPageDirty(to_page); + to_page_flags.page_is_young = test_and_clear_page_young(to_page); + to_page_flags.page_is_idle = page_is_idle(to_page); + clear_page_idle(to_page); + to_page_flags.page_swapcache = PageSwapCache(to_page); + to_page_flags.page_private = PagePrivate(to_page); + ClearPagePrivate(to_page); + to_page_flags.page_writeback = test_clear_page_writeback(to_page); + + /* set to_page */ + if (from_page_flags.page_error) + SetPageError(to_page); + if (from_page_flags.page_referenced) + SetPageReferenced(to_page); + if (from_page_flags.page_uptodate) + SetPageUptodate(to_page); + if (from_page_flags.page_active) { + VM_BUG_ON_PAGE(from_page_flags.page_unevictable, from_page); + SetPageActive(to_page); + } else if (from_page_flags.page_unevictable) + SetPageUnevictable(to_page); + if (from_page_flags.page_checked) + SetPageChecked(to_page); + if (from_page_flags.page_mappedtodisk) + SetPageMappedToDisk(to_page); + + /* Move dirty on pages not done by migrate_page_move_mapping() */ + if (from_page_flags.page_dirty) + SetPageDirty(to_page); + + if (from_page_flags.page_is_young) + set_page_young(to_page); + if (from_page_flags.page_is_idle) + set_page_idle(to_page); + + /* set from_page */ + if (to_page_flags.page_error) + SetPageError(from_page); + if (to_page_flags.page_referenced) + SetPageReferenced(from_page); + if (to_page_flags.page_uptodate) + SetPageUptodate(from_page); + if (to_page_flags.page_active) { + VM_BUG_ON_PAGE(to_page_flags.page_unevictable, from_page); + SetPageActive(from_page); + } else if (to_page_flags.page_unevictable) + SetPageUnevictable(from_page); + if (to_page_flags.page_checked) + SetPageChecked(from_page); + if (to_page_flags.page_mappedtodisk) + SetPageMappedToDisk(from_page); + + /* Move dirty on pages not done by migrate_page_move_mapping() */ + if (to_page_flags.page_dirty) + SetPageDirty(from_page); + + if (to_page_flags.page_is_young) + set_page_young(from_page); + if (to_page_flags.page_is_idle) + set_page_idle(from_page); + + /* + * Copy NUMA information to the new page, to prevent over-eager + * future migrations of this same page. + */ + page_cpupid_xchg_last(to_page, from_cpupid); + page_cpupid_xchg_last(from_page, to_cpupid); + + ksm_exchange_page(to_page, from_page); + /* + * Please do not reorder this without considering how mm/ksm.c's + * get_ksm_page() depends upon ksm_migrate_page() and PageSwapCache(). + */ + ClearPageSwapCache(to_page); + ClearPageSwapCache(from_page); + if (from_page_flags.page_swapcache) + SetPageSwapCache(to_page); + if (to_page_flags.page_swapcache) + SetPageSwapCache(from_page); + + +#ifdef CONFIG_PAGE_OWNER + /* exchange page owner */ + BUG(); +#endif + /* exchange mem cgroup */ + to_page->mem_cgroup = from_memcg; + from_page->mem_cgroup = to_memcg; + +} + +/* + * Replace the page in the mapping. + * + * The number of remaining references must be: + * 1 for anonymous pages without a mapping + * 2 for pages with a mapping + * 3 for pages with a mapping and PagePrivate/PagePrivate2 set. + */ + +static int exchange_page_move_mapping(struct address_space *to_mapping, + struct address_space *from_mapping, + struct page *to_page, struct page *from_page, + enum migrate_mode mode, + int to_extra_count, int from_extra_count) +{ + int to_expected_count = 1 + to_extra_count, + from_expected_count = 1 + from_extra_count; + unsigned long from_page_index = page_index(from_page), + to_page_index = page_index(to_page); + int to_swapbacked = PageSwapBacked(to_page), + from_swapbacked = PageSwapBacked(from_page); + struct address_space *to_mapping_value = to_page->mapping, + *from_mapping_value = from_page->mapping; + + + if (!to_mapping) { + /* Anonymous page without mapping */ + if (page_count(to_page) != to_expected_count) + return -EAGAIN; + } + + if (!from_mapping) { + /* Anonymous page without mapping */ + if (page_count(from_page) != from_expected_count) + return -EAGAIN; + } + + /* + * Now we know that no one else is looking at the page: + * no turning back from here. + */ + /* from_page */ + from_page->index = to_page_index; + from_page->mapping = to_mapping_value; + + ClearPageSwapBacked(from_page); + if (to_swapbacked) + SetPageSwapBacked(from_page); + + + /* to_page */ + to_page->index = from_page_index; + to_page->mapping = from_mapping_value; + + ClearPageSwapBacked(to_page); + if (from_swapbacked) + SetPageSwapBacked(to_page); + + return MIGRATEPAGE_SUCCESS; +} + +static int exchange_from_to_pages(struct page *to_page, struct page *from_page, + enum migrate_mode mode) +{ + int rc = -EBUSY; + struct address_space *to_page_mapping, *from_page_mapping; + + VM_BUG_ON_PAGE(!PageLocked(from_page), from_page); + VM_BUG_ON_PAGE(!PageLocked(to_page), to_page); + + /* copy page->mapping not use page_mapping() */ + to_page_mapping = page_mapping(to_page); + from_page_mapping = page_mapping(from_page); + + BUG_ON(from_page_mapping); + BUG_ON(to_page_mapping); + + BUG_ON(PageWriteback(from_page)); + BUG_ON(PageWriteback(to_page)); + + /* actual page mapping exchange */ + rc = exchange_page_move_mapping(to_page_mapping, from_page_mapping, + to_page, from_page, mode, 0, 0); + /* actual page data exchange */ + if (rc != MIGRATEPAGE_SUCCESS) + return rc; + + rc = -EFAULT; + + if (PageHuge(from_page) || PageTransHuge(from_page)) + exchange_huge_page(to_page, from_page); + else + exchange_highpage(to_page, from_page); + rc = 0; + + exchange_page_flags(to_page, from_page); + + return rc; +} + +static int unmap_and_exchange(struct page *from_page, struct page *to_page, + enum migrate_mode mode) +{ + int rc = -EAGAIN; + int from_page_was_mapped = 0, to_page_was_mapped = 0; + pgoff_t from_index, to_index; + struct anon_vma *from_anon_vma = NULL, *to_anon_vma = NULL; + + /* from_page lock down */ + if (!trylock_page(from_page)) { + if ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC) + goto out; + + lock_page(from_page); + } + + BUG_ON(PageWriteback(from_page)); + + /* + * By try_to_unmap(), page->mapcount goes down to 0 here. In this case, + * we cannot notice that anon_vma is freed while we migrates a page. + * This get_anon_vma() delays freeing anon_vma pointer until the end + * of migration. File cache pages are no problem because of page_lock() + * File Caches may use write_page() or lock_page() in migration, then, + * just care Anon page here. + * + * Only page_get_anon_vma() understands the subtleties of + * getting a hold on an anon_vma from outside one of its mms. + * But if we cannot get anon_vma, then we won't need it anyway, + * because that implies that the anon page is no longer mapped + * (and cannot be remapped so long as we hold the page lock). + */ + if (PageAnon(from_page) && !PageKsm(from_page)) + from_anon_vma = page_get_anon_vma(from_page); + + /* to_page lock down */ + if (!trylock_page(to_page)) { + if ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC) + goto out_unlock; + + lock_page(to_page); + } + + BUG_ON(PageWriteback(to_page)); + + /* + * By try_to_unmap(), page->mapcount goes down to 0 here. In this case, + * we cannot notice that anon_vma is freed while we migrates a page. + * This get_anon_vma() delays freeing anon_vma pointer until the end + * of migration. File cache pages are no problem because of page_lock() + * File Caches may use write_page() or lock_page() in migration, then, + * just care Anon page here. + * + * Only page_get_anon_vma() understands the subtleties of + * getting a hold on an anon_vma from outside one of its mms. + * But if we cannot get anon_vma, then we won't need it anyway, + * because that implies that the anon page is no longer mapped + * (and cannot be remapped so long as we hold the page lock). + */ + if (PageAnon(to_page) && !PageKsm(to_page)) + to_anon_vma = page_get_anon_vma(to_page); + + from_index = from_page->index; + to_index = to_page->index; + + /* + * Corner case handling: + * 1. When a new swap-cache page is read into, it is added to the LRU + * and treated as swapcache but it has no rmap yet. + * Calling try_to_unmap() against a page->mapping==NULL page will + * trigger a BUG. So handle it here. + * 2. An orphaned page (see truncate_complete_page) might have + * fs-private metadata. The page can be picked up due to memory + * offlining. Everywhere else except page reclaim, the page is + * invisible to the vm, so the page can not be migrated. So try to + * free the metadata, so the page can be freed. + */ + if (!from_page->mapping) { + VM_BUG_ON_PAGE(PageAnon(from_page), from_page); + if (page_has_private(from_page)) { + try_to_free_buffers(from_page); + goto out_unlock_both; + } + } else if (page_mapped(from_page)) { + /* Establish migration ptes */ + VM_BUG_ON_PAGE(PageAnon(from_page) && !PageKsm(from_page) && + !from_anon_vma, from_page); + try_to_unmap(from_page, + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + from_page_was_mapped = 1; + } + + if (!to_page->mapping) { + VM_BUG_ON_PAGE(PageAnon(to_page), to_page); + if (page_has_private(to_page)) { + try_to_free_buffers(to_page); + goto out_unlock_both_remove_from_migration_pte; + } + } else if (page_mapped(to_page)) { + /* Establish migration ptes */ + VM_BUG_ON_PAGE(PageAnon(to_page) && !PageKsm(to_page) && + !to_anon_vma, to_page); + try_to_unmap(to_page, + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + to_page_was_mapped = 1; + } + + if (!page_mapped(from_page) && !page_mapped(to_page)) + rc = exchange_from_to_pages(to_page, from_page, mode); + + /* In remove_migration_ptes(), page_walk_vma() assumes + * from_page and to_page have the same index. + * Thus, we restore old_page->index here. + * Here to_page is the old_page. + */ + if (to_page_was_mapped) { + if (rc == MIGRATEPAGE_SUCCESS) + swap(to_page->index, to_index); + + remove_migration_ptes(to_page, + rc == MIGRATEPAGE_SUCCESS ? from_page : to_page, false); + + if (rc == MIGRATEPAGE_SUCCESS) + swap(to_page->index, to_index); + } + +out_unlock_both_remove_from_migration_pte: + if (from_page_was_mapped) { + if (rc == MIGRATEPAGE_SUCCESS) + swap(from_page->index, from_index); + + remove_migration_ptes(from_page, + rc == MIGRATEPAGE_SUCCESS ? to_page : from_page, false); + + if (rc == MIGRATEPAGE_SUCCESS) + swap(from_page->index, from_index); + } + + + +out_unlock_both: + if (to_anon_vma) + put_anon_vma(to_anon_vma); + unlock_page(to_page); +out_unlock: + /* Drop an anon_vma reference if we took one */ + if (from_anon_vma) + put_anon_vma(from_anon_vma); + unlock_page(from_page); +out: + + return rc; +} + +/* + * Exchange pages in the exchange_list + * + * Caller should release the exchange_list resource. + * + * */ +int exchange_pages(struct list_head *exchange_list, + enum migrate_mode mode, + int reason) +{ + struct exchange_page_info *one_pair, *one_pair2; + int failed = 0; + + list_for_each_entry_safe(one_pair, one_pair2, exchange_list, list) { + struct page *from_page = one_pair->from_page; + struct page *to_page = one_pair->to_page; + int rc; + int retry = 0; + +again: + if (page_count(from_page) == 1) { + /* page was freed from under us. So we are done */ + ClearPageActive(from_page); + ClearPageUnevictable(from_page); + + put_page(from_page); + dec_node_page_state(from_page, NR_ISOLATED_ANON + + page_is_file_cache(from_page)); + + if (page_count(to_page) == 1) { + ClearPageActive(to_page); + ClearPageUnevictable(to_page); + put_page(to_page); + } else + goto putback_to_page; + + continue; + } + + if (page_count(to_page) == 1) { + /* page was freed from under us. So we are done */ + ClearPageActive(to_page); + ClearPageUnevictable(to_page); + + put_page(to_page); + + dec_node_page_state(to_page, NR_ISOLATED_ANON + + page_is_file_cache(to_page)); + + dec_node_page_state(from_page, NR_ISOLATED_ANON + + page_is_file_cache(from_page)); + putback_lru_page(from_page); + continue; + } + + /* TODO: compound page not supported */ + if (PageCompound(from_page) || page_mapping(from_page)) { + ++failed; + goto putback; + } + + rc = unmap_and_exchange(from_page, to_page, mode); + + if (rc == -EAGAIN && retry < 3) { + ++retry; + goto again; + } + + if (rc != MIGRATEPAGE_SUCCESS) + ++failed; + +putback: + dec_node_page_state(from_page, NR_ISOLATED_ANON + + page_is_file_cache(from_page)); + + putback_lru_page(from_page); +putback_to_page: + dec_node_page_state(to_page, NR_ISOLATED_ANON + + page_is_file_cache(to_page)); + + putback_lru_page(to_page); + + } + return failed; +} diff --git a/mm/ksm.c b/mm/ksm.c index fc64874..e5b492b 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2716,6 +2716,41 @@ void ksm_migrate_page(struct page *newpage, struct page *oldpage) set_page_stable_node(oldpage, NULL); } } + +void ksm_exchange_page(struct page *to_page, struct page *from_page) +{ + struct stable_node *to_stable_node, *from_stable_node; + + VM_BUG_ON_PAGE(!PageLocked(to_page), to_page); + VM_BUG_ON_PAGE(!PageLocked(from_page), from_page); + + to_stable_node = page_stable_node(to_page); + from_stable_node = page_stable_node(from_page); + if (to_stable_node) { + VM_BUG_ON_PAGE(to_stable_node->kpfn != page_to_pfn(from_page), + from_page); + to_stable_node->kpfn = page_to_pfn(to_page); + /* + * newpage->mapping was set in advance; now we need smp_wmb() + * to make sure that the new stable_node->kpfn is visible + * to get_ksm_page() before it can see that oldpage->mapping + * has gone stale (or that PageSwapCache has been cleared). + */ + smp_wmb(); + } + if (from_stable_node) { + VM_BUG_ON_PAGE(from_stable_node->kpfn != page_to_pfn(to_page), + to_page); + from_stable_node->kpfn = page_to_pfn(from_page); + /* + * newpage->mapping was set in advance; now we need smp_wmb() + * to make sure that the new stable_node->kpfn is visible + * to get_ksm_page() before it can see that oldpage->mapping + * has gone stale (or that PageSwapCache has been cleared). + */ + smp_wmb(); + } +} #endif /* CONFIG_MIGRATION */ #ifdef CONFIG_MEMORY_HOTREMOVE From patchwork Thu Apr 4 02:00:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884761 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E8DDE17E9 for ; Thu, 4 Apr 2019 02:02:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D1B2D288E4 for ; Thu, 4 Apr 2019 02:02:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C486828929; Thu, 4 Apr 2019 02:02:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC712288E4 for ; Thu, 4 Apr 2019 02:02:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D76416B0271; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D4CC36B0272; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C12EF6B0274; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 9EF896B0271 for ; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id t22so954494qtc.13 for ; Wed, 03 Apr 2019 19:01:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=sXG3rQcTOdL12EImFulmnFrT8g5Zh7pYzesmPBFsvuo=; b=G7IU8nq/vVHgOOCEhh23Y0TV5jKXTEqGXOMP+9bUOWfY0o0WbWj3Klk9W+fhzp2TjV YgR7l1krE9Qlh0sHUHm5ESEjzxD2ByQFCcRghC1uEAQXpXJmKUyLizDSULGPsih9+bS0 r/5p1KzpQZhckP22SSUuvmKV1iBAzRuLUm8uw2cMZiAEY4GLij+U+z2bz1kpkf6y+5MC Bb4/tpNIzIx0Z66lz2eQ4e7FCAs0A7HxrYr6oYOEyUBu7AN5I9GyogI9k0q94kakxpFu 6w54CTcVBBBUJvGl15Kc+4ZoMfPj/FWCg4OSkD22P5nTBltkifckBlfnTbjplYj5+eNL HmSA== X-Gm-Message-State: APjAAAWpzsdKUlG8ejVzPWxSsfy4FGhKe/IF5wiJz3qzqsAbzdnfLgSh +dsU2cnY9X9QozZj1nf/HPQcNkvfHPsg5/f3eL3teYUiMDuTA9jS28hZpkKwTZqUd0/AA7avUbp 0HKJptSvGufLTqrh7NFQAeH/sxD82jOPWy/7krDkHlAedTaOjMlNtDnAGtCtmtzQBBQ== X-Received: by 2002:a0c:b14d:: with SMTP id r13mr2581001qvc.80.1554343298401; Wed, 03 Apr 2019 19:01:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqwXQlQixgAxVkBdSaTRO0lL1U+SC/bNJdUl5JrnaADuH0npiCrwePms1KeDApJOLQ5cmoYt X-Received: by 2002:a0c:b14d:: with SMTP id r13mr2580934qvc.80.1554343297424; Wed, 03 Apr 2019 19:01:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343297; cv=none; d=google.com; s=arc-20160816; b=imSNkPN4LYjrhioC3wE0zQHbZRtQoZUCY99ICieKbSFg9R8OSME3WNvhnSHzOpY/jO BBN9ADD3p5zk8BYW7QYeBBlEVX1x+tLlQXC8mjsgBdzLgzUz6eECqbYzXfWhJ+D2tlBx 4U4BeH5p+W4dQVz8lzkrm3ahsCrAMISvyYgW1YXJDPSYv6HipWDUENIC01OnbOT4hJVR hLdL3f+R0VH0znbNDdjngIDKlFxVqBgmT5z8bHcP8JMRlaoKg9i0UXB93RAcWCOLLJQD xt40PmFg7iNJ4KS55H1xty5SHXY4m+OEXTiUPu9aQi8VxfcWmwoGMuBcqcIrPxAEgt4q bbzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=sXG3rQcTOdL12EImFulmnFrT8g5Zh7pYzesmPBFsvuo=; b=M41gMs+pHCNBjRArjje9GAg5eSc5O10Gzz1DGs/M1GPvwHZ5UsJMSRSeXwiHS/wWDJ iySW0D2QspH7kybH30atL4WIb51aCFV69JHRixcG0ausyLNjvVpoz2S04su4/QqCEV4A 7wDEwh5CXIioaFGcYU7bgai62QtZHP+P4OYTKt5LP5XNTMI5q/LzzJRxg4nnr8mfqR1b OPNFxFQOP6C17AkJCAYQs0IAlgOGfa+SzG0jd3mMoRrqZGqhJLeY6j+H/urLPx9zWDn1 sYSGIZiVfyVeT0oGuIl3s2nbcMYrD2tvypijgdKxCgL6lnmBNGX50Tx2nUem7rHoq6F4 IZWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=W2eIyqgl; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=5Mx+lgAC; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id d7si1777713qke.148.2019.04.03.19.01.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:37 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=W2eIyqgl; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=5Mx+lgAC; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 2AA8E2278E; Wed, 3 Apr 2019 22:01:37 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=sXG3rQcTOdL12 EImFulmnFrT8g5Zh7pYzesmPBFsvuo=; b=W2eIyqglUIoPdhEoBrYqkALYEwBAy mrWNgfESFLTDPFUr8KPyFvjQNgjXApnMept6L/h/0sWxRticLShbQa1oaFLnQCRW DnUl8I1fR3h9IQlZlZF774PzztAv7krSBVZ6DhzxiDEnlP7fLYLLOE3UMA2V4Njm 3H+7nM6zpyg1hlfGO5sfueE6wAmk/nB7gjUOyFU2lYId1SqCbfmFeBy1HDQWnRz5 yP3C4yLKR8WkjbVY96UVRLsppZuWTqlyHLENB5vevy7vOn5GbeGyMvdt3MHecMeJ ZIP2ZFMod9AoBzg2+OXGk0oV3acM/zNqbqVJl43nYYvkWRdQLW4Wfikeg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=sXG3rQcTOdL12EImFulmnFrT8g5Zh7pYzesmPBFsvuo=; b=5Mx+lgAC KDkvHRxSZV/pLrpUHQNBSrBo8LVd9uBkAiMlkAhiEZ1cNvJ4pxFO/LRd8XMdWtHU Ohzw/EMVniSR+vaUuK5V+ql8WCZiCyvhodklOA0zsmesnYGMGm4IZ6bglJY9Cjym V9HzA44JzDt5L+UL3cHxreUqjHLB8G9I56oV9uE24yZh6xBb1e0Y+nrRXoEh/CHn 4cgBQQLmAvIUKYp3d4l52Zdkur8nhh7NiWcYw3+2/swgm3ho1Dt6hrKXRNre8qil BIZnOmwQaUluPIXSGL3oqVEU9PvKavJ2BpPSL71EsKIsMrsVqud56v4BPMfiiOsN fXumJbx95Gdcfw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepuddt X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 244C210319; Wed, 3 Apr 2019 22:01:35 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 13/25] exchange pages: add multi-threaded exchange pages. Date: Wed, 3 Apr 2019 19:00:34 -0700 Message-Id: <20190404020046.32741-14-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Exchange two pages using multi threads. Exchange two lists of pages using multi threads. Signed-off-by: Zi Yan --- mm/Makefile | 1 + mm/exchange.c | 15 ++-- mm/exchange_page.c | 229 +++++++++++++++++++++++++++++++++++++++++++++++++++++ mm/internal.h | 5 ++ 4 files changed, 245 insertions(+), 5 deletions(-) create mode 100644 mm/exchange_page.c diff --git a/mm/Makefile b/mm/Makefile index 5e6c591..2f1f1ad 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -46,6 +46,7 @@ obj-y += memblock.o obj-y += copy_page.o obj-y += exchange.o +obj-y += exchange_page.o ifdef CONFIG_MMU obj-$(CONFIG_ADVISE_SYSCALLS) += madvise.o diff --git a/mm/exchange.c b/mm/exchange.c index 626bbea..ce2c899 100644 --- a/mm/exchange.c +++ b/mm/exchange.c @@ -345,11 +345,16 @@ static int exchange_from_to_pages(struct page *to_page, struct page *from_page, rc = -EFAULT; - if (PageHuge(from_page) || PageTransHuge(from_page)) - exchange_huge_page(to_page, from_page); - else - exchange_highpage(to_page, from_page); - rc = 0; + if (mode & MIGRATE_MT) + rc = exchange_page_mthread(to_page, from_page, + hpage_nr_pages(from_page)); + if (rc) { + if (PageHuge(from_page) || PageTransHuge(from_page)) + exchange_huge_page(to_page, from_page); + else + exchange_highpage(to_page, from_page); + rc = 0; + } exchange_page_flags(to_page, from_page); diff --git a/mm/exchange_page.c b/mm/exchange_page.c new file mode 100644 index 0000000..6054697 --- /dev/null +++ b/mm/exchange_page.c @@ -0,0 +1,229 @@ +/* + * Exchange page copy routine. + * + * Copyright 2019 by NVIDIA. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Authors: Zi Yan + * + */ +#include +#include +#include +#include + +/* + * nr_copythreads can be the highest number of threads for given node + * on any architecture. The actual number of copy threads will be + * limited by the cpumask weight of the target node. + */ +extern unsigned int limit_mt_num; + +struct copy_page_info { + struct work_struct copy_page_work; + char *to; + char *from; + unsigned long chunk_size; +}; + +static void exchange_page_routine(char *to, char *from, unsigned long chunk_size) +{ + u64 tmp; + int i; + + for (i = 0; i < chunk_size; i += sizeof(tmp)) { + tmp = *((u64*)(from + i)); + *((u64*)(from + i)) = *((u64*)(to + i)); + *((u64*)(to + i)) = tmp; + } +} + +static void exchange_page_work_queue_thread(struct work_struct *work) +{ + struct copy_page_info *my_work = (struct copy_page_info*)work; + + exchange_page_routine(my_work->to, + my_work->from, + my_work->chunk_size); +} + +int exchange_page_mthread(struct page *to, struct page *from, int nr_pages) +{ + int total_mt_num = limit_mt_num; + int to_node = page_to_nid(to); + int i; + struct copy_page_info *work_items; + char *vto, *vfrom; + unsigned long chunk_size; + const struct cpumask *per_node_cpumask = cpumask_of_node(to_node); + int cpu_id_list[32] = {0}; + int cpu; + + total_mt_num = min_t(unsigned int, total_mt_num, + cpumask_weight(per_node_cpumask)); + + if (total_mt_num > 1) + total_mt_num = (total_mt_num / 2) * 2; + + if (total_mt_num > 32 || total_mt_num < 1) + return -ENODEV; + + work_items = kvzalloc(sizeof(struct copy_page_info)*total_mt_num, + GFP_KERNEL); + if (!work_items) + return -ENOMEM; + + i = 0; + for_each_cpu(cpu, per_node_cpumask) { + if (i >= total_mt_num) + break; + cpu_id_list[i] = cpu; + ++i; + } + + /* XXX: assume no highmem */ + vfrom = kmap(from); + vto = kmap(to); + chunk_size = PAGE_SIZE*nr_pages / total_mt_num; + + for (i = 0; i < total_mt_num; ++i) { + INIT_WORK((struct work_struct *)&work_items[i], + exchange_page_work_queue_thread); + + work_items[i].to = vto + i * chunk_size; + work_items[i].from = vfrom + i * chunk_size; + work_items[i].chunk_size = chunk_size; + + queue_work_on(cpu_id_list[i], + system_highpri_wq, + (struct work_struct *)&work_items[i]); + } + + /* Wait until it finishes */ + flush_workqueue(system_highpri_wq); + + kunmap(to); + kunmap(from); + + kvfree(work_items); + + return 0; +} + +int exchange_page_lists_mthread(struct page **to, struct page **from, int nr_pages) +{ + int err = 0; + unsigned int total_mt_num = limit_mt_num; + int to_node = page_to_nid(*to); + int i; + struct copy_page_info *work_items; + int nr_pages_per_page = hpage_nr_pages(*from); + const struct cpumask *per_node_cpumask = cpumask_of_node(to_node); + int cpu_id_list[32] = {0}; + int cpu; + int item_idx; + + + total_mt_num = min_t(unsigned int, total_mt_num, + cpumask_weight(per_node_cpumask)); + + if (total_mt_num > 32 || total_mt_num < 1) + return -ENODEV; + + if (nr_pages < total_mt_num) { + int residual_nr_pages = nr_pages - rounddown_pow_of_two(nr_pages); + + if (residual_nr_pages) { + for (i = 0; i < residual_nr_pages; ++i) { + BUG_ON(hpage_nr_pages(to[i]) != hpage_nr_pages(from[i])); + err = exchange_page_mthread(to[i], from[i], hpage_nr_pages(to[i])); + VM_BUG_ON(err); + } + nr_pages = rounddown_pow_of_two(nr_pages); + to = &to[residual_nr_pages]; + from = &from[residual_nr_pages]; + } + + work_items = kvzalloc(sizeof(struct copy_page_info)*total_mt_num, + GFP_KERNEL); + } else + work_items = kvzalloc(sizeof(struct copy_page_info)*nr_pages, + GFP_KERNEL); + if (!work_items) + return -ENOMEM; + + i = 0; + for_each_cpu(cpu, per_node_cpumask) { + if (i >= total_mt_num) + break; + cpu_id_list[i] = cpu; + ++i; + } + + if (nr_pages < total_mt_num) { + for (cpu = 0; cpu < total_mt_num; ++cpu) + INIT_WORK((struct work_struct *)&work_items[cpu], + exchange_page_work_queue_thread); + cpu = 0; + for (item_idx = 0; item_idx < nr_pages; ++item_idx) { + unsigned long chunk_size = nr_pages * PAGE_SIZE * hpage_nr_pages(from[item_idx]) / total_mt_num; + char *vfrom = kmap(from[item_idx]); + char *vto = kmap(to[item_idx]); + VM_BUG_ON(PAGE_SIZE * hpage_nr_pages(from[item_idx]) % total_mt_num); + VM_BUG_ON(total_mt_num % nr_pages); + BUG_ON(hpage_nr_pages(to[item_idx]) != + hpage_nr_pages(from[item_idx])); + + for (i = 0; i < (total_mt_num/nr_pages); ++cpu, ++i) { + work_items[cpu].to = vto + chunk_size * i; + work_items[cpu].from = vfrom + chunk_size * i; + work_items[cpu].chunk_size = chunk_size; + } + } + if (cpu != total_mt_num) + pr_err("%s: only %d out of %d pages are transferred\n", __func__, + cpu - 1, total_mt_num); + + for (cpu = 0; cpu < total_mt_num; ++cpu) + queue_work_on(cpu_id_list[cpu], + system_highpri_wq, + (struct work_struct *)&work_items[cpu]); + } else { + for (i = 0; i < nr_pages; ++i) { + int thread_idx = i % total_mt_num; + + INIT_WORK((struct work_struct *)&work_items[i], exchange_page_work_queue_thread); + + /* XXX: assume no highmem */ + work_items[i].to = kmap(to[i]); + work_items[i].from = kmap(from[i]); + work_items[i].chunk_size = PAGE_SIZE * hpage_nr_pages(from[i]); + + BUG_ON(hpage_nr_pages(to[i]) != hpage_nr_pages(from[i])); + + queue_work_on(cpu_id_list[thread_idx], system_highpri_wq, (struct work_struct *)&work_items[i]); + } + } + + /* Wait until it finishes */ + flush_workqueue(system_highpri_wq); + + for (i = 0; i < nr_pages; ++i) { + kunmap(to[i]); + kunmap(from[i]); + } + + kvfree(work_items); + + return err; +} + diff --git a/mm/internal.h b/mm/internal.h index 51f5e1b..a039459 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -561,4 +561,9 @@ extern int copy_page_lists_dma_always(struct page **to, extern int copy_page_lists_mt(struct page **to, struct page **from, int nr_pages); +extern int exchange_page_mthread(struct page *to, struct page *from, + int nr_pages); +extern int exchange_page_lists_mthread(struct page **to, + struct page **from, + int nr_pages); #endif /* __MM_INTERNAL_H */ From patchwork Thu Apr 4 02:00:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884763 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E56D61708 for ; Thu, 4 Apr 2019 02:02:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CEF7F288E4 for ; Thu, 4 Apr 2019 02:02:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BFE8928929; Thu, 4 Apr 2019 02:02:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1B42288E4 for ; Thu, 4 Apr 2019 02:02:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4EE396B0272; Wed, 3 Apr 2019 22:01:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 49C1A6B0274; Wed, 3 Apr 2019 22:01:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 363426B0275; Wed, 3 Apr 2019 22:01:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 10B2C6B0272 for ; Wed, 3 Apr 2019 22:01:41 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id c67so951225qkg.5 for ; Wed, 03 Apr 2019 19:01:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=wYGW67q2hX0YeFGfjhSH69JcgDm6ztcqTDnUJ+CdMVc=; b=N+T11yND7/VXXNH+lAvAS7ecePmd8Od4hwN+gosKhBPW1kZq+MqNtmKWr01qMbcyt+ X56UbGIVYp5ClXuiGBsGAif4IkixpzlvBunt/D65T1PNUwMR3VSOEwIu7rtMEinPZtnf CrPT52ezdhmLdzXKt/WW0rQnezakLa3QQ5zxRludAxLot/iOnnxzlMSP5JEZ3QJd5eZw EQiy2HULo4YbiVjDB8SGtR9TUyElvur7VOGL3isMCzXMZhiyTK9TtuT1wWI0eY2YWkBz PNPUnvME9LCSZztuYoSVpFaMH2+GUllIHtucBA0/7rM4/PuRxC5IE8UAClkFZ4fHWCKv 1tVg== X-Gm-Message-State: APjAAAW8D24yfQSiFeWkYoh8ZG/MPvfdqEMPr31eYEDwWxN33XyJ+/eo ta2xgGYT0sqpwcjKuSID3K+Om8kUmtxPH3r2msvhaI4yPpLCmfa8fCgSsxnuPKdAQq4mv9i3Vtd j9rMRKdIkCgvDP8btb6iU7M3MJPHOidBIXMcT9lQ3fZCRSlTAjmq765pv2hKhzSQWDw== X-Received: by 2002:ae9:ef07:: with SMTP id d7mr3047462qkg.100.1554343300797; Wed, 03 Apr 2019 19:01:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqzID+DxAG8H+kVhR4jJwrhHXHLA/Jm9U/7MJhCO6oNRX2QeDxlkK7WMrUHfDrC6YOqBoOkb X-Received: by 2002:ae9:ef07:: with SMTP id d7mr3047360qkg.100.1554343299254; Wed, 03 Apr 2019 19:01:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343299; cv=none; d=google.com; s=arc-20160816; b=CfjGj4GskgAFLd4MZYqZi7j3qxRFY6+RmE+phzdiLY44wQ5mwWVsiLiZ2Vs2wzkMoq mVJDQLJj+tmSXal0upeAUPWXqqfUj7D12qvlVxJZKMvUMmluqH0qMbJSjkxUwvAL5+ZE rT5IU619ji/rYEfKVozH2j4A3Wrtg+nQDUH88YJdo7UGruLSRjuxx8zlvD/ExN5RorJe i29LQU+P4xQcuG0VpQ4ZVeCh0loQHsa3TIcrSRFxZDdDltj8Ez21hB9IxJCavg9gWbR/ FZuXQMtxi0yPtPuIx8mJWwOmajuzPiM2Dka117xYdZWI1GkJqaYA150fjrtsuaRPo8mk vdIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=wYGW67q2hX0YeFGfjhSH69JcgDm6ztcqTDnUJ+CdMVc=; b=Qb8y+cQp6b+Gj2f1ctHpU69lzsXmC1xSq806LztLCn1beo0jBVmFcKJ39ImocH9j6A vOXwOX/N6TZdJ//8MvF+nn8NMblAZ9cQ9PEsOSc3Zk3U5KX7u9LjweJtNZEQoFGb9jnC D6g+ycCQToSROWorgBA5wX7FS+qYEodLCFVHZpI3M3mgg/1smKPj4MQDiODRUIs/MPI+ E8Burz4YAr78kTb0VxGyXJN3UBZI+eDn9Ri0sYQYBRNTcqW05RMsfxQ46fpFR6xiynWQ 5zZvVQ0UNLp0mmj3vMGPaAkrtYPcUvwijMzoCBHtgqn+YSC4k8jDOgl4RXmTyLhK3rBF njpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=fCpfmq6r; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=xXsEe6Si; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id h4si3685063qta.351.2019.04.03.19.01.39 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:39 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=fCpfmq6r; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=xXsEe6Si; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id E70A7228D4; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=wYGW67q2hX0Ye FGfjhSH69JcgDm6ztcqTDnUJ+CdMVc=; b=fCpfmq6ri4dvlPq3glhihZTgdw/hw y57u+VkeKxBP8eQp2qDCAalNUl2bFcPcG+Ve9QXMaA1tUHpGszvCvsnuYG7YTCDw qjy1kW4mGX21k0APvAHWLqMcWpB1W8Ml7PHpOJFvvtwEumCCUThHvi0YszBgtMZD tzzIbtMx0K8nVBkI3vN/D1blcoUge/fgWytVPLBjvZig3t3G+68jSWbOYiPexSC7 0shMeh+zTCp/miMOrmSb94BbkaxcOhVyUXUZGOnFaqsiwrmb6MG+sHfx52+6zlat vObGBzHTGr0yKm4WimYoMD75+c7kFcl8JwN8DLvrqeuxc1f08OUWmVgjA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=wYGW67q2hX0YeFGfjhSH69JcgDm6ztcqTDnUJ+CdMVc=; b=xXsEe6Si 1I8CPRa6jVORnOP1UG1J9zV9dqjQzPZvEgfE042iVdXe2PsHYWRZMFVE3Hbrwopa f+2YKs/HkBjAI4/ULj6tMuc532L/+pTneOm7/rr5pmiDGrkA4LAv6C0BbB7XdbzQ 2+yRetPTS0NnCZ8Nce1z4SQbI/PImyqWsP6XhIVud4EzCRTMhcpsaZwzDlOdJ/XE UZzwJlpSsbzvvt3//rRdOw20NnW025+HOW5ePBYqsFaDvzxsqsctyvOWTPfvGm7d PTEs6huvl02cUunBGyokQTxwIfcOaKmAIb32fbOpFMDzLmL0bFym7APyDbd/o9hQ G/hxmkHOS+3PBQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepuddt X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id D4DCF10310; Wed, 3 Apr 2019 22:01:36 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 14/25] exchange pages: concurrent exchange pages. Date: Wed, 3 Apr 2019 19:00:35 -0700 Message-Id: <20190404020046.32741-15-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan It unmaps two lists of pages, then exchange them in exchange_page_lists_mthread(), and finally remaps both lists of pages. Signed-off-by: Zi Yan --- include/linux/exchange.h | 2 + mm/exchange.c | 397 +++++++++++++++++++++++++++++++++++++++++++++++ mm/exchange_page.c | 1 - 3 files changed, 399 insertions(+), 1 deletion(-) diff --git a/include/linux/exchange.h b/include/linux/exchange.h index 778068e..20d2184 100644 --- a/include/linux/exchange.h +++ b/include/linux/exchange.h @@ -20,4 +20,6 @@ struct exchange_page_info { int exchange_pages(struct list_head *exchange_list, enum migrate_mode mode, int reason); +int exchange_pages_concur(struct list_head *exchange_list, + enum migrate_mode mode, int reason); #endif /* _LINUX_EXCHANGE_H */ diff --git a/mm/exchange.c b/mm/exchange.c index ce2c899..bbada58 100644 --- a/mm/exchange.c +++ b/mm/exchange.c @@ -600,3 +600,400 @@ int exchange_pages(struct list_head *exchange_list, } return failed; } + + +static int unmap_pair_pages_concur(struct exchange_page_info *one_pair, + int force, enum migrate_mode mode) +{ + int rc = -EAGAIN; + struct anon_vma *anon_vma_from_page = NULL, *anon_vma_to_page = NULL; + struct page *from_page = one_pair->from_page; + struct page *to_page = one_pair->to_page; + + /* from_page lock down */ + if (!trylock_page(from_page)) { + if (!force || ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC)) + goto out; + + lock_page(from_page); + } + + BUG_ON(PageWriteback(from_page)); + + /* + * By try_to_unmap(), page->mapcount goes down to 0 here. In this case, + * we cannot notice that anon_vma is freed while we migrates a page. + * This get_anon_vma() delays freeing anon_vma pointer until the end + * of migration. File cache pages are no problem because of page_lock() + * File Caches may use write_page() or lock_page() in migration, then, + * just care Anon page here. + * + * Only page_get_anon_vma() understands the subtleties of + * getting a hold on an anon_vma from outside one of its mms. + * But if we cannot get anon_vma, then we won't need it anyway, + * because that implies that the anon page is no longer mapped + * (and cannot be remapped so long as we hold the page lock). + */ + if (PageAnon(from_page) && !PageKsm(from_page)) + one_pair->from_anon_vma = anon_vma_from_page + = page_get_anon_vma(from_page); + + /* to_page lock down */ + if (!trylock_page(to_page)) { + if (!force || ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC)) + goto out_unlock; + + lock_page(to_page); + } + + BUG_ON(PageWriteback(to_page)); + + /* + * By try_to_unmap(), page->mapcount goes down to 0 here. In this case, + * we cannot notice that anon_vma is freed while we migrates a page. + * This get_anon_vma() delays freeing anon_vma pointer until the end + * of migration. File cache pages are no problem because of page_lock() + * File Caches may use write_page() or lock_page() in migration, then, + * just care Anon page here. + * + * Only page_get_anon_vma() understands the subtleties of + * getting a hold on an anon_vma from outside one of its mms. + * But if we cannot get anon_vma, then we won't need it anyway, + * because that implies that the anon page is no longer mapped + * (and cannot be remapped so long as we hold the page lock). + */ + if (PageAnon(to_page) && !PageKsm(to_page)) + one_pair->to_anon_vma = anon_vma_to_page = page_get_anon_vma(to_page); + + /* + * Corner case handling: + * 1. When a new swap-cache page is read into, it is added to the LRU + * and treated as swapcache but it has no rmap yet. + * Calling try_to_unmap() against a page->mapping==NULL page will + * trigger a BUG. So handle it here. + * 2. An orphaned page (see truncate_complete_page) might have + * fs-private metadata. The page can be picked up due to memory + * offlining. Everywhere else except page reclaim, the page is + * invisible to the vm, so the page can not be migrated. So try to + * free the metadata, so the page can be freed. + */ + if (!from_page->mapping) { + VM_BUG_ON_PAGE(PageAnon(from_page), from_page); + if (page_has_private(from_page)) { + try_to_free_buffers(from_page); + goto out_unlock_both; + } + } else if (page_mapped(from_page)) { + /* Establish migration ptes */ + VM_BUG_ON_PAGE(PageAnon(from_page) && !PageKsm(from_page) && + !anon_vma_from_page, from_page); + try_to_unmap(from_page, + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + + one_pair->from_page_was_mapped = 1; + } + + if (!to_page->mapping) { + VM_BUG_ON_PAGE(PageAnon(to_page), to_page); + if (page_has_private(to_page)) { + try_to_free_buffers(to_page); + goto out_unlock_both; + } + } else if (page_mapped(to_page)) { + /* Establish migration ptes */ + VM_BUG_ON_PAGE(PageAnon(to_page) && !PageKsm(to_page) && + !anon_vma_to_page, to_page); + try_to_unmap(to_page, + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + + one_pair->to_page_was_mapped = 1; + } + + return MIGRATEPAGE_SUCCESS; + +out_unlock_both: + if (anon_vma_to_page) + put_anon_vma(anon_vma_to_page); + unlock_page(to_page); +out_unlock: + /* Drop an anon_vma reference if we took one */ + if (anon_vma_from_page) + put_anon_vma(anon_vma_from_page); + unlock_page(from_page); +out: + + return rc; +} + +static int exchange_page_mapping_concur(struct list_head *unmapped_list_ptr, + struct list_head *exchange_list_ptr, + enum migrate_mode mode) +{ + int rc = -EBUSY; + int nr_failed = 0; + struct address_space *to_page_mapping, *from_page_mapping; + struct exchange_page_info *one_pair, *one_pair2; + + list_for_each_entry_safe(one_pair, one_pair2, unmapped_list_ptr, list) { + struct page *from_page = one_pair->from_page; + struct page *to_page = one_pair->to_page; + + VM_BUG_ON_PAGE(!PageLocked(from_page), from_page); + VM_BUG_ON_PAGE(!PageLocked(to_page), to_page); + + /* copy page->mapping not use page_mapping() */ + to_page_mapping = page_mapping(to_page); + from_page_mapping = page_mapping(from_page); + + BUG_ON(from_page_mapping); + BUG_ON(to_page_mapping); + + BUG_ON(PageWriteback(from_page)); + BUG_ON(PageWriteback(to_page)); + + /* actual page mapping exchange */ + rc = exchange_page_move_mapping(to_page_mapping, from_page_mapping, + to_page, from_page, mode, 0, 0); + + if (rc) { + if (one_pair->from_page_was_mapped) + remove_migration_ptes(from_page, from_page, false); + if (one_pair->to_page_was_mapped) + remove_migration_ptes(to_page, to_page, false); + + if (one_pair->from_anon_vma) + put_anon_vma(one_pair->from_anon_vma); + unlock_page(from_page); + + if (one_pair->to_anon_vma) + put_anon_vma(one_pair->to_anon_vma); + unlock_page(to_page); + + mod_node_page_state(page_pgdat(from_page), NR_ISOLATED_ANON + + page_is_file_cache(from_page), -hpage_nr_pages(from_page)); + putback_lru_page(from_page); + + mod_node_page_state(page_pgdat(to_page), NR_ISOLATED_ANON + + page_is_file_cache(to_page), -hpage_nr_pages(to_page)); + putback_lru_page(to_page); + + one_pair->from_page = NULL; + one_pair->to_page = NULL; + + list_move(&one_pair->list, exchange_list_ptr); + ++nr_failed; + } + } + + return nr_failed; +} + +static int exchange_page_data_concur(struct list_head *unmapped_list_ptr, + enum migrate_mode mode) +{ + struct exchange_page_info *one_pair; + int num_pages = 0, idx = 0; + struct page **src_page_list = NULL, **dst_page_list = NULL; + unsigned long size = 0; + int rc = -EFAULT; + + if (list_empty(unmapped_list_ptr)) + return 0; + + /* form page list */ + list_for_each_entry(one_pair, unmapped_list_ptr, list) { + ++num_pages; + size += PAGE_SIZE * hpage_nr_pages(one_pair->from_page); + } + + src_page_list = kzalloc(sizeof(struct page *)*num_pages, GFP_KERNEL); + if (!src_page_list) + return -ENOMEM; + dst_page_list = kzalloc(sizeof(struct page *)*num_pages, GFP_KERNEL); + if (!dst_page_list) + return -ENOMEM; + + list_for_each_entry(one_pair, unmapped_list_ptr, list) { + src_page_list[idx] = one_pair->from_page; + dst_page_list[idx] = one_pair->to_page; + ++idx; + } + + BUG_ON(idx != num_pages); + + + if (mode & MIGRATE_MT) + rc = exchange_page_lists_mthread(dst_page_list, src_page_list, + num_pages); + + if (rc) { + list_for_each_entry(one_pair, unmapped_list_ptr, list) { + if (PageHuge(one_pair->from_page) || + PageTransHuge(one_pair->from_page)) { + exchange_huge_page(one_pair->to_page, one_pair->from_page); + } else { + exchange_highpage(one_pair->to_page, one_pair->from_page); + } + } + } + + kfree(src_page_list); + kfree(dst_page_list); + + list_for_each_entry(one_pair, unmapped_list_ptr, list) { + exchange_page_flags(one_pair->to_page, one_pair->from_page); + } + + return rc; +} + +static int remove_migration_ptes_concur(struct list_head *unmapped_list_ptr) +{ + struct exchange_page_info *iterator; + + list_for_each_entry(iterator, unmapped_list_ptr, list) { + remove_migration_ptes(iterator->from_page, iterator->to_page, false); + remove_migration_ptes(iterator->to_page, iterator->from_page, false); + + + if (iterator->from_anon_vma) + put_anon_vma(iterator->from_anon_vma); + unlock_page(iterator->from_page); + + + if (iterator->to_anon_vma) + put_anon_vma(iterator->to_anon_vma); + unlock_page(iterator->to_page); + + + putback_lru_page(iterator->from_page); + iterator->from_page = NULL; + + putback_lru_page(iterator->to_page); + iterator->to_page = NULL; + } + + return 0; +} + +int exchange_pages_concur(struct list_head *exchange_list, + enum migrate_mode mode, int reason) +{ + struct exchange_page_info *one_pair, *one_pair2; + int pass = 0; + int retry = 1; + int nr_failed = 0; + int nr_succeeded = 0; + int rc = 0; + LIST_HEAD(serialized_list); + LIST_HEAD(unmapped_list); + + for(pass = 0; pass < 1 && retry; pass++) { + retry = 0; + + /* unmap and get new page for page_mapping(page) == NULL */ + list_for_each_entry_safe(one_pair, one_pair2, exchange_list, list) { + struct page *from_page = one_pair->from_page; + struct page *to_page = one_pair->to_page; + cond_resched(); + + if (page_count(from_page) == 1) { + /* page was freed from under us. So we are done */ + ClearPageActive(from_page); + ClearPageUnevictable(from_page); + + put_page(from_page); + dec_node_page_state(from_page, NR_ISOLATED_ANON + + page_is_file_cache(from_page)); + + if (page_count(to_page) == 1) { + ClearPageActive(to_page); + ClearPageUnevictable(to_page); + put_page(to_page); + } else { + mod_node_page_state(page_pgdat(to_page), NR_ISOLATED_ANON + + page_is_file_cache(to_page), -hpage_nr_pages(to_page)); + putback_lru_page(to_page); + } + list_del(&one_pair->list); + + continue; + } + + if (page_count(to_page) == 1) { + /* page was freed from under us. So we are done */ + ClearPageActive(to_page); + ClearPageUnevictable(to_page); + + put_page(to_page); + + dec_node_page_state(to_page, NR_ISOLATED_ANON + + page_is_file_cache(to_page)); + + mod_node_page_state(page_pgdat(from_page), NR_ISOLATED_ANON + + page_is_file_cache(from_page), -hpage_nr_pages(from_page)); + putback_lru_page(from_page); + + list_del(&one_pair->list); + continue; + } + /* We do not exchange huge pages and file-backed pages concurrently */ + if (PageHuge(one_pair->from_page) || PageHuge(one_pair->to_page)) { + rc = -ENODEV; + } + else if ((page_mapping(one_pair->from_page) != NULL) || + (page_mapping(one_pair->from_page) != NULL)) { + rc = -ENODEV; + } + else + rc = unmap_pair_pages_concur(one_pair, 1, mode); + + switch(rc) { + case -ENODEV: + list_move(&one_pair->list, &serialized_list); + break; + case -ENOMEM: + goto out; + case -EAGAIN: + retry++; + break; + case MIGRATEPAGE_SUCCESS: + list_move(&one_pair->list, &unmapped_list); + nr_succeeded++; + break; + default: + /* + * Permanent failure (-EBUSY, -ENOSYS, etc.): + * unlike -EAGAIN case, the failed page is + * removed from migration page list and not + * retried in the next outer loop. + */ + list_move(&one_pair->list, &serialized_list); + nr_failed++; + break; + } + } + + /* move page->mapping to new page, only -EAGAIN could happen */ + exchange_page_mapping_concur(&unmapped_list, exchange_list, mode); + + + /* copy pages in unmapped_list */ + exchange_page_data_concur(&unmapped_list, mode); + + + /* remove migration pte, if old_page is NULL?, unlock old and new + * pages, put anon_vma, put old and new pages */ + remove_migration_ptes_concur(&unmapped_list); + } + + nr_failed += retry; + rc = nr_failed; + + exchange_pages(&serialized_list, mode, reason); +out: + list_splice(&unmapped_list, exchange_list); + list_splice(&serialized_list, exchange_list); + + return nr_failed?-EFAULT:0; +} diff --git a/mm/exchange_page.c b/mm/exchange_page.c index 6054697..5dba0a6 100644 --- a/mm/exchange_page.c +++ b/mm/exchange_page.c @@ -126,7 +126,6 @@ int exchange_page_lists_mthread(struct page **to, struct page **from, int nr_pag int to_node = page_to_nid(*to); int i; struct copy_page_info *work_items; - int nr_pages_per_page = hpage_nr_pages(*from); const struct cpumask *per_node_cpumask = cpumask_of_node(to_node); int cpu_id_list[32] = {0}; int cpu; From patchwork Thu Apr 4 02:00:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884765 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF6CC17E9 for ; Thu, 4 Apr 2019 02:02:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9816B28929 for ; Thu, 4 Apr 2019 02:02:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B37A28928; Thu, 4 Apr 2019 02:02:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 552FD28928 for ; Thu, 4 Apr 2019 02:02:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DC5936B0274; Wed, 3 Apr 2019 22:01:42 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D4E936B0275; Wed, 3 Apr 2019 22:01:42 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEDA36B0276; Wed, 3 Apr 2019 22:01:42 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 995566B0274 for ; Wed, 3 Apr 2019 22:01:42 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id f15so939981qtk.16 for ; Wed, 03 Apr 2019 19:01:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=wKf/HaqlNMzdplmWVUVv2uQOEaqupeioEqpo1cew5Lw=; b=oB+5ETBeGJf99ppWsv6tMO1BMKekNTl5+Dq+o3n2CMoYt8BHmUpIpPre6p0CosgalB 1hpUMF5jRxtyJgxCJIRt0GbFCs16L74Ic9CScltOJ7Qtb0in91RoUYxihmp5fth5kPes vDzKUmRks+o5O1Dfpnhh7B3fnGb9bmHXgkRsfTfTtnVvxbJwAHvp8+OzBYTu9eIwJQ/a Oif6kEwANJoTvzuaw3Nl/INM9qLzcSz1Xe9PSj0WVBRi4ky3aGsynHwqERbDoVzhQTY2 U6yrtRpmGPL58P8TWJh5Xkjlr+/hRoW5sQzgGPEEGEsKehvjwF9sTq8tFSyPcn2M/6oQ v8Mg== X-Gm-Message-State: APjAAAVlvVrKVMei1aQNIooT3xyIG8BTygu3Kwt/ZZR7sPEUC/qogrm2 LIz+LQLAJFENfihRntEjlC1wdSqVPvfZngSYNlxG0p792lfMwRBVlGCiDMh07qEPIeCf9e+r4ek eviRYUQw4Ey/YH4BhTIjcD9KI5xazNNskDHqfbAArXaAOapXtP916eyFneyd4a2QZGw== X-Received: by 2002:ae9:ed4c:: with SMTP id c73mr2970984qkg.192.1554343302332; Wed, 03 Apr 2019 19:01:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqzBAjMnFUQ/E+7rHqWDiHIswjw2WXMkXBchHrAtnH9V46cJlnhcFd6fuvRX5oAfVMWu8RTw X-Received: by 2002:ae9:ed4c:: with SMTP id c73mr2970904qkg.192.1554343301109; Wed, 03 Apr 2019 19:01:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343301; cv=none; d=google.com; s=arc-20160816; b=BRHKnq+lw9hk//QLgQ9MogjQumntWJZM3vKYi2Y+DnyIbnnyP40zDiT+tyu7jKnZdW 6qIPgqpY1hStzZW2hn7e1XAMQnknvvNnFW2NHOUwYUmeVTYpCknIDgEbFxrxVn/k+Cyq 1WSIJitdmTebLELfiYJQn947/7/HnuW7Cw6p+a2iSyPqDt7JFoLmWlJ4Wttp4liv57tm dJVEPtYILN/jxMuLNiG3rvTCbQvFMxjgOLW6fPByPUSEtWw9uOdG7OyxCr2F4PyAmQPN +GjgDa36EeY12yvIu3MWtwG3XINEY0g5OjBBZPZJ4vC4xt6NVcAx3MS1TGtKIFcrmMDY 8tJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=wKf/HaqlNMzdplmWVUVv2uQOEaqupeioEqpo1cew5Lw=; b=Fy7ErwIsMylklCy6iuQ8ajvzpNQZ83VcpKukkHATyGV3uqEE2Q6D7zIsUf4Y552TXS Or6ei/WzNLSOANb6nUeqxHOucqKqY0BcOq9jUTlWmq3CxnfFbofBHi8Ytpqff7E7q2Gl BzUMiBlJtDc37pmH//P6Idw4rd0Nf/pjHscJuOxMFbYxeQM5qzo2DZvW4VknPCfJHYUE YPnq6W31FkIBpNr3YPuBCAXAnTYe4jzvUUwn3WKcW6I2IIjcRIRLwGDk2BScXC3Bx42w s1vmDYH4pcHMzxEmyLGWYagmI+saEmREoS4eEk060j5H8GENgzUh8ZkUnsy8wj/w83et jWrQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b="SvhL/4Py"; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=PJu+qrR3; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id a22si784611qkl.117.2019.04.03.19.01.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:41 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b="SvhL/4Py"; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=PJu+qrR3; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id CA0F722A3B; Wed, 3 Apr 2019 22:01:40 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=wKf/HaqlNMzdp lmWVUVv2uQOEaqupeioEqpo1cew5Lw=; b=SvhL/4PyiHIT2BO3xP9K8uRh4ljRG ZPijSt7CNCgrO/LaKjf49+BmticX/iJkRnZpHKW9D3KLRNpf9lAylwKVFFm32dn+ +uv3CBXZWQjj4NFRUVbYTrMVrzNL0DLd4w+wpEv7ZyQB3X4Y5oJ6zd6O0s1P3E+U McpWk2Bqgrgb9ci0c8AXi+kitUg7XaCUzziiWmsrrJgbrKmBh9fygfLMzoDG/OLM FywVTT6tu7PLjL8SBeFekRrqEvJJn2Ar//CE4+oNw+SLsp1pWOSb31LbeU1vcqck /6/zxHmyUBO8xRdgEOkRqBXO65RK0sLfSOM11jFWHbxvUi817jgOEHktw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=wKf/HaqlNMzdplmWVUVv2uQOEaqupeioEqpo1cew5Lw=; b=PJu+qrR3 v7NmlvIjDSv6zd8YKIW71Hg5az/BRBlW9T4Zq4z8dcawQEas8L7s7E5FF8EZ08ai FIaqaZPAOIVBNWcDTshbxK7aOoic8THiPnjzb9QsHLbV3pFljBWDHlz75TdNKx5x ihuIW4a7KFz9mn2Bz/XlWCNbwF8HWdN7u0H7hPWu4DkInEdt6sx1/35MVopCmH+o ekb/hdToj4XpfYfO40wj8JufzH+/XieJKy6RydkwG70V9lPQtm87MAiRBkYV6kKf xebYrY5QijV+ZzuRZ7pHIBt/5UXgbPiErPP3DxmjMBxjGmLgZmUkGbTUmnb8iyMD lh8YtYCOOPRhbg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepudeg X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 94A6710319; Wed, 3 Apr 2019 22:01:38 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 15/25] exchange pages: exchange anonymous page and file-backed page. Date: Wed, 3 Apr 2019 19:00:36 -0700 Message-Id: <20190404020046.32741-16-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan This is only done for the basic exchange pages, because we might need to lock multiple files when doing concurrent exchange pages, which could cause deadlocks easily. Signed-off-by: Zi Yan --- mm/exchange.c | 284 ++++++++++++++++++++++++++++++++++++++++++++++------------ mm/internal.h | 9 ++ mm/migrate.c | 6 +- 3 files changed, 241 insertions(+), 58 deletions(-) diff --git a/mm/exchange.c b/mm/exchange.c index bbada58..555a72c 100644 --- a/mm/exchange.c +++ b/mm/exchange.c @@ -20,6 +20,8 @@ #include #include #include +#include /* buffer_migrate_page */ +#include #include "internal.h" @@ -147,8 +149,6 @@ static void exchange_page_flags(struct page *to_page, struct page *from_page) from_page_flags.page_is_idle = page_is_idle(from_page); clear_page_idle(from_page); from_page_flags.page_swapcache = PageSwapCache(from_page); - from_page_flags.page_private = PagePrivate(from_page); - ClearPagePrivate(from_page); from_page_flags.page_writeback = test_clear_page_writeback(from_page); @@ -170,8 +170,6 @@ static void exchange_page_flags(struct page *to_page, struct page *from_page) to_page_flags.page_is_idle = page_is_idle(to_page); clear_page_idle(to_page); to_page_flags.page_swapcache = PageSwapCache(to_page); - to_page_flags.page_private = PagePrivate(to_page); - ClearPagePrivate(to_page); to_page_flags.page_writeback = test_clear_page_writeback(to_page); /* set to_page */ @@ -268,18 +266,22 @@ static void exchange_page_flags(struct page *to_page, struct page *from_page) static int exchange_page_move_mapping(struct address_space *to_mapping, struct address_space *from_mapping, struct page *to_page, struct page *from_page, + struct buffer_head *to_head, struct buffer_head *from_head, enum migrate_mode mode, int to_extra_count, int from_extra_count) { - int to_expected_count = 1 + to_extra_count, - from_expected_count = 1 + from_extra_count; - unsigned long from_page_index = page_index(from_page), - to_page_index = page_index(to_page); + int to_expected_count = expected_page_refs(to_mapping, to_page) + to_extra_count, + from_expected_count = expected_page_refs(from_mapping, from_page) + from_extra_count; + unsigned long from_page_index = from_page->index; + unsigned long to_page_index = to_page->index; int to_swapbacked = PageSwapBacked(to_page), from_swapbacked = PageSwapBacked(from_page); - struct address_space *to_mapping_value = to_page->mapping, - *from_mapping_value = from_page->mapping; + struct address_space *to_mapping_value = to_page->mapping; + struct address_space *from_mapping_value = from_page->mapping; + VM_BUG_ON_PAGE(to_mapping != page_mapping(to_page), to_page); + VM_BUG_ON_PAGE(from_mapping != page_mapping(from_page), from_page); + VM_BUG_ON(PageCompound(from_page) != PageCompound(to_page)); if (!to_mapping) { /* Anonymous page without mapping */ @@ -293,26 +295,125 @@ static int exchange_page_move_mapping(struct address_space *to_mapping, return -EAGAIN; } - /* - * Now we know that no one else is looking at the page: - * no turning back from here. - */ - /* from_page */ - from_page->index = to_page_index; - from_page->mapping = to_mapping_value; + /* both are anonymous pages */ + if (!from_mapping && !to_mapping) { + /* from_page */ + from_page->index = to_page_index; + from_page->mapping = to_mapping_value; + + ClearPageSwapBacked(from_page); + if (to_swapbacked) + SetPageSwapBacked(from_page); + + + /* to_page */ + to_page->index = from_page_index; + to_page->mapping = from_mapping_value; + + ClearPageSwapBacked(to_page); + if (from_swapbacked) + SetPageSwapBacked(to_page); + } else if (!from_mapping && to_mapping) { + /* from is anonymous, to is file-backed */ + XA_STATE(to_xas, &to_mapping->i_pages, page_index(to_page)); + struct zone *from_zone, *to_zone; + int dirty; + + from_zone = page_zone(from_page); + to_zone = page_zone(to_page); + + xas_lock_irq(&to_xas); + + if (page_count(to_page) != to_expected_count || + xas_load(&to_xas) != to_page) { + xas_unlock_irq(&to_xas); + return -EAGAIN; + } + + if (!page_ref_freeze(to_page, to_expected_count)) { + xas_unlock_irq(&to_xas); + pr_debug("cannot freeze page count\n"); + return -EAGAIN; + } + + if (!page_ref_freeze(from_page, from_expected_count)) { + page_ref_unfreeze(to_page, to_expected_count); + xas_unlock_irq(&to_xas); + + return -EAGAIN; + } + /* + * Now we know that no one else is looking at the page: + * no turning back from here. + */ + ClearPageSwapBacked(from_page); + ClearPageSwapBacked(to_page); + + /* from_page */ + from_page->index = to_page_index; + from_page->mapping = to_mapping_value; + /* to_page */ + to_page->index = from_page_index; + to_page->mapping = from_mapping_value; + + if (to_swapbacked) + __SetPageSwapBacked(from_page); + else + VM_BUG_ON_PAGE(PageSwapCache(to_page), to_page); - ClearPageSwapBacked(from_page); - if (to_swapbacked) - SetPageSwapBacked(from_page); + if (from_swapbacked) + __SetPageSwapBacked(to_page); + else + VM_BUG_ON_PAGE(PageSwapCache(from_page), from_page); + dirty = PageDirty(to_page); - /* to_page */ - to_page->index = from_page_index; - to_page->mapping = from_mapping_value; + xas_store(&to_xas, from_page); + if (PageTransHuge(to_page)) { + int i; + for (i = 1; i < HPAGE_PMD_NR; i++) { + xas_next(&to_xas); + xas_store(&to_xas, from_page + i); + } + } + + /* move cache reference */ + page_ref_unfreeze(to_page, to_expected_count - hpage_nr_pages(to_page)); + page_ref_unfreeze(from_page, from_expected_count + hpage_nr_pages(from_page)); + + xas_unlock(&to_xas); + + /* + * If moved to a different zone then also account + * the page for that zone. Other VM counters will be + * taken care of when we establish references to the + * new page and drop references to the old page. + * + * Note that anonymous pages are accounted for + * via NR_FILE_PAGES and NR_ANON_MAPPED if they + * are mapped to swap space. + */ + if (to_zone != from_zone) { + __dec_node_state(to_zone->zone_pgdat, NR_FILE_PAGES); + __inc_node_state(from_zone->zone_pgdat, NR_FILE_PAGES); + if (PageSwapBacked(to_page) && !PageSwapCache(to_page)) { + __dec_node_state(to_zone->zone_pgdat, NR_SHMEM); + __inc_node_state(from_zone->zone_pgdat, NR_SHMEM); + } + if (dirty && mapping_cap_account_dirty(to_mapping)) { + __dec_node_state(to_zone->zone_pgdat, NR_FILE_DIRTY); + __dec_zone_state(to_zone, NR_ZONE_WRITE_PENDING); + __inc_node_state(from_zone->zone_pgdat, NR_FILE_DIRTY); + __inc_zone_state(from_zone, NR_ZONE_WRITE_PENDING); + } + } + local_irq_enable(); - ClearPageSwapBacked(to_page); - if (from_swapbacked) - SetPageSwapBacked(to_page); + } else { + /* from is file-backed to is anonymous: fold this to the case above */ + /* both are file-backed */ + VM_BUG_ON(1); + } return MIGRATEPAGE_SUCCESS; } @@ -322,6 +423,7 @@ static int exchange_from_to_pages(struct page *to_page, struct page *from_page, { int rc = -EBUSY; struct address_space *to_page_mapping, *from_page_mapping; + struct buffer_head *to_head = NULL, *to_bh = NULL; VM_BUG_ON_PAGE(!PageLocked(from_page), from_page); VM_BUG_ON_PAGE(!PageLocked(to_page), to_page); @@ -330,15 +432,71 @@ static int exchange_from_to_pages(struct page *to_page, struct page *from_page, to_page_mapping = page_mapping(to_page); from_page_mapping = page_mapping(from_page); + /* from_page has to be anonymous page */ BUG_ON(from_page_mapping); - BUG_ON(to_page_mapping); - BUG_ON(PageWriteback(from_page)); + /* writeback has to finish */ BUG_ON(PageWriteback(to_page)); - /* actual page mapping exchange */ - rc = exchange_page_move_mapping(to_page_mapping, from_page_mapping, - to_page, from_page, mode, 0, 0); + /* to_page is anonymous */ + if (!to_page_mapping) { +exchange_mappings: + /* actual page mapping exchange */ + rc = exchange_page_move_mapping(to_page_mapping, from_page_mapping, + to_page, from_page, NULL, NULL, mode, 0, 0); + } else { + if (to_page_mapping->a_ops->migratepage == buffer_migrate_page) { + if (!page_has_buffers(to_page)) + goto exchange_mappings; + + to_head = page_buffers(to_page); + + rc = exchange_page_move_mapping(to_page_mapping, + from_page_mapping, to_page, from_page, + to_head, NULL, mode, 0, 0); + + if (rc != MIGRATEPAGE_SUCCESS) + return rc; + + /* + * In the async case, migrate_page_move_mapping locked the buffers + * with an IRQ-safe spinlock held. In the sync case, the buffers + * need to be locked now + */ + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_ASYNC) + BUG_ON(!buffer_migrate_lock_buffers(to_head, mode)); + + ClearPagePrivate(to_page); + set_page_private(from_page, page_private(to_page)); + set_page_private(to_page, 0); + /* transfer private page count */ + put_page(to_page); + get_page(from_page); + + to_bh = to_head; + do { + set_bh_page(to_bh, from_page, bh_offset(to_bh)); + to_bh = to_bh->b_this_page; + + } while (to_bh != to_head); + + SetPagePrivate(from_page); + + to_bh = to_head; + } else if (!to_page_mapping->a_ops->migratepage) { + /* fallback_migrate_page */ + if (PageDirty(to_page)) { + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC) + return -EBUSY; + return writeout(to_page_mapping, to_page); + } + if (page_has_private(to_page) && + !try_to_release_page(to_page, GFP_KERNEL)) + return -EAGAIN; + + goto exchange_mappings; + } + } /* actual page data exchange */ if (rc != MIGRATEPAGE_SUCCESS) return rc; @@ -356,8 +514,28 @@ static int exchange_from_to_pages(struct page *to_page, struct page *from_page, rc = 0; } + /* + * 1. buffer_migrate_page: + * private flag should be transferred from to_page to from_page + * + * 2. anon<->anon, fallback_migrate_page: + * both have none private flags or to_page's is cleared. + * */ + VM_BUG_ON(!((page_has_private(from_page) && !page_has_private(to_page)) || + (!page_has_private(from_page) && !page_has_private(to_page)))); + exchange_page_flags(to_page, from_page); + if (to_bh) { + VM_BUG_ON(to_bh != to_head); + do { + unlock_buffer(to_bh); + put_bh(to_bh); + to_bh = to_bh->b_this_page; + + } while (to_bh != to_head); + } + return rc; } @@ -369,34 +547,12 @@ static int unmap_and_exchange(struct page *from_page, struct page *to_page, pgoff_t from_index, to_index; struct anon_vma *from_anon_vma = NULL, *to_anon_vma = NULL; - /* from_page lock down */ if (!trylock_page(from_page)) { if ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC) goto out; - lock_page(from_page); } - BUG_ON(PageWriteback(from_page)); - - /* - * By try_to_unmap(), page->mapcount goes down to 0 here. In this case, - * we cannot notice that anon_vma is freed while we migrates a page. - * This get_anon_vma() delays freeing anon_vma pointer until the end - * of migration. File cache pages are no problem because of page_lock() - * File Caches may use write_page() or lock_page() in migration, then, - * just care Anon page here. - * - * Only page_get_anon_vma() understands the subtleties of - * getting a hold on an anon_vma from outside one of its mms. - * But if we cannot get anon_vma, then we won't need it anyway, - * because that implies that the anon page is no longer mapped - * (and cannot be remapped so long as we hold the page lock). - */ - if (PageAnon(from_page) && !PageKsm(from_page)) - from_anon_vma = page_get_anon_vma(from_page); - - /* to_page lock down */ if (!trylock_page(to_page)) { if ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC) goto out_unlock; @@ -404,7 +560,22 @@ static int unmap_and_exchange(struct page *from_page, struct page *to_page, lock_page(to_page); } - BUG_ON(PageWriteback(to_page)); + /* from_page is supposed to be an anonymous page */ + VM_BUG_ON_PAGE(PageWriteback(from_page), from_page); + + if (PageWriteback(to_page)) { + /* + * Only in the case of a full synchronous migration is it + * necessary to wait for PageWriteback. In the async case, + * the retry loop is too short and in the sync-light case, + * the overhead of stalling is too much + */ + if ((mode & MIGRATE_MODE_MASK) != MIGRATE_SYNC) { + rc = -EBUSY; + goto out_unlock; + } + wait_on_page_writeback(to_page); + } /* * By try_to_unmap(), page->mapcount goes down to 0 here. In this case, @@ -420,6 +591,9 @@ static int unmap_and_exchange(struct page *from_page, struct page *to_page, * because that implies that the anon page is no longer mapped * (and cannot be remapped so long as we hold the page lock). */ + if (PageAnon(from_page) && !PageKsm(from_page)) + from_anon_vma = page_get_anon_vma(from_page); + if (PageAnon(to_page) && !PageKsm(to_page)) to_anon_vma = page_get_anon_vma(to_page); @@ -753,7 +927,7 @@ static int exchange_page_mapping_concur(struct list_head *unmapped_list_ptr, /* actual page mapping exchange */ rc = exchange_page_move_mapping(to_page_mapping, from_page_mapping, - to_page, from_page, mode, 0, 0); + to_page, from_page, NULL, NULL, mode, 0, 0); if (rc) { if (one_pair->from_page_was_mapped) diff --git a/mm/internal.h b/mm/internal.h index a039459..cf63bf6 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -566,4 +566,13 @@ extern int exchange_page_mthread(struct page *to, struct page *from, extern int exchange_page_lists_mthread(struct page **to, struct page **from, int nr_pages); + +extern int exchange_two_pages(struct page *page1, struct page *page2); + +bool buffer_migrate_lock_buffers(struct buffer_head *head, + enum migrate_mode mode); +int writeout(struct address_space *mapping, struct page *page); +int expected_page_refs(struct address_space *mapping, struct page *page); + + #endif /* __MM_INTERNAL_H */ diff --git a/mm/migrate.c b/mm/migrate.c index ad02797..a0ca817 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -385,7 +385,7 @@ void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t *pmd) } #endif -static int expected_page_refs(struct address_space *mapping, struct page *page) +int expected_page_refs(struct address_space *mapping, struct page *page) { int expected_count = 1; @@ -732,7 +732,7 @@ EXPORT_SYMBOL(migrate_page); #ifdef CONFIG_BLOCK /* Returns true if all buffers are successfully locked */ -static bool buffer_migrate_lock_buffers(struct buffer_head *head, +bool buffer_migrate_lock_buffers(struct buffer_head *head, enum migrate_mode mode) { struct buffer_head *bh = head; @@ -880,7 +880,7 @@ int buffer_migrate_page_norefs(struct address_space *mapping, /* * Writeback a page to clean the dirty state */ -static int writeout(struct address_space *mapping, struct page *page) +int writeout(struct address_space *mapping, struct page *page) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, From patchwork Thu Apr 4 02:00:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884767 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 98C971708 for ; Thu, 4 Apr 2019 02:02:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7CFC8288E4 for ; Thu, 4 Apr 2019 02:02:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 70EBD28952; Thu, 4 Apr 2019 02:02:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0FA0C28929 for ; Thu, 4 Apr 2019 02:02:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7D8D06B0275; Wed, 3 Apr 2019 22:01:44 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 75B166B0276; Wed, 3 Apr 2019 22:01:44 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D4616B0277; Wed, 3 Apr 2019 22:01:44 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 3CE4E6B0275 for ; Wed, 3 Apr 2019 22:01:44 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id q127so958751qkd.2 for ; Wed, 03 Apr 2019 19:01:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=6c2RdtcLhR3zr/QIZTKdBARjyyIAyxC57tVZiXAXuu4=; b=I6R3SfvAMT8toV3P1v9/lmyachzkTf4zyLwTKkcm4uO304Hwq922QyB2MVS3adFoIc 7Nz9ddLoSSTC9oKclD4WqSE0ZegROlcoGDO4vZ32m3Ai8F+zfebv0qqooeqgvnXR/gjM rJXS+d1UyVshO07qzymXEnWcrpXcS5qlIG9xs/ORTjbO/jzDuCV0weDi5UUfn6//2juM 7/Oz9r2858jxXLwxO3qTZZj5vrUOv/BnpOEgs4EoPRAinhDp7c9o5t3ZFet2EvDDhw9B CfJsGcnBGLQ5EWKrILF/AFRtUYnoP8+pagC2OzIWoTARk5i1wdl3MFkP4gqMbuH9W4Li tmTw== X-Gm-Message-State: APjAAAVIwFhL+dHc0di49IKyTg6o7jVtWP0k7crJbhF68NvlznPzfi0h 1FugOohTvw6wml+qTzNhQX8QhsWXAUZ/TWWYz535g4u+m+fDfB28tTBsfU7wL+jf+LWEnQRn/L2 Hcvhr9RZaybHyxy13TXBtYUxlAQeEiwT+P6JXGG110RujLnYrQGNC+0HaPgy7iMZt4A== X-Received: by 2002:a05:620a:12ea:: with SMTP id f10mr2864236qkl.86.1554343303990; Wed, 03 Apr 2019 19:01:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqyu0qiRU9BKfL24/g/A8cUad94d3kBtnQmbLPrLMBfNfErAC+VVZJfaZGiSczJl0Kfiezqk X-Received: by 2002:a05:620a:12ea:: with SMTP id f10mr2864157qkl.86.1554343302628; Wed, 03 Apr 2019 19:01:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343302; cv=none; d=google.com; s=arc-20160816; b=PgZ9b3OcZGCHGnTAdr3VH4x2YfeELobpYoqPs5kWn2DwslEz3Y6pheSxaA5Bm37JQQ gNBcAijNiS21s+gcX33fuRCP7HJ7o4H4UbN/tUiPAiIs7bL24aL6tu0HvhQGJ+e0Vypc QL7Ge/fM/91fn7Kf8CbBVUkdttCIBBTktfMbHPPTQSqZjlMOj3ajiQ3vTkzYXNCvaciX 1eLVwJo2+owSm1P1qVsLJqu9WleC4dQ4dyxdZCPsZUnCP1LmaNuRR+GGGrHggcWWgQ6k RBuZlNz6bCXhPo9lZt/+pFYWPxq5i3ApXBJYPbQxUbLf4U35EHNfX5cAN82u7xCUt1YF RnOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=6c2RdtcLhR3zr/QIZTKdBARjyyIAyxC57tVZiXAXuu4=; b=qdGsgbiMnb+74Xy7Oi7uKINMRSaF2hooCN9x/UuF27EADZDA/HYfKLeus5kDRsfWoR XzxT9tTuMAoKalJk8E3WunMFKaZvqj1zpEKVoMmximFYIQz/eRPSsARSuRwc51T8+7Ml Mk90fvzq9zAVPiMbsuVKovf6PZzkjF/GbGt3Q79nClifPXqu0LBinZl6931NQSb9sSi6 3WGqhHAbC7rcvL9ZR1opk/qhCjW10SBzyi0weWpK88pnJJoHTiGWK4JPdbHTIc+HuK/Q PCEiJGz8p6YDjenbXNt9aFJWpxYIuZJRxwrgyHwQYP8N+6zEr9y1U3Bl5rwfahvBWXTh IzlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=J5GTuTLS; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=ebgaOEje; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id c197si6198289qkb.179.2019.04.03.19.01.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:42 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=J5GTuTLS; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=ebgaOEje; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 5244B21D72; Wed, 3 Apr 2019 22:01:42 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=6c2RdtcLhR3zr /QIZTKdBARjyyIAyxC57tVZiXAXuu4=; b=J5GTuTLSH8I1aHJyQfQrBoaGcvamV Iy8yv5kFP32GM5ir8TPfMhhCPPk3V+sNQppl24SNqLUbOygmeyWSCzyQjonRsOnl tcSWm+2A7L458T+/mandh3ct2dyEGuYUpIs62fpBvTKjtENPvFlIDu6f2HJDu1UK 4KH8zOxES+AX896LvQ4WVZfyxejccMtQawEM2rt7hN5IaH7VWtDv5+05bQv7tFIr IQnA+iGpPB7BxJ982qBwgvksaGAYmjUPS8HRqb3/P3hhHWkL3iIp/FXyRx97ZkLQ friDO8GWmqJ/1M5tITTcmJSiauq4zI9FzbN2iAS3FZcPBeJB++vs+buHw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=6c2RdtcLhR3zr/QIZTKdBARjyyIAyxC57tVZiXAXuu4=; b=ebgaOEje XfLeeFoFbbWGsCIilHerM8zpmpJy9VuAqUO8SPANbdWIprBpImwjQmdZrH9GhpvA 8y22W/Wbc/CMgPkQ7+k0faTquoLFuevD9AjwlQ63EfXd+Y1wrTLE0ZfKH0VUhShI dJ5kANFw0ck1R3xaJJZeupCViLpQn2+MH8mAN6VLz/6fQocCYM0g/qo1lRfhFAjO F0z1lvMra0Cnmab/qAoltLHx75BjisDU2yXcVz4N9+Vz44Et5eOoxHY0Cfo7dgLe /SRmB7VMc6yETKQshlBMU15sjPcoO2PI0zneCamMhkQzLJ4Yn7PoWOobQJEEFgTh QlriM+MpCcwWfA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepudeg X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 5D7D01030F; Wed, 3 Apr 2019 22:01:40 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 16/25] exchange page: Add THP exchange support. Date: Wed, 3 Apr 2019 19:00:37 -0700 Message-Id: <20190404020046.32741-17-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Enable exchange THPs in the process. It also need to take care of exchanging PTE-mapped THPs. Signed-off-by: Zi Yan --- include/linux/exchange.h | 2 ++ mm/exchange.c | 73 +++++++++++++++++++++++++++++++++++++----------- mm/migrate.c | 2 +- 3 files changed, 60 insertions(+), 17 deletions(-) diff --git a/include/linux/exchange.h b/include/linux/exchange.h index 20d2184..8785d08 100644 --- a/include/linux/exchange.h +++ b/include/linux/exchange.h @@ -14,6 +14,8 @@ struct exchange_page_info { int from_page_was_mapped; int to_page_was_mapped; + pgoff_t from_index, to_index; + struct list_head list; }; diff --git a/mm/exchange.c b/mm/exchange.c index 555a72c..45c7013 100644 --- a/mm/exchange.c +++ b/mm/exchange.c @@ -51,7 +51,8 @@ struct page_flags { unsigned int page_swapcache:1; unsigned int page_writeback:1; unsigned int page_private:1; - unsigned int __pad:3; + unsigned int page_doublemap:1; + unsigned int __pad:2; }; @@ -127,20 +128,23 @@ static void exchange_huge_page(struct page *dst, struct page *src) static void exchange_page_flags(struct page *to_page, struct page *from_page) { int from_cpupid, to_cpupid; - struct page_flags from_page_flags, to_page_flags; + struct page_flags from_page_flags = {0}, to_page_flags = {0}; struct mem_cgroup *to_memcg = page_memcg(to_page), *from_memcg = page_memcg(from_page); from_cpupid = page_cpupid_xchg_last(from_page, -1); - from_page_flags.page_error = TestClearPageError(from_page); + from_page_flags.page_error = PageError(from_page); + if (from_page_flags.page_error) + ClearPageError(from_page); from_page_flags.page_referenced = TestClearPageReferenced(from_page); from_page_flags.page_uptodate = PageUptodate(from_page); ClearPageUptodate(from_page); from_page_flags.page_active = TestClearPageActive(from_page); from_page_flags.page_unevictable = TestClearPageUnevictable(from_page); from_page_flags.page_checked = PageChecked(from_page); - ClearPageChecked(from_page); + if (from_page_flags.page_checked) + ClearPageChecked(from_page); from_page_flags.page_mappedtodisk = PageMappedToDisk(from_page); ClearPageMappedToDisk(from_page); from_page_flags.page_dirty = PageDirty(from_page); @@ -150,18 +154,22 @@ static void exchange_page_flags(struct page *to_page, struct page *from_page) clear_page_idle(from_page); from_page_flags.page_swapcache = PageSwapCache(from_page); from_page_flags.page_writeback = test_clear_page_writeback(from_page); + from_page_flags.page_doublemap = PageDoubleMap(from_page); to_cpupid = page_cpupid_xchg_last(to_page, -1); - to_page_flags.page_error = TestClearPageError(to_page); + to_page_flags.page_error = PageError(to_page); + if (to_page_flags.page_error) + ClearPageError(to_page); to_page_flags.page_referenced = TestClearPageReferenced(to_page); to_page_flags.page_uptodate = PageUptodate(to_page); ClearPageUptodate(to_page); to_page_flags.page_active = TestClearPageActive(to_page); to_page_flags.page_unevictable = TestClearPageUnevictable(to_page); to_page_flags.page_checked = PageChecked(to_page); - ClearPageChecked(to_page); + if (to_page_flags.page_checked) + ClearPageChecked(to_page); to_page_flags.page_mappedtodisk = PageMappedToDisk(to_page); ClearPageMappedToDisk(to_page); to_page_flags.page_dirty = PageDirty(to_page); @@ -171,6 +179,7 @@ static void exchange_page_flags(struct page *to_page, struct page *from_page) clear_page_idle(to_page); to_page_flags.page_swapcache = PageSwapCache(to_page); to_page_flags.page_writeback = test_clear_page_writeback(to_page); + to_page_flags.page_doublemap = PageDoubleMap(to_page); /* set to_page */ if (from_page_flags.page_error) @@ -197,6 +206,8 @@ static void exchange_page_flags(struct page *to_page, struct page *from_page) set_page_young(to_page); if (from_page_flags.page_is_idle) set_page_idle(to_page); + if (from_page_flags.page_doublemap) + SetPageDoubleMap(to_page); /* set from_page */ if (to_page_flags.page_error) @@ -223,6 +234,8 @@ static void exchange_page_flags(struct page *to_page, struct page *from_page) set_page_young(from_page); if (to_page_flags.page_is_idle) set_page_idle(from_page); + if (to_page_flags.page_doublemap) + SetPageDoubleMap(from_page); /* * Copy NUMA information to the new page, to prevent over-eager @@ -599,7 +612,6 @@ static int unmap_and_exchange(struct page *from_page, struct page *to_page, from_index = from_page->index; to_index = to_page->index; - /* * Corner case handling: * 1. When a new swap-cache page is read into, it is added to the LRU @@ -673,8 +685,6 @@ static int unmap_and_exchange(struct page *from_page, struct page *to_page, swap(from_page->index, from_index); } - - out_unlock_both: if (to_anon_vma) put_anon_vma(to_anon_vma); @@ -689,6 +699,23 @@ static int unmap_and_exchange(struct page *from_page, struct page *to_page, return rc; } +static bool can_be_exchanged(struct page *from, struct page *to) +{ + if (PageCompound(from) != PageCompound(to)) + return false; + + if (PageHuge(from) != PageHuge(to)) + return false; + + if (PageHuge(from) || PageHuge(to)) + return false; + + if (compound_order(from) != compound_order(to)) + return false; + + return true; +} + /* * Exchange pages in the exchange_list * @@ -745,7 +772,8 @@ int exchange_pages(struct list_head *exchange_list, } /* TODO: compound page not supported */ - if (PageCompound(from_page) || page_mapping(from_page)) { + if (!can_be_exchanged(from_page, to_page) || + page_mapping(from_page)) { ++failed; goto putback; } @@ -784,6 +812,8 @@ static int unmap_pair_pages_concur(struct exchange_page_info *one_pair, struct page *from_page = one_pair->from_page; struct page *to_page = one_pair->to_page; + one_pair->from_index = from_page->index; + one_pair->to_index = to_page->index; /* from_page lock down */ if (!trylock_page(from_page)) { if (!force || ((mode & MIGRATE_MODE_MASK) == MIGRATE_ASYNC)) @@ -903,7 +933,6 @@ static int exchange_page_mapping_concur(struct list_head *unmapped_list_ptr, struct list_head *exchange_list_ptr, enum migrate_mode mode) { - int rc = -EBUSY; int nr_failed = 0; struct address_space *to_page_mapping, *from_page_mapping; struct exchange_page_info *one_pair, *one_pair2; @@ -911,6 +940,7 @@ static int exchange_page_mapping_concur(struct list_head *unmapped_list_ptr, list_for_each_entry_safe(one_pair, one_pair2, unmapped_list_ptr, list) { struct page *from_page = one_pair->from_page; struct page *to_page = one_pair->to_page; + int rc = -EBUSY; VM_BUG_ON_PAGE(!PageLocked(from_page), from_page); VM_BUG_ON_PAGE(!PageLocked(to_page), to_page); @@ -926,8 +956,9 @@ static int exchange_page_mapping_concur(struct list_head *unmapped_list_ptr, BUG_ON(PageWriteback(to_page)); /* actual page mapping exchange */ - rc = exchange_page_move_mapping(to_page_mapping, from_page_mapping, - to_page, from_page, NULL, NULL, mode, 0, 0); + if (!page_mapped(from_page) && !page_mapped(to_page)) + rc = exchange_page_move_mapping(to_page_mapping, from_page_mapping, + to_page, from_page, NULL, NULL, mode, 0, 0); if (rc) { if (one_pair->from_page_was_mapped) @@ -954,7 +985,7 @@ static int exchange_page_mapping_concur(struct list_head *unmapped_list_ptr, one_pair->from_page = NULL; one_pair->to_page = NULL; - list_move(&one_pair->list, exchange_list_ptr); + list_del(&one_pair->list); ++nr_failed; } } @@ -1026,8 +1057,18 @@ static int remove_migration_ptes_concur(struct list_head *unmapped_list_ptr) struct exchange_page_info *iterator; list_for_each_entry(iterator, unmapped_list_ptr, list) { - remove_migration_ptes(iterator->from_page, iterator->to_page, false); - remove_migration_ptes(iterator->to_page, iterator->from_page, false); + struct page *from_page = iterator->from_page; + struct page *to_page = iterator->to_page; + + swap(from_page->index, iterator->from_index); + if (iterator->from_page_was_mapped) + remove_migration_ptes(iterator->from_page, iterator->to_page, false); + swap(from_page->index, iterator->from_index); + + swap(to_page->index, iterator->to_index); + if (iterator->to_page_was_mapped) + remove_migration_ptes(iterator->to_page, iterator->from_page, false); + swap(to_page->index, iterator->to_index); if (iterator->from_anon_vma) diff --git a/mm/migrate.c b/mm/migrate.c index a0ca817..da7af68 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -229,7 +229,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, if (PageKsm(page)) new = page; else - new = page - pvmw.page->index + + new = page - page->index + linear_page_index(vma, pvmw.address); #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION From patchwork Thu Apr 4 02:00:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884769 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6693B1708 for ; Thu, 4 Apr 2019 02:02:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4CEFA288E4 for ; Thu, 4 Apr 2019 02:02:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3FA3E28928; Thu, 4 Apr 2019 02:02:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 476C0288E4 for ; Thu, 4 Apr 2019 02:02:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC9666B0276; Wed, 3 Apr 2019 22:01:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B9F336B0277; Wed, 3 Apr 2019 22:01:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3EFB6B0278; Wed, 3 Apr 2019 22:01:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 7BD436B0276 for ; Wed, 3 Apr 2019 22:01:45 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id k13so921858qtc.23 for ; Wed, 03 Apr 2019 19:01:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=XrMS+H78cvhwFXbC0E6g3rjtcoxdIYJ+GaG+5au9LuY=; b=BUYo9c8s8s6zDrIzkk80hpHx42Tf0syoKPtQ9YjZJh5dk+nlVunHR5Jbl3701M1N0z fKOUyGbKZMmN4hhNrhmgy0eSmx/m6B3izSNcJ5ntejtANicaJsNA+q4+P3yTaRGOU/Qm fg2I7pYGe0x0qNsTSMdVL6fKR8/HcmmWSNqZbXMDmj2FFug1fXPcFeCJWK5/kfhypz63 ugXcibVdTnrcfVVoJZim6N03RNFWedahYEupFkMDnwKBqrhn8LELlhVQ2NWOGLeQbeTR VqEHX1njKzel+aajMXtMTNB6AwRcvL5TMTxZQnwRTRV0DL/jJkNnv0HHG1vY4q2nLVC3 EbsA== X-Gm-Message-State: APjAAAUAiGUHKigzO8KWU1nVQJH1O3prfBk8V02B6ije/jBk+dGy2pmL qIFZkjg+2b9kOKJsxMyyfroFum3K5lzdne7ebXo5NV0yjCT/fqqMtrse+JcSk7h57akqdTHg3i4 1cyQwxo4ECvL2cyiAmjyP/BaOEHzIl9qg8OdTQFPx3reJduxKzmVojsP42JObEq86Eg== X-Received: by 2002:ac8:f24:: with SMTP id e33mr3149416qtk.256.1554343305237; Wed, 03 Apr 2019 19:01:45 -0700 (PDT) X-Google-Smtp-Source: APXvYqyCtuJAxdEdYJXc3MSLye3Z0ngdsUph/QWg18BElHqAMmZuFFIg6c35memw1Axgyfrrbg3G X-Received: by 2002:ac8:f24:: with SMTP id e33mr3149362qtk.256.1554343304308; Wed, 03 Apr 2019 19:01:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343304; cv=none; d=google.com; s=arc-20160816; b=CHhH4G0G/z9cN1ra0N8sxAEqsXgNKqJl1umQuWbc4secgA1z20Fv4/V9nK/gzlPJvt 2/rRBR5g4KP6b1gFrJB51fNUszBnIUaL4R5qCHJ4I5u1TmC8MHe8t36oz2vxtcu7APBw I0PZTPNYQTmBbHMxP1WXzJ16jp8SbrwdCE+owsgGaU4uCeU0S9XCzxc8VCMbV7NpkNIO RspTuWPqZQgJtk/outZMLybrPwU0aYB9MS+h2/qYAG+iaJzQ4ViSftkAuxTlCn5gloYL uhwxBZVMF33nObkEKyQCI6sCYahhUdKM9YKyQ7yNLKdHE+AMQ4qVGXY3zd3kWagF/1c+ D8Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=XrMS+H78cvhwFXbC0E6g3rjtcoxdIYJ+GaG+5au9LuY=; b=XJPb5WLSin+A0CVhUyYCn3gvB9TmeIbG91aSOIfYruFWd0h6oNH3OAUkWZtlV9SM9r EGFzu7cth+cUiZSXwFSjzjAAqyFvyMjhNDRy9Sd4+6qgegiZ0FBK18A2CA9qIjMkRasn Dq1o6cETgLpVrSczkHcIHbGczwxkDS7oj/LJSL8gOjQZWfzA+Zn3M+ESsfNjzZBEVxfa ajPIX3ZJfPl9CWBG2AyRenZkf6hWMNAOCIFVFvM+8yzrovp8BDp/tsu7zi7Qsw8irHO4 xIcZRpfzxph6Ri2/+I4n6MFyAUuluvDynB6vqTxdikSTWQskvlFzWcE3+RsOWwGKgq5m MNlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=dFVS7mRN; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=wDoNd3M3; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id d1si2281346qkf.152.2019.04.03.19.01.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:44 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=dFVS7mRN; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=wDoNd3M3; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 0B22A22585; Wed, 3 Apr 2019 22:01:44 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=XrMS+H78cvhwF XbC0E6g3rjtcoxdIYJ+GaG+5au9LuY=; b=dFVS7mRNn0EutfXNuwXinLqM4jsU0 /G4Zfj5zPHgFLq4R45ZAwVO7wdd+4qeo+pNxUUt9RGp4RyQVOrBFfTlaZhap89mz GoEsvVhtqGntAJex0/3NiAsqKfcCAV5EKXzKZ6I9HeBQlP6OgLgjnbieNhUT+gyj 0B7we6pmbNbhBm8HKSiux8kyIpIq1508GQR6EBYolxpuk466prBlkBzxVR8PGDAr wklMI4S+HQIX1tba6rOZOGcPb7gTTpXY1zli5s5Wp9qBXc/67SjhgRmjYJ47qbod dRl4t+b4q53xMBZY5i7+Zr+lc0/tOYHIDkfG7PSSNwvY0v05Ap6Yq2/1Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=XrMS+H78cvhwFXbC0E6g3rjtcoxdIYJ+GaG+5au9LuY=; b=wDoNd3M3 aDj4rPP1tCcvD31CntXlDuJxGD+XgVLa1OLqTR62VjQ4UzUtUxt/DiZwcZzcLQ2S 02/rrlNA7xHgrXIvbUoegmUWnowjYGR8JZX7soycQgK3FQ/2TOnI3t1O+pZNG3fX aCiELZddV0qMktf34aBGO93JbBvQN5Qed7nhbyuaEWThSfQ1+3K4/ZPPR2rfRuiQ FIhX8XmTJAr/bJWavxyPXAN2mIiDDjEQCTv4IvYHYzo+cQJVftvnDLqLOkUyy2W5 aXCbwv5hnNZHJ21cGJTDDfCOoZdpx8yqh8MSklrhaZhnlbF1LUxSG1OaT17PCESi tDRkdKorKlcFKw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepudeg X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 13C5110310; Wed, 3 Apr 2019 22:01:42 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 17/25] exchange page: Add exchange_page() syscall. Date: Wed, 3 Apr 2019 19:00:38 -0700 Message-Id: <20190404020046.32741-18-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Users can use the syscall to exchange two lists of pages, similar to move_pages() syscall. Signed-off-by: Zi Yan --- arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux/syscalls.h | 5 + mm/exchange.c | 346 +++++++++++++++++++++++++++++++++ 3 files changed, 352 insertions(+) diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 92ee0b4..863a21e 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -343,6 +343,7 @@ 332 common statx __x64_sys_statx 333 common io_pgetevents __x64_sys_io_pgetevents 334 common rseq __x64_sys_rseq +335 common exchange_pages __x64_sys_exchange_pages # don't use numbers 387 through 423, add new calls after the last # 'common' entry 424 common pidfd_send_signal __x64_sys_pidfd_send_signal diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index e446806..2c1eb49 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1203,6 +1203,11 @@ asmlinkage long sys_mmap_pgoff(unsigned long addr, unsigned long len, unsigned long fd, unsigned long pgoff); asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg); +asmlinkage long sys_exchange_pages(pid_t pid, unsigned long nr_pages, + const void __user * __user *from_pages, + const void __user * __user *to_pages, + int __user *status, + int flags); /* * Not a real system call, but a placeholder for syscalls which are diff --git a/mm/exchange.c b/mm/exchange.c index 45c7013..48e344e 100644 --- a/mm/exchange.c +++ b/mm/exchange.c @@ -22,6 +22,7 @@ #include #include /* buffer_migrate_page */ #include +#include #include "internal.h" @@ -1212,3 +1213,348 @@ int exchange_pages_concur(struct list_head *exchange_list, return nr_failed?-EFAULT:0; } + +static int store_status(int __user *status, int start, int value, int nr) +{ + while (nr-- > 0) { + if (put_user(value, status + start)) + return -EFAULT; + start++; + } + + return 0; +} + +static int do_exchange_page_list(struct mm_struct *mm, + struct list_head *from_pagelist, struct list_head *to_pagelist, + bool migrate_mt, bool migrate_concur) +{ + int err; + struct exchange_page_info *one_pair; + LIST_HEAD(exchange_page_list); + + while (!list_empty(from_pagelist)) { + struct page *from_page, *to_page; + + from_page = list_first_entry_or_null(from_pagelist, struct page, lru); + to_page = list_first_entry_or_null(to_pagelist, struct page, lru); + + if (!from_page || !to_page) + break; + + one_pair = kzalloc(sizeof(struct exchange_page_info), GFP_ATOMIC); + if (!one_pair) { + err = -ENOMEM; + break; + } + + list_del(&from_page->lru); + list_del(&to_page->lru); + + one_pair->from_page = from_page; + one_pair->to_page = to_page; + + list_add_tail(&one_pair->list, &exchange_page_list); + } + + if (migrate_concur) + err = exchange_pages_concur(&exchange_page_list, + MIGRATE_SYNC | (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD), + MR_SYSCALL); + else + err = exchange_pages(&exchange_page_list, + MIGRATE_SYNC | (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD), + MR_SYSCALL); + + while (!list_empty(&exchange_page_list)) { + struct exchange_page_info *one_pair = + list_first_entry(&exchange_page_list, + struct exchange_page_info, list); + + list_del(&one_pair->list); + kfree(one_pair); + } + + if (!list_empty(from_pagelist)) + putback_movable_pages(from_pagelist); + + if (!list_empty(to_pagelist)) + putback_movable_pages(to_pagelist); + + return err; +} + +static int add_page_for_exchange(struct mm_struct *mm, + unsigned long from_addr, unsigned long to_addr, + struct list_head *from_pagelist, struct list_head *to_pagelist, + bool migrate_all) +{ + struct vm_area_struct *from_vma, *to_vma; + struct page *from_page, *to_page; + LIST_HEAD(err_page_list); + unsigned int follflags; + int err; + + err = -EFAULT; + from_vma = find_vma(mm, from_addr); + if (!from_vma || from_addr < from_vma->vm_start || + !vma_migratable(from_vma)) + goto set_from_status; + + /* FOLL_DUMP to ignore special (like zero) pages */ + follflags = FOLL_GET | FOLL_DUMP; + from_page = follow_page(from_vma, from_addr, follflags); + + err = PTR_ERR(from_page); + if (IS_ERR(from_page)) + goto set_from_status; + + err = -ENOENT; + if (!from_page) + goto set_from_status; + + err = -EACCES; + if (page_mapcount(from_page) > 1 && !migrate_all) + goto put_and_set_from_page; + + if (PageHuge(from_page)) { + if (PageHead(from_page)) + if (isolate_huge_page(from_page, &err_page_list)) { + err = 0; + } + goto put_and_set_from_page; + } else if (PageTransCompound(from_page)) { + if (PageTail(from_page)) { + err = -EACCES; + goto put_and_set_from_page; + } + } + + err = isolate_lru_page(from_page); + if (!err) + mod_node_page_state(page_pgdat(from_page), NR_ISOLATED_ANON + + page_is_file_cache(from_page), hpage_nr_pages(from_page)); +put_and_set_from_page: + /* + * Either remove the duplicate refcount from + * isolate_lru_page() or drop the page ref if it was + * not isolated. + * + * Since FOLL_GET calls get_page(), and isolate_lru_page() + * also calls get_page() + */ + put_page(from_page); +set_from_status: + if (err) + goto out; + + /* to pages */ + err = -EFAULT; + to_vma = find_vma(mm, to_addr); + if (!to_vma || + to_addr < to_vma->vm_start || + !vma_migratable(to_vma)) + goto set_to_status; + + /* FOLL_DUMP to ignore special (like zero) pages */ + to_page = follow_page(to_vma, to_addr, follflags); + + err = PTR_ERR(to_page); + if (IS_ERR(to_page)) + goto set_to_status; + + err = -ENOENT; + if (!to_page) + goto set_to_status; + + err = -EACCES; + if (page_mapcount(to_page) > 1 && + !migrate_all) + goto put_and_set_to_page; + + if (PageHuge(to_page)) { + if (PageHead(to_page)) + if (isolate_huge_page(to_page, &err_page_list)) { + err = 0; + } + goto put_and_set_to_page; + } else if (PageTransCompound(to_page)) { + if (PageTail(to_page)) { + err = -EACCES; + goto put_and_set_to_page; + } + } + + err = isolate_lru_page(to_page); + if (!err) + mod_node_page_state(page_pgdat(to_page), NR_ISOLATED_ANON + + page_is_file_cache(to_page), hpage_nr_pages(to_page)); +put_and_set_to_page: + /* + * Either remove the duplicate refcount from + * isolate_lru_page() or drop the page ref if it was + * not isolated. + * + * Since FOLL_GET calls get_page(), and isolate_lru_page() + * also calls get_page() + */ + put_page(to_page); +set_to_status: + if (!err) { + if ((PageHuge(from_page) != PageHuge(to_page)) || + (PageTransHuge(from_page) != PageTransHuge(to_page))) { + list_add(&from_page->lru, &err_page_list); + list_add(&to_page->lru, &err_page_list); + } else { + list_add_tail(&from_page->lru, from_pagelist); + list_add_tail(&to_page->lru, to_pagelist); + } + } else + list_add(&from_page->lru, &err_page_list); +out: + if (!list_empty(&err_page_list)) + putback_movable_pages(&err_page_list); + return err; +} +/* + * Migrate an array of page address onto an array of nodes and fill + * the corresponding array of status. + */ +static int do_pages_exchange(struct mm_struct *mm, nodemask_t task_nodes, + unsigned long nr_pages, + const void __user * __user *from_pages, + const void __user * __user *to_pages, + int __user *status, int flags) +{ + LIST_HEAD(from_pagelist); + LIST_HEAD(to_pagelist); + int start, i; + int err = 0, err1; + + migrate_prep(); + + down_read(&mm->mmap_sem); + for (i = start = 0; i < nr_pages; i++) { + const void __user *from_p, *to_p; + unsigned long from_addr, to_addr; + + err = -EFAULT; + if (get_user(from_p, from_pages + i)) + goto out_flush; + if (get_user(to_p, to_pages + i)) + goto out_flush; + + from_addr = (unsigned long)from_p; + to_addr = (unsigned long)to_p; + + err = -EACCES; + /* + * Errors in the page lookup or isolation are not fatal and we simply + * report them via status + */ + err = add_page_for_exchange(mm, from_addr, to_addr, + &from_pagelist, &to_pagelist, + flags & MPOL_MF_MOVE_ALL); + + if (!err) + continue; + + err = store_status(status, i, err, 1); + if (err) + goto out_flush; + + err = do_exchange_page_list(mm, &from_pagelist, &to_pagelist, + flags & MPOL_MF_MOVE_MT, + flags & MPOL_MF_MOVE_CONCUR); + if (err) + goto out; + if (i > start) { + err = store_status(status, start, 0, i - start); + if (err) + goto out; + } + start = i; + } +out_flush: + /* Make sure we do not overwrite the existing error */ + err1 = do_exchange_page_list(mm, &from_pagelist, &to_pagelist, + flags & MPOL_MF_MOVE_MT, + flags & MPOL_MF_MOVE_CONCUR); + if (!err1) + err1 = store_status(status, start, 0, i - start); + if (!err) + err = err1; +out: + up_read(&mm->mmap_sem); + return err; +} + +SYSCALL_DEFINE6(exchange_pages, pid_t, pid, unsigned long, nr_pages, + const void __user * __user *, from_pages, + const void __user * __user *, to_pages, + int __user *, status, int, flags) +{ + const struct cred *cred = current_cred(), *tcred; + struct task_struct *task; + struct mm_struct *mm; + int err; + nodemask_t task_nodes; + + /* Check flags */ + if (flags & ~(MPOL_MF_MOVE| + MPOL_MF_MOVE_ALL| + MPOL_MF_MOVE_MT| + MPOL_MF_MOVE_CONCUR)) + return -EINVAL; + + if ((flags & MPOL_MF_MOVE_ALL) && !capable(CAP_SYS_NICE)) + return -EPERM; + + /* Find the mm_struct */ + rcu_read_lock(); + task = pid ? find_task_by_vpid(pid) : current; + if (!task) { + rcu_read_unlock(); + return -ESRCH; + } + get_task_struct(task); + + /* + * Check if this process has the right to modify the specified + * process. The right exists if the process has administrative + * capabilities, superuser privileges or the same + * userid as the target process. + */ + tcred = __task_cred(task); + if (!uid_eq(cred->euid, tcred->suid) && !uid_eq(cred->euid, tcred->uid) && + !uid_eq(cred->uid, tcred->suid) && !uid_eq(cred->uid, tcred->uid) && + !capable(CAP_SYS_NICE)) { + rcu_read_unlock(); + err = -EPERM; + goto out; + } + rcu_read_unlock(); + + err = security_task_movememory(task); + if (err) + goto out; + + task_nodes = cpuset_mems_allowed(task); + mm = get_task_mm(task); + put_task_struct(task); + + if (!mm) + return -EINVAL; + + err = do_pages_exchange(mm, task_nodes, nr_pages, from_pages, + to_pages, status, flags); + + mmput(mm); + + return err; + +out: + put_task_struct(task); + + return err; +} From patchwork Thu Apr 4 02:00:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884771 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 50DEB17E9 for ; Thu, 4 Apr 2019 02:02:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 38E99288E4 for ; Thu, 4 Apr 2019 02:02:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2C76528928; Thu, 4 Apr 2019 02:02:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A563288E4 for ; Thu, 4 Apr 2019 02:02:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D7596B0277; Wed, 3 Apr 2019 22:01:47 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3B12F6B0278; Wed, 3 Apr 2019 22:01:47 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 252C36B0279; Wed, 3 Apr 2019 22:01:47 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id F0A7C6B0277 for ; Wed, 3 Apr 2019 22:01:46 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id m8so932715qka.10 for ; Wed, 03 Apr 2019 19:01:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=bogrl8ykXE3AE9N3lxqf7+kaVmqboUoXMwOaBKi4kwI=; b=X7hfYsD5Yjq0uiG1H/U0J4PfR3WK5CmlhEJZmxHspmQDku60/1Ux3sW4c/6v8AaXN2 fL570akptrclKRJDaNN0sOP1o+Im4lIXTv/1dUhj2iUP4CgBgCPLTJYThBlbaILxmL8R Vw+nPq42/BPxHRv+AKiz/4P5zPe7/YPQ2hueIYSR1rTJq1jAkOrbGjAwmvnwCQz0yaGV TZsBTFWrKzkIFMIYYtYy8KS3zhCizWFjZ2p9o8BP421lvTuDw7uYGdLH44EQXG+s+WAs jxjh/2bp1nWgnJnSG9ldv+VJVkhlKwi7yD7D5D1Pi7/oDkw7RCvTl4fzd76ETjY8jrNs cvjg== X-Gm-Message-State: APjAAAXVd5pqcCFnVzlLdCIzwXz3gth9hxyB4qwAzasBl7bx19EtJ2Th 5khGvoQTDfu26f/BKnrLy3BxVKauqF0x20DWZEknPXPoFwbd1SHl2epFye0XP9HTVb3vCQt/SWD Nynso84798KvXHHfaZXigf9BhbFzg+Biq9d94hUhjRf1e3k3pYOEQgWS9fhEMvRBmLA== X-Received: by 2002:aed:3b62:: with SMTP id q31mr3013786qte.82.1554343306739; Wed, 03 Apr 2019 19:01:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqyNuYjuImayzQjueuTij0uIpO778FuyxEfX2dDrdCnkquUe/o3SUTPHIhM614tCBF5FVRyp X-Received: by 2002:aed:3b62:: with SMTP id q31mr3013734qte.82.1554343305959; Wed, 03 Apr 2019 19:01:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343305; cv=none; d=google.com; s=arc-20160816; b=Am04fkMvmAhsSIPyOVxCI/ydWcD5U9r7N3fnETK1/c5ieOGEtIvG3hiOug8A67mc1s S3/7Zzxu3wVSdeSpQyRsrRjfgKUZNLlFaoJCtuJPGKeBfySAFARq7vY2lLmXuyfPFbE5 U3NSf9p/YUlZKIFlqQiSg+A4SBaGLHVGtxynWs7Maenv9EXqHLSxylMiukXbhJ8ScEqH bmpgLuDhILv2TXZHJNlGoTGfPd/3ov7aXz046l0OjCGIkifuUuxRugQIGrx3lCvfliWB uXVAZy4FShPIZ+mJrCuq8jEMlKsIQfu5IEo2AlSSQZjJKPr2XjlX8EshbmU6BnDL7ULT ZToQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=bogrl8ykXE3AE9N3lxqf7+kaVmqboUoXMwOaBKi4kwI=; b=ANuButLncynQTaKM6igkU/gjH6Wh27AFLwyBnr6FxwrsSeoXEnNDroWGlVrxcmHT5d cSjcHcpOetEP4CMRFEUf/SEp1iz/iS1Pc3WHiournxv8xn8jvpm3maV88R/ut8qj4Cbi 0VJinZwjuDe/4MN24xZ6a72YpgD0Mo89+i2xVHjpDYWZV4M5x0tskuZYbqLEgFk6MV2V aPV91dmQpQI5GtFz18/ZbJEpIQM18UuSzu0xPwfmhPFes9NIYFEy95n1e4DCE/U9mV5H uNAHMx9roTn+BkCRIqV8V47FOSv2/kujlESaBEdplVkwpmX6eEIMq+dcm7n4J7fvhkKq qFWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=kHdvF9pN; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=muwx5J5m; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id r7si1758323qke.34.2019.04.03.19.01.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:45 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=kHdvF9pN; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=muwx5J5m; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id A7A9D228BA; Wed, 3 Apr 2019 22:01:45 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=bogrl8ykXE3AE 9N3lxqf7+kaVmqboUoXMwOaBKi4kwI=; b=kHdvF9pNWcw8D8b8/t/kv8mKy+ArM dg62fTqmQKSItiP6KA8Vwk35fnN/5YjdeJl3B6KUd7eTC/44vtAzvWx6+U+A6P4X Piy2T6tLbpFFwvQ2a+hjWkv7JxOSb3ASu/brFlmlG25yt/ggco4UjZ0r0ZyOI67S ElbA4KNk3VnvoJiER8XoetoJ1c8TnScHn8u4vfIovZovy+2iI437NaUZRPfhB5eR uyHiLxigKLp4sKsJEHX9L2eZYqI0HSZ4xleKLKbRdexU3Yjb8nlHav1EKRYvV1H8 el5AC2pwrLUpjoi2D0HLOyaMF/whe2L9GoIA9+WHf8Jlz6WMIEXvp2e1w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=bogrl8ykXE3AE9N3lxqf7+kaVmqboUoXMwOaBKi4kwI=; b=muwx5J5m gMAQYmtcbAweMtV01l1pPYxfGsXWSAAJLKYJxEBbotutU98jcz+0Tg/k8yGEgEYA 8LClxo10euoBJ8Y0ZNiglPT9zpe0Ey4c3ztZJ91rFOzgjR88n6xGNFMjyNuKgmlq qNSkfJPXkAZtYav7jjUYgyhgBx9SU4WeK+VY+bB+DhCzq574mFK3E29FJJ+HB3Dd /a73q69lHQO66YxPnncSr7j/XIAvs602pTfDQNiQK35gcPc18+qIhkLsZkdYLm9S vYLVnDBAqbX4eM6pQ5ZFHxvCUImu4cajiwkt5yFije7ipZXkBXsYB4xvnXKkcHA4 IrTV2gn+rhDmxg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepudej X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id B84C610319; Wed, 3 Apr 2019 22:01:43 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 18/25] memcg: Add per node memory usage&max stats in memcg. Date: Wed, 3 Apr 2019 19:00:39 -0700 Message-Id: <20190404020046.32741-19-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan It prepares for the following patches to enable memcg-based NUMA node page migration. We are going to limit memory usage in each node on a per-memcg basis. Signed-off-by: Zi Yan --- include/linux/cgroup-defs.h | 1 + include/linux/memcontrol.h | 67 +++++++++++++++++++++++++++++++++++++ mm/memcontrol.c | 80 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 148 insertions(+) diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index 1c70803..7e87f5e 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -531,6 +531,7 @@ struct cftype { struct cgroup_subsys *ss; /* NULL for cgroup core files */ struct list_head node; /* anchored at ss->cfts */ struct kernfs_ops *kf_ops; + int numa_node_id; int (*open)(struct kernfs_open_file *of); void (*release)(struct kernfs_open_file *of); diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 1f3d880..3e40321 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -130,6 +130,7 @@ struct mem_cgroup_per_node { atomic_long_t lruvec_stat[NR_VM_NODE_STAT_ITEMS]; unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; + unsigned long max_nr_base_pages; struct mem_cgroup_reclaim_iter iter[DEF_PRIORITY + 1]; @@ -797,6 +798,51 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, void mem_cgroup_split_huge_fixup(struct page *head); #endif +static inline unsigned long lruvec_size_memcg_node(enum lru_list lru, + struct mem_cgroup *memcg, int nid) +{ + if (nid == MAX_NUMNODES) + return 0; + + VM_BUG_ON(lru < 0 || lru >= NR_LRU_LISTS); + return mem_cgroup_node_nr_lru_pages(memcg, nid, BIT(lru)); +} + +static inline unsigned long active_inactive_size_memcg_node(struct mem_cgroup *memcg, int nid, bool active) +{ + unsigned long val = 0; + enum lru_list lru; + + for_each_evictable_lru(lru) { + if ((active && is_active_lru(lru)) || + (!active && !is_active_lru(lru))) + val += mem_cgroup_node_nr_lru_pages(memcg, nid, BIT(lru)); + } + + return val; +} + +static inline unsigned long memcg_size_node(struct mem_cgroup *memcg, int nid) +{ + unsigned long val = 0; + int i; + + if (nid == MAX_NUMNODES) + return val; + + for (i = 0; i < NR_LRU_LISTS; i++) + val += mem_cgroup_node_nr_lru_pages(memcg, nid, BIT(i)); + + return val; +} + +static inline unsigned long memcg_max_size_node(struct mem_cgroup *memcg, int nid) +{ + if (nid == MAX_NUMNODES) + return 0; + return memcg->nodeinfo[nid]->max_nr_base_pages; +} + #else /* CONFIG_MEMCG */ #define MEM_CGROUP_ID_SHIFT 0 @@ -1123,6 +1169,27 @@ static inline void count_memcg_event_mm(struct mm_struct *mm, enum vm_event_item idx) { } + +static inline unsigned long lruvec_size_memcg_node(enum lru_list lru, + struct mem_cgroup *memcg, int nid) +{ + return 0; +} + +static inline unsigned long active_inactive_size_memcg_node(struct mem_cgroup *memcg, int nid, bool active) +{ + return 0; +} + +static inline unsigned long memcg_size_node(struct mem_cgroup *memcg, int nid) +{ + return 0; +} +static inline unsigned long memcg_max_size_node(struct mem_cgroup *memcg, int nid) +{ + return 0; +} + #endif /* CONFIG_MEMCG */ /* idx can be of type enum memcg_stat_item or node_stat_item */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 532e0e2..478d216 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4394,6 +4394,7 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node) pn->usage_in_excess = 0; pn->on_tree = false; pn->memcg = memcg; + pn->max_nr_base_pages = PAGE_COUNTER_MAX; memcg->nodeinfo[node] = pn; return 0; @@ -6700,4 +6701,83 @@ static int __init mem_cgroup_swap_init(void) } subsys_initcall(mem_cgroup_swap_init); +static int memory_per_node_stat_show(struct seq_file *m, void *v) +{ + struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m)); + struct cftype *cur_file = seq_cft(m); + int nid = cur_file->numa_node_id; + unsigned long val = 0; + int i; + + for (i = 0; i < NR_LRU_LISTS; i++) + val += mem_cgroup_node_nr_lru_pages(memcg, nid, BIT(i)); + + seq_printf(m, "%llu\n", (u64)val * PAGE_SIZE); + + return 0; +} + +static int memory_per_node_max_show(struct seq_file *m, void *v) +{ + struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m)); + struct cftype *cur_file = seq_cft(m); + int nid = cur_file->numa_node_id; + unsigned long max = READ_ONCE(memcg->nodeinfo[nid]->max_nr_base_pages); + + if (max == PAGE_COUNTER_MAX) + seq_puts(m, "max\n"); + else + seq_printf(m, "%llu\n", (u64)max * PAGE_SIZE); + + return 0; +} + +static ssize_t memory_per_node_max_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off) +{ + struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of)); + struct cftype *cur_file = of_cft(of); + int nid = cur_file->numa_node_id; + unsigned long max; + int err; + + buf = strstrip(buf); + err = page_counter_memparse(buf, "max", &max); + if (err) + return err; + + xchg(&memcg->nodeinfo[nid]->max_nr_base_pages, max); + + return nbytes; +} + +static struct cftype memcg_per_node_stats_files[N_MEMORY]; +static struct cftype memcg_per_node_max_files[N_MEMORY]; + +static int __init mem_cgroup_per_node_init(void) +{ + int nid; + + for_each_node_state(nid, N_MEMORY) { + snprintf(memcg_per_node_stats_files[nid].name, MAX_CFTYPE_NAME, + "size_at_node:%d", nid); + memcg_per_node_stats_files[nid].flags = CFTYPE_NOT_ON_ROOT; + memcg_per_node_stats_files[nid].seq_show = memory_per_node_stat_show; + memcg_per_node_stats_files[nid].numa_node_id = nid; + + snprintf(memcg_per_node_max_files[nid].name, MAX_CFTYPE_NAME, + "max_at_node:%d", nid); + memcg_per_node_max_files[nid].flags = CFTYPE_NOT_ON_ROOT; + memcg_per_node_max_files[nid].seq_show = memory_per_node_max_show; + memcg_per_node_max_files[nid].write = memory_per_node_max_write; + memcg_per_node_max_files[nid].numa_node_id = nid; + } + WARN_ON(cgroup_add_dfl_cftypes(&memory_cgrp_subsys, + memcg_per_node_stats_files)); + WARN_ON(cgroup_add_dfl_cftypes(&memory_cgrp_subsys, + memcg_per_node_max_files)); + return 0; +} +subsys_initcall(mem_cgroup_per_node_init); + #endif /* CONFIG_MEMCG_SWAP */ From patchwork Thu Apr 4 02:00:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884773 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DDA8417E9 for ; Thu, 4 Apr 2019 02:02:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C8777288E4 for ; Thu, 4 Apr 2019 02:02:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BCBD228929; Thu, 4 Apr 2019 02:02:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C585288E4 for ; Thu, 4 Apr 2019 02:02:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2DD86B0278; Wed, 3 Apr 2019 22:01:48 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AB7576B0279; Wed, 3 Apr 2019 22:01:48 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A30F6B027A; Wed, 3 Apr 2019 22:01:48 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 763A06B0278 for ; Wed, 3 Apr 2019 22:01:48 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id n13so968520qtn.6 for ; Wed, 03 Apr 2019 19:01:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=3UekVlM57ISUTTW0lHTJU3YrdfYPvp3rAFH0aaIT5ag=; b=kUyg9pIEnUYP3SndR9qZ6BFzj894EBoZQVg0sFqaqXb7FA3yNHUIzVogCZolFunyle 3DNhPpzUeBaiifLMW0JEgxmRy9vZeeB1PbdAikGPSt/Xnuzxsl6SgcTrQ+gf+NEfTWfy cRLR3EDnZvQsJeer3TKRxz8n1rUkxyqmzfT9GOKA+6HaWT0+evE0WU36hJZRWmKZdW2Y VTtxjGlN6ew84wzKHpjdl4ym9leZOh64Lu7syAJbYyioad9bDuvD6WJEymB+FlFTmbbK VceMExsupgoD8hCaf+XuHckiakSD1Sh1iyKkkh/ejsDp1nLsmRnlDuOdgxUjtjLNbwbW t5LQ== X-Gm-Message-State: APjAAAWzK5rTN8zukbvGtVCUixrR9dl3amzUZuJA/eYzOiyATxxxAlYI PoA9D6ZJ32XTIwQZAJ8KAHzqJBhUmBlZd+olr+uqJFPllNabg+ylkVYgMZOGliC2+MLqwsbg/1j kZzmjc/59qGwBlgJv2QJnGjy9sHTqs2pHd1oxOa+b3uMVUm+spXHF4hLqQgGfNosqOQ== X-Received: by 2002:ad4:42cb:: with SMTP id f11mr2557927qvr.53.1554343308262; Wed, 03 Apr 2019 19:01:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqzICTZnPoNUzGTgHjMdy3lfEmPUWNJtE63FU8Wlz1rYnF04qx/iW2QYkQNqZ+bgTmZNGQcP X-Received: by 2002:ad4:42cb:: with SMTP id f11mr2557898qvr.53.1554343307645; Wed, 03 Apr 2019 19:01:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343307; cv=none; d=google.com; s=arc-20160816; b=kcI1oydqN6jknELYB2oLbdwUy5xzyJuG22sAghKPeehr62TmNtK6WdTFUE5NriTcQv uz9BDA11RESMgAOqZSq4RWje5VXw6pDS1ZgqHVRBpfuPt/l7e3hlRaNDCzKsF/f4xbyh 3eErEjbyAW5X02QHzNpXj1ULCOGZsW30IwvlZLU8W771xjgbzj93heFY/tvMV6RyNxtq RBFm7Yxr2nciFaFeoOhOEquvbG6zww66JeUDj5jK1J88x3Ru4FM1M4/ErF5pvHbWC9YK coZ+H73vAPpECVAn5eniIL0bMOYQaRmgrUJOqacYlGNmvryObK0avxnz9Eh7hB7yVm2x ZQOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=3UekVlM57ISUTTW0lHTJU3YrdfYPvp3rAFH0aaIT5ag=; b=Lns4QZHfLkg7/4u1NaDUt0un48PeB0dVC3jQFimUaReYrHlAYyo5St1gS33h40xlc0 O5dLtDWPjLJRl8sCR2ErK7Qppv0QInmBrpZErxsybOitHIdFc3KgxDvzOT0w+PbPhDHl xOsW2HMhy7frGW3SvUUd4w8E1r91kfAGOFJXN+o5mhDeDioCrFspPWa+M35TYoUA2E5r +v/HMkMRYzPLnw5klE95lF/1i3T8yFnKljUSA/8DA1GP7ISXkljsorNk8B7ZeOlVgt2+ YxNpBle99/6PKnQ94QrFhhCaVBoiULqJVrhI5elvW0ed5NaN9oU1l9ccUAGjBAPFkW/n 52/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=uiRPlq6m; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=MyEaAGe4; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id t21si8256843qte.338.2019.04.03.19.01.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:47 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=uiRPlq6m; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=MyEaAGe4; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 6095422710; Wed, 3 Apr 2019 22:01:47 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=3UekVlM57ISUT TW0lHTJU3YrdfYPvp3rAFH0aaIT5ag=; b=uiRPlq6mC9oXMnQB8WtxkGcQ0xCOf b7pGGA5rxoUR/W1l4alCIYJEymTCa8ogRidqOt//14B9/kwgDh7KNV1tDBvlj/kl hLStBwtyoswGRgSzMavZE4wASV2KjaPBgA+nLU2FAvuRQfZV+bHPgxk0GLw5lmYD jxQkgxNJNeB58nNLj7OtRyBjx8E2RlRBGivEIb/+KigepHrhmsSm7KZlE6qOFsZt tU99yhvHWv3AztL49xrP3e8IgTnrpXO3AAyC2YUd5z6jA0bxIbKBsABPAnNjrvFY gTMDhTxYWCKPVCpyVxtE2GZq36+2Tgk0eW0gTk4nOR/hFrN7Zo8MwUc/Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=3UekVlM57ISUTTW0lHTJU3YrdfYPvp3rAFH0aaIT5ag=; b=MyEaAGe4 6nOYVcgLm7Dnv4D9kfwxR4el7PgmlaH7vQjvG+tPatHJsI3xAPkNLbR7BtYveG/c 6+muJpOgUMqIUvMKE8gjytzHGnEBQ+sdfi8rcp9AhOwc8JZjddR22gm52qp9xLsP wjGYGkNFVAKDlVLLalxRMCO8w2JAiMzYeMWpCgVihfnUP3+ve75nO0SawyavLK/g xhyxO1VhM2YadnHl9lkLwBEly6Mapb235uQS3Sfmt2gB7/xGTXnwLY4uQgBQMy9t itUHnRVMWx10CzCOp6vW54XgaDyggW8Dtx2jpaHNV9wZNmp5p3Xw9cQsZSn6PP8e UIsiQeAH+R6Kmg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepudej X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 624C610393; Wed, 3 Apr 2019 22:01:45 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 19/25] mempolicy: add MPOL_F_MEMCG flag, enforcing memcg memory limit. Date: Wed, 3 Apr 2019 19:00:40 -0700 Message-Id: <20190404020046.32741-20-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan With MPOL_F_MEMCG set and MPOL_PREFERRED is used, we will enforce the memory limit set in the corresponding memcg. Signed-off-by: Zi Yan --- include/uapi/linux/mempolicy.h | 3 ++- mm/mempolicy.c | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index eb6560e..a9d03e5 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -28,12 +28,13 @@ enum { /* Flags for set_mempolicy */ #define MPOL_F_STATIC_NODES (1 << 15) #define MPOL_F_RELATIVE_NODES (1 << 14) +#define MPOL_F_MEMCG (1 << 13) /* * MPOL_MODE_FLAGS is the union of all possible optional mode flags passed to * either set_mempolicy() or mbind(). */ -#define MPOL_MODE_FLAGS (MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES) +#define MPOL_MODE_FLAGS (MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES | MPOL_F_MEMCG) /* Flags for get_mempolicy */ #define MPOL_F_NODE (1<<0) /* return next IL mode instead of node mask */ diff --git a/mm/mempolicy.c b/mm/mempolicy.c index af171cc..0e30049 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2040,6 +2040,42 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma, goto out; } + if (pol->mode == MPOL_PREFERRED && (pol->flags & MPOL_F_MEMCG)) { + struct task_struct *p = current; + struct mem_cgroup *memcg = mem_cgroup_from_task(p); + int nid = pol->v.preferred_node; + unsigned long nr_memcg_node_size; + struct mm_struct *mm = get_task_mm(p); + unsigned long nr_pages = hugepage?HPAGE_PMD_NR:1; + + if (!(memcg && mm)) { + if (mm) + mmput(mm); + goto use_other_policy; + } + + /* skip preferred node if mm_manage is going on */ + if (test_bit(MMF_MM_MANAGE, &mm->flags)) { + nid = next_memory_node(nid); + if (nid == MAX_NUMNODES) + nid = first_memory_node; + } + mmput(mm); + + nr_memcg_node_size = memcg_max_size_node(memcg, nid); + + while (nr_memcg_node_size != ULONG_MAX && + nr_memcg_node_size <= (memcg_size_node(memcg, nid) + nr_pages)) { + if ((nid = next_memory_node(nid)) == MAX_NUMNODES) + nid = first_memory_node; + nr_memcg_node_size = memcg_max_size_node(memcg, nid); + } + + mpol_cond_put(pol); + page = __alloc_pages_node(nid, gfp | __GFP_THISNODE, order); + goto out; + } +use_other_policy: if (unlikely(IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && hugepage)) { int hpage_node = node; From patchwork Thu Apr 4 02:00:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884775 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DA9B11708 for ; Thu, 4 Apr 2019 02:02:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C451B288E4 for ; Thu, 4 Apr 2019 02:02:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B7FEA28928; Thu, 4 Apr 2019 02:02:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC411288E4 for ; Thu, 4 Apr 2019 02:02:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CB0A96B0279; Wed, 3 Apr 2019 22:01:50 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C8D956B027A; Wed, 3 Apr 2019 22:01:50 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B551A6B027B; Wed, 3 Apr 2019 22:01:50 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 8CB3B6B0279 for ; Wed, 3 Apr 2019 22:01:50 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id x12so986700qtk.2 for ; Wed, 03 Apr 2019 19:01:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=+OWLOaTSulD1I3rYTyrpQROTEVUrL3XU1UdiU3Mdlfc=; b=te3vtk+f9/XaVT6BDMajAVgxoIkHbIheZeGX5u5SBfY6RQs+zDPOiPbIjpKbxdlGHE Z+uHc1Eo1p/tJkb8+5kwKpe9aQcvVamiLrACU2JTYaqc4MMAtst6EzOzhRTTUl/GCSiR ufxlTbzbFwj68yCIZUN9Ww0DK9h2Z2TJ+Y0RDPiJx9Fa1QvdiCdCWgB0IglMkADeWiyj 4A0t91fEqil8riO194Tqruhei/rLh/Sc81cKB0lwanrVkoyRBLJSzI1I2RnMvgznonFo GhJgXnBhJc4Rjmps1YP0JKyBoiiXdWMuVUNA/GMZrl/MKIxp4VeTnu3OCzUurXtiVYzQ yaiw== X-Gm-Message-State: APjAAAWq8NI7VECXxI3MoRA7VC3H7LxAUrG7a1so8q1nxZdGt6msVZnG p4q3h+Rho9tKxrpvIv5Pw6MqLWYh6f08ojm2WZaTKa90o6ReWw87JebFM6WNmNkZDr7rnp3SrTL rd0x3f6tCuNnQD+8mI1fcbZFmScW1mDRtHM9VWQI79LYP1lk6B3Oa/g6Z63d+qaqwaA== X-Received: by 2002:a37:ba44:: with SMTP id k65mr2921196qkf.209.1554343310311; Wed, 03 Apr 2019 19:01:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqxKbg/SsYL3/rHZv+VJCp1jNwLdZ7t/ErDiPXbRDrxRLeqtoA43yJqtpgOyKfwfiGHdS9x7 X-Received: by 2002:a37:ba44:: with SMTP id k65mr2921147qkf.209.1554343309394; Wed, 03 Apr 2019 19:01:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343309; cv=none; d=google.com; s=arc-20160816; b=0Ut6jp1BVyGpNf2NpUSmAVqmRfLWZqlCmocO/9b5KiFosUsHmqWrZemVUFZS8iTrIO sfn7GY+EOhvYSac8YosCEoF89SPdQzeOGGmgJC54vMAahlSXP7oCb5dUXdEoHf6cUjT7 Drz/lgnLRCAogyjSeD/XgztIMetWOir8i/p1vnZjdedqlL4WqxkOSMi4P2lV4kB6ZkUo KAl1ap0YkJ5JZoEKmZown3hylowB0KDjkgVfp1LnL3gHCwSC/pq3FAM5Qi5sBtRut4VJ CBnY4O57BCqJwvEB1gCZno7MV38xUUZlOj6oYklFIP8MUTqFw2xBE9AgIXaRilDmdf6X Lsyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=+OWLOaTSulD1I3rYTyrpQROTEVUrL3XU1UdiU3Mdlfc=; b=kJqqOYn7NJdCYwC1HtKKXAhUrjllZCKKCsnOOL70Xpy+yFLarKYI013Rh6xJw1zA+d xYlquaFhUkPKfnlXby5yu0puRnGaxunhf4JGf7KZSVrX/wKFUptTXgVChFc+Z3e22+lu 9z+ydM00MuVXFBY5XFbt2BOZMatEPkgQIcQZCnMYiMdiguLC1N5zwvBtSLONypKGDm2c FPFwBUpsq7uGgvIyVzXpFXUTKJR0OSu3DEENczKdis0tVVHvrEAW6IOPXauGRNkTXnZV 0Mhu+YXYF9ZxQO5qaOsy67bbgD2M78tlcPf6YcLPJvGjs13z2V7fxP/K3aOrcZkv/ovn YX+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=IepaJTQb; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=pFvarqzE; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id x14si3612399qtb.125.2019.04.03.19.01.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:49 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=IepaJTQb; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=pFvarqzE; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 10E4822197; Wed, 3 Apr 2019 22:01:49 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=+OWLOaTSulD1I 3rYTyrpQROTEVUrL3XU1UdiU3Mdlfc=; b=IepaJTQbb31zalJJGZxTgZaCkwM6d eHeCyODQ1uYRTZk/AzVpR1neNOZmfFpY1YlnHUDm8TeEqGn4R8FdxsyDqMwIPlfL hQCPcHSOseDhYaRschdT+jVFxj0U2JtNrD/xCmrscoyA2G0Ry03Xr45E2Z43iIYN vmiIU6wDztaqQblBtYiPa0LLA418qMYmRh5QTCCsdOWmgyWCh3LjnrL+QrBhVcRq HUNN1W1DHA2P+P6MB0em6DKpySNm78bgvUakLYu23lBCzPxiscinsIfadLb2nyJx 5WObCzzG2GSRkwbpwSLxxt7PtQLjYNLvRzVO48ybpjzBdweOGQ+ilHptw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=+OWLOaTSulD1I3rYTyrpQROTEVUrL3XU1UdiU3Mdlfc=; b=pFvarqzE 6t4RsHaY5kXD4/iEsCtGmbrabYqmgojOy9io0NkPu3ucLA3xmch/9GusYPD8xpcS OGkJ4IT8EMO4KOiKuwYJdemcmT9TrpbL/Jme7Rdty/XGDdKSYsBp/2G5GVTE2jdd qdj+i1ib6yjJo1fwcgXWPpdWqEF3hb0FhSEC1x58fYBEDrrCn2N0Fk+dPEFgC7HY HP35GPuZ7L586jmytk1uZv9LDF+F1y0b0MoAKuFcL9YHJug98nrI01rVZM8DKh+B Uns3S+iGixw0u2MkHR3Tsu/A/2gw7sFZlOiGpDjhbFi5+JEK0BdCwCsVT8NdLy5Q 3C+29CawBAahYg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepudej X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 2061C10316; Wed, 3 Apr 2019 22:01:47 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 20/25] memory manage: Add memory manage syscall. Date: Wed, 3 Apr 2019 19:00:41 -0700 Message-Id: <20190404020046.32741-21-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan This prepares for the following patches to provide a user API to manipulate pages in two memory nodes with the help of memcg. missing memcg_max_size_node() Signed-off-by: Zi Yan --- arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux/sched/coredump.h | 1 + include/linux/syscalls.h | 5 ++ include/uapi/linux/mempolicy.h | 1 + mm/Makefile | 1 + mm/internal.h | 2 + mm/memory_manage.c | 109 +++++++++++++++++++++++++++++++++ mm/mempolicy.c | 2 +- 8 files changed, 121 insertions(+), 1 deletion(-) create mode 100644 mm/memory_manage.c diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 863a21e..fa8def3 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -344,6 +344,7 @@ 333 common io_pgetevents __x64_sys_io_pgetevents 334 common rseq __x64_sys_rseq 335 common exchange_pages __x64_sys_exchange_pages +336 common mm_manage __x64_sys_mm_manage # don't use numbers 387 through 423, add new calls after the last # 'common' entry 424 common pidfd_send_signal __x64_sys_pidfd_send_signal diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h index ecdc654..9aa9d94b 100644 --- a/include/linux/sched/coredump.h +++ b/include/linux/sched/coredump.h @@ -73,6 +73,7 @@ static inline int get_dumpable(struct mm_struct *mm) #define MMF_OOM_VICTIM 25 /* mm is the oom victim */ #define MMF_OOM_REAP_QUEUED 26 /* mm was queued for oom_reaper */ #define MMF_DISABLE_THP_MASK (1 << MMF_DISABLE_THP) +#define MMF_MM_MANAGE 27 #define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\ MMF_DISABLE_THP_MASK) diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 2c1eb49..47d56c5 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1208,6 +1208,11 @@ asmlinkage long sys_exchange_pages(pid_t pid, unsigned long nr_pages, const void __user * __user *to_pages, int __user *status, int flags); +asmlinkage long sys_mm_manage(pid_t pid, unsigned long nr_pages, + unsigned long maxnode, + const unsigned long __user *old_nodes, + const unsigned long __user *new_nodes, + int flags); /* * Not a real system call, but a placeholder for syscalls which are diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index a9d03e5..4722bb7 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -52,6 +52,7 @@ enum { #define MPOL_MF_MOVE_DMA (1<<5) /* Use DMA page copy routine */ #define MPOL_MF_MOVE_MT (1<<6) /* Use multi-threaded page copy routine */ #define MPOL_MF_MOVE_CONCUR (1<<7) /* Move pages in a batch */ +#define MPOL_MF_EXCHANGE (1<<8) /* Exchange pages */ #define MPOL_MF_VALID (MPOL_MF_STRICT | \ MPOL_MF_MOVE | \ diff --git a/mm/Makefile b/mm/Makefile index 2f1f1ad..5302d79 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -47,6 +47,7 @@ obj-y += memblock.o obj-y += copy_page.o obj-y += exchange.o obj-y += exchange_page.o +obj-y += memory_manage.o ifdef CONFIG_MMU obj-$(CONFIG_ADVISE_SYSCALLS) += madvise.o diff --git a/mm/internal.h b/mm/internal.h index cf63bf6..94feb14 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -574,5 +574,7 @@ bool buffer_migrate_lock_buffers(struct buffer_head *head, int writeout(struct address_space *mapping, struct page *page); int expected_page_refs(struct address_space *mapping, struct page *page); +int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask, + unsigned long maxnode); #endif /* __MM_INTERNAL_H */ diff --git a/mm/memory_manage.c b/mm/memory_manage.c new file mode 100644 index 0000000..b8f3654 --- /dev/null +++ b/mm/memory_manage.c @@ -0,0 +1,109 @@ +/* + * A syscall used to move pages between two nodes. + */ + +#include +#include +#include +#include +#include +#include + +#include "internal.h" + + +SYSCALL_DEFINE6(mm_manage, pid_t, pid, unsigned long, nr_pages, + unsigned long, maxnode, + const unsigned long __user *, slow_nodes, + const unsigned long __user *, fast_nodes, + int, flags) +{ + const struct cred *cred = current_cred(), *tcred; + struct task_struct *task; + struct mm_struct *mm = NULL; + int err; + nodemask_t task_nodes; + nodemask_t *slow; + nodemask_t *fast; + NODEMASK_SCRATCH(scratch); + + if (!scratch) + return -ENOMEM; + + slow = &scratch->mask1; + fast = &scratch->mask2; + + err = get_nodes(slow, slow_nodes, maxnode); + if (err) + goto out; + + err = get_nodes(fast, fast_nodes, maxnode); + if (err) + goto out; + + /* Check flags */ + if (flags & ~(MPOL_MF_MOVE_MT| + MPOL_MF_MOVE_DMA| + MPOL_MF_MOVE_CONCUR| + MPOL_MF_EXCHANGE)) + return -EINVAL; + + /* Find the mm_struct */ + rcu_read_lock(); + task = pid ? find_task_by_vpid(pid) : current; + if (!task) { + rcu_read_unlock(); + err = -ESRCH; + goto out; + } + get_task_struct(task); + + err = -EINVAL; + /* + * Check if this process has the right to modify the specified + * process. The right exists if the process has administrative + * capabilities, superuser privileges or the same + * userid as the target process. + */ + tcred = __task_cred(task); + if (!uid_eq(cred->euid, tcred->suid) && !uid_eq(cred->euid, tcred->uid) && + !uid_eq(cred->uid, tcred->suid) && !uid_eq(cred->uid, tcred->uid) && + !capable(CAP_SYS_NICE)) { + rcu_read_unlock(); + err = -EPERM; + goto out_put; + } + rcu_read_unlock(); + + err = security_task_movememory(task); + if (err) + goto out_put; + + task_nodes = cpuset_mems_allowed(task); + mm = get_task_mm(task); + put_task_struct(task); + + if (!mm) { + err = -EINVAL; + goto out; + } + if (test_bit(MMF_MM_MANAGE, &mm->flags)) { + mmput(mm); + goto out; + } else { + set_bit(MMF_MM_MANAGE, &mm->flags); + } + + + clear_bit(MMF_MM_MANAGE, &mm->flags); + mmput(mm); +out: + NODEMASK_SCRATCH_FREE(scratch); + + return err; + +out_put: + put_task_struct(task); + goto out; + +} \ No newline at end of file diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 0e30049..168d17f8 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1249,7 +1249,7 @@ static long do_mbind(unsigned long start, unsigned long len, */ /* Copy a node mask from user space. */ -static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask, +int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask, unsigned long maxnode) { unsigned long k; From patchwork Thu Apr 4 02:00:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884777 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 50E2F17E9 for ; Thu, 4 Apr 2019 02:02:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C548288E4 for ; Thu, 4 Apr 2019 02:02:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2FE2328928; Thu, 4 Apr 2019 02:02:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 07254288E8 for ; Thu, 4 Apr 2019 02:02:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 204826B027A; Wed, 3 Apr 2019 22:01:52 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 18DE16B027B; Wed, 3 Apr 2019 22:01:52 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 02CA76B027C; Wed, 3 Apr 2019 22:01:51 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id CE65C6B027A for ; Wed, 3 Apr 2019 22:01:51 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id e31so992355qtb.0 for ; Wed, 03 Apr 2019 19:01:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=90OfIYrsLVV/XL3gUGgucTsSXTyMHPlo/8YAawvSo30=; b=n3grSJwMfbjEwDW8H2n/eobdlhda1UmMg8GE/MbZDrP9wtFk9Pjrf8BC5VDLcqOJ0H ZSK+xx7I62U6DUHdW1r15XoB7kZypxN+9BhsoubSC3x51uhdFKr2hAFx/sOJ/eFXmTJo CSUDxTAvq6UKpOMSz0gcDPXZzp67pbHpeDlWIxT0e5tjhSimJl98KTskBafiBZxbzqxK u9nBM5Mp4zRqrkOLqHOjf5wqqkifrLMT7hIB32329a8Xek5bhiROqcCGdeWgKHelM7YI jt7aVsMN3mXDOQLaZgDIxMvqH/KPW6RbOnSUFGO1xHPoLO6NEJGHOrutZCYQRlm86Sce E1fQ== X-Gm-Message-State: APjAAAX7WmUrsA3hUeVMtt3d5C795UbPTTxRMW0LpVGSs8+0BfHT192L o+/hXyWSU46kSgIZfbjTjtr34Ddy41hMO/gjJdYKABL6jxzsLRSvv7Ib7DOMUeDXsRqrKScTL8N Mq5tXVtDOZUJceOnO7EYhWDlh+fdgbfiZOZpt4qUc6Y0QUoj2IiEctMQgRBbydxXF2w== X-Received: by 2002:a37:a951:: with SMTP id s78mr3017943qke.156.1554343311611; Wed, 03 Apr 2019 19:01:51 -0700 (PDT) X-Google-Smtp-Source: APXvYqwjn258cLnJ7bgvm+aa4cBXXdg/woh58sSDbcEimezbePKmO2yA3ZyewQa1gnm6yWr6WgaG X-Received: by 2002:a37:a951:: with SMTP id s78mr3017906qke.156.1554343310995; Wed, 03 Apr 2019 19:01:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343310; cv=none; d=google.com; s=arc-20160816; b=tyaZMvfRRGMFELbNjhsmZ01JGoj8mQBq7CDmdZ8OVkqwXJahrh9knppdgQc3OORhlL Z4PvQ3JvSscOcmvzfECBereLlQjt8vYpkDqWiURjFXJWi1PtaluPax6u5MTAl+ux6TFU yp14TJA/UQQbo3drItmOSPGeLMie/JtuXDK4RnEMXoi2LzHj09z9AG+W+Ni/zRnwYGxJ rEt7+heHtRPQDh9+0R3pT1Ntu7+7ZfvGBKpd3nivVVnmARCURMv46ytNFKa/FGqc48ZM Qq6wAQlLGlOBjipOkZOVhWXRLU0jqd3yUH/5FbEjEUOJgdHGeVucsL2CV66xJGo7Qyx3 Sh5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=90OfIYrsLVV/XL3gUGgucTsSXTyMHPlo/8YAawvSo30=; b=Rila6CRvYblPJhzjZPpVRHjjAgbU7L+XLmvAF75cV494Ld9pRt5EBStEbcK+Xd8bas Wk+Vjvg99k6qQtZ7tq8QYoh3tiqT0hmbnPR8Hirg7VSZrPlTnBHgOVlQWlqWlDBx5xqz jwCtX/g6h+GNJAKLYMscBMw/pYD+zFN4JFq0Y25VPVHg+wXASA9uzRVPjd/QM89nhW8k 1tQ9Rs9Kw8BiFTwCwKYUuKZ2jCQ6beCr0Wy/YGK0Iwa5fr3nMN3/1mOTmT6HeHZzihxW RE7WKyclUPXXM20cowun6zLdj4Hey0882C+xo9K8uO5B0XwKXLw01Wvd3+VYUlYdNAsV a8TQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=ZTcljXgK; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=FQOkb9gU; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id h37si422424qvh.81.2019.04.03.19.01.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:50 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=ZTcljXgK; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=FQOkb9gU; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id B644522AB8; Wed, 3 Apr 2019 22:01:50 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=90OfIYrsLVV/X L3gUGgucTsSXTyMHPlo/8YAawvSo30=; b=ZTcljXgKyrhIcGFZqw0NEMiVpsJTj WIedhk8FmwdAwNdcO/lmKhbASK3zQyp8yrVxgQiD9s80HAVdqAxKpAXfrUwQgwDS HgjyFJJtczMeURxtFvM3M5d9ca37WW0IaGa+7sRAGhTZBv+q1lOX2o7q8mzaWX5y /wc/XLxzzCJtsPDZUaf+ZX5sulfsDLUVeQAOMAB1GROLM370bU+EU5LVZ2Or6n0Y a5PK3Zx9vxs3sgzGvieVLHnMK57yuhLB3QuzYX8XxPpsLxGFvAmwg7YYsCBPFUcQ vR/sZSU4THIU844zmdtAJMJ27ZujmNsbFVEvUkqlq8rhZZhksVXBf/SSQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=90OfIYrsLVV/XL3gUGgucTsSXTyMHPlo/8YAawvSo30=; b=FQOkb9gU Wkt8yUg1JLS3hUvT+MWCzBfCq1LO4fDjtBiCZmKlCrIXPv9WDGmMQQTncX5wHiI0 DSYBHbRosU5Mu8R+OPCgLMc5VDnC9PiMahmjzc+tNFdNQw8tPpvof+UMuCFZ3A8V Gkk3N7WpIe1eJ7y6oljn0705l7NXHq1RE/rm8Y7oFB3Xqod0Q9liAKUeVmQ/I38H /KybQVLISD/6fmLiBCGgLOy2OyLz7GQwlGiSyBHJJSterSdu4vFe1iZSrjwzJrkf hZVH+e71873++cIHWH9Gp+0SNUk4PquC3YQfZ37jddtruYC4WXKR1MpK8Vg/ICUJ 9Xrbp/lZmiGkjw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepudej X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id C4EE61030F; Wed, 3 Apr 2019 22:01:48 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 21/25] mm: move update_lru_sizes() to mm_inline.h for broader use. Date: Wed, 3 Apr 2019 19:00:42 -0700 Message-Id: <20190404020046.32741-22-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Signed-off-by: Zi Yan --- include/linux/mm_inline.h | 21 +++++++++++++++++++++ mm/vmscan.c | 25 ++----------------------- 2 files changed, 23 insertions(+), 23 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 04ec454..b9fbd0b 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -44,6 +44,27 @@ static __always_inline void update_lru_size(struct lruvec *lruvec, #endif } +/* + * Update LRU sizes after isolating pages. The LRU size updates must + * be complete before mem_cgroup_update_lru_size due to a santity check. + */ +static __always_inline void update_lru_sizes(struct lruvec *lruvec, + enum lru_list lru, unsigned long *nr_zone_taken) +{ + int zid; + + for (zid = 0; zid < MAX_NR_ZONES; zid++) { + if (!nr_zone_taken[zid]) + continue; + + __update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]); +#ifdef CONFIG_MEMCG + mem_cgroup_update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]); +#endif + } + +} + static __always_inline void add_page_to_lru_list(struct page *page, struct lruvec *lruvec, enum lru_list lru) { diff --git a/mm/vmscan.c b/mm/vmscan.c index a5ad0b3..1d539d6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1593,27 +1593,6 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode) } -/* - * Update LRU sizes after isolating pages. The LRU size updates must - * be complete before mem_cgroup_update_lru_size due to a santity check. - */ -static __always_inline void update_lru_sizes(struct lruvec *lruvec, - enum lru_list lru, unsigned long *nr_zone_taken) -{ - int zid; - - for (zid = 0; zid < MAX_NR_ZONES; zid++) { - if (!nr_zone_taken[zid]) - continue; - - __update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]); -#ifdef CONFIG_MEMCG - mem_cgroup_update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]); -#endif - } - -} - /** * pgdat->lru_lock is heavily contended. Some of the functions that * shrink the lists perform better by taking out a batch of pages @@ -1804,7 +1783,7 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, return isolated > inactive; } -static noinline_for_stack void +noinline_for_stack void putback_inactive_pages(struct lruvec *lruvec, struct list_head *page_list) { struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat; @@ -2003,7 +1982,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, * Returns the number of pages moved to the given lru. */ -static unsigned move_active_pages_to_lru(struct lruvec *lruvec, +unsigned move_active_pages_to_lru(struct lruvec *lruvec, struct list_head *list, struct list_head *pages_to_free, enum lru_list lru) From patchwork Thu Apr 4 02:00:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884779 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 769FF1708 for ; Thu, 4 Apr 2019 02:02:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5CCA5288E4 for ; Thu, 4 Apr 2019 02:02:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4FFE728947; Thu, 4 Apr 2019 02:02:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 39E3A288E4 for ; Thu, 4 Apr 2019 02:02:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 70C636B027B; Wed, 3 Apr 2019 22:01:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 699136B027C; Wed, 3 Apr 2019 22:01:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51A5E6B027D; Wed, 3 Apr 2019 22:01:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 1A2596B027B for ; Wed, 3 Apr 2019 22:01:54 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id x12so986824qtk.2 for ; Wed, 03 Apr 2019 19:01:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=KCLXIWPn2wPPaX//haXWVGdcpCWLncsANSVaA+4FczE=; b=F0W/rJ712XVL1VLIru8ddvMNf4wq/D4H/mvG18z5KdfFLvLSSmo5wl4fybbipP2nf4 rQQyyIqDsW3CZty8m2jVtRALzc1byxtm4sMV/CFnV7wB1sJTsaJhoO96IGbNGs2K/+60 rlpfzt5elb9gF0QF1am1XURd8vw02yf1isZp1EqXiX8iwoRP8oIVHbS23jiSjOlIT7Ey UhS4dDGMrZgfIkXBnwerQo1wvUZqYp4sQ71jcx0w1TWAhlgw0JHwb1boap2zGLLzBB36 1C8c1N4T1BtBKBSfBTAG+Yx7UUoIoc/dRYKjxwftNsPtvRvDk7kZUvfrVQljmBc+b7CE 9ltg== X-Gm-Message-State: APjAAAV9MTgrdith3I2Vc2H39RSy/AqZDvx3f9XThECuuvP071Z6BGoR Srsuh3XJXdwcS9D/8ysy5/TW4OnlcGqFw1lI/NBZK+10vgyzvQLQGQVktOuMtxjypRl2xNlIilk 2mi5PdrEqU5BkIIzg3Ad10C828zWvQIzWeI0Fy/n6Bz6LlSvkcHlKBoow5FJhxiDeGA== X-Received: by 2002:a37:a546:: with SMTP id o67mr2965528qke.134.1554343313823; Wed, 03 Apr 2019 19:01:53 -0700 (PDT) X-Google-Smtp-Source: APXvYqxz56OCkeoQS5oFrRED1H2bKE1d7lrJPxmbbI6WYdtoVuEAZ8TUxRK3oIX5NbRyu8u4+iwI X-Received: by 2002:a37:a546:: with SMTP id o67mr2965466qke.134.1554343312684; Wed, 03 Apr 2019 19:01:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343312; cv=none; d=google.com; s=arc-20160816; b=n8ubtybBV9+8iG9AhqZXbYpZrqQU5G4Z+3h407O+tcHFzmKBUca0Yo1RFB6hHx54qw Ae1aHvxnAkxag+ZpO5eOKXMEeaE207zK4/yiZFC8hparoR9AY1edxTNhNCBSJk4mhWea zEV7grWrvDI9Vb+tRhZViw2LBnen9ZbMrjnNi9kjmY8LHGXvAIJGuBtF3J8Zu4z01PMh OmOj5BYlFjWVwnB4OWA7dmbAhorRSbml8VHyB26zhB5Ix91GJ+nUmM7iO3Sn+o133k21 dypHY0xw4i/9n0vSXDZ72aEMbpJCwVnWsUtB7vgaY3detSVPHpaSQPv1MTSdZAnhgkOx EDSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=KCLXIWPn2wPPaX//haXWVGdcpCWLncsANSVaA+4FczE=; b=rMd/0nsx9IlWRUmI0Mx8xczM6KMAmSbbKmDOUmU409UWKW7LmPVua1kaRyz488iiYv HzOkoXJCn7OiPvP5VNzOG5rkG1PM/FypxY1Z5Q7sJardQYBdvRLamQIW6tlF0Kx8HCQC pQM3Qeqt9wHHlOEmkNz+734FEFrG/KFXQCCa7AE27acPjbXxAvaHS/QiYu0oqBI+WaDv Oq/NMamvcXRndgFdAajYMXB4QbNk14vVyVW+7u5poWaaVk/tN0PuQzwZxVVJM8JcGcab gfM2yU9/PxZ2lKk5nqcJt+C8Ygl2nhqcKuQ4ibOGg471dkiePtmdLeWu5bNWBweV2aXZ uXzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=MBp3j9bH; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b="qkA7/ZD2"; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id q47si1194742qvf.4.2019.04.03.19.01.52 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:52 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=MBp3j9bH; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b="qkA7/ZD2"; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 6AC0C225E4; Wed, 3 Apr 2019 22:01:52 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=KCLXIWPn2wPPa X//haXWVGdcpCWLncsANSVaA+4FczE=; b=MBp3j9bHaCgiDCWMOpc0eYvTzoFNl s7ncTpUi6YimkWJAvnpJXe2/pxIoit8B37S2iW96yrfE9kQppfYJ0pwggJs0cmUy xB9nKLHyLu2p30mI7rcE3MnP0Sh4qtVRVcJP1RXyuVyhTjdLkS4rwCPrZ5mG6O71 OuFXXZTTPa4nA10XVB36jkHMifJhn3w0JGZbukdGdEwY+IkeYzDL09uR1Fowdv+M qRuTTxsMVJYa+mSnQqEVsi5WPx0ORmn1SBkmcPwbg//2Kf+vRPUcHl5u0w8vYXIf 4B/J3ziemsBolK4WwIha0UB8jtlu1LNz5bzEouikNurPJSX3S5kWx7Nmw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=KCLXIWPn2wPPaX//haXWVGdcpCWLncsANSVaA+4FczE=; b=qkA7/ZD2 BwJSpBIjr3CaPbOcDkKoKiLpDUkEkU9dcNfFob2cHnp678IE9w9gJLV/gj9qLvSn oFrkoEAHO+3J19psvJpUD2SZtW6j6MQ9l5VPh5PbNIuatOdH9CN+ZtW+SG2n1gcn vglhxwxtIwEy3RRqsoN/Z49KbgQzzY5ZOJoWrAIASga+CzSfhjuuAItExIbh/QII Nnu00rv4J6PuSESCkSLUexyqIFd8JWoSJc8dyvlEyigjWg4ETm4SrTD282za36Gi ndcy1l5Fl6rcFdF5iysSdfmopADJ5KfQnmmbnuxGPG+6gGiJtrKwB1KFi6p7K4lT oibgye0lHPX39Q== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepvddu X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 7306E10319; Wed, 3 Apr 2019 22:01:50 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 22/25] memory manage: active/inactive page list manipulation in memcg. Date: Wed, 3 Apr 2019 19:00:43 -0700 Message-Id: <20190404020046.32741-23-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan The syscall allows users to trigger page list scanning to actively move pages between active/inactive lists according to page references. This is limited to the memcg which the process belongs to. It would not impact the global LRU lists, which is the root memcg. Signed-off-by: Zi Yan --- include/uapi/linux/mempolicy.h | 1 + mm/internal.h | 93 +++++++++++++++++++++++++++++++++++++++++- mm/memory_manage.c | 76 +++++++++++++++++++++++++++++++++- mm/vmscan.c | 90 ++++++++-------------------------------- 4 files changed, 184 insertions(+), 76 deletions(-) diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h index 4722bb7..dac474a 100644 --- a/include/uapi/linux/mempolicy.h +++ b/include/uapi/linux/mempolicy.h @@ -53,6 +53,7 @@ enum { #define MPOL_MF_MOVE_MT (1<<6) /* Use multi-threaded page copy routine */ #define MPOL_MF_MOVE_CONCUR (1<<7) /* Move pages in a batch */ #define MPOL_MF_EXCHANGE (1<<8) /* Exchange pages */ +#define MPOL_MF_SHRINK_LISTS (1<<9) /* Exchange pages */ #define MPOL_MF_VALID (MPOL_MF_STRICT | \ MPOL_MF_MOVE | \ diff --git a/mm/internal.h b/mm/internal.h index 94feb14..eec88de 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -564,7 +564,7 @@ extern int copy_page_lists_mt(struct page **to, extern int exchange_page_mthread(struct page *to, struct page *from, int nr_pages); extern int exchange_page_lists_mthread(struct page **to, - struct page **from, + struct page **from, int nr_pages); extern int exchange_two_pages(struct page *page1, struct page *page2); @@ -577,4 +577,95 @@ int expected_page_refs(struct address_space *mapping, struct page *page); int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask, unsigned long maxnode); +unsigned move_active_pages_to_lru(struct lruvec *lruvec, + struct list_head *list, + struct list_head *pages_to_free, + enum lru_list lru); +void putback_inactive_pages(struct lruvec *lruvec, struct list_head *page_list); + +struct scan_control { + /* How many pages shrink_list() should reclaim */ + unsigned long nr_to_reclaim; + + /* + * Nodemask of nodes allowed by the caller. If NULL, all nodes + * are scanned. + */ + nodemask_t *nodemask; + + /* + * The memory cgroup that hit its limit and as a result is the + * primary target of this reclaim invocation. + */ + struct mem_cgroup *target_mem_cgroup; + + /* Writepage batching in laptop mode; RECLAIM_WRITE */ + unsigned int may_writepage:1; + + /* Can mapped pages be reclaimed? */ + unsigned int may_unmap:1; + + /* Can pages be swapped as part of reclaim? */ + unsigned int may_swap:1; + + /* e.g. boosted watermark reclaim leaves slabs alone */ + unsigned int may_shrinkslab:1; + + /* + * Cgroups are not reclaimed below their configured memory.low, + * unless we threaten to OOM. If any cgroups are skipped due to + * memory.low and nothing was reclaimed, go back for memory.low. + */ + unsigned int memcg_low_reclaim:1; + unsigned int memcg_low_skipped:1; + + unsigned int hibernation_mode:1; + + /* One of the zones is ready for compaction */ + unsigned int compaction_ready:1; + + unsigned int isolate_only_huge_page:1; + unsigned int isolate_only_base_page:1; + unsigned int no_reclaim:1; + + /* Allocation order */ + s8 order; + + /* Scan (total_size >> priority) pages at once */ + s8 priority; + + /* The highest zone to isolate pages for reclaim from */ + s8 reclaim_idx; + + /* This context's GFP mask */ + gfp_t gfp_mask; + + /* Incremented by the number of inactive pages that were scanned */ + unsigned long nr_scanned; + + /* Number of pages freed so far during a call to shrink_zones() */ + unsigned long nr_reclaimed; + + struct { + unsigned int dirty; + unsigned int unqueued_dirty; + unsigned int congested; + unsigned int writeback; + unsigned int immediate; + unsigned int file_taken; + unsigned int taken; + } nr; +}; + +unsigned long isolate_lru_pages(unsigned long nr_to_scan, + struct lruvec *lruvec, struct list_head *dst, + unsigned long *nr_scanned, struct scan_control *sc, + enum lru_list lru); +void shrink_active_list(unsigned long nr_to_scan, + struct lruvec *lruvec, + struct scan_control *sc, + enum lru_list lru); +unsigned long shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, + struct scan_control *sc, enum lru_list lru); + #endif /* __MM_INTERNAL_H */ diff --git a/mm/memory_manage.c b/mm/memory_manage.c index b8f3654..e8dddbf 100644 --- a/mm/memory_manage.c +++ b/mm/memory_manage.c @@ -5,13 +5,79 @@ #include #include #include +#include +#include #include +#include #include +#include #include #include "internal.h" +static unsigned long shrink_lists_node_memcg(pg_data_t *pgdat, + struct mem_cgroup *memcg, unsigned long nr_to_scan) +{ + struct lruvec *lruvec = mem_cgroup_lruvec(pgdat, memcg); + enum lru_list lru; + + for_each_evictable_lru(lru) { + unsigned long nr_to_scan_local = lruvec_size_memcg_node(lru, memcg, + pgdat->node_id) / 2; + struct scan_control sc = {.may_unmap = 1, .no_reclaim = 1}; + /*nr_reclaimed += shrink_list(lru, nr_to_scan, lruvec, memcg, sc);*/ + /* + * for slow node, we want active list, we start from the top of + * the active list. For pages in the bottom of + * the inactive list, we can place it to the top of inactive list + */ + /* + * for fast node, we want inactive list, we start from the bottom of + * the inactive list. For pages in the active list, we just keep them. + */ + /* + * A key question is how many pages to scan each time, and what criteria + * to use to move pages between active/inactive page lists. + * */ + if (is_active_lru(lru)) + shrink_active_list(nr_to_scan_local, lruvec, &sc, lru); + else + shrink_inactive_list(nr_to_scan_local, lruvec, &sc, lru); + } + cond_resched(); + + return 0; +} + +static int shrink_lists(struct task_struct *p, struct mm_struct *mm, + const nodemask_t *slow, const nodemask_t *fast, unsigned long nr_to_scan) +{ + struct mem_cgroup *memcg = mem_cgroup_from_task(p); + int slow_nid, fast_nid; + int err = 0; + + if (!memcg) + return 0; + /* Let's handle simplest situation first */ + if (!(nodes_weight(*slow) == 1 && nodes_weight(*fast) == 1)) + return 0; + + if (memcg == root_mem_cgroup) + return 0; + + slow_nid = first_node(*slow); + fast_nid = first_node(*fast); + + /* move pages between page lists in slow node */ + shrink_lists_node_memcg(NODE_DATA(slow_nid), memcg, nr_to_scan); + + /* move pages between page lists in fast node */ + shrink_lists_node_memcg(NODE_DATA(fast_nid), memcg, nr_to_scan); + + return err; +} + SYSCALL_DEFINE6(mm_manage, pid_t, pid, unsigned long, nr_pages, unsigned long, maxnode, const unsigned long __user *, slow_nodes, @@ -42,10 +108,14 @@ SYSCALL_DEFINE6(mm_manage, pid_t, pid, unsigned long, nr_pages, goto out; /* Check flags */ - if (flags & ~(MPOL_MF_MOVE_MT| + if (flags & ~( + MPOL_MF_MOVE| + MPOL_MF_MOVE_MT| MPOL_MF_MOVE_DMA| MPOL_MF_MOVE_CONCUR| - MPOL_MF_EXCHANGE)) + MPOL_MF_EXCHANGE| + MPOL_MF_SHRINK_LISTS| + MPOL_MF_MOVE_ALL)) return -EINVAL; /* Find the mm_struct */ @@ -94,6 +164,8 @@ SYSCALL_DEFINE6(mm_manage, pid_t, pid, unsigned long, nr_pages, set_bit(MMF_MM_MANAGE, &mm->flags); } + if (flags & MPOL_MF_SHRINK_LISTS) + shrink_lists(task, mm, slow, fast, nr_pages); clear_bit(MMF_MM_MANAGE, &mm->flags); mmput(mm); diff --git a/mm/vmscan.c b/mm/vmscan.c index 1d539d6..3693550 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -63,75 +63,6 @@ #define CREATE_TRACE_POINTS #include -struct scan_control { - /* How many pages shrink_list() should reclaim */ - unsigned long nr_to_reclaim; - - /* - * Nodemask of nodes allowed by the caller. If NULL, all nodes - * are scanned. - */ - nodemask_t *nodemask; - - /* - * The memory cgroup that hit its limit and as a result is the - * primary target of this reclaim invocation. - */ - struct mem_cgroup *target_mem_cgroup; - - /* Writepage batching in laptop mode; RECLAIM_WRITE */ - unsigned int may_writepage:1; - - /* Can mapped pages be reclaimed? */ - unsigned int may_unmap:1; - - /* Can pages be swapped as part of reclaim? */ - unsigned int may_swap:1; - - /* e.g. boosted watermark reclaim leaves slabs alone */ - unsigned int may_shrinkslab:1; - - /* - * Cgroups are not reclaimed below their configured memory.low, - * unless we threaten to OOM. If any cgroups are skipped due to - * memory.low and nothing was reclaimed, go back for memory.low. - */ - unsigned int memcg_low_reclaim:1; - unsigned int memcg_low_skipped:1; - - unsigned int hibernation_mode:1; - - /* One of the zones is ready for compaction */ - unsigned int compaction_ready:1; - - /* Allocation order */ - s8 order; - - /* Scan (total_size >> priority) pages at once */ - s8 priority; - - /* The highest zone to isolate pages for reclaim from */ - s8 reclaim_idx; - - /* This context's GFP mask */ - gfp_t gfp_mask; - - /* Incremented by the number of inactive pages that were scanned */ - unsigned long nr_scanned; - - /* Number of pages freed so far during a call to shrink_zones() */ - unsigned long nr_reclaimed; - - struct { - unsigned int dirty; - unsigned int unqueued_dirty; - unsigned int congested; - unsigned int writeback; - unsigned int immediate; - unsigned int file_taken; - unsigned int taken; - } nr; -}; #ifdef ARCH_HAS_PREFETCH #define prefetch_prev_lru_page(_page, _base, _field) \ @@ -1261,6 +1192,13 @@ static unsigned long shrink_page_list(struct list_head *page_list, ; /* try to reclaim the page below */ } + /* We keep the page in inactive list for migration in the next + * step */ + if (sc->no_reclaim) { + stat->nr_ref_keep++; + goto keep_locked; + } + /* * Anonymous process memory has backing store? * Try to allocate it some swap space here. @@ -1613,7 +1551,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode) * * returns how many pages were moved onto *@dst. */ -static unsigned long isolate_lru_pages(unsigned long nr_to_scan, +unsigned long isolate_lru_pages(unsigned long nr_to_scan, struct lruvec *lruvec, struct list_head *dst, unsigned long *nr_scanned, struct scan_control *sc, enum lru_list lru) @@ -1634,6 +1572,13 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, struct page *page; page = lru_to_page(src); + nr_pages = hpage_nr_pages(page); + + if (sc->isolate_only_base_page && nr_pages != 1) + continue; + if (sc->isolate_only_huge_page && nr_pages == 1) + continue; + prefetchw_prev_lru_page(page, src, flags); VM_BUG_ON_PAGE(!PageLRU(page), page); @@ -1653,7 +1598,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, scan++; switch (__isolate_lru_page(page, mode)) { case 0: - nr_pages = hpage_nr_pages(page); nr_taken += nr_pages; nr_zone_taken[page_zonenum(page)] += nr_pages; list_move(&page->lru, dst); @@ -1855,7 +1799,7 @@ static int current_may_throttle(void) * shrink_inactive_list() is a helper for shrink_node(). It returns the number * of reclaimed pages */ -static noinline_for_stack unsigned long +noinline_for_stack unsigned long shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, struct scan_control *sc, enum lru_list lru) { @@ -2029,7 +1973,7 @@ unsigned move_active_pages_to_lru(struct lruvec *lruvec, return nr_moved; } -static void shrink_active_list(unsigned long nr_to_scan, +void shrink_active_list(unsigned long nr_to_scan, struct lruvec *lruvec, struct scan_control *sc, enum lru_list lru) From patchwork Thu Apr 4 02:00:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884781 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7F9517E9 for ; Thu, 4 Apr 2019 02:02:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1A64288E4 for ; Thu, 4 Apr 2019 02:02:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 95A1328928; Thu, 4 Apr 2019 02:02:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AA2AD288E4 for ; Thu, 4 Apr 2019 02:02:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 70E396B027C; Wed, 3 Apr 2019 22:01:56 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 696CB6B027D; Wed, 3 Apr 2019 22:01:56 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 55D246B027E; Wed, 3 Apr 2019 22:01:56 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 30E146B027C for ; Wed, 3 Apr 2019 22:01:56 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id b188so929648qkg.15 for ; Wed, 03 Apr 2019 19:01:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=K0zKEQYNg9vA6LlAqXa0LlxwYDj+DZBEdtgFhqGYyMM=; b=goAOR2K2PCR9BWVBb7IhskAbvCxc+O74T02TOW6LSHC1+4djypMdbIiofCoFeo2QHL rbbvC4ZhOKGPIF3ho4TGSFb4gW7IQhhxD14dv55EOrbhum3DGAanHPuB+3AtjhWIR8n6 UReyCtj4WpUIynDXcAdlp0lkmhfyWk0bWBbRatnfygwGCsskB20NgzSHhEHCLpHiUhk1 h+IuR2kXtTNsuCxm+1LcNmsj13BgvLkbdSuP3zp9o/xBye5zJs9/kgx3tPxH/sD+MW/S ZGeW6fGOdWmQHfNFnjsKUAmlWEySEgeHJVnxdZg36NRa7bfgsjficCMGJ6Quverer9eh Z2Yg== X-Gm-Message-State: APjAAAXA35RqT1GUh1gs7s5tlyMV3wLao92YF631aafpVCE0d/B/rn10 Z3onxkEC59GblF6h+xSMU748ppZTmX0yfzsRHtTUqqTSC0a/mGroAybN4n4gBVu02U7O/LoCfJL KJe9ddU4ch6ShXapeMsT/lw7joTwJaSAXLVAuFLopqjPpTa5ip3H7f3pqKwTtrorOBw== X-Received: by 2002:a0c:d28f:: with SMTP id q15mr2523038qvh.185.1554343315927; Wed, 03 Apr 2019 19:01:55 -0700 (PDT) X-Google-Smtp-Source: APXvYqyMK//Rilg8xxhxjVSSKoTeQ3jAhCIaSAx6s1yiH1RQbw3fDt0VKOkF04FZW3Cw0ESWSHcg X-Received: by 2002:a0c:d28f:: with SMTP id q15mr2522968qvh.185.1554343314476; Wed, 03 Apr 2019 19:01:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343314; cv=none; d=google.com; s=arc-20160816; b=C0ZkODxFNiSAlcTtoi5JuHQeUhZBDIFdNRlPQJ1/cDvb7xaItrBatPu2G+sUPvz7UB XHiAlFQD30kObYtAVbtBTiIOyKMAbPNmGVL0Zrb+/xzdZ162eUSyOHvNgnQ8O3biEOKm NZPQ0QKEKB2PM7ZKX1dDlDZWZkj4M2CyyWSKm6gwFMqgAGmEoycrXnGPBL9QJlugB4Bp YpSg+9eNIRopXhwkVURDbqC9FMjqucmdSblkmUZPyD+ypEDnP3rQ0YXIRNBabwgVN8jV 4S4X9jsly7vVrDkeopFwM5wldkXP16q9cOF+IjjfuFOYbKI0Pg6So2EpHHRbbXdSI//P e2EA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=K0zKEQYNg9vA6LlAqXa0LlxwYDj+DZBEdtgFhqGYyMM=; b=rN2xhY15osGLaCKFg7PsINVBvwvKHxStKxNLQnnGpugIcKzX7TliK2FO4ou8nvnW6J gzdcv/8uYond2HeFF2cvA/0ab9KoWmPtJ4mzjJnZZka9mFxZCWvLIO/sgbUkEMF3xMfB qDKyevrXYbqqUlHcZ/t64sdTpgW24iGqzHDBPXiGx6WFy4m4lGJoSYVRyZTkOlnczcOv ZML5JMEAVvUOoiu6KqaTiCfF9XklNQLfxSgspgFsw6UCoBM0YmExvUhcHfCAc234lC9/ e+QnVwKcGJfIzJhAJEdrnGYnmAZnh4Ot7Jk3fzUqUX+JlerVlya8wxyTAjuSQyAlzH0M md9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=rm2vS3tR; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=qgwfMbcI; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id v24si1897430qth.193.2019.04.03.19.01.54 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:54 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=rm2vS3tR; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=qgwfMbcI; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 248E822540; Wed, 3 Apr 2019 22:01:54 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=K0zKEQYNg9vA6 LlAqXa0LlxwYDj+DZBEdtgFhqGYyMM=; b=rm2vS3tR5QUYYMFHtm8drYY7K3zaL sGwtg1jQk5nmv9aRvOn2yTFkVZMbcmx2noGILAdtXSJQKJZhGXVj475iLEw1Vhwo cEaHc2UwdMeekVirdkE0dRaABzomvqv1NwGXHSJy5BTKzF7t/O+NWKwth2p31Va+ aVdoA7juNsP/FGYEdxktwlzwhFPnwrUQxZ7GS7gJSNlLQcoewZake+DQCwIieR6E rUsgFHjpfRpqRR291tNhpmNM5a3ppbhUG1oPxajV5NwmObv5/hHGJbhbdjhuRh7v U55/Yd9cmsIG2mKLqhT9CyoABkBqdB/YgSS59KAncLVmW/wUS67k1Ezcg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=K0zKEQYNg9vA6LlAqXa0LlxwYDj+DZBEdtgFhqGYyMM=; b=qgwfMbcI eBFpaV6MMHzN5rIerjMrDKrbAWBiNNEtFg9eDeybWW7/wCBOzaFqhqHES6LQ3vHq /5wNJJIo9YwTCOXPzoe2sxzzynOOn4HIhs+sOQmRXtJ8xTXS2NK1hKA1b8GuOgXG /NR4yJ6BTLhqEQtRAnexNhRWE80XHcJL+YaM/9H9tQ2h68lyracLdVumqMymMbej jkUwuNchlfU8V9S7Ju78YYN02EDsLJKB4XYmAcjwjnHUvUSP3WuoGso4mU1mEXXB wM84Dh5ud02Zhvx8CQ/66sz/7W0L/DmwQxnwvlZSqJZ3r2WaRTfdnEVjnUphhSf0 jajSPBQ7AP5RTA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepvddu X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 222CA1030F; Wed, 3 Apr 2019 22:01:52 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 23/25] memory manage: page migration based page manipulation between NUMA nodes. Date: Wed, 3 Apr 2019 19:00:44 -0700 Message-Id: <20190404020046.32741-24-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Users are expected to set memcg max size to reflect their memory resource allocation policy. The syscall simply migrates pages belong to the application's memcg between from_node to to_node, where from_node is considered fast memory and to_node is considered slow memory. In common cases, active(hot) pages are migrated from to_node to from_node and inactive(cold) pages are migrated from from_node to to_node. Separate migration for base pages and huge pages to achieve high throughput. 1. They are migrated via different calls. 2. 4KB base pages are not transferred via multi-threaded. 3. All pages are migrated together if no optimization is used. Signed-off-by: Zi Yan --- mm/memory_manage.c | 275 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 275 insertions(+) diff --git a/mm/memory_manage.c b/mm/memory_manage.c index e8dddbf..d63ad25 100644 --- a/mm/memory_manage.c +++ b/mm/memory_manage.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -15,6 +16,11 @@ #include "internal.h" +enum isolate_action { + ISOLATE_COLD_PAGES = 1, + ISOLATE_HOT_PAGES, + ISOLATE_HOT_AND_COLD_PAGES, +}; static unsigned long shrink_lists_node_memcg(pg_data_t *pgdat, struct mem_cgroup *memcg, unsigned long nr_to_scan) @@ -78,6 +84,272 @@ static int shrink_lists(struct task_struct *p, struct mm_struct *mm, return err; } +static unsigned long isolate_pages_from_lru_list(pg_data_t *pgdat, + struct mem_cgroup *memcg, unsigned long nr_pages, + struct list_head *base_page_list, + struct list_head *huge_page_list, + unsigned long *nr_taken_base_page, + unsigned long *nr_taken_huge_page, + enum isolate_action action) +{ + struct lruvec *lruvec = mem_cgroup_lruvec(pgdat, memcg); + enum lru_list lru; + unsigned long nr_all_taken = 0; + + if (nr_pages == ULONG_MAX) + nr_pages = memcg_size_node(memcg, pgdat->node_id); + + lru_add_drain_all(); + + for_each_evictable_lru(lru) { + unsigned long nr_scanned, nr_taken; + int file = is_file_lru(lru); + struct scan_control sc = {.may_unmap = 1}; + + if (action == ISOLATE_COLD_PAGES && is_active_lru(lru)) + continue; + if (action == ISOLATE_HOT_PAGES && !is_active_lru(lru)) + continue; + + spin_lock_irq(&pgdat->lru_lock); + + /* Isolate base pages */ + sc.isolate_only_base_page = 1; + nr_taken = isolate_lru_pages(nr_pages, lruvec, base_page_list, + &nr_scanned, &sc, lru); + /* Isolate huge pages */ + sc.isolate_only_base_page = 0; + sc.isolate_only_huge_page = 1; + nr_taken += isolate_lru_pages(nr_pages - nr_scanned, lruvec, + huge_page_list, &nr_scanned, &sc, lru); + + __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken); + + spin_unlock_irq(&pgdat->lru_lock); + + nr_all_taken += nr_taken; + + if (nr_all_taken > nr_pages) + break; + } + + return nr_all_taken; +} + +static int migrate_to_node(struct list_head *page_list, int nid, + enum migrate_mode mode) +{ + bool migrate_concur = mode & MIGRATE_CONCUR; + int num = 0; + int from_nid; + int err; + + if (list_empty(page_list)) + return num; + + from_nid = page_to_nid(list_first_entry(page_list, struct page, lru)); + + if (migrate_concur) + err = migrate_pages_concur(page_list, alloc_new_node_page, + NULL, nid, mode, MR_SYSCALL); + else + err = migrate_pages(page_list, alloc_new_node_page, + NULL, nid, mode, MR_SYSCALL); + + if (err) { + struct page *page; + + list_for_each_entry(page, page_list, lru) + num += hpage_nr_pages(page); + pr_debug("%d pages failed to migrate from %d to %d\n", + num, from_nid, nid); + + putback_movable_pages(page_list); + } + return num; +} + +static inline int _putback_overflow_pages(unsigned long max_nr_pages, + struct list_head *page_list, unsigned long *nr_remaining_pages) +{ + struct page *page; + LIST_HEAD(putback_list); + + if (list_empty(page_list)) + return max_nr_pages; + + *nr_remaining_pages = 0; + /* in case we need to drop the whole list */ + page = list_first_entry(page_list, struct page, lru); + if (max_nr_pages <= (2 * hpage_nr_pages(page))) { + max_nr_pages = 0; + putback_movable_pages(page_list); + goto out; + } + + list_for_each_entry(page, page_list, lru) { + int nr_pages = hpage_nr_pages(page); + /* drop just one more page to avoid using up free space */ + if (max_nr_pages <= (2 * nr_pages)) { + max_nr_pages = 0; + break; + } + max_nr_pages -= nr_pages; + *nr_remaining_pages += nr_pages; + } + + /* we did not scan all pages in page_list, we need to put back some */ + if (&page->lru != page_list) { + list_cut_position(&putback_list, page_list, &page->lru); + putback_movable_pages(page_list); + list_splice(&putback_list, page_list); + } +out: + return max_nr_pages; +} + +static int putback_overflow_pages(unsigned long max_nr_base_pages, + unsigned long max_nr_huge_pages, + long nr_free_pages, + struct list_head *base_page_list, + struct list_head *huge_page_list, + unsigned long *nr_base_pages, + unsigned long *nr_huge_pages) +{ + if (nr_free_pages < 0) { + if ((-nr_free_pages) > max_nr_base_pages) { + nr_free_pages += max_nr_base_pages; + max_nr_base_pages = 0; + } + + if ((-nr_free_pages) > max_nr_huge_pages) { + nr_free_pages = 0; + max_nr_base_pages = 0; + } + } + /* + * counting pages in page lists and substract the number from max_nr_* + * when max_nr_* go to zero, drop the remaining pages + */ + max_nr_huge_pages += _putback_overflow_pages(nr_free_pages/2 + max_nr_base_pages, + base_page_list, nr_base_pages); + return _putback_overflow_pages(nr_free_pages/2 + max_nr_huge_pages, + huge_page_list, nr_huge_pages); +} + +static int do_mm_manage(struct task_struct *p, struct mm_struct *mm, + const nodemask_t *slow, const nodemask_t *fast, + unsigned long nr_pages, int flags) +{ + bool migrate_mt = flags & MPOL_MF_MOVE_MT; + bool migrate_concur = flags & MPOL_MF_MOVE_CONCUR; + bool migrate_dma = flags & MPOL_MF_MOVE_DMA; + bool move_hot_and_cold_pages = flags & MPOL_MF_MOVE_ALL; + struct mem_cgroup *memcg = mem_cgroup_from_task(p); + int err = 0; + unsigned long nr_isolated_slow_pages; + unsigned long nr_isolated_slow_base_pages = 0; + unsigned long nr_isolated_slow_huge_pages = 0; + unsigned long nr_isolated_fast_pages; + /* in case no migration from to node, we migrate all isolated pages from + * slow node */ + unsigned long nr_isolated_fast_base_pages = ULONG_MAX; + unsigned long nr_isolated_fast_huge_pages = ULONG_MAX; + unsigned long max_nr_pages_fast_node, nr_pages_fast_node; + unsigned long nr_pages_slow_node, nr_active_pages_slow_node; + long nr_free_pages_fast_node; + int slow_nid, fast_nid; + enum migrate_mode mode = MIGRATE_SYNC | + (migrate_mt ? MIGRATE_MT : MIGRATE_SINGLETHREAD) | + (migrate_dma ? MIGRATE_DMA : MIGRATE_SINGLETHREAD) | + (migrate_concur ? MIGRATE_CONCUR : MIGRATE_SINGLETHREAD); + enum isolate_action isolate_action = + move_hot_and_cold_pages?ISOLATE_HOT_AND_COLD_PAGES:ISOLATE_HOT_PAGES; + LIST_HEAD(slow_base_page_list); + LIST_HEAD(slow_huge_page_list); + + if (!memcg) + return 0; + /* Let's handle simplest situation first */ + if (!(nodes_weight(*slow) == 1 && nodes_weight(*fast) == 1)) + return 0; + + /* Only work on specific cgroup not the global root */ + if (memcg == root_mem_cgroup) + return 0; + + slow_nid = first_node(*slow); + fast_nid = first_node(*fast); + + max_nr_pages_fast_node = memcg_max_size_node(memcg, fast_nid); + nr_pages_fast_node = memcg_size_node(memcg, fast_nid); + nr_active_pages_slow_node = active_inactive_size_memcg_node(memcg, + slow_nid, true); + nr_pages_slow_node = memcg_size_node(memcg, slow_nid); + + nr_free_pages_fast_node = max_nr_pages_fast_node - nr_pages_fast_node; + + /* do not migrate in more pages than fast node can hold */ + nr_pages = min_t(unsigned long, max_nr_pages_fast_node, nr_pages); + /* do not migrate away more pages than slow node has */ + nr_pages = min_t(unsigned long, nr_pages_slow_node, nr_pages); + + /* if fast node has enough space, migrate all possible pages in slow node */ + if (nr_pages != ULONG_MAX && + nr_free_pages_fast_node > 0 && + nr_active_pages_slow_node < nr_free_pages_fast_node) { + isolate_action = ISOLATE_HOT_AND_COLD_PAGES; + } + + nr_isolated_slow_pages = isolate_pages_from_lru_list(NODE_DATA(slow_nid), + memcg, nr_pages, &slow_base_page_list, &slow_huge_page_list, + &nr_isolated_slow_base_pages, &nr_isolated_slow_huge_pages, + isolate_action); + + if (max_nr_pages_fast_node != ULONG_MAX && + (nr_free_pages_fast_node < 0 || + nr_free_pages_fast_node < nr_isolated_slow_pages)) { + LIST_HEAD(fast_base_page_list); + LIST_HEAD(fast_huge_page_list); + + nr_isolated_fast_base_pages = 0; + nr_isolated_fast_huge_pages = 0; + /* isolate pages on fast node to make space */ + nr_isolated_fast_pages = isolate_pages_from_lru_list(NODE_DATA(fast_nid), + memcg, + nr_isolated_slow_pages - nr_free_pages_fast_node, + &fast_base_page_list, &fast_huge_page_list, + &nr_isolated_fast_base_pages, &nr_isolated_fast_huge_pages, + move_hot_and_cold_pages?ISOLATE_HOT_AND_COLD_PAGES:ISOLATE_COLD_PAGES); + + /* Migrate pages to slow node */ + /* No multi-threaded migration for base pages */ + nr_isolated_fast_base_pages -= + migrate_to_node(&fast_base_page_list, slow_nid, mode & ~MIGRATE_MT); + + nr_isolated_fast_huge_pages -= + migrate_to_node(&fast_huge_page_list, slow_nid, mode); + } + + if (nr_isolated_fast_base_pages != ULONG_MAX && + nr_isolated_fast_huge_pages != ULONG_MAX) + putback_overflow_pages(nr_isolated_fast_base_pages, + nr_isolated_fast_huge_pages, nr_free_pages_fast_node, + &slow_base_page_list, &slow_huge_page_list, + &nr_isolated_slow_base_pages, + &nr_isolated_slow_huge_pages); + + /* Migrate pages to fast node */ + /* No multi-threaded migration for base pages */ + nr_isolated_slow_base_pages -= + migrate_to_node(&slow_base_page_list, fast_nid, mode & ~MIGRATE_MT); + + nr_isolated_slow_huge_pages -= + migrate_to_node(&slow_huge_page_list, fast_nid, mode); + + return err; +} + SYSCALL_DEFINE6(mm_manage, pid_t, pid, unsigned long, nr_pages, unsigned long, maxnode, const unsigned long __user *, slow_nodes, @@ -167,6 +439,9 @@ SYSCALL_DEFINE6(mm_manage, pid_t, pid, unsigned long, nr_pages, if (flags & MPOL_MF_SHRINK_LISTS) shrink_lists(task, mm, slow, fast, nr_pages); + if (flags & MPOL_MF_MOVE) + err = do_mm_manage(task, mm, slow, fast, nr_pages, flags); + clear_bit(MMF_MM_MANAGE, &mm->flags); mmput(mm); out: From patchwork Thu Apr 4 02:00:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884783 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC1E71708 for ; Thu, 4 Apr 2019 02:02:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C688C288E4 for ; Thu, 4 Apr 2019 02:02:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BA48128928; Thu, 4 Apr 2019 02:02:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC883288E4 for ; Thu, 4 Apr 2019 02:02:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB6566B027D; Wed, 3 Apr 2019 22:01:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A67C06B027E; Wed, 3 Apr 2019 22:01:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8DFA66B027F; Wed, 3 Apr 2019 22:01:57 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 6D5D46B027D for ; Wed, 3 Apr 2019 22:01:57 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id h51so926041qte.22 for ; Wed, 03 Apr 2019 19:01:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=3T42stGTEO5yaBeosRcIHJtycIIQs381815PCz0cSnU=; b=r8Hj1y/jxFQbFqEnvG6lLwMUxb/AHL/wScj8GKZO919UrrRy8RMslJTWaSEY6mZdzU qEfrsyFVt4MLgpHWiTYc9yTY2rk3fQBzGsxIl9FIlpls3s3jaFvNApGRxDpjw0Sjhm8o +AWt4bbVQ63cfXla66enrPDttJmuwyJueMk02Kn4hr5uSYkA1SFzVHhlj3APC6pEKIaT voN95p+1O7abQWNCJK+B8coI9KJdXEryHJzsYJj2soESpps7BdHBFQfPTLGdYlFyJcoC mnmezoXG2m03JB1yn1tQys3Y2sjKjcPfOjbFul/RDOJbtOmNScovQKpH96FK/iihMMit WYHA== X-Gm-Message-State: APjAAAVzn/qNN6F3jAwevY4nYcVCIYKznR7uyhlDtaPz3KGlv1d39RhG 1czLvpQWpiZZ4ucOvplmikqOGHBbOtExCFFwxCBPKWGOYHLxEKkc/urXo/Pv887dzAWmcrgsHr8 4QirIfl3f4QWWunKOrziKK8WvbZsj5WbrWnN4v4ZLuTUQHZ5bFzF73JcGyehQgQREuw== X-Received: by 2002:aed:3b62:: with SMTP id q31mr3014410qte.82.1554343317220; Wed, 03 Apr 2019 19:01:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqzrOzSAcECGLFh8ifURex61N/J0iumRmjMNJ+jA3yWwbL/t0LGWTPqADquKNXKS3HS8K97Y X-Received: by 2002:aed:3b62:: with SMTP id q31mr3014341qte.82.1554343316046; Wed, 03 Apr 2019 19:01:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343316; cv=none; d=google.com; s=arc-20160816; b=kPgdHehiJiV+z+65hqUVP0UTgIPwdIfUPrS1xZ3KB/Uv9uqC2qDjVo1TU8x7ASEMbb 2IZIJJfBinyCQYOqqDtw1/kavctn5oZxQOIW/Qt4/XL9FlyOYeqxgCWsE/5ixWme0cCW 8w8bc+sz7LxU0b32EbmnLUWai6phh5LDOBqQiqXd+zqOjMThkLzXEjGYKzoxkUHtNRic hK5x+B2xnf1fAtiMaaz5hls+FynZOPl7veqidMOGWjG9wS0PHRVa6za6ZU5cK0c9uUGY C/yHGPxCt8PjCtyPBynru954TB9KOYRr/UAzQ6/rbFYECjSOvwf+X5FReE666Q76W7u+ SPHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=3T42stGTEO5yaBeosRcIHJtycIIQs381815PCz0cSnU=; b=sLfoEKWbOBkc3g2GL6MIM4+CGDB2iQw+mCwx91WpDqkGb0vWCqlep1xQCvrv1VdziD r8TFjNE59GdVwDc6LAnuVLYlyQXQv1Uux+amN/RRESJphkWp7HgF3nrbiTALSBApCaKw ZpbKw3YsyvIj0GHRM4jXSeGulFiXjyu6EQ/o0ndThjPtLVou4yUJfIHXEuVlLVgbN9/g ecp/Yaz1lPMp3BhCBxfabas6yKdYwkqw66e4vR42DGUkWPkm8UiioY7BPKcu39n8clTd xXZdv6yYpdhpFUTVG8xkpdYXzw+Ug88uFlICT3ryiWns8rINwJPApYk9zK9jEJHSf03/ MlCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=fdG7ZhJt; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=ndTS5P7P; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id z10si144395qth.215.2019.04.03.19.01.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:56 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=fdG7ZhJt; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=ndTS5P7P; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id BE5D922B6A; Wed, 3 Apr 2019 22:01:55 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=3T42stGTEO5ya BeosRcIHJtycIIQs381815PCz0cSnU=; b=fdG7ZhJtpi82VTO5NK8xh0TlTw62C 8chyJiNLcSRPLCPCzhQNGJA6pRCfAlbglgqgay1XkS7iR8kiNcXOMUBKUSsWmue5 leWLYppitvB7QhkUcBz/BEGJnbQkl2+ae5++jORUQZHoRQ/EfFnfrIHb6zaVyGpy +odnxxnOrMYnjyBER2Chm0ZliVIQYufit4BkU0ZTZxWYy++QR3uR6tKLecYqpGXE hY7Tg13vEWZ+uOH4cGGumZAZ0EBPIz+LhMMJEK0quK7aQF1Ap0eq34R3q1bj760H 2G9gkIruWkuWmGtX6+CjKXA100hig8M5kexoPWP9WBmuwmJoh4BYLTC/g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=3T42stGTEO5yaBeosRcIHJtycIIQs381815PCz0cSnU=; b=ndTS5P7P Qxd+Ar6roAAVVpGH9xCeYy7Geo+mo94CLKb/oUbWs0cCIEiyGxakuKff58A66qvj kgdDO62ecGWtQYlO1OjXrqAjMjH+05dLReIYoEcUqF9KEnVytE2DeZnl/rb/Po0E JMA1ZkWN9x4sTXtvU7BGC0Mr4FCyczgPnHoRomZuvnWkmAXlrta5j3Fybc45HQ6m BtBNbaFeQ2/ENDSfZrwMxeFLzvi15B81siMXbk6Pf3HsKcAVIDW9ujxUZEZLSUim K0aV/eavecjeWXfz67rnUexfMsvqvDJQA6T/TT/oh7/GFQr70ZyOhIsengZEh0qk qj3WXk5xsgjW7g== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepvddu X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id C218010310; Wed, 3 Apr 2019 22:01:53 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 24/25] memory manage: limit migration batch size. Date: Wed, 3 Apr 2019 19:00:45 -0700 Message-Id: <20190404020046.32741-25-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan Make migration batch size adjustable to avoid excessive migration overheads when a lot of pages are under migration. Signed-off-by: Zi Yan --- kernel/sysctl.c | 8 ++++++++ mm/memory_manage.c | 60 ++++++++++++++++++++++++++++++++++++------------------ 2 files changed, 48 insertions(+), 20 deletions(-) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index b8712eb..b92e2da9 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -105,6 +105,7 @@ extern int accel_page_copy; extern unsigned int limit_mt_num; extern int use_all_dma_chans; extern int limit_dma_chans; +extern int migration_batch_size; /* External variables not in a header file. */ extern int suid_dumpable; @@ -1470,6 +1471,13 @@ static struct ctl_table vm_table[] = { .extra1 = &zero, }, { + .procname = "migration_batch_size", + .data = &migration_batch_size, + .maxlen = sizeof(migration_batch_size), + .mode = 0644, + .proc_handler = proc_dointvec, + }, + { .procname = "hugetlb_shm_group", .data = &sysctl_hugetlb_shm_group, .maxlen = sizeof(gid_t), diff --git a/mm/memory_manage.c b/mm/memory_manage.c index d63ad25..8b76fcf 100644 --- a/mm/memory_manage.c +++ b/mm/memory_manage.c @@ -16,6 +16,8 @@ #include "internal.h" +int migration_batch_size = 16; + enum isolate_action { ISOLATE_COLD_PAGES = 1, ISOLATE_HOT_PAGES, @@ -137,35 +139,49 @@ static unsigned long isolate_pages_from_lru_list(pg_data_t *pgdat, } static int migrate_to_node(struct list_head *page_list, int nid, - enum migrate_mode mode) + enum migrate_mode mode, int batch_size) { bool migrate_concur = mode & MIGRATE_CONCUR; + bool unlimited_batch_size = (batch_size <=0 || !migrate_concur); int num = 0; - int from_nid; + int from_nid = -1; int err; if (list_empty(page_list)) return num; - from_nid = page_to_nid(list_first_entry(page_list, struct page, lru)); + while (!list_empty(page_list)) { + LIST_HEAD(batch_page_list); + int i; - if (migrate_concur) - err = migrate_pages_concur(page_list, alloc_new_node_page, - NULL, nid, mode, MR_SYSCALL); - else - err = migrate_pages(page_list, alloc_new_node_page, - NULL, nid, mode, MR_SYSCALL); + /* it should move all pages to batch_page_list if !migrate_concur */ + for (i = 0; i < batch_size || unlimited_batch_size; i++) { + struct page *item = list_first_entry_or_null(page_list, struct page, lru); + if (!item) + break; + list_move(&item->lru, &batch_page_list); + } - if (err) { - struct page *page; + from_nid = page_to_nid(list_first_entry(&batch_page_list, struct page, lru)); - list_for_each_entry(page, page_list, lru) - num += hpage_nr_pages(page); - pr_debug("%d pages failed to migrate from %d to %d\n", - num, from_nid, nid); + if (migrate_concur) + err = migrate_pages_concur(&batch_page_list, alloc_new_node_page, + NULL, nid, mode, MR_SYSCALL); + else + err = migrate_pages(&batch_page_list, alloc_new_node_page, + NULL, nid, mode, MR_SYSCALL); - putback_movable_pages(page_list); + if (err) { + struct page *page; + + list_for_each_entry(page, &batch_page_list, lru) + num += hpage_nr_pages(page); + + putback_movable_pages(&batch_page_list); + } } + pr_debug("%d pages failed to migrate from %d to %d\n", + num, from_nid, nid); return num; } @@ -325,10 +341,12 @@ static int do_mm_manage(struct task_struct *p, struct mm_struct *mm, /* Migrate pages to slow node */ /* No multi-threaded migration for base pages */ nr_isolated_fast_base_pages -= - migrate_to_node(&fast_base_page_list, slow_nid, mode & ~MIGRATE_MT); + migrate_to_node(&fast_base_page_list, slow_nid, + mode & ~MIGRATE_MT, migration_batch_size); nr_isolated_fast_huge_pages -= - migrate_to_node(&fast_huge_page_list, slow_nid, mode); + migrate_to_node(&fast_huge_page_list, slow_nid, mode, + migration_batch_size); } if (nr_isolated_fast_base_pages != ULONG_MAX && @@ -342,10 +360,12 @@ static int do_mm_manage(struct task_struct *p, struct mm_struct *mm, /* Migrate pages to fast node */ /* No multi-threaded migration for base pages */ nr_isolated_slow_base_pages -= - migrate_to_node(&slow_base_page_list, fast_nid, mode & ~MIGRATE_MT); + migrate_to_node(&slow_base_page_list, fast_nid, mode & ~MIGRATE_MT, + migration_batch_size); nr_isolated_slow_huge_pages -= - migrate_to_node(&slow_huge_page_list, fast_nid, mode); + migrate_to_node(&slow_huge_page_list, fast_nid, mode, + migration_batch_size); return err; } From patchwork Thu Apr 4 02:00:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 10884785 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F035E1708 for ; Thu, 4 Apr 2019 02:02:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA4CA288E4 for ; Thu, 4 Apr 2019 02:02:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CE0DE28928; Thu, 4 Apr 2019 02:02:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 10F26288E4 for ; Thu, 4 Apr 2019 02:02:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D5EA6B027E; Wed, 3 Apr 2019 22:01:59 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 55ED36B0280; Wed, 3 Apr 2019 22:01:59 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 425026B0281; Wed, 3 Apr 2019 22:01:59 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 1CE786B027E for ; Wed, 3 Apr 2019 22:01:59 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id 77so943687qkd.9 for ; Wed, 03 Apr 2019 19:01:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:dkim-signature:from:to:cc:subject :date:message-id:in-reply-to:references:reply-to:mime-version :content-transfer-encoding; bh=PhVR5Tm+hbp1uFSOwTqwsUckzcZYN8AfZd5SoP03jII=; b=LL2wR9rS2dvKE2nJtnOqy7aH32nf8xN7TJNPxJfKHsPk9RUUnTPd5EdCmN5UKJaee8 qjjp6MQvH1udhttyqLUCn16Rfi+WClnYoNmKvCK/Vo3x1jyjlvLz/7wFSFupYXuXXwpo ENl3RsunBKweWm2Vl76QZeVMrIJdGeofegP4yIhOHNB2jMnX+Usm0HCUlP2vgXMM/tS6 fmWIisP3+ARISD3iei2nFob4kg/BhN7y9mV1Gz1sLGf2JjRF4ayq+JHmG9Pb2gPmMasA 6uYUwzK8ZLOeRml2zTW5znyuriu0KsT3y3DbeuRkBuVxZ0mJSpeb1nUweZVfaz/XI2r5 CA0A== X-Gm-Message-State: APjAAAWvdsFtgRyUgntvs7aarBVGGDLYx0XoW03D6VDAS942BO097C2t InP55NV4aEblP/LVQQN0dx1XbyA++7ma3asagRMkSeNLJlfCBEPF5jwhDDFGZCIOhpAdAqTUooh 4QteudnVbYEaCPZIBtpGpwH8WAYX+JgqzBfs+dtQFyz6mEM195RbmhsllryWJwg4NjQ== X-Received: by 2002:ac8:538c:: with SMTP id x12mr2947442qtp.238.1554343318865; Wed, 03 Apr 2019 19:01:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqzRcYlw5eso3XnHDPhE/wogvkXGUrPjD+8Ohri7ZyaD6OgFM7JExRCqJvx/5ddWKOSkiNUT X-Received: by 2002:ac8:538c:: with SMTP id x12mr2947384qtp.238.1554343317667; Wed, 03 Apr 2019 19:01:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554343317; cv=none; d=google.com; s=arc-20160816; b=xyW7noPnmC7uTYfql5iePLdqY5Ps4k6oArcaQ+NkfZ6uCt0p88/hkzcZPCwWRsiQmO YsJEWjSP9RQ13k/QwTaAq1ybqfhEwdjBUQJ6HkRnZPD7bd7WwxSwhdHxQkWjm1bZcuvD sd2mtRTp0+7dDMk3pl44l6bvAfClJqEiVjA+JAHvSgLpH+/UPmqCZzF7Te+cJKZ0bbg5 NZJxvXuM/dJVQuH4kvxiTRXxaHVBN+DjRYsy4HMA4ePrHaqvxM7ww5TDqlB31WQC7b98 mJs3r7Kl87IwCoFgzoR+19A6TMNiwVxepJf2qPrabQ3eJkB8vFp/LEFF00o/e9O0+3nS AA0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :dkim-signature; bh=PhVR5Tm+hbp1uFSOwTqwsUckzcZYN8AfZd5SoP03jII=; b=GQ1boNG0nXiLqE4Q3cxdigM3z6jVumbOVUb67AIUE4wQY9H7xtPO5Opj0NmfpboIEL j1OF+hNGKDGDv9UVY06rsRY13yrrU1BLhYPcRRIz889bx7G2slhky2RcqbAzpPjKqOb0 xebLdBHJHpGgOqc9RnCT3pEZe9Tx1mOLBFySp91dxEzJxNqFaNqNU7hjTmhcfzeR6OMl e3akvRz/epXERPxpsHBAT3jd54m1G5hx8oUaM/JqYNIFjhCc8dZPBHtUhA9NagMw83J9 K3a+d31PdYtBKuuEz9whFKKcynTdkAVREALQctMUsbBeWIxz7j/eTOd7qPZ6er6zrOcf YFew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b="Yr+ax8/Q"; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=oPEdqyfk; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com. [66.111.4.29]) by mx.google.com with ESMTPS id b189si5280925qkd.230.2019.04.03.19.01.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 19:01:57 -0700 (PDT) Received-SPF: pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) client-ip=66.111.4.29; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b="Yr+ax8/Q"; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=oPEdqyfk; spf=pass (google.com: domain of zi.yan@sent.com designates 66.111.4.29 as permitted sender) smtp.mailfrom=zi.yan@sent.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 5ADAD22A8A; Wed, 3 Apr 2019 22:01:57 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Apr 2019 22:01:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm3; bh=PhVR5Tm+hbp1u FSOwTqwsUckzcZYN8AfZd5SoP03jII=; b=Yr+ax8/QVYPA1+W4dwtnPPgONbRsx CoMF+dfi9QtEPs7g0RiLU0pk0iVzICDgXU+At416R46J4I7W0fTnatME4kl5KAbu zcy+cb/BVsz3H3kZbWWZ9WpsYqBgrPkJifyZQq0NQGJaBQTq1SzAFzQJurhA2+5y O0RrMVKSKeObFdnk6RqGNur+0U6DjptvN0jr9DJnVmwAYeD6I7g8JLUW3xD44aT5 12e05CYeOXAtVaDMUatMFf/HENG9325wrtYwsBbHYhuCvNhDQbgpCq5MKFsEaW3Y kYc9owHk7glJtDNRiJDtGpilgFsbr6V8Jjj0cVWjKi4ce6Ts0Y9CHQqfw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=PhVR5Tm+hbp1uFSOwTqwsUckzcZYN8AfZd5SoP03jII=; b=oPEdqyfk qmB60PvsJ01YVyhW1msq2+79MMbiCd+DOqBpEd4/V4c2ONuYgW8GsXHZakCoiao4 l9yLhOoKXJo2VL4oM/S0YbHmTJM42epHCIKJOjcH/Yu3ZtWB5es5DBgjHD7aQ9Sy L7TeL0RCElkGYeWslFraN4QHcpPQGbWYR3JKuX/aSnyuc9GHH35RhRI7T3zGOGzd 9sPWcZXsZSAorFAWcUJ77jbioPTPHHMWFXrscP2Ux0wZzw/nKDPs5p1J0vPXyv5D R9pGXdzYZB3ZKXBR+6iuv1bP5hVNpgxYyhY3Z+FLe1ipe4cyT5rheF5Jq3EPkg5E xL7NTzTZordTDg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeggdehudculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddt necuhfhrohhmpegkihcujggrnhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucfkph epvdduiedrvddvkedrudduvddrvddvnecurfgrrhgrmhepmhgrihhlfhhrohhmpeiiihdr higrnhesshgvnhhtrdgtohhmnecuvehluhhsthgvrhfuihiivgepvdeg X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 7114310316; Wed, 3 Apr 2019 22:01:55 -0400 (EDT) From: Zi Yan To: Dave Hansen , Yang Shi , Keith Busch , Fengguang Wu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Daniel Jordan , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , Javier Cabezas , David Nellans , Zi Yan Subject: [RFC PATCH 25/25] memory manage: use exchange pages to memory manage to improve throughput. Date: Wed, 3 Apr 2019 19:00:46 -0700 Message-Id: <20190404020046.32741-26-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190404020046.32741-1-zi.yan@sent.com> References: <20190404020046.32741-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Zi Yan 1. Exclude file-backed base pages from exchanging. 2. Split THP in exchange pages if THP support is disabled. 3. if THP migration is supported, only exchange THPs. Signed-off-by: Zi Yan --- mm/memory_manage.c | 173 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 173 insertions(+) diff --git a/mm/memory_manage.c b/mm/memory_manage.c index 8b76fcf..d3d07b7 100644 --- a/mm/memory_manage.c +++ b/mm/memory_manage.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -253,6 +254,147 @@ static int putback_overflow_pages(unsigned long max_nr_base_pages, huge_page_list, nr_huge_pages); } +static int add_pages_to_exchange_list(struct list_head *from_pagelist, + struct list_head *to_pagelist, struct exchange_page_info *info_list, + struct list_head *exchange_list, unsigned long info_list_size) +{ + unsigned long info_list_index = 0; + LIST_HEAD(failed_from_list); + LIST_HEAD(failed_to_list); + + while (!list_empty(from_pagelist) && !list_empty(to_pagelist)) { + struct page *from_page, *to_page; + struct exchange_page_info *one_pair = &info_list[info_list_index]; + int rc; + + from_page = list_first_entry_or_null(from_pagelist, struct page, lru); + to_page = list_first_entry_or_null(to_pagelist, struct page, lru); + + if (!from_page || !to_page) + break; + + if (!thp_migration_supported() && PageTransHuge(from_page)) { + lock_page(from_page); + rc = split_huge_page_to_list(from_page, &from_page->lru); + unlock_page(from_page); + if (rc) { + list_move(&from_page->lru, &failed_from_list); + continue; + } + } + + if (!thp_migration_supported() && PageTransHuge(to_page)) { + lock_page(to_page); + rc = split_huge_page_to_list(to_page, &to_page->lru); + unlock_page(to_page); + if (rc) { + list_move(&to_page->lru, &failed_to_list); + continue; + } + } + + if (hpage_nr_pages(from_page) != hpage_nr_pages(to_page)) { + if (!(hpage_nr_pages(from_page) == 1 && hpage_nr_pages(from_page) == HPAGE_PMD_NR)) { + list_del(&from_page->lru); + list_add(&from_page->lru, &failed_from_list); + } + if (!(hpage_nr_pages(to_page) == 1 && hpage_nr_pages(to_page) == HPAGE_PMD_NR)) { + list_del(&to_page->lru); + list_add(&to_page->lru, &failed_to_list); + } + continue; + } + + /* Exclude file-backed pages, exchange it concurrently is not + * implemented yet. */ + if (page_mapping(from_page)) { + list_del(&from_page->lru); + list_add(&from_page->lru, &failed_from_list); + continue; + } + if (page_mapping(to_page)) { + list_del(&to_page->lru); + list_add(&to_page->lru, &failed_to_list); + continue; + } + + list_del(&from_page->lru); + list_del(&to_page->lru); + + one_pair->from_page = from_page; + one_pair->to_page = to_page; + + list_add_tail(&one_pair->list, exchange_list); + + info_list_index++; + if (info_list_index >= info_list_size) + break; + } + list_splice(&failed_from_list, from_pagelist); + list_splice(&failed_to_list, to_pagelist); + + return info_list_index; +} + +static unsigned long exchange_pages_between_nodes(unsigned long nr_from_pages, + unsigned long nr_to_pages, struct list_head *from_page_list, + struct list_head *to_page_list, int batch_size, + bool huge_page, enum migrate_mode mode) +{ + struct exchange_page_info *info_list; + unsigned long info_list_size = min_t(unsigned long, + nr_from_pages, nr_to_pages) / (huge_page?HPAGE_PMD_NR:1); + unsigned long added_size = 0; + bool migrate_concur = mode & MIGRATE_CONCUR; + LIST_HEAD(exchange_list); + + /* non concurrent does not need to split into batches */ + if (!migrate_concur || batch_size <= 0) + batch_size = info_list_size; + + /* prepare for huge page split */ + if (!thp_migration_supported() && huge_page) { + batch_size = batch_size * HPAGE_PMD_NR; + info_list_size = info_list_size * HPAGE_PMD_NR; + } + + info_list = kvzalloc(sizeof(struct exchange_page_info)*batch_size, + GFP_KERNEL); + if (!info_list) + return 0; + + while (!list_empty(from_page_list) && !list_empty(to_page_list)) { + unsigned long nr_added_pages; + INIT_LIST_HEAD(&exchange_list); + + nr_added_pages = add_pages_to_exchange_list(from_page_list, to_page_list, + info_list, &exchange_list, batch_size); + + /* + * Nothing to exchange, we bail out. + * + * In case from_page_list and to_page_list both only have file-backed + * pages left */ + if (!nr_added_pages) + break; + + added_size += nr_added_pages; + + VM_BUG_ON(added_size > info_list_size); + + if (migrate_concur) + exchange_pages_concur(&exchange_list, mode, MR_SYSCALL); + else + exchange_pages(&exchange_list, mode, MR_SYSCALL); + + memset(info_list, 0, sizeof(struct exchange_page_info)*batch_size); + } + + kvfree(info_list); + + return info_list_size; +} + static int do_mm_manage(struct task_struct *p, struct mm_struct *mm, const nodemask_t *slow, const nodemask_t *fast, unsigned long nr_pages, int flags) @@ -261,6 +403,7 @@ static int do_mm_manage(struct task_struct *p, struct mm_struct *mm, bool migrate_concur = flags & MPOL_MF_MOVE_CONCUR; bool migrate_dma = flags & MPOL_MF_MOVE_DMA; bool move_hot_and_cold_pages = flags & MPOL_MF_MOVE_ALL; + bool migrate_exchange_pages = flags & MPOL_MF_EXCHANGE; struct mem_cgroup *memcg = mem_cgroup_from_task(p); int err = 0; unsigned long nr_isolated_slow_pages; @@ -338,6 +481,35 @@ static int do_mm_manage(struct task_struct *p, struct mm_struct *mm, &nr_isolated_fast_base_pages, &nr_isolated_fast_huge_pages, move_hot_and_cold_pages?ISOLATE_HOT_AND_COLD_PAGES:ISOLATE_COLD_PAGES); + if (migrate_exchange_pages) { + unsigned long nr_exchange_pages; + + /* + * base pages can include file-backed ones, we do not handle them + * at the moment + */ + if (!thp_migration_supported()) { + nr_exchange_pages = exchange_pages_between_nodes(nr_isolated_slow_base_pages, + nr_isolated_fast_base_pages, &slow_base_page_list, + &fast_base_page_list, migration_batch_size, false, mode); + + nr_isolated_fast_base_pages -= nr_exchange_pages; + } + + /* THP page exchange */ + nr_exchange_pages = exchange_pages_between_nodes(nr_isolated_slow_huge_pages, + nr_isolated_fast_huge_pages, &slow_huge_page_list, + &fast_huge_page_list, migration_batch_size, true, mode); + + /* split THP above, so we do not need to multiply the counter */ + if (!thp_migration_supported()) + nr_isolated_fast_huge_pages -= nr_exchange_pages; + else + nr_isolated_fast_huge_pages -= nr_exchange_pages * HPAGE_PMD_NR; + + goto migrate_out; + } else { +migrate_out: /* Migrate pages to slow node */ /* No multi-threaded migration for base pages */ nr_isolated_fast_base_pages -= @@ -347,6 +519,7 @@ static int do_mm_manage(struct task_struct *p, struct mm_struct *mm, nr_isolated_fast_huge_pages -= migrate_to_node(&fast_huge_page_list, slow_nid, mode, migration_batch_size); + } } if (nr_isolated_fast_base_pages != ULONG_MAX &&