From patchwork Tue Sep 11 00:59:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jordan X-Patchwork-Id: 10594961 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 08C466CB for ; Tue, 11 Sep 2018 01:00:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ECD6B29010 for ; Tue, 11 Sep 2018 01:00:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E0FB22901E; Tue, 11 Sep 2018 01:00:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4656729010 for ; Tue, 11 Sep 2018 01:00:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C1638E0009; Mon, 10 Sep 2018 21:00:16 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 471288E0001; Mon, 10 Sep 2018 21:00:16 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 365D68E0009; Mon, 10 Sep 2018 21:00:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-it0-f72.google.com (mail-it0-f72.google.com [209.85.214.72]) by kanga.kvack.org (Postfix) with ESMTP id 0DD168E0001 for ; Mon, 10 Sep 2018 21:00:16 -0400 (EDT) Received: by mail-it0-f72.google.com with SMTP id a10-v6so44770079itc.9 for ; Mon, 10 Sep 2018 18:00:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=p1afmEd8/tPqlTNMZIkHsm5+Y4w1VDdGqw9tYwO1YbM=; b=JQ0bTNgNAlIUbJawpjI8Df4Z7DDmZga2FGFnj+zhRLrYbbDKPqKa6x3pn3Gddt4SPC eHynsoVoGoMXfZHzVh458Ymq0YhnO8Zq9PbvEkBe1PhlO7b5+dIDq34uFUkuDOQcKhEa G3v/bl2WHos+q/CJbywrvy5yVwkGBf9NDJfxAeKLebEoEZoEfvaTqUBSoHwEu7IFQdcf wXD5PoXm2Bsdh7k8p+q99ERl0UX7Gr5Ej4UVgfRrmbrkSGFztDWpEoRg8Tku0LPgqF97 chlIQhHXp0eJ8GyQF1b2XAHTAZk7DQXvG5beMvTSQvVtOKoujb/DVyMuNPPFCJQXnplG uAOg== X-Gm-Message-State: APzg51BPbl56RJhPbWtwv7T4RtJAcVRtNSr5bgA3uxi1z5hu55y/fCSo 2xDlOfGOYRTbdOov6xEfDaE+do5uDIP2Kp2cAJKkyqLv5U8h24cZ61SausScQEFnQajliE6WLQm zNJomXrsXrnafcPcXDGhENkwfpNsSLpJX+QOBa8o3tuPvcrN6kgq3eG/5rKC5QuGwEw== X-Received: by 2002:a6b:e00d:: with SMTP id z13-v6mr18520294iog.70.1536627615789; Mon, 10 Sep 2018 18:00:15 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdaqs3c//MQBcrVfRUa5wlpS/zaIddyXHzp/FWlHMffkAr+nZan0fKGVzx1TLELdmp2eeGo6 X-Received: by 2002:a6b:e00d:: with SMTP id z13-v6mr18520246iog.70.1536627614873; Mon, 10 Sep 2018 18:00:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536627614; cv=none; d=google.com; s=arc-20160816; b=b6UTznpGwrfevjFYJkDR3gQvHc2TSHO95aLKJqCu/IgDeTUqX+f3iq/LiW4I1PMCD8 Xk7PYKu67ymij7Xp52y2RSkxaPj9BebIYiDP4Rr+vVDsUhaPYVvwBjRZWDeWREbdzvMq rC4b9WsYSTA7lW+dhjHmokQ0/tWXu8vNvFinvoWh1qwRRMGMDZ6mQcI0A9zcqIErxQ6v T8lnzhGA3XXFZEQ5OPDGXxEjyUgGJeqpw6oO3oLqfcg0j/sx7JGTwgKZDrrK0TOsRZaR rylyd4jlYTNOMKf12X+bl/CQLMXZ4VL0QXo736g+7v84KR+H3mCm/j7zs8Hkym8/zSuO k0FA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=p1afmEd8/tPqlTNMZIkHsm5+Y4w1VDdGqw9tYwO1YbM=; b=jpDos5ARGNzAVSAyULWczr4L2HpEWFvt5YS1yyQLI8fkEgdKaOLKzQOnT248CY3lC4 YsuYvTyQ+y/mRGzc0bQgm6Gdsx5oQenzYXCyvGQvmTrybcHbkZ9wghf7Pn5RxWCcSjyL P5QS9AlgSk8leOrQcf7sgDr0vlGgoogmGmdDq9hXuMpwUdsI21N3pwNOGeFz1L3Zac8v 2JwK/CVO0o2EIFUZuDmxAl+Cv0GeKFlbIebK5uXQfODSal0a9XaAkZ4I8KLWTzj3nhDU FGiMOqhEmRq00U14/0od4E1AGti0Vqd7JNvANm+2OWPPWdEGHF/UxT8FvJELjfriENxj huUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=MfmVbIa9; spf=pass (google.com: domain of daniel.m.jordan@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=daniel.m.jordan@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2130.oracle.com (userp2130.oracle.com. [156.151.31.86]) by mx.google.com with ESMTPS id o10-v6si11587509iod.271.2018.09.10.18.00.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 10 Sep 2018 18:00:14 -0700 (PDT) Received-SPF: pass (google.com: domain of daniel.m.jordan@oracle.com designates 156.151.31.86 as permitted sender) client-ip=156.151.31.86; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=MfmVbIa9; spf=pass (google.com: domain of daniel.m.jordan@oracle.com designates 156.151.31.86 as permitted sender) smtp.mailfrom=daniel.m.jordan@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w8B0xxnK077547; Tue, 11 Sep 2018 01:00:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=p1afmEd8/tPqlTNMZIkHsm5+Y4w1VDdGqw9tYwO1YbM=; b=MfmVbIa9FDvmDf90G0SzRodDhj46UW5c5w3fwmSapXN0FNOooe0U0SG10AUylxX4sB2A 30o3Fsn7Z2vMF6hF5cP4nde0+DZtczlOg9h6mvK12fB5v52b4b8IsGtgFXMbpZuBeqxE BPiBh0UNidC0UXAelzvyR4zl79H6uOqFq2lH6MDRVg1IzqK6NNyYHxn3l/zpxDmWsJB9 elAcVjYCaWgQWq40nYYtdT2Qr+UaWCrE135gnAQnXuMA+qIsiFtcUZhGkKclpSoMbFkj DryN8jGtrRDpyuC30oZADxiEXl6lU9uSQfJ3K7Wx6YV32tvsTQNFX2kelYgPZvr+7AFD fQ== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2130.oracle.com with ESMTP id 2mc5ut94eh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Sep 2018 01:00:02 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w8B101cX029692 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Sep 2018 01:00:02 GMT Received: from abhmp0009.oracle.com (abhmp0009.oracle.com [141.146.116.15]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w8B1018I026802; Tue, 11 Sep 2018 01:00:01 GMT Received: from localhost.localdomain (/73.143.71.164) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 10 Sep 2018 18:00:01 -0700 From: Daniel Jordan To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Cc: aaron.lu@intel.com, ak@linux.intel.com, akpm@linux-foundation.org, dave.dice@oracle.com, dave.hansen@linux.intel.com, hannes@cmpxchg.org, levyossi@icloud.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, mhocko@kernel.org, Pavel.Tatashin@microsoft.com, steven.sistare@oracle.com, tim.c.chen@intel.com, vdavydov.dev@gmail.com, ying.huang@intel.com Subject: [RFC PATCH v2 7/8] mm: introduce smp_list_splice to prepare for concurrent LRU adds Date: Mon, 10 Sep 2018 20:59:48 -0400 Message-Id: <20180911005949.5635-4-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180911004240.4758-1-daniel.m.jordan@oracle.com> References: <20180911004240.4758-1-daniel.m.jordan@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9012 signatures=668708 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809110009 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Now that we splice a local list onto the LRU, prepare for multiple tasks doing this concurrently by adding a variant of the kernel's list splicing API, list_splice, that's designed to work with multiple tasks. Although there is naturally less parallelism to be gained from locking the LRU head this way, the main benefit of doing this is to allow removals to happen concurrently. The way lru_lock is today, an add needlessly blocks removal of any page but the first in the LRU. For now, hold lru_lock as writer to serialize the adds to ensure the function is correct for a single thread at a time. Yosef Lev came up with this algorithm. Suggested-by: Yosef Lev Signed-off-by: Daniel Jordan --- include/linux/list.h | 1 + lib/list.c | 60 ++++++++++++++++++++++++++++++++++++++------ mm/swap.c | 3 ++- 3 files changed, 56 insertions(+), 8 deletions(-) diff --git a/include/linux/list.h b/include/linux/list.h index bb80fe9b48cf..6d964ea44f1a 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -48,6 +48,7 @@ static inline bool __list_del_entry_valid(struct list_head *entry) #endif extern void smp_list_del(struct list_head *entry); +extern void smp_list_splice(struct list_head *list, struct list_head *head); /* * Insert a new entry between two known consecutive entries. diff --git a/lib/list.c b/lib/list.c index 22188fc0316d..d6a834ef1543 100644 --- a/lib/list.c +++ b/lib/list.c @@ -10,17 +10,18 @@ #include /* - * smp_list_del is a variant of list_del that allows concurrent list removals - * under certain assumptions. The idea is to get away from overly coarse - * synchronization, such as using a lock to guard an entire list, which - * serializes all operations even though those operations might be happening on - * disjoint parts. + * smp_list_del and smp_list_splice are variants of list_del and list_splice, + * respectively, that allow concurrent list operations under certain + * assumptions. The idea is to get away from overly coarse synchronization, + * such as using a lock to guard an entire list, which serializes all + * operations even though those operations might be happening on disjoint + * parts. * * If you want to use other functions from the list API concurrently, * additional synchronization may be necessary. For example, you could use a * rwlock as a two-mode lock, where readers use the lock in shared mode and are - * allowed to call smp_list_del concurrently, and writers use the lock in - * exclusive mode and are allowed to use all list operations. + * allowed to call smp_list_* functions concurrently, and writers use the lock + * in exclusive mode and are allowed to use all list operations. */ /** @@ -156,3 +157,48 @@ void smp_list_del(struct list_head *entry) entry->next = LIST_POISON1; entry->prev = LIST_POISON2; } + +/** + * smp_list_splice - thread-safe splice of two lists + * @list: the new list to add + * @head: the place to add it in the first list + * + * Safely handles concurrent smp_list_splice operations onto the same list head + * and concurrent smp_list_del operations of any list entry except @head. + * Assumes that @head cannot be removed. + */ +void smp_list_splice(struct list_head *list, struct list_head *head) +{ + struct list_head *first = list->next; + struct list_head *last = list->prev; + struct list_head *succ; + + /* + * Lock the front of @head by replacing its next pointer with NULL. + * Should another thread be adding to the front, wait until it's done. + */ + succ = READ_ONCE(head->next); + while (succ == NULL || cmpxchg(&head->next, succ, NULL) != succ) { + cpu_relax(); + succ = READ_ONCE(head->next); + } + + first->prev = head; + last->next = succ; + + /* + * It is safe to write to succ, head's successor, because locking head + * prevents succ from being removed in smp_list_del. + */ + succ->prev = last; + + /* + * Pairs with the implied full barrier before the cmpxchg above. + * Ensures the write that unlocks the head is seen last to avoid list + * corruption. + */ + smp_wmb(); + + /* Simultaneously complete the splice and unlock the head node. */ + WRITE_ONCE(head->next, first); +} diff --git a/mm/swap.c b/mm/swap.c index 07b951727a11..fe3098c09815 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -35,6 +35,7 @@ #include #include #include +#include #include "internal.h" @@ -1019,7 +1020,7 @@ void __pagevec_lru_add(struct pagevec *pvec) pgdat = splice->pgdat; write_lock_irqsave(&pgdat->lru_lock, flags); } - list_splice(&splice->list, splice->lru); + smp_list_splice(&splice->list, splice->lru); } while (!list_empty(&singletons)) {