From patchwork Fri Sep 14 14:59:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 10600909 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD34C14BD for ; Fri, 14 Sep 2018 14:59:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD0862B8CE for ; Fri, 14 Sep 2018 14:59:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AF4712B8D0; Fri, 14 Sep 2018 14:59:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 15B682B8CE for ; Fri, 14 Sep 2018 14:59:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0ED38E0002; Fri, 14 Sep 2018 10:59:32 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D988D8E0001; Fri, 14 Sep 2018 10:59:32 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C624C8E0002; Fri, 14 Sep 2018 10:59:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by kanga.kvack.org (Postfix) with ESMTP id 6B5A78E0001 for ; Fri, 14 Sep 2018 10:59:32 -0400 (EDT) Received: by mail-wm1-f71.google.com with SMTP id s205-v6so126213wmf.7 for ; Fri, 14 Sep 2018 07:59:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:mime-version:content-transfer-encoding; bh=1lx+zcUrarYqQBd9xDyjmXb9tjld5IAFfufbWjDwEus=; b=KUlOBOUoHdCkMoepDYS0kzjGcZ8fX0YX5u4jVKX4QbX18aB/MS76JszH+IhY/0fQm3 ef1kO4etm+9EkGslMvbrHZVW8KeWq4qlYA6v61JFRrPa6JsfLZ02mhspRdt3qqNPPRJT mF7ol2QY74Cmh8GlccRA171uUKBn+S0APF+FXuQ5R34/Efd77143b0lrkVUwSjSHNFYv WvJlePqSVPag9e6vuO1ntEz8nkKVTTovrP45foqMH+t+a7UbgF2fdxV8aefbGeTb/Iaa 95P0dULR0n4BOol7DrxXLPQOoBiHeUoABiHVunGpNEUC/j4SkFw9zFTNdi2txoqJy502 ZnZg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of bigeasy@linutronix.de designates 2a01:7a0:2:106d:700::1 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de X-Gm-Message-State: APzg51BBAwMeOT4N0VaB7oAWMFwHsFzNElmW2MzJ6AAfd8AyFlxXTiaE fqTr4uyMffd8QttINp+cziFELD4UYJcgLgtWo3H2P9U9KKnAqASerP9UEQKoTw3oB8lUW0edxD6 b0TEd4pwdiWc6pFUAXBxmdHxCOKlSe2IQIH6A3NReqXTmqfvFbpTZ56Yz9nvM2mX/BQ== X-Received: by 2002:a1c:620b:: with SMTP id w11-v6mr1879244wmb.65.1536937171936; Fri, 14 Sep 2018 07:59:31 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZC2ai8ksvOnF1FnDavakWMwQfefjEfifHe/eT+ErSPVN4G/J1lRwu7ptKeDdexFL1rJEOj X-Received: by 2002:a1c:620b:: with SMTP id w11-v6mr1879184wmb.65.1536937170696; Fri, 14 Sep 2018 07:59:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536937170; cv=none; d=google.com; s=arc-20160816; b=hwJi1of8WNaDv0Uk7cjghY584W6HrcDGJlDzC7GrzkwAXJ68C7mOHuLErMyC7rsXkW CAMJbhtvw/LVPkAuDQ/viTqSJbm/KtQpdYSQxeAallQwhYrJMFnqr4RYuObnp/xnfmFg Mkt7YRIUzaOxVOQeTYLscaOxNGyfac2LZO9dAbDUiQfJUQZM0WSTK0Z+mdGml02xL7f4 UPtzCP3+70deZtZDx8zkm/W4dLG+TG51wTBMUvtbv7EMkVBvcparG6eXsvKG23fi2AK5 DD0l4KQR4gYMYCihhEXB9sN4chl5YbJBmmHFVa/6UlfK0DVKNmC5zMupNl5/4iIGOHhB X77Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from; bh=1lx+zcUrarYqQBd9xDyjmXb9tjld5IAFfufbWjDwEus=; b=IDqBZor+FP42nczxlFu1N61Mc7pKRMTNQ+qujEXtjOCx0sXqEZJ4oP/rMlYrRFCpDu yNYyS6ToNhcpQ43cuC1LGmMnjvgGI4augKGZkRoSPy4aRdjNX+fZxI92KVxMUOE6BkBL FVpfTZg/AMvdkbU2xMu/EoQy5hA7XmxqoodfjNnX8NpmuA7t1DQsJzIkOclgmslake+I y/wBzr4atpzy/zY50hk7ZiK8GYLpoJWf/q62Vf/UVmvdYtMYVLcI6ELmptuGWy0TuEWo IKCjfEwgvpV/HNROiZ9PejGV6dKxAETGkidSbPaclvN6cKyHzapt1qQA24NtcRBFyDhO q9hw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of bigeasy@linutronix.de designates 2a01:7a0:2:106d:700::1 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de Received: from Galois.linutronix.de (Galois.linutronix.de. [2a01:7a0:2:106d:700::1]) by mx.google.com with ESMTPS id z65-v6si1816164wme.78.2018.09.14.07.59.30 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Fri, 14 Sep 2018 07:59:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of bigeasy@linutronix.de designates 2a01:7a0:2:106d:700::1 as permitted sender) client-ip=2a01:7a0:2:106d:700::1; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of bigeasy@linutronix.de designates 2a01:7a0:2:106d:700::1 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de Received: from localhost ([127.0.0.1] helo=bazinga.breakpoint.cc) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1g0pZA-00019w-Pt; Fri, 14 Sep 2018 16:59:29 +0200 From: Sebastian Andrzej Siewior To: linux-mm@kvack.org Cc: tglx@linutronix.de, Vlastimil Babka , frederic@kernel.org Subject: [PATCH 0/2] mm/swap: Add locking for pagevec Date: Fri, 14 Sep 2018 16:59:22 +0200 Message-Id: <20180914145924.22055-1-bigeasy@linutronix.de> X-Mailer: git-send-email 2.19.0 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000248, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The swap code synchronizes its access to the (four) pagevec struct (which is allocated per-CPU) by disabling preemption. This works and the one struct needs to be accessed from interrupt context is protected by disabling interrupts. This was manually audited and there is no lockdep coverage for this. There is one case where the per-CPU of a remote CPU needs to be accessed and this is solved by started a worker on the remote CPU and waiting for it to finish. I measured the invocation of lru_add_drain_all(), ensured that it would invoke the drain function but the drain function would not do anything except the locking (preempt / interrupts on/off) of the individual pagevec. On a Xeon E5-2650 (2 Socket, 8 cores dual threaded, 32 CPUs in total) I tried to drain CPU4 and measured how long it took in microseconds: t-771 [001] .... 183.165619: lru_add_drain_all_test: took 92 t-771 [001] .... 183.165710: lru_add_drain_all_test: took 87 t-771 [001] .... 183.165781: lru_add_drain_all_test: took 68 t-771 [001] .... 183.165826: lru_add_drain_all_test: took 43 t-771 [001] .... 183.165837: lru_add_drain_all_test: took 9 t-771 [001] .... 183.165847: lru_add_drain_all_test: took 9 t-771 [001] .... 183.165858: lru_add_drain_all_test: took 9 t-771 [001] .... 183.165868: lru_add_drain_all_test: took 9 t-771 [001] .... 183.165878: lru_add_drain_all_test: took 9 t-771 [001] .... 183.165889: lru_add_drain_all_test: took 9 This is mostly the wake up from idle that takes long and once the CPU is busy and cache hot it goes down to 9us. If all CPUs are busy in user land then t-1484 [001] .... 40864.452481: lru_add_drain_all_test: took 12 t-1484 [001] .... 40864.452492: lru_add_drain_all_test: took 8 t-1484 [001] .... 40864.452500: lru_add_drain_all_test: took 7 t-1484 [001] .... 40864.452508: lru_add_drain_all_test: took 7 t-1484 [001] .... 40864.452516: lru_add_drain_all_test: took 7 t-1484 [001] .... 40864.452524: lru_add_drain_all_test: took 7 t-1484 [001] .... 40864.452532: lru_add_drain_all_test: took 7 t-1484 [001] .... 40864.452540: lru_add_drain_all_test: took 7 t-1484 [001] .... 40864.452547: lru_add_drain_all_test: took 7 t-1484 [001] .... 40864.452555: lru_add_drain_all_test: took 7 it goes to 7us once the cache is hot. Invoking the same test on every CPU it gets to: t-768 [000] .... 61.508781: lru_add_drain_all_test: took 133 t-768 [000] .... 61.508892: lru_add_drain_all_test: took 105 t-768 [000] .... 61.509004: lru_add_drain_all_test: took 108 t-768 [000] .... 61.509112: lru_add_drain_all_test: took 104 t-768 [000] .... 61.509220: lru_add_drain_all_test: took 104 t-768 [000] .... 61.509333: lru_add_drain_all_test: took 109 t-768 [000] .... 61.509414: lru_add_drain_all_test: took 78 t-768 [000] .... 61.509493: lru_add_drain_all_test: took 76 t-768 [000] .... 61.509558: lru_add_drain_all_test: took 63 t-768 [000] .... 61.509623: lru_add_drain_all_test: took 62 on an idle machine and once the CPUs are busy: t-849 [020] .... 379.429727: lru_add_drain_all_test: took 57 t-849 [020] .... 379.429777: lru_add_drain_all_test: took 47 t-849 [020] .... 379.429823: lru_add_drain_all_test: took 45 t-849 [020] .... 379.429870: lru_add_drain_all_test: took 45 t-849 [020] .... 379.429916: lru_add_drain_all_test: took 45 t-849 [020] .... 379.429962: lru_add_drain_all_test: took 45 t-849 [020] .... 379.430009: lru_add_drain_all_test: took 45 t-849 [020] .... 379.430055: lru_add_drain_all_test: took 45 t-849 [020] .... 379.430101: lru_add_drain_all_test: took 45 t-849 [020] .... 379.430147: lru_add_drain_all_test: took 45 so we get down to 45us. If the preemption based locking gets replaced with a PER-CPU spin_lock() then it gain a locking scope on the operation. The spin_lock() should not bring much overhead because it is not contended. However, having the lock there does not only add lockdep coverage it also allows to access the data from a remote CPU. So the work can be done on the CPU that asked for it and there is no need to wake a CPU from idle (or user land). With this series applied, the test again: Idle box, all CPUs: t-861 [000] .... 861.051780: lru_add_drain_all_test: took 16 t-861 [000] .... 861.051789: lru_add_drain_all_test: took 7 t-861 [000] .... 861.051797: lru_add_drain_all_test: took 7 t-861 [000] .... 861.051805: lru_add_drain_all_test: took 7 t-861 [000] .... 861.051813: lru_add_drain_all_test: took 7 t-861 [000] .... 861.051821: lru_add_drain_all_test: took 7 t-861 [000] .... 861.051829: lru_add_drain_all_test: took 7 t-861 [000] .... 861.051837: lru_add_drain_all_test: took 7 t-861 [000] .... 861.051844: lru_add_drain_all_test: took 7 t-861 [000] .... 861.051852: lru_add_drain_all_test: took 7 which is almost the same compared with "busy, one CPU". Invoking the test only for a single remote CPU: t-863 [020] .... 906.579885: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579887: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579889: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579889: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579890: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579891: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579892: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579892: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579893: lru_add_drain_all_test: took 0 t-863 [020] .... 906.579894: lru_add_drain_all_test: took 0 and it is less than a microsecond. Sebastian