From patchwork Sat Jun 22 03:58:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leonardo Bras X-Patchwork-Id: 13708195 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2188C27C53 for ; Sat, 22 Jun 2024 03:58:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2A6BE8D01B3; Fri, 21 Jun 2024 23:58:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 257128D01AF; Fri, 21 Jun 2024 23:58:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 11E688D01B3; Fri, 21 Jun 2024 23:58:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E90B38D01AF for ; Fri, 21 Jun 2024 23:58:46 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 91D591A13DC for ; Sat, 22 Jun 2024 03:58:46 +0000 (UTC) X-FDA: 82257168252.02.46513A5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 7A2204000C for ; Sat, 22 Jun 2024 03:58:44 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="UTGVj/L0"; spf=pass (imf04.hostedemail.com: domain of leobras@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=leobras@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719028713; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=glHWFwF1PVhz6SvXAFFwFqrh96wIoW6X2gSfzajSRBE=; b=BxAfnKITzuBXpK5u76LLXNNxSpCyCQYeLVWDKcBrgTUxyR9NEP9jq8gpPfu81wJ0E8vvMa dA56xvcOrYutL1cx/Dd1aphiQSbGNEhmkV2QoqsnOsOkroWOzL23LKDxnsy4V0SnEGNIWR uJld5+poxq2iesJzeeNeJklWNF+w1n8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719028713; a=rsa-sha256; cv=none; b=vpFeMoRM700pndzI5HNaVBWv1clkJ72IdgowV2yYKVPzhujKvDn9zMfpeJS77jdaqbyqoA uE7TiMV/26/Qxf6L6xU7+UMliGHfP+DZCWXLJ8AguA+G3Mepc8f1cyLifM+QBkPpwBWpkK +0ZYud1vGTKfIcSOaiuJz/fNNJrgIVw= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="UTGVj/L0"; spf=pass (imf04.hostedemail.com: domain of leobras@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=leobras@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1719028723; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=glHWFwF1PVhz6SvXAFFwFqrh96wIoW6X2gSfzajSRBE=; b=UTGVj/L0vJkaVTDmA3aKZsbr7tFomGCEo3F33bnJAM5TYPpDXP5HGi0BrMCswTVO2O83ty mVl5E1Iw778+F/l+xUDG2G1SAnPiaTfqO6e6pddMbl4RX5aP4RJGEww/oVxcJ5pXQP9ge3 9T9pJW6hle09mNzOs067j3oQY2/V7po= Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-402--LxQAJR_PiOihGeiYw9oGA-1; Fri, 21 Jun 2024 23:58:40 -0400 X-MC-Unique: -LxQAJR_PiOihGeiYw9oGA-1 Received: by mail-pl1-f199.google.com with SMTP id d9443c01a7336-1f98f733043so23477875ad.2 for ; Fri, 21 Jun 2024 20:58:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719028719; x=1719633519; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=glHWFwF1PVhz6SvXAFFwFqrh96wIoW6X2gSfzajSRBE=; b=ueAfG8vIDyy7rVvQRFrnh+iDV7H9fNq9qDFEhTwla9xghfw2r9wl/uH9DgUBa9LD3N 340wxcv/K8ivZqyfhqjRhzGkTuP0IZKGtuM2Mold5ZBD+vF6hWGgqmPy1ZWz+Xx//RPJ r1C0isNWm8p9m9VCcRHciUiXZ8w/Tp+K/7eZ8XV+oc9qL1SSkNLPPf2t6ih8pZtiNVs2 /L7a9Dc0RUF8VWLrcuic+3wdxkwSIzUo/CCRNUedpqB1j3k4HIJjPzJRu/6Hy2wx3k2X g6X4egoucowiAL7Da/Yxl1u52sNg88KeaHde9Y24mqhD98fU0dVDvMTGHP5j4737iD+5 aehg== X-Forwarded-Encrypted: i=1; AJvYcCWXPg0PHxovoq+JVQhT7dZzE2trC24fitL8VusZOeVTCN+D5EZuZ/c6vNhfqWf5R3crVpKVtiLbZFelcaxTNMKwW44= X-Gm-Message-State: AOJu0YywcgzhCPhVhuJliVXK1o2hZjSNyukp047xYClhKg+oddwXoHc8 bvp8i0eLVJtS820Skkp4Gk4sa9iVDYFiwRZZ0uaSYTl9bwutgXbgh3vfP1+qUsC3otAIdX8h71j QfsYi8J76xfDt24ZVWG+H1g/DiXOXzhh15nWRvOAGhakCk7DI X-Received: by 2002:a17:902:d4c3:b0:1f7:3163:831d with SMTP id d9443c01a7336-1f9aa396e3dmr125385715ad.14.1719028719317; Fri, 21 Jun 2024 20:58:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGn6jq5za2vXIACul9YEe/UzkN4xzhEvsBbIbTzW3YiVVEm98izqyGGRygyQeD38+tfCNqDdw== X-Received: by 2002:a17:902:d4c3:b0:1f7:3163:831d with SMTP id d9443c01a7336-1f9aa396e3dmr125385545ad.14.1719028718901; Fri, 21 Jun 2024 20:58:38 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:1b3:a801:c138:e21d:3579:5747:ad1]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f9eb32b9edsm21832365ad.118.2024.06.21.20.58.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Jun 2024 20:58:38 -0700 (PDT) From: Leonardo Bras To: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Leonardo Bras , Thomas Gleixner , Marcelo Tosatti Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 0/4] Introduce QPW for per-cpu operations Date: Sat, 22 Jun 2024 00:58:08 -0300 Message-ID: <20240622035815.569665-1-leobras@redhat.com> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: u4741mds7wkuxao7s9dmjqeya33djzrj X-Rspamd-Queue-Id: 7A2204000C X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1719028724-369235 X-HE-Meta: U2FsdGVkX185aVOyMXVNeQhaNEZmnvmMzyfnIJOZD1XjR5ZOj51yoOx534049VevjE0dHJzrhnobJcCKwj/8bbFm+IjXQMG+vjHMU4ylBNNeS4eS5YgqP53sXAJQiMjLrcLdh4nbGUaaJiWvVqFEBCqtt1dflvwS04L8yT7jgqkuAudIgZb7dU/X+0qYkyvvqh4fYUYDKKlPzXDOKaJ5b+0RmZ635J17AfqO0m0fYcr6EHMa3WweOJS4n6DWFbjk2yFiBzDFyYDqcEHqOEEkdxOlO1umEhPZgOpr5r655kEmY7vLT2IF5SM2AeYda1QghiVcN60aTFULK/fk2b9QljmmQVbMxlJoSNK0lrVCTg/LxAWgxZ4zSaTa/+XBWqbtvBRtjvJycrYGyQyR6yIj/yRVtmHabqKKZ07h8wflyUhuojR+frgkWZnQGM9NBvtCd25/FERBJLtSWySEL//tKTuedxcrpEYU17UEzhcEy/mPg+Jk6JJODiZN1sMDkcynkxW0/dkICvg429xsSjNoN17nIG9zqKCbsqCTvLaocaJhgnJYHYpOfjtRr/VE+vRs6CmwGEDjFTTXD9nV2MHAmjYQEHqAKqets0DXvoggU8l6UJGD/kJzrNvpU8tUIIBAjZl4lCO6kGIZ8y9hcoW1n9Gzw2PIA3Lm2BTN6Tbg8jj408IjRuOfngU1n7BLrcrExeynnAyAmWOaK+T3rmNF45COkHqBAZvi/1LEHDVN/sI3eNIuzxg9TaDxS9t0UxsPkfThD5D4L8aKlyCVrFNgoorBlrnSQKxtCehSsDtvnYeVBF3RCTUBUXI4mw/DoW9XFryM2JOKt/bSV6naqtDljqW9rMQ3a9ND/lmVQqDWFIXEOjXcPWbPYoC761q9r//wSCuxYkR7kB53mY2/s+IKdoHba72GqMeRXHtJH8WQ7cVCUwfzue5iQOfPTWTwrpvhphH9m+Xg9eGRFAomKoV l7ksgYKM ux4ORJcUhq/rke9Cwq3zkqGLHulbqFC4V0MJt54Ht0GPCKOVfibtsiRkdLkTUZemW2oKJ1ZwJePJBZPScPrnjr6rMPpFxBfRB1l6p63jnYu3NizgjUG+U/MjZhpyDEgcjXIuRKj6Zpj/KJUc2EYdiuBLgdXLdZqYMK0cNchyagfqeDKsfHeQ5Adv+ONpdFWMH8Fwuy8BLZ91am5knddt1ohxa61sC6aNkmHlhHzRFxR9RgEYh6fzZz8JWvEe6I8qlzlywKIM9uFqLaKA34z52PHLl+GTvs29zd71XeO+2d1JGjIwcFIcf8aGSkacQO9Znl4NVdMWFYBHsYLpEcHqMRpd1gzwVdwp55ojrdvnr3x8sbdWk/VT5MutYEKxjzAFJVm7masb6/A5ZGt4go+iq+Eqks2ECeU0Jbe4rs6cwXCQlop4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The problem: Some places in the kernel implement a parallel programming strategy consisting on local_locks() for most of the work, and some rare remote operations are scheduled on target cpu. This keeps cache bouncing low since cacheline tends to be mostly local, and avoids the cost of locks in non-RT kernels, even though the very few remote operations will be expensive due to scheduling overhead. On the other hand, for RT workloads this can represent a problem: getting an important workload scheduled out to deal with remote requests is sure to introduce unexpected deadline misses. The idea: Currently with PREEMPT_RT=y, local_locks() become per-cpu spinlocks. In this case, instead of scheduling work on a remote cpu, it should be safe to grab that remote cpu's per-cpu spinlock and run the required work locally. Tha major cost, which is un/locking in every local function, already happens in PREEMPT_RT. Also, there is no need to worry about extra cache bouncing: The cacheline invalidation already happens due to schedule_work_on(). This will avoid schedule_work_on(), and thus avoid scheduling-out an RT workload. For patches 2, 3 & 4, I noticed just grabing the lock and executing the function locally is much faster than just scheduling it on a remote cpu. Proposed solution: A new interface called Queue PerCPU Work (QPW), which should replace Work Queue in the above mentioned use case. If PREEMPT_RT=n, this interfaces just wraps the current local_locks + WorkQueue behavior, so no expected change in runtime. If PREEMPT_RT=y, queue_percpu_work_on(cpu,...) will lock that cpu's per-cpu structure and perform work on it locally. This is possible because on functions that can be used for performing remote work on remote per-cpu structures, the local_lock (which is already a this_cpu spinlock()), will be replaced by a qpw_spinlock(), which is able to get the per_cpu spinlock() for the cpu passed as parameter. Patch 1 implements QPW interface, and patches 2, 3 & 4 replaces the current local_lock + WorkQueue interface by the QPW interface in swap, memcontrol & slub interface. Please let me know what you think on that, and please suggest improvements. Thanks a lot! Leo Leonardo Bras (4): Introducing qpw_lock() and per-cpu queue & flush work swap: apply new queue_percpu_work_on() interface memcontrol: apply new queue_percpu_work_on() interface slub: apply new queue_percpu_work_on() interface include/linux/qpw.h | 88 +++++++++++++++++++++++++++++++++++++++++++++ mm/memcontrol.c | 20 ++++++----- mm/slub.c | 26 ++++++++------ mm/swap.c | 26 +++++++------- 4 files changed, 127 insertions(+), 33 deletions(-) create mode 100644 include/linux/qpw.h base-commit: 50736169ecc8387247fe6a00932852ce7b057083