From patchwork Wed Aug 17 21:04:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexei Starovoitov X-Patchwork-Id: 12946477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9E31C25B08 for ; Wed, 17 Aug 2022 21:04:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 48BFE8D000A; Wed, 17 Aug 2022 17:04:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4145E8D0003; Wed, 17 Aug 2022 17:04:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 267628D000A; Wed, 17 Aug 2022 17:04:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1665B8D0003 for ; Wed, 17 Aug 2022 17:04:50 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id DE22AABF2C for ; Wed, 17 Aug 2022 21:04:49 +0000 (UTC) X-FDA: 79810313898.17.D3D9C03 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf05.hostedemail.com (Postfix) with ESMTP id 78A571001C7 for ; Wed, 17 Aug 2022 21:04:49 +0000 (UTC) Received: by mail-pl1-f175.google.com with SMTP id 2so1801577pll.0 for ; Wed, 17 Aug 2022 14:04:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc; bh=Lv5oDDXoP51M0FtVQV1jm/w40k3x4dVuENG348Cw/nw=; b=ZXSpm8lQbBrV+pU9U4pQEM3v4XLsgwAXKotmTp5zpOTDkVdKlyc1PHW2NqIaTuLID9 zjWx0yK9Nf9DCOQUS4E5vWFC/5GOZEKuWPCQNm3NbjWvsTXeQtK+jF0yvay0Yhkfl7wp uCd/Y/GtWHw6FGX+R0eG4ApsZGRJ6Zu1cdApjTl1pSy9kKAyX8eg/a2Yp7EoIgxzpGEa 0MGCBdWTPy/TFHVgy6eqioEsQsy9uxFOC0u0RBX8f/Da9M/j4nc39CsGAO5riu9i5HuF CaCsIbw7LqPMlu0gbj/IQaJfRHksrMb0FNBg4oFTfAugLITLAD8DZNWMLy/VY6VHN8go rNaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=Lv5oDDXoP51M0FtVQV1jm/w40k3x4dVuENG348Cw/nw=; b=JZFQjQDTToxjddKgH6L2As9kBSac1YNvA56nGZnkT9vuFamMyyl/bVtPuMw83Oekyo r+8O7zVtl9cfQ7iMT57BvtlMgJiC9Fr537IFnj/wSdajmhyB6jM/AQhMU2craL38cjqI sR7S7FP6lFiiKt0QlDqGRSOiYX7+599DwDddzLj/j3nGcw5H3ks7wQkSEJM2l7OZvHvg Z1nEEjaQ+6fMH0rlQj0Ub/obQm5blWRoE9fnaT2+6td8Id7avCuEj1PAuZ0OAH8w7Dzi tUWkbOiRuBqH4Ug0qFnaNFhdDXjCzffuYoU+GGBCa5ip1LIK0UuuiLPcFSTjUvIc6ha8 23cQ== X-Gm-Message-State: ACgBeo2fm7pmsKs21m0UBAJFs5AfaDaZzP77NSDjCBmhz2DiC2z082bp Z59YxZFXvamwYiYJ2aB0RGrfCJ+IOI8= X-Google-Smtp-Source: AA6agR5R8107j9CXk3t1Ru7cJ4EvMgwEyjWC6iEOO06AeiuQYxrSsdNX9l+nsIhYNHB8+KnEVmvG1Q== X-Received: by 2002:a17:90b:4390:b0:1f7:2cb1:9e43 with SMTP id in16-20020a17090b439000b001f72cb19e43mr5542773pjb.91.1660770288439; Wed, 17 Aug 2022 14:04:48 -0700 (PDT) Received: from localhost.localdomain ([2620:10d:c090:500::1:ccd6]) by smtp.gmail.com with ESMTPSA id h26-20020aa79f5a000000b0052d33bf14d6sm10934193pfr.63.2022.08.17.14.04.47 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 17 Aug 2022 14:04:48 -0700 (PDT) From: Alexei Starovoitov To: davem@davemloft.net Cc: daniel@iogearbox.net, andrii@kernel.org, tj@kernel.org, memxor@gmail.com, delyank@fb.com, linux-mm@kvack.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 bpf-next 07/12] bpf: Optimize call_rcu in non-preallocated hash map. Date: Wed, 17 Aug 2022 14:04:14 -0700 Message-Id: <20220817210419.95560-8-alexei.starovoitov@gmail.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220817210419.95560-1-alexei.starovoitov@gmail.com> References: <20220817210419.95560-1-alexei.starovoitov@gmail.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=ZXSpm8lQ; spf=pass (imf05.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660770288; a=rsa-sha256; cv=none; b=WByUpi9zRQfgUP8YoW8bNSkEJn0IqKV0Nely6FYRbfqLH2OGnHakIs6A57k3Kf8Fos9srC bBq4XLUtJ584Wv0orGvv5zMyxwcR13bqTp0rGHqye/V8uMhbC1ETi2KUu63IeN/zmNMUKy E54ldxTQpXyEzGanAGJKp/K7ITmEKPc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660770288; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Lv5oDDXoP51M0FtVQV1jm/w40k3x4dVuENG348Cw/nw=; b=M8L0SI9Woh0CW4jPMhR0E0Ii0zUeGJwWFZWW8ZavCHkDSXHTd5LEIZ6ExVyvnGdGI68jJA c3dOQnX8UQVKThW95ss8Uj+PXCHC1EyywyOcS64lVD8IQonqEr/r4ZJsmZYgh+/AjGc9gR ZXvxWF75a67B7Rf8RkR2gF/cpg2VOng= X-Stat-Signature: me3ab7534tijke933rubspztrxw6krcq X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=ZXSpm8lQ; spf=pass (imf05.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Queue-Id: 78A571001C7 X-Rspamd-Server: rspam03 X-HE-Tag: 1660770289-516929 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Alexei Starovoitov Doing call_rcu() million times a second becomes a bottle neck. Convert non-preallocated hash map from call_rcu to SLAB_TYPESAFE_BY_RCU. The rcu critical section is no longer observed for one htab element which makes non-preallocated hash map behave just like preallocated hash map. The map elements are released back to kernel memory after observing rcu critical section. This improves 'map_perf_test 4' performance from 100k events per second to 250k events per second. bpf_mem_alloc + percpu_counter + typesafe_by_rcu provide 10x performance boost to non-preallocated hash map and make it within few % of preallocated map while consuming fraction of memory. Signed-off-by: Alexei Starovoitov --- kernel/bpf/hashtab.c | 8 ++++++-- kernel/bpf/memalloc.c | 2 +- tools/testing/selftests/bpf/progs/timer.c | 11 ----------- 3 files changed, 7 insertions(+), 14 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 65ebe5a719f5..3c1d15fd052a 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -944,8 +944,12 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l) __pcpu_freelist_push(&htab->freelist, &l->fnode); } else { dec_elem_count(htab); - l->htab = htab; - call_rcu(&l->rcu, htab_elem_free_rcu); + if (htab->map.map_type == BPF_MAP_TYPE_PERCPU_HASH) { + l->htab = htab; + call_rcu(&l->rcu, htab_elem_free_rcu); + } else { + htab_elem_free(htab, l); + } } } diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 8de268922380..a43630371b9f 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -332,7 +332,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size) return -ENOMEM; size += LLIST_NODE_SZ; /* room for llist_node */ snprintf(buf, sizeof(buf), "bpf-%u", size); - kmem_cache = kmem_cache_create(buf, size, 8, 0, NULL); + kmem_cache = kmem_cache_create(buf, size, 8, SLAB_TYPESAFE_BY_RCU, NULL); if (!kmem_cache) { free_percpu(pc); return -ENOMEM; diff --git a/tools/testing/selftests/bpf/progs/timer.c b/tools/testing/selftests/bpf/progs/timer.c index 5f5309791649..0053c5402173 100644 --- a/tools/testing/selftests/bpf/progs/timer.c +++ b/tools/testing/selftests/bpf/progs/timer.c @@ -208,17 +208,6 @@ static int timer_cb2(void *map, int *key, struct hmap_elem *val) */ bpf_map_delete_elem(map, key); - /* in non-preallocated hashmap both 'key' and 'val' are RCU - * protected and still valid though this element was deleted - * from the map. Arm this timer for ~35 seconds. When callback - * finishes the call_rcu will invoke: - * htab_elem_free_rcu - * check_and_free_timer - * bpf_timer_cancel_and_free - * to cancel this 35 second sleep and delete the timer for real. - */ - if (bpf_timer_start(&val->timer, 1ull << 35, 0) != 0) - err |= 256; ok |= 4; } return 0;