From patchwork Tue Nov 29 15:58:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 13058743 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74AFCC433FE for ; Tue, 29 Nov 2022 16:00:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236034AbiK2QAU (ORCPT ); Tue, 29 Nov 2022 11:00:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235644AbiK2QAB (ORCPT ); Tue, 29 Nov 2022 11:00:01 -0500 Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [IPv6:2a00:1450:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1DF642F5E; Tue, 29 Nov 2022 07:58:32 -0800 (PST) Received: by mail-ej1-x633.google.com with SMTP id ho10so34917569ejc.1; Tue, 29 Nov 2022 07:58:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BILtrzXJ5yRtYE7hhS8hAZjrORquLTNRlDKrfD5m1Tk=; b=KVzbEMgthPEe+IT7Qhrq2QMU3YdiDdSJOat5VBF7tlTpnscSkASWTr7tPF9t9dicoB Ct6jU9IYz3tWe0pud1GHpiEMuVPW0/O8wy0+IMLBShIfPdkRHXd03KyUpjqXMlTI8nbY DRsi9ZPX1vmLhdUYKYkUaIUkuB1+659Xlh8k5a3fwEvQTTzYWYaYLtbJnvTt7njT1d1U xzS3fvcgy9mgl1LQ91ryOtvFHATwNQuUYkTPS8xX+7veMEbs3qzzMOzxzRimH0L7L1H7 i5pPw/MSlK7GfIXwRh+HDyO1BkFv2WZa0ycbZ54qbH4UoBp8HitOc6W1LhPxiN/VNybp 4v4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BILtrzXJ5yRtYE7hhS8hAZjrORquLTNRlDKrfD5m1Tk=; b=rtsnmC13zJ6rve60YiocuCzAqwDfZXnCTebXh5+bUFNdVvFfPxYTLzhCkG25qGdS9I qsm7nu5dtAJPmP0acWHx6nooDsLEVjgCrfHkB0VAkTK8KVKSkmqqxe4+AXChxHgAXJWX qTmeoL0OlcfiyRVjLVLS2fwLte5D1x7Yer6nhv9YXmrUUc1WMjFddiGxfob0nPsZhsOw NLn0Maj7GTpL8hpJHoBfJLbm6Q1yS7cDgyBYWsR/I/CagOxR6Ht/TfMfABhndYYCELMB HEvxt2A1GSQ4izw/LqEZd6VGfkf1lbwYjNoiDbUhOHg+8PgUmq1KGNRh8zTIRtOoNEDz 2jew== X-Gm-Message-State: ANoB5pnRIMcrB5nLD2rkGTIw7H2KqT+YAFnnl3UYJhYDGHWsSbgFr4xC Clok9wyl3ZpLszxq5HFGye4s3YcNLs8= X-Google-Smtp-Source: AA0mqf68VGneVeGR5DNdsE8M+YauIMts+4sX3B1v7WOnMxS/cCeEgBmhWx4vKzNzYKuhwP2H72W9TA== X-Received: by 2002:a17:906:9493:b0:7ad:83cb:85e6 with SMTP id t19-20020a170906949300b007ad83cb85e6mr49631092ejx.108.1669737511157; Tue, 29 Nov 2022 07:58:31 -0800 (PST) Received: from pc638.lan ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id u23-20020a056402065700b0046778ce5fdfsm6371059edx.10.2022.11.29.07.58.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 07:58:30 -0800 (PST) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , "Paul E . McKenney" Cc: Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH v2 1/4] rcu/kvfree: Switch to a generic linked list API Date: Tue, 29 Nov 2022 16:58:19 +0100 Message-Id: <20221129155822.538434-2-urezki@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129155822.538434-1-urezki@gmail.com> References: <20221129155822.538434-1-urezki@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org To make a code more readable and less confusing switch to a standard circular double linked list API. It allows to simplify the code since basic list operations are well defined and documented. Please note, this patch does not introduce any functional change it is only limited by refactoring of code. Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 89 +++++++++++++++++++++++------------------------ 1 file changed, 43 insertions(+), 46 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 219b4b516f38..1c79b244aac9 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2740,13 +2740,13 @@ EXPORT_SYMBOL_GPL(call_rcu); /** * struct kvfree_rcu_bulk_data - single block to store kvfree_rcu() pointers + * @list: List node. All blocks are linked between each other * @nr_records: Number of active pointers in the array - * @next: Next bulk object in the block chain * @records: Array of the kvfree_rcu() pointers */ struct kvfree_rcu_bulk_data { + struct list_head list; unsigned long nr_records; - struct kvfree_rcu_bulk_data *next; void *records[]; }; @@ -2762,21 +2762,21 @@ struct kvfree_rcu_bulk_data { * struct kfree_rcu_cpu_work - single batch of kfree_rcu() requests * @rcu_work: Let queue_rcu_work() invoke workqueue handler after grace period * @head_free: List of kfree_rcu() objects waiting for a grace period - * @bkvhead_free: Bulk-List of kvfree_rcu() objects waiting for a grace period + * @bulk_head_free: Bulk-List of kvfree_rcu() objects waiting for a grace period * @krcp: Pointer to @kfree_rcu_cpu structure */ struct kfree_rcu_cpu_work { struct rcu_work rcu_work; struct rcu_head *head_free; - struct kvfree_rcu_bulk_data *bkvhead_free[FREE_N_CHANNELS]; + struct list_head bulk_head_free[FREE_N_CHANNELS]; struct kfree_rcu_cpu *krcp; }; /** * struct kfree_rcu_cpu - batch up kfree_rcu() requests for RCU grace period * @head: List of kfree_rcu() objects not yet waiting for a grace period - * @bkvhead: Bulk-List of kvfree_rcu() objects not yet waiting for a grace period + * @bulk_head: Bulk-List of kvfree_rcu() objects not yet waiting for a grace period * @krw_arr: Array of batches of kfree_rcu() objects waiting for a grace period * @lock: Synchronize access to this structure * @monitor_work: Promote @head to @head_free after KFREE_DRAIN_JIFFIES @@ -2800,7 +2800,7 @@ struct kfree_rcu_cpu_work { */ struct kfree_rcu_cpu { struct rcu_head *head; - struct kvfree_rcu_bulk_data *bkvhead[FREE_N_CHANNELS]; + struct list_head bulk_head[FREE_N_CHANNELS]; struct kfree_rcu_cpu_work krw_arr[KFREE_N_BATCHES]; raw_spinlock_t lock; struct delayed_work monitor_work; @@ -2895,12 +2895,13 @@ drain_page_cache(struct kfree_rcu_cpu *krcp) /* * This function is invoked in workqueue context after a grace period. - * It frees all the objects queued on ->bkvhead_free or ->head_free. + * It frees all the objects queued on ->bulk_head_free or ->head_free. */ static void kfree_rcu_work(struct work_struct *work) { unsigned long flags; - struct kvfree_rcu_bulk_data *bkvhead[FREE_N_CHANNELS], *bnext; + struct kvfree_rcu_bulk_data *bnode, *n; + struct list_head bulk_head[FREE_N_CHANNELS]; struct rcu_head *head, *next; struct kfree_rcu_cpu *krcp; struct kfree_rcu_cpu_work *krwp; @@ -2912,10 +2913,8 @@ static void kfree_rcu_work(struct work_struct *work) raw_spin_lock_irqsave(&krcp->lock, flags); // Channels 1 and 2. - for (i = 0; i < FREE_N_CHANNELS; i++) { - bkvhead[i] = krwp->bkvhead_free[i]; - krwp->bkvhead_free[i] = NULL; - } + for (i = 0; i < FREE_N_CHANNELS; i++) + list_replace_init(&krwp->bulk_head_free[i], &bulk_head[i]); // Channel 3. head = krwp->head_free; @@ -2924,36 +2923,33 @@ static void kfree_rcu_work(struct work_struct *work) // Handle the first two channels. for (i = 0; i < FREE_N_CHANNELS; i++) { - for (; bkvhead[i]; bkvhead[i] = bnext) { - bnext = bkvhead[i]->next; - debug_rcu_bhead_unqueue(bkvhead[i]); + list_for_each_entry_safe(bnode, n, &bulk_head[i], list) { + debug_rcu_bhead_unqueue(bnode); rcu_lock_acquire(&rcu_callback_map); if (i == 0) { // kmalloc() / kfree(). trace_rcu_invoke_kfree_bulk_callback( - rcu_state.name, bkvhead[i]->nr_records, - bkvhead[i]->records); + rcu_state.name, bnode->nr_records, + bnode->records); - kfree_bulk(bkvhead[i]->nr_records, - bkvhead[i]->records); + kfree_bulk(bnode->nr_records, bnode->records); } else { // vmalloc() / vfree(). - for (j = 0; j < bkvhead[i]->nr_records; j++) { + for (j = 0; j < bnode->nr_records; j++) { trace_rcu_invoke_kvfree_callback( - rcu_state.name, - bkvhead[i]->records[j], 0); + rcu_state.name, bnode->records[j], 0); - vfree(bkvhead[i]->records[j]); + vfree(bnode->records[j]); } } rcu_lock_release(&rcu_callback_map); raw_spin_lock_irqsave(&krcp->lock, flags); - if (put_cached_bnode(krcp, bkvhead[i])) - bkvhead[i] = NULL; + if (put_cached_bnode(krcp, bnode)) + bnode = NULL; raw_spin_unlock_irqrestore(&krcp->lock, flags); - if (bkvhead[i]) - free_page((unsigned long) bkvhead[i]); + if (bnode) + free_page((unsigned long) bnode); cond_resched_tasks_rcu_qs(); } @@ -2989,7 +2985,7 @@ need_offload_krc(struct kfree_rcu_cpu *krcp) int i; for (i = 0; i < FREE_N_CHANNELS; i++) - if (krcp->bkvhead[i]) + if (!list_empty(&krcp->bulk_head[i])) return true; return !!krcp->head; @@ -3026,21 +3022,20 @@ static void kfree_rcu_monitor(struct work_struct *work) for (i = 0; i < KFREE_N_BATCHES; i++) { struct kfree_rcu_cpu_work *krwp = &(krcp->krw_arr[i]); - // Try to detach bkvhead or head and attach it over any + // Try to detach bulk_head or head and attach it over any // available corresponding free channel. It can be that // a previous RCU batch is in progress, it means that // immediately to queue another one is not possible so // in that case the monitor work is rearmed. - if ((krcp->bkvhead[0] && !krwp->bkvhead_free[0]) || - (krcp->bkvhead[1] && !krwp->bkvhead_free[1]) || + if ((!list_empty(&krcp->bulk_head[0]) && list_empty(&krwp->bulk_head_free[0])) || + (!list_empty(&krcp->bulk_head[1]) && list_empty(&krwp->bulk_head_free[1])) || (krcp->head && !krwp->head_free)) { + // Channel 1 corresponds to the SLAB-pointer bulk path. // Channel 2 corresponds to vmalloc-pointer bulk path. for (j = 0; j < FREE_N_CHANNELS; j++) { - if (!krwp->bkvhead_free[j]) { - krwp->bkvhead_free[j] = krcp->bkvhead[j]; - krcp->bkvhead[j] = NULL; - } + if (list_empty(&krwp->bulk_head_free[j])) + list_replace_init(&krcp->bulk_head[j], &krwp->bulk_head_free[j]); } // Channel 3 corresponds to both SLAB and vmalloc @@ -3152,10 +3147,11 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp, return false; idx = !!is_vmalloc_addr(ptr); + bnode = list_first_entry_or_null(&(*krcp)->bulk_head[idx], + struct kvfree_rcu_bulk_data, list); /* Check if a new block is required. */ - if (!(*krcp)->bkvhead[idx] || - (*krcp)->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) { + if (!bnode || bnode->nr_records == KVFREE_BULK_MAX_ENTR) { bnode = get_cached_bnode(*krcp); if (!bnode && can_alloc) { krc_this_cpu_unlock(*krcp, *flags); @@ -3179,18 +3175,13 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp, if (!bnode) return false; - /* Initialize the new block. */ + // Initialize the new block and attach it. bnode->nr_records = 0; - bnode->next = (*krcp)->bkvhead[idx]; - - /* Attach it to the head. */ - (*krcp)->bkvhead[idx] = bnode; + list_add(&bnode->list, &(*krcp)->bulk_head[idx]); } /* Finally insert. */ - (*krcp)->bkvhead[idx]->records - [(*krcp)->bkvhead[idx]->nr_records++] = ptr; - + bnode->records[bnode->nr_records++] = ptr; return true; } @@ -4779,7 +4770,7 @@ struct workqueue_struct *rcu_gp_wq; static void __init kfree_rcu_batch_init(void) { int cpu; - int i; + int i, j; /* Clamp it to [0:100] seconds interval. */ if (rcu_delay_page_cache_fill_msec < 0 || @@ -4799,8 +4790,14 @@ static void __init kfree_rcu_batch_init(void) for (i = 0; i < KFREE_N_BATCHES; i++) { INIT_RCU_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work); krcp->krw_arr[i].krcp = krcp; + + for (j = 0; j < FREE_N_CHANNELS; j++) + INIT_LIST_HEAD(&krcp->krw_arr[i].bulk_head_free[j]); } + for (i = 0; i < FREE_N_CHANNELS; i++) + INIT_LIST_HEAD(&krcp->bulk_head[i]); + INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor); INIT_DELAYED_WORK(&krcp->page_cache_work, fill_page_cache_func); krcp->initialized = true; From patchwork Tue Nov 29 15:58:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 13058742 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 555D9C46467 for ; Tue, 29 Nov 2022 16:00:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235880AbiK2QAT (ORCPT ); Tue, 29 Nov 2022 11:00:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235992AbiK2QAA (ORCPT ); Tue, 29 Nov 2022 11:00:00 -0500 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E30136A775; Tue, 29 Nov 2022 07:58:33 -0800 (PST) Received: by mail-ed1-x52f.google.com with SMTP id b8so20345498edf.11; Tue, 29 Nov 2022 07:58:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=i1497nUJKqf7VUksi/0Y8xV/c/P+R8VIcAPELcOUy1w=; b=MhUMVf/m+igVldFa7ZmQ2pWdRqW1NuYkVc/IwSgq5NNNP5/jxn2sm/Z380OJj0htNf 1zgvtVPf/Iyd0t6kPCVPZY6sMgXuGdDwBiZ5bnanINNSI7kG5lHI5gbJocSnwtvo1QmR 9Xx+ilF4A4y1zoN4lPqZRtu52SbLop964AV/UxWTf7btSzSpFn7ZypYZarF+ikSibsHt shMpnQNClLljA49Qa+G1IOFvZyjNl4w2eX55G26YLHkR5mxU2bj57HKcXYAH69sUZ474 WDCv7DQqIoEWi6mqTfDL+h0wB/T+bD1qG88nFMtVEUwkL75icLW+xth/k5aLuxnNSspd PxFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i1497nUJKqf7VUksi/0Y8xV/c/P+R8VIcAPELcOUy1w=; b=ziHLbvaz2aNl4ZGhvFL+JbfCmoTI6ufhwyyVu6EZpzALuwglE5Xnx6fosBrtfdAai0 qlpp96bYg1Qw1jUGnw37r0lEZP9EHqnRxPEI6CYlEWlvOnXfw5Yq9XCfpLClFAcvaRjg FHfehxQUHBefMz4vXrIM60YhO7KG4Gma1XyLI/pp9tYoqSCZhNs3XcUR4vrx+wCPKCf5 rcM3zOyKQ8//L79TJqAjUROiZH8qcAa+VTgaQzd0SL8tCbAVd2k7qIWYhxoHKdnwdcFe DgJnm1HTRIZqcL0IAbZBRqlHfXTnthOGOEptGHfOJSrQ0jywmTA6br9xFRZmQxXXZXl3 Xtqg== X-Gm-Message-State: ANoB5pkdWvsY4TMgBIHwnY2PbGNEr4rCAwWoflnC4JB+h+7J4I1DlFH7 uDaOHl7u94IePWBWqBR/ZPndJhR33IQ= X-Google-Smtp-Source: AA0mqf6b9TxEOWnQxnL4rSJh5Z0KY2uxGyNKqu356PyR0UMCjwvddB6RNRJ3OOpYqXnIztmm1vqq0Q== X-Received: by 2002:a05:6402:1f89:b0:458:caec:8f1e with SMTP id c9-20020a0564021f8900b00458caec8f1emr51499032edc.280.1669737512122; Tue, 29 Nov 2022 07:58:32 -0800 (PST) Received: from pc638.lan ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id u23-20020a056402065700b0046778ce5fdfsm6371059edx.10.2022.11.29.07.58.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 07:58:31 -0800 (PST) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , "Paul E . McKenney" Cc: Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH v2 2/4] rcu/kvfree: Move bulk/list reclaim to separate functions Date: Tue, 29 Nov 2022 16:58:20 +0100 Message-Id: <20221129155822.538434-3-urezki@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129155822.538434-1-urezki@gmail.com> References: <20221129155822.538434-1-urezki@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org There are two different paths how a memory is reclaimed. Currently it is open-coded what makes it a bit messy and less easy to read. Introduce two separate functions kvfree_rcu_list() and kvfree_rcu_bulk() to cover two independent cases. Please note, this patch does not introduce any functional change it is only limited by refactoring of code. Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 114 ++++++++++++++++++++++++++-------------------- 1 file changed, 65 insertions(+), 49 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 1c79b244aac9..445f8c11a9a3 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2893,6 +2893,65 @@ drain_page_cache(struct kfree_rcu_cpu *krcp) return freed; } +static void +kvfree_rcu_bulk(struct kfree_rcu_cpu *krcp, + struct kvfree_rcu_bulk_data *bnode, int idx) +{ + unsigned long flags; + int i; + + debug_rcu_bhead_unqueue(bnode); + + rcu_lock_acquire(&rcu_callback_map); + if (idx == 0) { // kmalloc() / kfree(). + trace_rcu_invoke_kfree_bulk_callback( + rcu_state.name, bnode->nr_records, + bnode->records); + + kfree_bulk(bnode->nr_records, bnode->records); + } else { // vmalloc() / vfree(). + for (i = 0; i < bnode->nr_records; i++) { + trace_rcu_invoke_kvfree_callback( + rcu_state.name, bnode->records[i], 0); + + vfree(bnode->records[i]); + } + } + rcu_lock_release(&rcu_callback_map); + + raw_spin_lock_irqsave(&krcp->lock, flags); + if (put_cached_bnode(krcp, bnode)) + bnode = NULL; + raw_spin_unlock_irqrestore(&krcp->lock, flags); + + if (bnode) + free_page((unsigned long) bnode); + + cond_resched_tasks_rcu_qs(); +} + +static void +kvfree_rcu_list(struct rcu_head *head) +{ + struct rcu_head *next; + + for (; head; head = next) { + void *ptr = (void *) head->func; + unsigned long offset = (void *) head - ptr; + + next = head->next; + debug_rcu_head_unqueue((struct rcu_head *)ptr); + rcu_lock_acquire(&rcu_callback_map); + trace_rcu_invoke_kvfree_callback(rcu_state.name, head, offset); + + if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset))) + kvfree(ptr); + + rcu_lock_release(&rcu_callback_map); + cond_resched_tasks_rcu_qs(); + } +} + /* * This function is invoked in workqueue context after a grace period. * It frees all the objects queued on ->bulk_head_free or ->head_free. @@ -2902,10 +2961,10 @@ static void kfree_rcu_work(struct work_struct *work) unsigned long flags; struct kvfree_rcu_bulk_data *bnode, *n; struct list_head bulk_head[FREE_N_CHANNELS]; - struct rcu_head *head, *next; + struct rcu_head *head; struct kfree_rcu_cpu *krcp; struct kfree_rcu_cpu_work *krwp; - int i, j; + int i; krwp = container_of(to_rcu_work(work), struct kfree_rcu_cpu_work, rcu_work); @@ -2922,38 +2981,9 @@ static void kfree_rcu_work(struct work_struct *work) raw_spin_unlock_irqrestore(&krcp->lock, flags); // Handle the first two channels. - for (i = 0; i < FREE_N_CHANNELS; i++) { - list_for_each_entry_safe(bnode, n, &bulk_head[i], list) { - debug_rcu_bhead_unqueue(bnode); - - rcu_lock_acquire(&rcu_callback_map); - if (i == 0) { // kmalloc() / kfree(). - trace_rcu_invoke_kfree_bulk_callback( - rcu_state.name, bnode->nr_records, - bnode->records); - - kfree_bulk(bnode->nr_records, bnode->records); - } else { // vmalloc() / vfree(). - for (j = 0; j < bnode->nr_records; j++) { - trace_rcu_invoke_kvfree_callback( - rcu_state.name, bnode->records[j], 0); - - vfree(bnode->records[j]); - } - } - rcu_lock_release(&rcu_callback_map); - - raw_spin_lock_irqsave(&krcp->lock, flags); - if (put_cached_bnode(krcp, bnode)) - bnode = NULL; - raw_spin_unlock_irqrestore(&krcp->lock, flags); - - if (bnode) - free_page((unsigned long) bnode); - - cond_resched_tasks_rcu_qs(); - } - } + for (i = 0; i < FREE_N_CHANNELS; i++) + list_for_each_entry_safe(bnode, n, &bulk_head[i], list) + kvfree_rcu_bulk(krcp, bnode, i); /* * This is used when the "bulk" path can not be used for the @@ -2962,21 +2992,7 @@ static void kfree_rcu_work(struct work_struct *work) * queued on a linked list through their rcu_head structures. * This list is named "Channel 3". */ - for (; head; head = next) { - void *ptr = (void *) head->func; - unsigned long offset = (void *) head - ptr; - - next = head->next; - debug_rcu_head_unqueue((struct rcu_head *)ptr); - rcu_lock_acquire(&rcu_callback_map); - trace_rcu_invoke_kvfree_callback(rcu_state.name, head, offset); - - if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset))) - kvfree(ptr); - - rcu_lock_release(&rcu_callback_map); - cond_resched_tasks_rcu_qs(); - } + kvfree_rcu_list(head); } static bool From patchwork Tue Nov 29 15:58:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 13058744 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC488C4167B for ; Tue, 29 Nov 2022 16:00:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232155AbiK2QAV (ORCPT ); Tue, 29 Nov 2022 11:00:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236048AbiK2QAC (ORCPT ); Tue, 29 Nov 2022 11:00:02 -0500 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC4006B382; Tue, 29 Nov 2022 07:58:34 -0800 (PST) Received: by mail-ed1-x52c.google.com with SMTP id z18so20355705edb.9; Tue, 29 Nov 2022 07:58:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wwJd7Mk1dtDP8oA/uY5amNur5qX6Eol7WXl6WKlAauE=; b=k7RDxpOVxxxBPJk5dkKGuhSlo5tyM17cWxhDRGCU4C1nv6x2VtSEAFOh7fX1K4hCLT l00Um8+IvJAaXR9p74nItgVGO50H1//Ts0PgIX7x2k4lxh/BOozzmQor/EvSD1XgLdB2 J7lGF8qFRK0ifGwxqi03WcZxR+kSCHnuGudnBwfp4O1++Iz5MX3KKQFcYfkKT3KOnUoe ZX5GOhj28nEzy8WT/tCSe53jTmR4SDG64Tdwing+Xg4JLzAtzpaeQ/FN8HHYhf3+XNWu xh5bHNUGi0N+Ay+ezM+adL6IpXD+IIi4DUs6A+cbaIflfBcL/j9BULh9dZs+/xWfHyq2 ZwVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wwJd7Mk1dtDP8oA/uY5amNur5qX6Eol7WXl6WKlAauE=; b=mPBOMUsUFC6Mj7h3V9/d0DyKF1cDMEcPG7ITyDGarV/Z+oh88vh8yCsP0gw2WhSdS+ h7M+EBCie4NNwl+d5wTqhEoJkEzrv06O3rHkE35R3d+sJ7av/ixehe3fHvA2iyfZmWvO Gdj1XxipP3Y3HChDlwf15Cx5/jFTi0ty1OzcHjSI76HQSNvyNW887G0R1QgdRmpl35i7 Ih1eMv0PYE3fPe/M5iQjxUKrylp6MU3igOdtD0FmMWxo1aJ4DZ+tVGchUD0HLgTp9c5m 9xSpRV3cTRAQzIMR8pY18it73kv53LiDor4wwgARPVDg4PgiGaJlOSPwerC1Fktdl52k V7Qg== X-Gm-Message-State: ANoB5pmJDKlCt7s9f1s4f9/6v7S1nYish+p01jDT3laskwYlhGBKyIxZ W3w/bTYha7cWEvF6pTL1fSzYQ4cV1sM= X-Google-Smtp-Source: AA0mqf4KQl8YIbCcZx4IUitW1+3MkDtFffsuSZAbcvafldGa6CRyYSeYLQuLMnQuqghBwO9y5RbkBQ== X-Received: by 2002:a05:6402:401:b0:461:5b2f:2d8f with SMTP id q1-20020a056402040100b004615b2f2d8fmr39012524edv.348.1669737512927; Tue, 29 Nov 2022 07:58:32 -0800 (PST) Received: from pc638.lan ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id u23-20020a056402065700b0046778ce5fdfsm6371059edx.10.2022.11.29.07.58.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 07:58:32 -0800 (PST) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , "Paul E . McKenney" Cc: Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH v2 3/4] rcu/kvfree: Move need_offload_krc() out of krcp->lock Date: Tue, 29 Nov 2022 16:58:21 +0100 Message-Id: <20221129155822.538434-4-urezki@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129155822.538434-1-urezki@gmail.com> References: <20221129155822.538434-1-urezki@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Currently a need_offload_krc() function requires the krcp->lock to be held because krcp->head can not be checked concurrently. Fix it by updating the krcp->head using WRITE_ONCE() macro so it becomes lock-free and safe for readers to see a valid data without any locking. Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 445f8c11a9a3..c94c17194299 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3058,7 +3058,7 @@ static void kfree_rcu_monitor(struct work_struct *work) // objects queued on the linked list. if (!krwp->head_free) { krwp->head_free = krcp->head; - krcp->head = NULL; + WRITE_ONCE(krcp->head, NULL); } WRITE_ONCE(krcp->count, 0); @@ -3072,6 +3072,8 @@ static void kfree_rcu_monitor(struct work_struct *work) } } + raw_spin_unlock_irqrestore(&krcp->lock, flags); + // If there is nothing to detach, it means that our job is // successfully done here. In case of having at least one // of the channels that is still busy we should rearm the @@ -3079,8 +3081,6 @@ static void kfree_rcu_monitor(struct work_struct *work) // still in progress. if (need_offload_krc(krcp)) schedule_delayed_monitor_work(krcp); - - raw_spin_unlock_irqrestore(&krcp->lock, flags); } static enum hrtimer_restart @@ -3250,7 +3250,7 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr) head->func = ptr; head->next = krcp->head; - krcp->head = head; + WRITE_ONCE(krcp->head, head); success = true; } @@ -3327,15 +3327,12 @@ static struct shrinker kfree_rcu_shrinker = { void __init kfree_rcu_scheduler_running(void) { int cpu; - unsigned long flags; for_each_possible_cpu(cpu) { struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); - raw_spin_lock_irqsave(&krcp->lock, flags); if (need_offload_krc(krcp)) schedule_delayed_monitor_work(krcp); - raw_spin_unlock_irqrestore(&krcp->lock, flags); } } From patchwork Tue Nov 29 15:58:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uladzislau Rezki X-Patchwork-Id: 13058745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD336C47088 for ; Tue, 29 Nov 2022 16:00:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235237AbiK2QAW (ORCPT ); Tue, 29 Nov 2022 11:00:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236052AbiK2QAF (ORCPT ); Tue, 29 Nov 2022 11:00:05 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CD8F59856; Tue, 29 Nov 2022 07:58:35 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id d20so9625596edn.0; Tue, 29 Nov 2022 07:58:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ndu8NLiGjk7yRkLXGpL0Z42hS5iNlQz7b0o3ua4PdvI=; b=LGJS13DqQAiOtBkjjE+GxM+AJP0mskXMq5M1ui7bL9ICiCIxNir4ZQtWt5IX2CwMUz uE3PdyGluk3ivEKkDoEOLf+ctFNjJV4VmEsXoti0oDJVe0fcVy3ljyChvG7v0KhELMam drXGRLFhZt/F+GFaz5w9ICf62rLfpnbXn2xze+9TM9IUIHOdQyN8ofOZpc7/3bRgaqtT w3NTPy5JqhHiANe7X+A39RBZJhGFx+BKkBgSLwuckeyipaFKTGXXNX8YCqPQz61WM/6j a/DTihkR5lS4g1o33Qzcr4z+SzM0wb4DwXnsdJbGhKHl9HiTb0uHa9HTdNTWxUPYC6Wx 1jVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ndu8NLiGjk7yRkLXGpL0Z42hS5iNlQz7b0o3ua4PdvI=; b=YQyAtkt/N6R61bbUfyaa4/RCoFkOEdGw/vJCmf6ggzuucM0msL3iyHOtHKf8BlL6lW vzd0mc9QCXv9IQLXebbvOieZKGnu0VqDdCctIGyRZ2ICgN/TO8wt8WFJVcy6HGvSleX2 rnzXkj/WystGC0ndYdXCgKrChuxYbjeMG7WX6BKejGg5x8Qt+eN6c5zJrmXxNQdJuWM+ w2JARrqfV1HPwpXDcKeSPuYWsWmXzpqy5/AygwYx124+rpsgRGzj+iYmateLev7irnzA Z1uJqVHk1hOd0LAeHdbpbQy8Cfflw/4Ja1HZn82SET7TuZrfPvb138oEDNOZLudpVggb ekpQ== X-Gm-Message-State: ANoB5pmVratW1pq7EyrwoM2qV6crhdziS3hpc67tr7+cSrIdXLzOj37R WlO/pf+HizHligsD5X3pTE1ReI/gLUQ= X-Google-Smtp-Source: AA0mqf6jheYnM8cLkZNMbknrGpAhk3zVWsLiJV+U+ifBhgE93tzOkvuvKhoW82EvHxyB6lYSCSlcEg== X-Received: by 2002:a05:6402:2421:b0:461:524f:a8f4 with SMTP id t33-20020a056402242100b00461524fa8f4mr51873418eda.260.1669737514117; Tue, 29 Nov 2022 07:58:34 -0800 (PST) Received: from pc638.lan ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id u23-20020a056402065700b0046778ce5fdfsm6371059edx.10.2022.11.29.07.58.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 07:58:33 -0800 (PST) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , "Paul E . McKenney" Cc: Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH v2 4/4] rcu/kvfree: Use a polled API to speedup a reclaim process Date: Tue, 29 Nov 2022 16:58:22 +0100 Message-Id: <20221129155822.538434-5-urezki@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221129155822.538434-1-urezki@gmail.com> References: <20221129155822.538434-1-urezki@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org Currently all objects placed into a batch require a full GP after passing which objects in a batch are eligible to be freed. The problem is that many pointers may already passed several GP sequences so there is no need for them in extra delay and such objects can be reclaimed right away without waiting. In order to reduce a memory footprint this patch introduces a per-page-grace-period-controlling mechanism. It allows us to distinguish pointers for which a grace period is passed and for which not. A reclaim thread in its turn frees a memory in a reverse order starting from a tail because a GP is likely passed for objects in a page. If a page with a GP sequence in a list hits a condition when a GP is not ready we bail out requesting one more grace period in order to complete a drain process for left pages. Test example: kvm.sh --memory 10G --torture rcuscale --allcpus --duration 1 \ --kconfig CONFIG_NR_CPUS=64 \ --kconfig CONFIG_RCU_NOCB_CPU=y \ --kconfig CONFIG_RCU_NOCB_CPU_DEFAULT_ALL=y \ --kconfig CONFIG_RCU_LAZY=n \ --bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 \ rcuscale.holdoff=20 rcuscale.kfree_loops=10000 \ torture.disable_onoff_at_boot" --trust-make Total time taken by all kfree'ers: 8535693700 ns, loops: 10000, batches: 1188, memory footprint: 2248MB Total time taken by all kfree'ers: 8466933582 ns, loops: 10000, batches: 1157, memory footprint: 2820MB Total time taken by all kfree'ers: 5375602446 ns, loops: 10000, batches: 1130, memory footprint: 6502MB Total time taken by all kfree'ers: 7523283832 ns, loops: 10000, batches: 1006, memory footprint: 3343MB Total time taken by all kfree'ers: 6459171956 ns, loops: 10000, batches: 1150, memory footprint: 6549MB Total time taken by all kfree'ers: 8560060176 ns, loops: 10000, batches: 1787, memory footprint: 61MB Total time taken by all kfree'ers: 8573885501 ns, loops: 10000, batches: 1777, memory footprint: 93MB Total time taken by all kfree'ers: 8320000202 ns, loops: 10000, batches: 1727, memory footprint: 66MB Total time taken by all kfree'ers: 8552718794 ns, loops: 10000, batches: 1790, memory footprint: 75MB Total time taken by all kfree'ers: 8601368792 ns, loops: 10000, batches: 1724, memory footprint: 62MB Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 47 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index c94c17194299..44279ca488ef 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2741,11 +2741,13 @@ EXPORT_SYMBOL_GPL(call_rcu); /** * struct kvfree_rcu_bulk_data - single block to store kvfree_rcu() pointers * @list: List node. All blocks are linked between each other + * @gp_snap: Snapshot of RCU state for objects placed to this bulk * @nr_records: Number of active pointers in the array * @records: Array of the kvfree_rcu() pointers */ struct kvfree_rcu_bulk_data { struct list_head list; + unsigned long gp_snap; unsigned long nr_records; void *records[]; }; @@ -2762,13 +2764,15 @@ struct kvfree_rcu_bulk_data { * struct kfree_rcu_cpu_work - single batch of kfree_rcu() requests * @rcu_work: Let queue_rcu_work() invoke workqueue handler after grace period * @head_free: List of kfree_rcu() objects waiting for a grace period + * @head_free_gp_snap: Snapshot of RCU state for objects placed to "@head_free" * @bulk_head_free: Bulk-List of kvfree_rcu() objects waiting for a grace period * @krcp: Pointer to @kfree_rcu_cpu structure */ struct kfree_rcu_cpu_work { - struct rcu_work rcu_work; + struct work_struct rcu_work; struct rcu_head *head_free; + unsigned long head_free_gp_snap; struct list_head bulk_head_free[FREE_N_CHANNELS]; struct kfree_rcu_cpu *krcp; }; @@ -2964,10 +2968,11 @@ static void kfree_rcu_work(struct work_struct *work) struct rcu_head *head; struct kfree_rcu_cpu *krcp; struct kfree_rcu_cpu_work *krwp; + unsigned long head_free_gp_snap; int i; - krwp = container_of(to_rcu_work(work), - struct kfree_rcu_cpu_work, rcu_work); + krwp = container_of(work, + struct kfree_rcu_cpu_work, rcu_work); krcp = krwp->krcp; raw_spin_lock_irqsave(&krcp->lock, flags); @@ -2978,12 +2983,29 @@ static void kfree_rcu_work(struct work_struct *work) // Channel 3. head = krwp->head_free; krwp->head_free = NULL; + head_free_gp_snap = krwp->head_free_gp_snap; raw_spin_unlock_irqrestore(&krcp->lock, flags); // Handle the first two channels. - for (i = 0; i < FREE_N_CHANNELS; i++) + for (i = 0; i < FREE_N_CHANNELS; i++) { + // Start from the tail page, so a GP is likely passed for it. + list_for_each_entry_safe_reverse(bnode, n, &bulk_head[i], list) { + // Not yet ready? Bail out since we need one more GP. + if (!poll_state_synchronize_rcu(bnode->gp_snap)) + break; + + list_del_init(&bnode->list); + kvfree_rcu_bulk(krcp, bnode, i); + } + + // Please note a request for one more extra GP can + // occur only once for all objects in this batch. + if (!list_empty(&bulk_head[i])) + synchronize_rcu(); + list_for_each_entry_safe(bnode, n, &bulk_head[i], list) kvfree_rcu_bulk(krcp, bnode, i); + } /* * This is used when the "bulk" path can not be used for the @@ -2992,7 +3014,10 @@ static void kfree_rcu_work(struct work_struct *work) * queued on a linked list through their rcu_head structures. * This list is named "Channel 3". */ - kvfree_rcu_list(head); + if (head) { + cond_synchronize_rcu(head_free_gp_snap); + kvfree_rcu_list(head); + } } static bool @@ -3059,6 +3084,11 @@ static void kfree_rcu_monitor(struct work_struct *work) if (!krwp->head_free) { krwp->head_free = krcp->head; WRITE_ONCE(krcp->head, NULL); + + // Take a snapshot for this krwp. Please note no more + // any objects can be added to attached head_free channel + // therefore fixate a GP for it here. + krwp->head_free_gp_snap = get_state_synchronize_rcu(); } WRITE_ONCE(krcp->count, 0); @@ -3068,7 +3098,7 @@ static void kfree_rcu_monitor(struct work_struct *work) // be that the work is in the pending state when // channels have been detached following by each // other. - queue_rcu_work(system_wq, &krwp->rcu_work); + queue_work(system_wq, &krwp->rcu_work); } } @@ -3196,8 +3226,9 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp, list_add(&bnode->list, &(*krcp)->bulk_head[idx]); } - /* Finally insert. */ + // Finally insert and update the GP for this page. bnode->records[bnode->nr_records++] = ptr; + bnode->gp_snap = get_state_synchronize_rcu(); return true; } @@ -4801,7 +4832,7 @@ static void __init kfree_rcu_batch_init(void) struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); for (i = 0; i < KFREE_N_BATCHES; i++) { - INIT_RCU_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work); + INIT_WORK(&krcp->krw_arr[i].rcu_work, kfree_rcu_work); krcp->krw_arr[i].krcp = krcp; for (j = 0; j < FREE_N_CHANNELS; j++)