From patchwork Mon Mar 27 20:11:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190025 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 47D80C6FD1D for ; Mon, 27 Mar 2023 20:13:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948005; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=K7rE7gfIkkick9hC5oefqFb+vy1k4voR4Qk7csAT96Q=; b=N1C5pcy6YSD9c1wmCJ2bltZVS7xyJshG3WL5fd9EJwyT1ToBls+4Z495Jl+S+vlmHcuBTI PSxRUAZ+lkwgEwvFH/seejXU1EEH5b9Z0B1H6YQAaTo5awEffwT+S9FuDQBSdao6ODKkgY nOMkogJh7JovAbfPB0tgZQLztQ8yyTE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-321-_NSJO08OP6i9UqE1LWfVTA-1; Mon, 27 Mar 2023 16:13:23 -0400 X-MC-Unique: _NSJO08OP6i9UqE1LWfVTA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 05A60185A791; Mon, 27 Mar 2023 20:13:21 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id E57C5C33187; Mon, 27 Mar 2023 20:13:20 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 36C541946A6F; Mon, 27 Mar 2023 20:13:17 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 133AC19465B9 for ; Mon, 27 Mar 2023 20:13:11 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 06DBEC15BB8; Mon, 27 Mar 2023 20:13:11 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast07.extmail.prod.ext.rdu2.redhat.com [10.11.55.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F3201C15BA0 for ; Mon, 27 Mar 2023 20:13:10 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D79EC3C0E44A for ; Mon, 27 Mar 2023 20:13:10 +0000 (UTC) Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-221-3YBuUjitNIKfosdK_x-V_A-1; Mon, 27 Mar 2023 16:13:09 -0400 X-MC-Unique: 3YBuUjitNIKfosdK_x-V_A-1 Received: by mail-qt1-f169.google.com with SMTP id n14so9789386qta.10 for ; Mon, 27 Mar 2023 13:13:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679947988; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pNKDXzqz05HkFrPteMNUMY0uT7UJYCgf272zyuZUZl4=; b=rwHho5qRM3GFKMYMthdcGXK6p7SmVe3MLqZJcCHtdDVNC1TUbfYExnYyzZA73y81oJ pp6ppQO6HPfCK3esBKNyn8BLFt2w71nQR3P/6Fp1Hja1s+c+enrAL9VRZuEydl1BQjQd 70EXSC3UeE6Qm62bFTboS2m6Gj6gBv2sAsJrLxOtt4MrSYExvM3cZ1cBgGeqYPYjzGfo yykAJYD2LQ3knycWU6Su+WU968xNCS4+YUOzH4nBzr/R31sozNc4Dv/aGG2lu/7nYgd3 IJKWSZ0GV59JejlU8V11+TFUxKfquoJF2CVvoSNMQXOQUyPjM6iX71Il73pOM2Uunryi dxNA== X-Gm-Message-State: AO0yUKWcLU2YKFnCeOz8ZOi/2corw+lU01ygSKR7zrRSEHtYxhWD2KWH aH4pQ2eWVseEIt7qqjdpHi/tvEhmPL2iFhXS+2f5REokk44oo10kGtlRbwnjREVT/eZhr2XwjE1 z7m9pzJWYV4yJu2fC8jvAEzhdonrCtb7zUyMuHH7uJYSz0ab2GZS783b8t9IKdud7+R1TI95l1w Y= X-Google-Smtp-Source: AK7set+7MMoq6Vd0Xdcf9qnBj1WT4JAtfpuUnRj7pcmV6oOeji14qNAy6w9DZt923oR8XoqLbUpKMQ== X-Received: by 2002:ac8:5f0b:0:b0:3e1:c341:f618 with SMTP id x11-20020ac85f0b000000b003e1c341f618mr20781332qta.65.1679947988341; Mon, 27 Mar 2023 13:13:08 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id 128-20020a370486000000b00746772d78a6sm11397199qke.2.2023.03.27.13.13.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:07 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:24 -0400 Message-Id: <20230327201143.51026-2-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Subject: [dm-devel] [dm-6.4 PATCH v3 01/20] dm bufio: remove unused dm_bufio_release_move interface X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Joe Thornber Was used by multi-snapshot DM target that never went upstream. Signed-off-by: Joe Thornber Acked-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 77 ---------------------------------------- include/linux/dm-bufio.h | 6 ---- 2 files changed, 83 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index cf077f9b30c3..79434b38f368 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1415,83 +1415,6 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c } EXPORT_SYMBOL_GPL(dm_bufio_issue_discard); -/* - * We first delete any other buffer that may be at that new location. - * - * Then, we write the buffer to the original location if it was dirty. - * - * Then, if we are the only one who is holding the buffer, relink the buffer - * in the buffer tree for the new location. - * - * If there was someone else holding the buffer, we write it to the new - * location but not relink it, because that other user needs to have the buffer - * at the same place. - */ -void dm_bufio_release_move(struct dm_buffer *b, sector_t new_block) -{ - struct dm_bufio_client *c = b->c; - struct dm_buffer *new; - - BUG_ON(dm_bufio_in_request()); - - dm_bufio_lock(c); - -retry: - new = __find(c, new_block); - if (new) { - if (new->hold_count) { - __wait_for_free_buffer(c); - goto retry; - } - - /* - * FIXME: Is there any point waiting for a write that's going - * to be overwritten in a bit? - */ - __make_buffer_clean(new); - __unlink_buffer(new); - __free_buffer_wake(new); - } - - BUG_ON(!b->hold_count); - BUG_ON(test_bit(B_READING, &b->state)); - - __write_dirty_buffer(b, NULL); - if (b->hold_count == 1) { - wait_on_bit_io(&b->state, B_WRITING, - TASK_UNINTERRUPTIBLE); - set_bit(B_DIRTY, &b->state); - b->dirty_start = 0; - b->dirty_end = c->block_size; - __unlink_buffer(b); - __link_buffer(b, new_block, LIST_DIRTY); - } else { - sector_t old_block; - - wait_on_bit_lock_io(&b->state, B_WRITING, - TASK_UNINTERRUPTIBLE); - /* - * Relink buffer to "new_block" so that write_callback - * sees "new_block" as a block number. - * After the write, link the buffer back to old_block. - * All this must be done in bufio lock, so that block number - * change isn't visible to other threads. - */ - old_block = b->block; - __unlink_buffer(b); - __link_buffer(b, new_block, b->list_mode); - submit_io(b, REQ_OP_WRITE, write_endio); - wait_on_bit_io(&b->state, B_WRITING, - TASK_UNINTERRUPTIBLE); - __unlink_buffer(b); - __link_buffer(b, old_block, b->list_mode); - } - - dm_bufio_unlock(c); - dm_bufio_release(b); -} -EXPORT_SYMBOL_GPL(dm_bufio_release_move); - static void forget_buffer_locked(struct dm_buffer *b) { if (likely(!b->hold_count) && likely(!smp_load_acquire(&b->state))) { diff --git a/include/linux/dm-bufio.h b/include/linux/dm-bufio.h index 2056743aaaaa..681656a1c03d 100644 --- a/include/linux/dm-bufio.h +++ b/include/linux/dm-bufio.h @@ -130,12 +130,6 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c); */ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t count); -/* - * Like dm_bufio_release but also move the buffer to the new - * block. dm_bufio_write_dirty_buffers is needed to commit the new block. - */ -void dm_bufio_release_move(struct dm_buffer *b, sector_t new_block); - /* * Free the given buffer. * This is just a hint, if the buffer is in use or dirty, this function From patchwork Mon Mar 27 20:11:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190026 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 42E91C76195 for ; Mon, 27 Mar 2023 20:13:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948009; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=Vo2n6IYkcNWod7UIv1TMZL+ybABjNugsy8vVkb1XQSc=; b=QnDDL05+7vOOQEXlfWbeUG1zBjoOE9nd5QlA1WlKLAje4FU86GgKXVamN6dtG5p87YmLvC bWSVQpPN63raRFM1MR/prrv/rtXTp0q+eZRy0bHSVus8HhI7SqblBgkY0S7Djay1RlMghn 3oPMSYjITuBy8vBhkG9L0YbLqpc0IP8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-491-DQ1owt3MNT-RstgrByxKBg-1; Mon, 27 Mar 2023 16:13:26 -0400 X-MC-Unique: DQ1owt3MNT-RstgrByxKBg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 87BE5857A89; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 763E5492C3E; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 60CAC1946A7D; Mon, 27 Mar 2023 20:13:17 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 1D6B519465A2 for ; Mon, 27 Mar 2023 20:13:14 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 18E7C2027045; Mon, 27 Mar 2023 20:13:14 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast10.extmail.prod.ext.rdu2.redhat.com [10.11.55.26]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1100C2027044 for ; Mon, 27 Mar 2023 20:13:14 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E66961C05EC6 for ; Mon, 27 Mar 2023 20:13:13 +0000 (UTC) Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-86-d_IvLALbMTC40SUY2m_tgw-1; Mon, 27 Mar 2023 16:13:11 -0400 X-MC-Unique: d_IvLALbMTC40SUY2m_tgw-1 Received: by mail-qv1-f48.google.com with SMTP id oe8so7649283qvb.6 for ; Mon, 27 Mar 2023 13:13:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679947990; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K6rad/3rQ5uNwjgAbrRZn8viSEaiHhaJsk57wekX4Yc=; b=QZyYFFwiIZT38Z79owdgWVAQF8OMGsSjNNBvDoAEfkQ+QP+LJQb97LAKovUYfMLsVQ bVE7xtCMPHI3AWgnWUJnYZJmFyecbIxkRWMs8sVG4JwpDdVnlY/Avh3xBr04ruwPSQv1 6k7funwi4hNURcs9F2qq/3PFX+hxcjpP2cqGqIwOvEdHf3S/UFKKVIJcNe7IbhECXCLL 25pUaW94FEsnalaFC34WYvcjLAlA6YCjkhNIA4MHuuteJDFKs8AhNxf5lRKv+6CltZrs r//DZ6rJbVXpnaGqTBfb1hAsg/TYMk8jcbMzO1/EA5n8xm+YTMJ0SofXGfEy/8kRftEz PtEw== X-Gm-Message-State: AAQBX9fPII6a5T+xUZt5fBnLvZ+2M3G5qv2aBfh2QcSY4m+OKU4xhU1v UbXpK/2PwuFmqul9S6NrJLPFRdAMNJuyEK+Er4YzgxWmu15FZh1wGpeE+Cd7bBPMG/7RV0zX/ig VtUM/XR3sA1tuN07//KomZQYEnrKCVjVSqzVgRXAeuyAToZr/mcsEgYvhUpkeqTp8VNZJc2Nlpb A= X-Google-Smtp-Source: AKy350aGPv+m7+Nvtksg720HQ297PPqoxtEn/b3IGQEXHEExxXsRvNIRRAb6T0t3WIyj8iHk9azUFg== X-Received: by 2002:a05:6214:b64:b0:5b8:1f61:a20 with SMTP id ey4-20020a0562140b6400b005b81f610a20mr25404563qvb.35.1679947989792; Mon, 27 Mar 2023 13:13:09 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id f16-20020ac86ed0000000b003e390b48958sm4723408qtv.55.2023.03.27.13.13.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:09 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:25 -0400 Message-Id: <20230327201143.51026-3-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Subject: [dm-devel] [dm-6.4 PATCH v3 02/20] dm bufio: use WARN_ON in dm_bufio_client_destroy and dm_bufio_exit X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Using BUG_ON when tearing down is excessive. Relax these to WARN_ONs. Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 79434b38f368..dac9f1f84c34 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1828,8 +1828,8 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c) mutex_unlock(&dm_bufio_clients_lock); - BUG_ON(!RB_EMPTY_ROOT(&c->buffer_tree)); - BUG_ON(c->need_reserved_buffers); + WARN_ON(!RB_EMPTY_ROOT(&c->buffer_tree)); + WARN_ON(c->need_reserved_buffers); while (!list_empty(&c->reserved_buffers)) { struct dm_buffer *b = list_entry(c->reserved_buffers.next, @@ -1843,7 +1843,7 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c) DMERR("leaked buffer count %d: %ld", i, c->n_buffers[i]); for (i = 0; i < LIST_SIZE; i++) - BUG_ON(c->n_buffers[i]); + WARN_ON(c->n_buffers[i]); kmem_cache_destroy(c->slab_cache); kmem_cache_destroy(c->slab_buffer); @@ -2082,7 +2082,7 @@ static void __exit dm_bufio_exit(void) bug = 1; } - BUG_ON(bug); + WARN_ON(bug); /* leaks are not worth crashing the system */ } module_init(dm_bufio_init) From patchwork Mon Mar 27 20:11:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190029 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AFFE2C6FD1D for ; Mon, 27 Mar 2023 20:13:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948019; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=sPcaQg8F3elM/1El78HJT+eq7P4p99wIV/kYQ9Aef/0=; b=d1BcgjuUAn2B4TwjyGjX212lojh+8JKoUnLlzvvHporvsGy18iucwNNyZ+P9xDUfqp/Ant RviJsobDoyuA7Ua/Jxw/ZrlhJOPEim/gWs43B8t4ocKXqCsd35FCuD2YX2tZdmyKzTAVAb ulRu1dHvSyj6jaBPgpPRQWxxh/GZqJQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-303-bMeGmFCdOIqG7bBqrZ-RJQ-1; Mon, 27 Mar 2023 16:13:38 -0400 X-MC-Unique: bMeGmFCdOIqG7bBqrZ-RJQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C50078828C3; Mon, 27 Mar 2023 20:13:35 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id B2CDF202701E; Mon, 27 Mar 2023 20:13:35 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 8F2C81946A5A; Mon, 27 Mar 2023 20:13:22 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id C0EEA194658F for ; Mon, 27 Mar 2023 20:13:15 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id A535C492B01; Mon, 27 Mar 2023 20:13:15 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast10.extmail.prod.ext.rdu2.redhat.com [10.11.55.26]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9DEAE492B03 for ; Mon, 27 Mar 2023 20:13:15 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 83CD81C05EBF for ; Mon, 27 Mar 2023 20:13:15 +0000 (UTC) Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-434-a4NpGVsUNfOr0iZTT4Rzbg-1; Mon, 27 Mar 2023 16:13:12 -0400 X-MC-Unique: a4NpGVsUNfOr0iZTT4Rzbg-1 Received: by mail-qv1-f42.google.com with SMTP id x8so7617869qvr.9 for ; Mon, 27 Mar 2023 13:13:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679947991; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=38Sgxe5/q+hJ0JKVFdOX1S2/H0qUacEb3wrSiTwviHY=; b=A00nOqtpRbgmE2GWQgS6FX0rUpbjjcDT/sxtksuJRs1ALJrq/ZCLbslBVYVOmU/Q62 IV9HlCZjEMkFNVef6ZlT6mNq7lJgAWsKevL6Di5FSTi9v1O5IAKNv1N2eIXA1jiafUlq 2hdr1Lcw7MPVheMxZC7fUQhg8QGrdzJvgOI+xlISwOQvZiylbpq6y7azsxFCg3xu3hu1 uixtBbNh6KodUjAo7rUXoNf5Nz6m23h0AlcIX1VpyajHXrCu3YbaXmutPXJQVl5rhV9c sWGTr3WKOy5OboGRnsQMd/O/0SL2F2gWxLhCF86UHvh8/q2CjozSGNIheK6fYa6562LL +1FQ== X-Gm-Message-State: AAQBX9c6zimYCmc7X+kTskiGloR8lPz1gjq6Gg6BYYRKx4MLYGgFKD1G jZDQm7Vr255HAA9o8LHPWGEjRlISsSaVw1sgCEU33JnO0zUhSWcr9wmJivoJlMGUD1ccLxPL+do uyrRoTlGLy7C8uTu0GZbTOfzyopAaLbEWK/cqqr19lpGsDgVGbEuXOYBcwqplP5sxujdRNP9nid 4= X-Google-Smtp-Source: AKy350Z/SX3NEFg9tn3FY6AM6zZMl6uXUeBw6GhmbRlqi/g3goSm+tkfNNjvAC/CMiQFi0MxHMsbFw== X-Received: by 2002:ad4:5d69:0:b0:5db:e06f:e0f9 with SMTP id fn9-20020ad45d69000000b005dbe06fe0f9mr19745867qvb.13.1679947991288; Mon, 27 Mar 2023 13:13:11 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id a200-20020ae9e8d1000000b00747d211536dsm3327118qkg.107.2023.03.27.13.13.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:10 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:26 -0400 Message-Id: <20230327201143.51026-4-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Subject: [dm-devel] [dm-6.4 PATCH v3 03/20] dm bufio: never crash if dm_bufio_in_request() X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org All these instances are entirely avoidable given that they speak to coding mistakes that result in inappropriate use. Proper testing during development will catch any such coding bug so its best to relax all of these from BUG_ON to WARN_ON_ONCE. Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index dac9f1f84c34..d63f94ab1d9f 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1152,7 +1152,8 @@ EXPORT_SYMBOL_GPL(dm_bufio_get); void *dm_bufio_read(struct dm_bufio_client *c, sector_t block, struct dm_buffer **bp) { - BUG_ON(dm_bufio_in_request()); + if (WARN_ON_ONCE(dm_bufio_in_request())) + return ERR_PTR(-EINVAL); return new_read(c, block, NF_READ, bp); } @@ -1161,7 +1162,8 @@ EXPORT_SYMBOL_GPL(dm_bufio_read); void *dm_bufio_new(struct dm_bufio_client *c, sector_t block, struct dm_buffer **bp) { - BUG_ON(dm_bufio_in_request()); + if (WARN_ON_ONCE(dm_bufio_in_request())) + return ERR_PTR(-EINVAL); return new_read(c, block, NF_FRESH, bp); } @@ -1174,7 +1176,8 @@ void dm_bufio_prefetch(struct dm_bufio_client *c, LIST_HEAD(write_list); - BUG_ON(dm_bufio_in_request()); + if (WARN_ON_ONCE(dm_bufio_in_request())) + return; /* should never happen */ blk_start_plug(&plug); dm_bufio_lock(c); @@ -1281,7 +1284,8 @@ void dm_bufio_write_dirty_buffers_async(struct dm_bufio_client *c) { LIST_HEAD(write_list); - BUG_ON(dm_bufio_in_request()); + if (WARN_ON_ONCE(dm_bufio_in_request())) + return; /* should never happen */ dm_bufio_lock(c); __write_dirty_buffers_async(c, 0, &write_list); @@ -1386,7 +1390,8 @@ int dm_bufio_issue_flush(struct dm_bufio_client *c) .count = 0, }; - BUG_ON(dm_bufio_in_request()); + if (WARN_ON_ONCE(dm_bufio_in_request())) + return -EINVAL; return dm_io(&io_req, 1, &io_reg, NULL); } @@ -1409,7 +1414,8 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c .count = block_to_sector(c, count), }; - BUG_ON(dm_bufio_in_request()); + if (WARN_ON_ONCE(dm_bufio_in_request())) + return -EINVAL; /* discards are optional */ return dm_io(&io_req, 1, &io_reg, NULL); } From patchwork Mon Mar 27 20:11:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190027 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 49B5EC6FD1D for ; Mon, 27 Mar 2023 20:13:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948010; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=tRatwY+TYZxRkLdO/ss4BQCgdJBqNUssHPPPaZrIFR0=; b=WroG9JPMDLCwnSChQlXV/XWx5zGqC7Ega1ljK5X3rljibjeYJpvWNd9beoc4m21XL4+tel Xrx8mB76jyTAq2Y1t0gY/BxOAK4/MIIzlL9m6ALyUd4DlVciZldT0AdTDe3F1rMBfJdC+q ezWsJUdZx0JiJGucyEHvwbaG5+zrfVc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-652-u4u-6YDlPlu8qsam2PM7gQ-1; Mon, 27 Mar 2023 16:13:28 -0400 X-MC-Unique: u4u-6YDlPlu8qsam2PM7gQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9199E802D32; Mon, 27 Mar 2023 20:13:26 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4281D202701F; Mon, 27 Mar 2023 20:13:26 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 0DF2419465A2; Mon, 27 Mar 2023 20:13:16 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 7C0C319465B9 for ; Mon, 27 Mar 2023 20:13:15 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 60E3D4020C85; Mon, 27 Mar 2023 20:13:15 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast07.extmail.prod.ext.rdu2.redhat.com [10.11.55.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 594AC4020C82 for ; Mon, 27 Mar 2023 20:13:15 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3D8393C0E44A for ; Mon, 27 Mar 2023 20:13:15 +0000 (UTC) Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-517-0mQhJb7zOP-LA3iJKqHCBQ-1; Mon, 27 Mar 2023 16:13:13 -0400 X-MC-Unique: 0mQhJb7zOP-LA3iJKqHCBQ-1 Received: by mail-qt1-f174.google.com with SMTP id cj15so2315900qtb.5 for ; Mon, 27 Mar 2023 13:13:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679947993; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hmqlLAmAcoDYlM73GWg+waCa1PP8Bn6cws6QITx68xo=; b=Bbxo4b4mPuDMGnyHqrfz4rrjuykQP8uJqh4WmYd1VVZ8UM0AU2ctz8X6D5QzKq2gZO ZPBroTmBBoCDrHZA15ZMB8l4y5gg1bP8KC/YYlVHqFQt1dO8osbDC4Uyu/ZnNXEgC/zE eNaBjfSfozpIQjuafZgnJUHXJ7i1iJ18G6CbDjmZpe8kJsJZ7apzWw4qgFxS69u5bww0 PkNfDihGugJXW//09iANTxWM3ZKV0xPdNVr2ify4lw2euqGg+DtDa50xEo+ezhIPcXA6 OfR4W/WFrYRzyXsFyfwr1t1EHBq0TTsUWRyUPcNj3SsGgUJ4cPLAL3Ai3j/X0YD5/GUj JowQ== X-Gm-Message-State: AAQBX9coA3kcvQ5YoT8jlhHBxCyQ/JrKdhcqFCczoeNlNVtS61agB3jl 92mHfeffJXjmiGIjkrYwUQUWcxS7JaCgHNkOn54V3+siTnc2B/Xyk3UvUhaO0va7Oh8zB3WWpjL 4TSjsujHxLYC8VaXzijYOTJERasBCxZWnPV9R8vBi+mgbGjY1qs+oy8c8b3zh16xh0BDVkuKso7 k= X-Google-Smtp-Source: AKy350b9yynzcGT7R923zG1Nqhk3Mn4VUNQG7FVDF/2AYEYBKvcrKmzGtyrZXsaF2wG0qO6dp7ngxQ== X-Received: by 2002:a05:622a:1002:b0:3e4:dbfa:ee2b with SMTP id d2-20020a05622a100200b003e4dbfaee2bmr15465044qte.17.1679947992941; Mon, 27 Mar 2023 13:13:12 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id 141-20020a370793000000b0074672975d5csm12784571qkh.91.2023.03.27.13.13.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:12 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:27 -0400 Message-Id: <20230327201143.51026-5-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Subject: [dm-devel] [dm-6.4 PATCH v3 04/20] dm bufio: don't bug for clear developer oversight X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Reasonable to relax to WARN_ON because these are easily avoided but do offer some assurance future coding mistakes won't occur (if changes tested properly). Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index d63f94ab1d9f..64fb5fd39bd9 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -378,8 +378,10 @@ static void adjust_total_allocated(struct dm_buffer *b, bool unlink) */ static void __cache_size_refresh(void) { - BUG_ON(!mutex_is_locked(&dm_bufio_clients_lock)); - BUG_ON(dm_bufio_client_count < 0); + if (WARN_ON(!mutex_is_locked(&dm_bufio_clients_lock))) + return; + if (WARN_ON(dm_bufio_client_count < 0)) + return; dm_bufio_cache_size_latch = READ_ONCE(dm_bufio_cache_size); @@ -1536,7 +1538,8 @@ static void drop_buffers(struct dm_bufio_client *c) int i; bool warned = false; - BUG_ON(dm_bufio_in_request()); + if (WARN_ON(dm_bufio_in_request())) + return; /* should never happen */ /* * An optimization so that the buffers are not written one-by-one. @@ -1556,7 +1559,7 @@ static void drop_buffers(struct dm_bufio_client *c) (unsigned long long)b->block, b->hold_count, i); #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING stack_trace_print(b->stack_entries, b->stack_len, 1); - /* mark unclaimed to avoid BUG_ON below */ + /* mark unclaimed to avoid WARN_ON below */ b->hold_count = 0; #endif } @@ -1567,7 +1570,7 @@ static void drop_buffers(struct dm_bufio_client *c) #endif for (i = 0; i < LIST_SIZE; i++) - BUG_ON(!list_empty(&c->lru[i])); + WARN_ON(!list_empty(&c->lru[i])); dm_bufio_unlock(c); } From patchwork Mon Mar 27 20:11:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190034 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AE90BC761A6 for ; Mon, 27 Mar 2023 20:13:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948026; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=Pe3huYU8CBCw8mEjSdVedVB4BOjOZQ9cV7IVW+4Ok4g=; b=D6ltb97orbLcJ0tNYVUkDPMiccUfhIQiNUlkpA7ky8yTqlg1XgJFznM9oWIMN0npCWfSyj mk0wzOaKJdxz1LvzK29+cxN0ehcRZccdMA5o6IOf3n/Wpo7CK3sO66BIPTO9PyYBHPUHIf iqBJWnTjbX2xG/SBtTy4IcIbAEwzDOA= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-341-qI82hxGXO7KefZcX1XlRdA-1; Mon, 27 Mar 2023 16:13:43 -0400 X-MC-Unique: qI82hxGXO7KefZcX1XlRdA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A763D2812956; Mon, 27 Mar 2023 20:13:41 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 955E82027040; Mon, 27 Mar 2023 20:13:41 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id E5B0D19472D3; Mon, 27 Mar 2023 20:13:22 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id AD2AE19472C9 for ; Mon, 27 Mar 2023 20:13:22 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 903B7C15BB8; Mon, 27 Mar 2023 20:13:17 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast06.extmail.prod.ext.rdu2.redhat.com [10.11.55.22]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 884CCC15BA0 for ; Mon, 27 Mar 2023 20:13:17 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6BC12185A7A9 for ; Mon, 27 Mar 2023 20:13:17 +0000 (UTC) Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-539-kCyN3A3zP3CFhZqeTDlNLw-1; Mon, 27 Mar 2023 16:13:15 -0400 X-MC-Unique: kCyN3A3zP3CFhZqeTDlNLw-1 Received: by mail-qv1-f47.google.com with SMTP id l7so7669137qvh.5 for ; Mon, 27 Mar 2023 13:13:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679947994; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HnRXjPFL8ryVwQIZKj/idU8AtLLYydcfQZ5/bZxBP7E=; b=m/U0m251NvjAqlellM3VGlA57gwQM+iKimgCQlRiQaYpBpkF1fbtbpMYBhVBp+jtZO odyZ3+OhC1aX9PzV6eV4eV3OJCaleKymAqSitpGTyxxGvtHl4kCKzQBDs0WSlvgHOKgd D+MGFC+5rnF9hABqJOXvOZ+Vy2ZPqmJoV/ARj5w3/qqBqLd3hX7P4EY+14/RBfOhi3JB P76XxrUFOVVTysPzI1VA/4WxcCCoXbUDRkcDt9IkFc0rhC51p7f11ER8jCxJG9Yx1jgX jpEnOIBIGooXiZoqxo88ptiGeNGhHCVK18AZlHm9LMsZU9I/bl/BvsxKtyoJA7xq2K70 Vj1g== X-Gm-Message-State: AAQBX9c7ZyTySYI08vyHn2KDC0wpe6JnMvIbNa8+aBnu70LaBrUA5tkX IStPSeIe7EGweGpJoyVPrnf3aU8h9uzwT10Q6eHTv9quZ9LWoG2Ju1xDsvqgR8aWRshOBP1H52g aAn42SILlYFDnJb5VlDe8oGyqUXAIS2NsQ8aCZsE93gLJnDAu6D38QsGbOgp2mxHqpW77y3i9ro s= X-Google-Smtp-Source: AKy350aV2shDFEGq+y0+P6CuzF0QHTPYTOXNK5SCBfMQ/nybOe1LXkh957qYAcLDjaI6537aBHt8UA== X-Received: by 2002:ad4:5dec:0:b0:5ad:5698:848e with SMTP id jn12-20020ad45dec000000b005ad5698848emr19825525qvb.48.1679947994512; Mon, 27 Mar 2023 13:13:14 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id d13-20020a0cf6cd000000b005dd8b9345d3sm3161240qvo.107.2023.03.27.13.13.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:14 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:28 -0400 Message-Id: <20230327201143.51026-6-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Subject: [dm-devel] [dm-6.4 PATCH v3 05/20] dm bufio: add LRU abstraction X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Joe Thornber A CLOCK algorithm is used in this LRU abstraction. This avoids relinking list nodes, which would require a write lock protecting it. None of the LRU methods are threadsafe; locking must be done at a higher level. Code that uses this new LRU will be introduced in the next 2 commits. As such, this commit will cause "defined but not used" compiler warnings that will be resolved by the next 2 commits. Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 235 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 235 insertions(+) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 64fb5fd39bd9..a0bb0de0d4e7 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -66,6 +66,241 @@ #define LIST_DIRTY 1 #define LIST_SIZE 2 +/*--------------------------------------------------------------*/ + +/* + * Rather than use an LRU list, we use a clock algorithm where entries + * are held in a circular list. When an entry is 'hit' a reference bit + * is set. The least recently used entry is approximated by running a + * cursor around the list selecting unreferenced entries. Referenced + * entries have their reference bit cleared as the cursor passes them. + */ +struct lru_entry { + struct list_head list; + atomic_t referenced; +}; + +struct lru_iter { + struct lru *lru; + struct list_head list; + struct lru_entry *stop; + struct lru_entry *e; +}; + +struct lru { + struct list_head *cursor; + unsigned long count; + + struct list_head iterators; +}; + +/*--------------*/ + +static void lru_init(struct lru *lru) +{ + lru->cursor = NULL; + lru->count = 0; + INIT_LIST_HEAD(&lru->iterators); +} + +static void lru_destroy(struct lru *lru) +{ + WARN_ON_ONCE(lru->cursor); + WARN_ON_ONCE(!list_empty(&lru->iterators)); +} + +/* + * Insert a new entry into the lru. + */ +static void lru_insert(struct lru *lru, struct lru_entry *le) +{ + /* + * Don't be tempted to set to 1, makes the lru aspect + * perform poorly. + */ + atomic_set(&le->referenced, 0); + + if (lru->cursor) { + list_add_tail(&le->list, lru->cursor); + } else { + INIT_LIST_HEAD(&le->list); + lru->cursor = &le->list; + } + lru->count++; +} + +/*--------------*/ + +/* + * Convert a list_head pointer to an lru_entry pointer. + */ +static inline struct lru_entry *to_le(struct list_head *l) +{ + return container_of(l, struct lru_entry, list); +} + +/* + * Initialize an lru_iter and add it to the list of cursors in the lru. + */ +static void lru_iter_begin(struct lru *lru, struct lru_iter *it) +{ + it->lru = lru; + it->stop = lru->cursor ? to_le(lru->cursor->prev) : NULL; + it->e = lru->cursor ? to_le(lru->cursor) : NULL; + list_add(&it->list, &lru->iterators); +} + +/* + * Remove an lru_iter from the list of cursors in the lru. + */ +static inline void lru_iter_end(struct lru_iter *it) +{ + list_del(&it->list); +} + +/* Predicate function type to be used with lru_iter_next */ +typedef bool (*iter_predicate)(struct lru_entry *le, void *context); + +/* + * Advance the cursor to the next entry that passes the + * predicate, and return that entry. Returns NULL if the + * iteration is complete. + */ +static struct lru_entry *lru_iter_next(struct lru_iter *it, + iter_predicate pred, void *context) +{ + struct lru_entry *e; + + while (it->e) { + e = it->e; + + /* advance the cursor */ + if (it->e == it->stop) + it->e = NULL; + else + it->e = to_le(it->e->list.next); + + if (pred(e, context)) + return e; + } + + return NULL; +} + +/* + * Invalidate a specific lru_entry and update all cursors in + * the lru accordingly. + */ +static void lru_iter_invalidate(struct lru *lru, struct lru_entry *e) +{ + struct lru_iter *it; + + list_for_each_entry(it, &lru->iterators, list) { + /* Move c->e forwards if necc. */ + if (it->e == e) { + it->e = to_le(it->e->list.next); + if (it->e == e) + it->e = NULL; + } + + /* Move it->stop backwards if necc. */ + if (it->stop == e) { + it->stop = to_le(it->stop->list.prev); + if (it->stop == e) + it->stop = NULL; + } + } +} + +/*--------------*/ + +/* + * Remove a specific entry from the lru. + */ +static void lru_remove(struct lru *lru, struct lru_entry *le) +{ + lru_iter_invalidate(lru, le); + if (lru->count == 1) { + lru->cursor = NULL; + } else { + if (lru->cursor == &le->list) + lru->cursor = lru->cursor->next; + list_del(&le->list); + } + lru->count--; +} + +/* + * Mark as referenced. + */ +static inline void lru_reference(struct lru_entry *le) +{ + atomic_set(&le->referenced, 1); +} + +/*--------------*/ + +/* + * Remove the least recently used entry (approx), that passes the predicate. + * Returns NULL on failure. + */ +enum evict_result { + ER_EVICT, + ER_DONT_EVICT, + ER_STOP, /* stop looking for something to evict */ +}; + +typedef enum evict_result (*le_predicate)(struct lru_entry *le, void *context); + +static struct lru_entry *lru_evict(struct lru *lru, le_predicate pred, void *context) +{ + unsigned long tested = 0; + struct list_head *h = lru->cursor; + struct lru_entry *le; + + if (!h) + return NULL; + /* + * In the worst case we have to loop around twice. Once to clear + * the reference flags, and then again to discover the predicate + * fails for all entries. + */ + while (tested < lru->count) { + le = container_of(h, struct lru_entry, list); + + if (atomic_read(&le->referenced)) { + atomic_set(&le->referenced, 0); + } else { + tested++; + switch (pred(le, context)) { + case ER_EVICT: + /* + * Adjust the cursor, so we start the next + * search from here. + */ + lru->cursor = le->list.next; + lru_remove(lru, le); + return le; + + case ER_DONT_EVICT: + break; + + case ER_STOP: + lru->cursor = le->list.next; + return NULL; + } + } + + h = h->next; + + cond_resched(); + } + + return NULL; +} + +/*--------------------------------------------------------------*/ + /* * Linking of buffers: * All buffers are linked to buffer_tree with their node field. From patchwork Mon Mar 27 20:11:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190028 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2BB8BC76195 for ; Mon, 27 Mar 2023 20:13:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948012; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=ft6+OH0NzqsAbwV/RPRwL5ODjkeVXhMlPlE0hGwohTc=; b=IocnZEPe/QHunD1kMrq+RxlXBJRtDyxJHpVQJfEjblER36ALMAp2Iocr9yTbNgpGcaZlam AdZ3trQfIqfYjbiZCDg0uVs0YlHIw3ELh+Ckz6Yg86WP7WKdQrS38IG6hMxkSC0yZQP7C4 e36mT3m7vR4bgUPXNv/HLB281BLEZZI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-652-MNv2yFCvM7yv1QXeVIJu-A-1; Mon, 27 Mar 2023 16:13:28 -0400 X-MC-Unique: MNv2yFCvM7yv1QXeVIJu-A-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A8993857FB4; Mon, 27 Mar 2023 20:13:26 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 963AC2166B26; Mon, 27 Mar 2023 20:13:26 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id BD48619472CB; Mon, 27 Mar 2023 20:13:22 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 1C7DD19472C5 for ; Mon, 27 Mar 2023 20:13:19 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 172922027042; Mon, 27 Mar 2023 20:13:19 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast06.extmail.prod.ext.rdu2.redhat.com [10.11.55.22]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0EBB02027040 for ; Mon, 27 Mar 2023 20:13:19 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E1D15185A7A2 for ; Mon, 27 Mar 2023 20:13:18 +0000 (UTC) Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-17-hTbaNhgSOaK4FLlW59f5Og-1; Mon, 27 Mar 2023 16:13:17 -0400 X-MC-Unique: hTbaNhgSOaK4FLlW59f5Og-1 Received: by mail-qt1-f175.google.com with SMTP id h16so3007185qtn.7 for ; Mon, 27 Mar 2023 13:13:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679947997; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uAVSvRsxMnPc21jXZ4LrLYAk6KvY+7YTUeeMl0wWC/w=; b=D1xTdwlSCKxpeN4NZcufIYdkpXktHvrzISSeBI8Fwkb3m3JOXUdlgsaCIKKvcxzWno Bn0NN5NaNdkQWbdsSqdb5DfOgAHL4yBZLJYw0EVZ7ss22oKGpWWL0e3GUFw2fYURpkwX bL35thOfk5Os7Yri5Ada0NWEQ7MOKXdTf9SJ60prwdQqj2KhKpDGtXhWI1QKyiiFY/Gg y2XR+lEMDNxV3aIpu4eqwP4HFdTQzj2y4j5o+p0Vv+2B/XZcPF71dyKdSm+9vPSE4umc 6+O+35zi5g9DmTFwlZE1ajBtEmds+abOKG4zNJnS3GyNoNv5AJUpGbQY6404z9DM/usa 6ERA== X-Gm-Message-State: AO0yUKVB8c2j3iosQtKy9ZnujZS+2mCS2RZmT3mofEUwA5RYC2wjKxL7 NzKjYNRtuDvR/j7XXVd2kquV5PrJV9tQNa0W7sT7mo/SVpnYwP4i6zJN8ClSgRTVzolD3+tv5rc FSa1Xdntkd3cLK+evgr/46Hvu76474b9Ftm+OxIcZZV9B0yG42CTcm464dYF8rk2cTVceQ4PEp1 I= X-Google-Smtp-Source: AK7set8E1sQHoMgS/ejqC0UID98TZdqr1b8EC2TrHyC7jnO0V77TRODkBdML8gGustfuewA+h/qb3Q== X-Received: by 2002:ac8:5dd0:0:b0:3e1:65f5:4a85 with SMTP id e16-20020ac85dd0000000b003e165f54a85mr17645462qtx.58.1679947996400; Mon, 27 Mar 2023 13:13:16 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id he8-20020a05622a600800b003b835e7e283sm380964qtb.44.2023.03.27.13.13.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:16 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:29 -0400 Message-Id: <20230327201143.51026-7-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Subject: [dm-devel] [dm-6.4 PATCH v3 06/20] dm bufio: add dm_buffer_cache abstraction X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Joe Thornber The buffer cache is responsible for managing the holder count, tracking clean/dirty state, and choosing buffers via predicates. Higher level code is responsible for allocation of buffers, IO and eviction/cache sizing. The buffer cache has thread safe methods for acquiring a reference to an existing buffer. All other methods in buffer cache are _not_ threadsafe, and only contain enough locking to guarantee the safe methods. Rather than a single mutex, sharded rw_semaphores are used to allow concurrent threads to 'get' buffers. Each rw_semaphore protects its own rbtree of buffer entries. Code that uses this new dm_buffer_cache abstraction will be introduced in a following commit. This commit moves the dm_buffer struct in preparation for finer grained dm_buffer changes, in the next commit, to be more easily seen. It also introduces temporary dm_buffer struct members to allow compilation of this intermediate commit (they will be elided in the next commit). This commit will cause "defined but not used" compiler warnings that will be resolved by the next commit. Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 588 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 526 insertions(+), 62 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index a0bb0de0d4e7..9daff9b77cee 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -301,6 +301,517 @@ static struct lru_entry *lru_evict(struct lru *lru, le_predicate pred, void *con /*--------------------------------------------------------------*/ +/* + * Buffer state bits. + */ +#define B_READING 0 +#define B_WRITING 1 +#define B_DIRTY 2 + +/* + * Describes how the block was allocated: + * kmem_cache_alloc(), __get_free_pages() or vmalloc(). + * See the comment at alloc_buffer_data. + */ +enum data_mode { + DATA_MODE_SLAB = 0, + DATA_MODE_GET_FREE_PAGES = 1, + DATA_MODE_VMALLOC = 2, + DATA_MODE_LIMIT = 3 +}; + +struct dm_buffer { + struct rb_node node; + struct list_head lru_list; + struct list_head global_list; + + sector_t block; + void *data; + unsigned char data_mode; /* DATA_MODE_* */ + + unsigned int accessed; + unsigned int __hold_count; + unsigned long last_accessed; + + unsigned char list_mode; /* LIST_* */ + blk_status_t read_error; + blk_status_t write_error; + unsigned long state; + unsigned int dirty_start; + unsigned int dirty_end; + unsigned int write_start; + unsigned int write_end; + struct dm_bufio_client *c; + struct list_head write_list; + void (*end_io)(struct dm_buffer *b, blk_status_t bs); +#ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING +#define MAX_STACK 10 + unsigned int stack_len; + unsigned long stack_entries[MAX_STACK]; +#endif + /* Temp members to allow dm_buffer_cache code to compile */ + atomic_t hold_count; + struct lru_entry lru; +}; + +/*--------------------------------------------------------------*/ + +/* + * The buffer cache manages buffers, particularly: + * - inc/dec of holder count + * - setting the last_accessed field + * - maintains clean/dirty state along with lru + * - selecting buffers that match predicates + * + * It does *not* handle: + * - allocation/freeing of buffers. + * - IO + * - Eviction or cache sizing. + * + * cache_get() and cache_put() are threadsafe, you do not need to + * protect these calls with a surrounding mutex. All the other + * methods are not threadsafe; they do use locking primitives, but + * only enough to ensure get/put are threadsafe. + */ + +#define NR_LOCKS 64 +#define LOCKS_MASK (NR_LOCKS - 1) + +struct buffer_tree { + struct rw_semaphore lock; + struct rb_root root; +} ____cacheline_aligned_in_smp; + +struct dm_buffer_cache { + /* + * We spread entries across multiple trees to reduce contention + * on the locks. + */ + struct buffer_tree trees[NR_LOCKS]; + struct lru lru[LIST_SIZE]; +}; + +static inline unsigned int cache_index(sector_t block) +{ + return block & LOCKS_MASK; +} + +static inline void cache_read_lock(struct dm_buffer_cache *bc, sector_t block) +{ + down_read(&bc->trees[cache_index(block)].lock); +} + +static inline void cache_read_unlock(struct dm_buffer_cache *bc, sector_t block) +{ + up_read(&bc->trees[cache_index(block)].lock); +} + +static inline void cache_write_lock(struct dm_buffer_cache *bc, sector_t block) +{ + down_write(&bc->trees[cache_index(block)].lock); +} + +static inline void cache_write_unlock(struct dm_buffer_cache *bc, sector_t block) +{ + up_write(&bc->trees[cache_index(block)].lock); +} + +static inline struct dm_buffer *le_to_buffer(struct lru_entry *le) +{ + return container_of(le, struct dm_buffer, lru); +} + +static struct dm_buffer *list_to_buffer(struct list_head *l) +{ + struct lru_entry *le = list_entry(l, struct lru_entry, list); + + if (!le) + return NULL; + + return le_to_buffer(le); +} + +static void cache_init(struct dm_buffer_cache *bc) +{ + unsigned int i; + + for (i = 0; i < NR_LOCKS; i++) { + init_rwsem(&bc->trees[i].lock); + bc->trees[i].root = RB_ROOT; + } + + lru_init(&bc->lru[LIST_CLEAN]); + lru_init(&bc->lru[LIST_DIRTY]); +} + +static void cache_destroy(struct dm_buffer_cache *bc) +{ + unsigned int i; + + for (i = 0; i < NR_LOCKS; i++) + WARN_ON_ONCE(!RB_EMPTY_ROOT(&bc->trees[i].root)); + + lru_destroy(&bc->lru[LIST_CLEAN]); + lru_destroy(&bc->lru[LIST_DIRTY]); +} + +/*--------------*/ + +/* + * not threadsafe, or racey depending how you look at it + */ +static inline unsigned long cache_count(struct dm_buffer_cache *bc, int list_mode) +{ + return bc->lru[list_mode].count; +} + +static inline unsigned long cache_total(struct dm_buffer_cache *bc) +{ + return cache_count(bc, LIST_CLEAN) + cache_count(bc, LIST_DIRTY); +} + +/*--------------*/ + +/* + * Gets a specific buffer, indexed by block. + * If the buffer is found then its holder count will be incremented and + * lru_reference will be called. + * + * threadsafe + */ +static struct dm_buffer *__cache_get(const struct rb_root *root, sector_t block) +{ + struct rb_node *n = root->rb_node; + struct dm_buffer *b; + + while (n) { + b = container_of(n, struct dm_buffer, node); + + if (b->block == block) + return b; + + n = block < b->block ? n->rb_left : n->rb_right; + } + + return NULL; +} + +static void __cache_inc_buffer(struct dm_buffer *b) +{ + atomic_inc(&b->hold_count); + WRITE_ONCE(b->last_accessed, jiffies); +} + +static struct dm_buffer *cache_get(struct dm_buffer_cache *bc, sector_t block) +{ + struct dm_buffer *b; + + cache_read_lock(bc, block); + b = __cache_get(&bc->trees[cache_index(block)].root, block); + if (b) { + lru_reference(&b->lru); + __cache_inc_buffer(b); + } + cache_read_unlock(bc, block); + + return b; +} + +/*--------------*/ + +/* + * Returns true if the hold count hits zero. + * threadsafe + */ +static bool cache_put(struct dm_buffer_cache *bc, struct dm_buffer *b) +{ + bool r; + + cache_read_lock(bc, b->block); + BUG_ON(!atomic_read(&b->hold_count)); + r = atomic_dec_and_test(&b->hold_count); + cache_read_unlock(bc, b->block); + + return r; +} + +/*--------------*/ + +typedef enum evict_result (*b_predicate)(struct dm_buffer *, void *); + +/* + * Evicts a buffer based on a predicate. The oldest buffer that + * matches the predicate will be selected. In addition to the + * predicate the hold_count of the selected buffer will be zero. + */ +struct evict_wrapper { + b_predicate pred; + void *context; +}; + +/* + * Wraps the buffer predicate turning it into an lru predicate. Adds + * extra test for hold_count. + */ +static enum evict_result __evict_pred(struct lru_entry *le, void *context) +{ + struct evict_wrapper *w = context; + struct dm_buffer *b = le_to_buffer(le); + + if (atomic_read(&b->hold_count)) + return ER_DONT_EVICT; + + return w->pred(b, w->context); +} + +static struct dm_buffer *cache_evict(struct dm_buffer_cache *bc, int list_mode, + b_predicate pred, void *context) +{ + struct evict_wrapper w = {.pred = pred, .context = context}; + struct lru_entry *le; + struct dm_buffer *b; + + le = lru_evict(&bc->lru[list_mode], __evict_pred, &w); + if (!le) + return NULL; + + b = le_to_buffer(le); + /* __evict_pred will have locked the appropriate tree. */ + rb_erase(&b->node, &bc->trees[cache_index(b->block)].root); + + return b; +} + +/*--------------*/ + +/* + * Mark a buffer as clean or dirty. Not threadsafe. + */ +static void cache_mark(struct dm_buffer_cache *bc, struct dm_buffer *b, int list_mode) +{ + cache_write_lock(bc, b->block); + if (list_mode != b->list_mode) { + lru_remove(&bc->lru[b->list_mode], &b->lru); + b->list_mode = list_mode; + lru_insert(&bc->lru[b->list_mode], &b->lru); + } + cache_write_unlock(bc, b->block); +} + +/*--------------*/ + +/* + * Runs through the lru associated with 'old_mode', if the predicate matches then + * it moves them to 'new_mode'. Not threadsafe. + */ +static void cache_mark_many(struct dm_buffer_cache *bc, int old_mode, int new_mode, + b_predicate pred, void *context) +{ + struct lru_entry *le; + struct dm_buffer *b; + struct evict_wrapper w = {.pred = pred, .context = context}; + + while (true) { + le = lru_evict(&bc->lru[old_mode], __evict_pred, &w); + if (!le) + break; + + b = le_to_buffer(le); + b->list_mode = new_mode; + lru_insert(&bc->lru[b->list_mode], &b->lru); + } +} + +/*--------------*/ + +/* + * Iterates through all clean or dirty entries calling a function for each + * entry. The callback may terminate the iteration early. Not threadsafe. + */ + +/* + * Iterator functions should return one of these actions to indicate + * how the iteration should proceed. + */ +enum it_action { + IT_NEXT, + IT_COMPLETE, +}; + +typedef enum it_action (*iter_fn)(struct dm_buffer *b, void *context); + +static void cache_iterate(struct dm_buffer_cache *bc, int list_mode, + iter_fn fn, void *context) +{ + struct lru *lru = &bc->lru[list_mode]; + struct lru_entry *le, *first; + + if (!lru->cursor) + return; + + first = le = to_le(lru->cursor); + do { + struct dm_buffer *b = le_to_buffer(le); + + switch (fn(b, context)) { + case IT_NEXT: + break; + + case IT_COMPLETE: + return; + } + cond_resched(); + + le = to_le(le->list.next); + } while (le != first); +} + +/*--------------*/ + +/* + * Passes ownership of the buffer to the cache. Returns false if the + * buffer was already present (in which case ownership does not pass). + * eg, a race with another thread. + * + * Holder count should be 1 on insertion. + * + * Not threadsafe. + */ +static bool __cache_insert(struct rb_root *root, struct dm_buffer *b) +{ + struct rb_node **new = &root->rb_node, *parent = NULL; + struct dm_buffer *found; + + while (*new) { + found = container_of(*new, struct dm_buffer, node); + + if (found->block == b->block) + return false; + + parent = *new; + new = b->block < found->block ? + &found->node.rb_left : &found->node.rb_right; + } + + rb_link_node(&b->node, parent, new); + rb_insert_color(&b->node, root); + + return true; +} + +static bool cache_insert(struct dm_buffer_cache *bc, struct dm_buffer *b) +{ + bool r; + + if (WARN_ON_ONCE(b->list_mode >= LIST_SIZE)) + return false; + + cache_write_lock(bc, b->block); + BUG_ON(atomic_read(&b->hold_count) != 1); + r = __cache_insert(&bc->trees[cache_index(b->block)].root, b); + if (r) + lru_insert(&bc->lru[b->list_mode], &b->lru); + cache_write_unlock(bc, b->block); + + return r; +} + +/*--------------*/ + +/* + * Removes buffer from cache, ownership of the buffer passes back to the caller. + * Fails if the hold_count is not one (ie. the caller holds the only reference). + * + * Not threadsafe. + */ +static bool cache_remove(struct dm_buffer_cache *bc, struct dm_buffer *b) +{ + bool r; + + cache_write_lock(bc, b->block); + + if (atomic_read(&b->hold_count) != 1) { + r = false; + } else { + r = true; + rb_erase(&b->node, &bc->trees[cache_index(b->block)].root); + lru_remove(&bc->lru[b->list_mode], &b->lru); + } + + cache_write_unlock(bc, b->block); + + return r; +} + +/*--------------*/ + +typedef void (*b_release)(struct dm_buffer *); + +static struct dm_buffer *__find_next(struct rb_root *root, sector_t block) +{ + struct rb_node *n = root->rb_node; + struct dm_buffer *b; + struct dm_buffer *best = NULL; + + while (n) { + b = container_of(n, struct dm_buffer, node); + + if (b->block == block) + return b; + + if (block <= b->block) { + n = n->rb_left; + best = b; + } else { + n = n->rb_right; + } + } + + return best; +} + +static void __remove_range(struct dm_buffer_cache *bc, + struct rb_root *root, + sector_t begin, sector_t end, + b_predicate pred, b_release release) +{ + struct dm_buffer *b; + + while (true) { + cond_resched(); + + b = __find_next(root, begin); + if (!b || (b->block >= end)) + break; + + begin = b->block + 1; + + if (atomic_read(&b->hold_count)) + continue; + + if (pred(b, NULL) == ER_EVICT) { + rb_erase(&b->node, root); + lru_remove(&bc->lru[b->list_mode], &b->lru); + release(b); + } + } +} + +static void cache_remove_range(struct dm_buffer_cache *bc, + sector_t begin, sector_t end, + b_predicate pred, b_release release) +{ + unsigned int i; + + for (i = 0; i < NR_LOCKS; i++) { + down_write(&bc->trees[i].lock); + __remove_range(bc, &bc->trees[i].root, begin, end, pred, release); + up_write(&bc->trees[i].lock); + } +} + +/*----------------------------------------------------------------*/ + /* * Linking of buffers: * All buffers are linked to buffer_tree with their node field. @@ -352,53 +863,6 @@ struct dm_bufio_client { atomic_long_t need_shrink; }; -/* - * Buffer state bits. - */ -#define B_READING 0 -#define B_WRITING 1 -#define B_DIRTY 2 - -/* - * Describes how the block was allocated: - * kmem_cache_alloc(), __get_free_pages() or vmalloc(). - * See the comment at alloc_buffer_data. - */ -enum data_mode { - DATA_MODE_SLAB = 0, - DATA_MODE_GET_FREE_PAGES = 1, - DATA_MODE_VMALLOC = 2, - DATA_MODE_LIMIT = 3 -}; - -struct dm_buffer { - struct rb_node node; - struct list_head lru_list; - struct list_head global_list; - sector_t block; - void *data; - unsigned char data_mode; /* DATA_MODE_* */ - unsigned char list_mode; /* LIST_* */ - blk_status_t read_error; - blk_status_t write_error; - unsigned int accessed; - unsigned int hold_count; - unsigned long state; - unsigned long last_accessed; - unsigned int dirty_start; - unsigned int dirty_end; - unsigned int write_start; - unsigned int write_end; - struct dm_bufio_client *c; - struct list_head write_list; - void (*end_io)(struct dm_buffer *buf, blk_status_t stat); -#ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING -#define MAX_STACK 10 - unsigned int stack_len; - unsigned long stack_entries[MAX_STACK]; -#endif -}; - static DEFINE_STATIC_KEY_FALSE(no_sleep_enabled); /*----------------------------------------------------------------*/ @@ -516,7 +980,7 @@ static struct dm_buffer *__find(struct dm_bufio_client *c, sector_t block) return NULL; } -static struct dm_buffer *__find_next(struct dm_bufio_client *c, sector_t block) +static struct dm_buffer *__find_next_old(struct dm_bufio_client *c, sector_t block) { struct rb_node *n = c->buffer_tree.rb_node; struct dm_buffer *b; @@ -1040,7 +1504,7 @@ static void __flush_write_list(struct list_head *write_list) */ static void __make_buffer_clean(struct dm_buffer *b) { - BUG_ON(b->hold_count); + BUG_ON(b->__hold_count); /* smp_load_acquire() pairs with read_endio()'s smp_mb__before_atomic() */ if (!smp_load_acquire(&b->state)) /* fast case */ @@ -1067,7 +1531,7 @@ static struct dm_buffer *__get_unclaimed_buffer(struct dm_bufio_client *c) unlikely(test_bit_acquire(B_READING, &b->state))) continue; - if (!b->hold_count) { + if (!b->__hold_count) { __make_buffer_clean(b); __unlink_buffer(b); return b; @@ -1081,7 +1545,7 @@ static struct dm_buffer *__get_unclaimed_buffer(struct dm_bufio_client *c) list_for_each_entry_reverse(b, &c->lru[LIST_DIRTY], lru_list) { BUG_ON(test_bit(B_READING, &b->state)); - if (!b->hold_count) { + if (!b->__hold_count) { __make_buffer_clean(b); __unlink_buffer(b); return b; @@ -1283,7 +1747,7 @@ static struct dm_buffer *__bufio_new(struct dm_bufio_client *c, sector_t block, __check_watermark(c, write_list); b = new_b; - b->hold_count = 1; + b->__hold_count = 1; b->read_error = 0; b->write_error = 0; __link_buffer(b, block, LIST_CLEAN); @@ -1311,7 +1775,7 @@ static struct dm_buffer *__bufio_new(struct dm_bufio_client *c, sector_t block, if (nf == NF_GET && unlikely(test_bit_acquire(B_READING, &b->state))) return NULL; - b->hold_count++; + b->__hold_count++; __relink_lru(b, test_bit(B_DIRTY, &b->state) || test_bit(B_WRITING, &b->state)); return b; @@ -1460,10 +1924,10 @@ void dm_bufio_release(struct dm_buffer *b) dm_bufio_lock(c); - BUG_ON(!b->hold_count); + BUG_ON(!b->__hold_count); - b->hold_count--; - if (!b->hold_count) { + b->__hold_count--; + if (!b->__hold_count) { wake_up(&c->free_buffer_wait); /* @@ -1564,12 +2028,12 @@ int dm_bufio_write_dirty_buffers(struct dm_bufio_client *c) if (test_bit(B_WRITING, &b->state)) { if (buffers_processed < c->n_buffers[LIST_DIRTY]) { dropped_lock = 1; - b->hold_count++; + b->__hold_count++; dm_bufio_unlock(c); wait_on_bit_io(&b->state, B_WRITING, TASK_UNINTERRUPTIBLE); dm_bufio_lock(c); - b->hold_count--; + b->__hold_count--; } else wait_on_bit_io(&b->state, B_WRITING, TASK_UNINTERRUPTIBLE); @@ -1660,7 +2124,7 @@ EXPORT_SYMBOL_GPL(dm_bufio_issue_discard); static void forget_buffer_locked(struct dm_buffer *b) { - if (likely(!b->hold_count) && likely(!smp_load_acquire(&b->state))) { + if (likely(!b->__hold_count) && likely(!smp_load_acquire(&b->state))) { __unlink_buffer(b); __free_buffer_wake(b); } @@ -1694,7 +2158,7 @@ void dm_bufio_forget_buffers(struct dm_bufio_client *c, sector_t block, sector_t while (block < end_block) { dm_bufio_lock(c); - b = __find_next(c, block); + b = __find_next_old(c, block); if (b) { block = b->block + 1; forget_buffer_locked(b); @@ -1791,7 +2255,7 @@ static void drop_buffers(struct dm_bufio_client *c) WARN_ON(!warned); warned = true; DMERR("leaked buffer %llx, hold count %u, list %d", - (unsigned long long)b->block, b->hold_count, i); + (unsigned long long)b->block, b->__hold_count, i); #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING stack_trace_print(b->stack_entries, b->stack_len, 1); /* mark unclaimed to avoid WARN_ON below */ @@ -1828,7 +2292,7 @@ static bool __try_evict_buffer(struct dm_buffer *b, gfp_t gfp) return false; } - if (b->hold_count) + if (b->__hold_count) return false; __make_buffer_clean(b); From patchwork Mon Mar 27 20:11:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190031 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 80BA1C76195 for ; Mon, 27 Mar 2023 20:13:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948024; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=0F4BJ+rDji4kQYj2D9nqdfUUnZJ1CPs4eA1iUd2H4zQ=; b=Vmva6TNxVmz7NCzZSuyg0sz/l+T80VQThLDmvTF8qZCj40siZ7hn4tyu0f4FsG0gab7lUW oWIyNv++8WcR6K6Zmx3mqh0v2/nJNr9m8pvNeh6CbhfBKgj19EpCKEQxPi0j3PapV6liaK m2Iciv4UWblfgr7tkBeuXm/UNwrFmC4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-441-2K6YMqwcNcKFBTgleG1V7A-1; Mon, 27 Mar 2023 16:13:40 -0400 X-MC-Unique: 2K6YMqwcNcKFBTgleG1V7A-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 632A9185A791; Mon, 27 Mar 2023 20:13:37 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4DB3F440BC; Mon, 27 Mar 2023 20:13:37 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 8A6DF1946A6F; Mon, 27 Mar 2023 20:13:27 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 4CE5719465A4 for ; Mon, 27 Mar 2023 20:13:26 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 431782166B2A; Mon, 27 Mar 2023 20:13:26 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast06.extmail.prod.ext.rdu2.redhat.com [10.11.55.22]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3B1362166B29 for ; Mon, 27 Mar 2023 20:13:21 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F16E118E0047 for ; Mon, 27 Mar 2023 20:13:20 +0000 (UTC) Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-656-GaOc5spjPg2clnKb03cH_g-2; Mon, 27 Mar 2023 16:13:19 -0400 X-MC-Unique: GaOc5spjPg2clnKb03cH_g-2 Received: by mail-qt1-f169.google.com with SMTP id p2so4725617qtw.13 for ; Mon, 27 Mar 2023 13:13:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679947999; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KZEmP2Lma0r81Bos8w4U89OEq/V+OweIm/gue2zysiY=; b=wLx1eoGTIgSb7LNvfZMHLI1vk08DvpcY0SZ+Oc4iMR92LS9RMjQOTDWiIH6ifq73Es c/pG2vPgucZWsf8HtntxHU1YoHeok0nPj4jk8rlY6kPMRNKlDF//OyvrQawdlcmwoxwT daaHjix/IOT8kPpvMrFVeeXdeyrWIMnj8mXE6jJjnB6hPratTxQffrrY+dfXHVdE/6fM BJxRkz1pl2yMeKHIp/DLHrcgnFTS06jwtWa1/OVY10NFdnqn8O/eKOFGTfBldEsaOfxV Fzja2wBcY1hCVKBEAsu8/kWWmjE25KCxudVDfzYs82YegPiJmxJ7+XCk8nxcJPPsmSEE PZKg== X-Gm-Message-State: AO0yUKX1OcIo/+N60hR/LsigGixD8tIGsMifE8QgIduUNg1877viEmcF 4kWfiA0hEUA7hFIxikS5dcSdGMGgkIUNe/lzy35dDxSPEAeJRYBDzOUHG5AZHp0V8tOArarC6ps RxB/n5mAIK+ZoWhviV2PiRX/W/xoHlQIm1FXqofC4M+SIapqisDZU+1FQnRiJKqBAY8tIu0awbq I= X-Google-Smtp-Source: AK7set/4u5wtNp1+7i337s6/269TkCLiJgXwRPcmr5VJh9wfIB7rZepSb8dvmBloMyznyDlN/1yP2Q== X-Received: by 2002:a05:622a:1109:b0:3e3:82c4:db44 with SMTP id e9-20020a05622a110900b003e382c4db44mr23802990qty.52.1679947998246; Mon, 27 Mar 2023 13:13:18 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id u23-20020a37ab17000000b0071f0d0aaef7sm13735581qke.80.2023.03.27.13.13.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:17 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:30 -0400 Message-Id: <20230327201143.51026-8-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Subject: [dm-devel] [dm-6.4 PATCH v3 07/20] dm bufio: improve concurrent IO performance X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Joe Thornber When multiple threads perform IO to a thin device, the underlying dm_bufio object can become a bottleneck; slowing down access to btree nodes that store the thin metadata. Prior to this commit, each bufio instance had a single mutex that was taken for every bufio operation. This commit concentrates on improving the common case where: a user of dm_bufio wishes to access, but not modify, a buffer which is already within the dm_bufio cache. Implementation:: The code has been refactored; pulling out an 'lru' abstraction and a 'buffer cache' abstraction (see 2 previous commits). This commit updates higher level bufio code (that performs allocation of buffers, IO and eviction/cache sizing) to leverage both abstractions. It also deals with the delicate locking requirements of both abstractions to provide finer grained locking. The result is significantly better concurrent IO performance. Before this commit, bufio has a global lru list it used to evict the oldest, clean buffers from _all_ clients. With the new locking we don’t want different ways to access the same buffer, so instead do_global_cleanup() loops around the clients asking them to free buffers older than a certain time. This commit also converts many old BUG_ONs to WARN_ON_ONCE, see the lru_evict and cache_evict code in particular. They will return ER_DONT_EVICT if a given buffer somehow meets the invariants that should _never_ happen. [Aside from revising this commit's header and fixing coding style and whitespace nits: this switching to WARN_ON_ONCE is Mike Snitzer's lone contribution to this commit] Testing:: Some of the low level functions have been unit tested using dm-unit: https://github.com/jthornber/dm-unit/blob/main/src/tests/bufio.rs Higher level concurrency and IO is tested via a test only target found here: https://github.com/jthornber/linux/blob/2023-03-24-thin-concurrency-9/drivers/md/dm-bufio-test.c The associated userland side of these tests is here: https://github.com/jthornber/dmtest-python/blob/main/src/dmtest/bufio/bufio_tests.py In addition the full dmtest suite of tests (dm-thin, dm-cache, etc) has been run (~450 tests). Performance:: Most bufio operations have unchanged performance. But if multiple threads are attempting to get buffers concurrently, and these buffers are already in the cache then there's a big speed up. Eg, one test has 16 'hotspot' threads simulating btree lookups while another thread dirties the whole device. In this case the hotspot threads acquire the buffers about 25 times faster. Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 959 +++++++++++++++++++++--------------------- 1 file changed, 487 insertions(+), 472 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 9daff9b77cee..1e000ec73bd6 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -321,37 +321,42 @@ enum data_mode { }; struct dm_buffer { + /* protected by the locks in dm_buffer_cache */ struct rb_node node; - struct list_head lru_list; - struct list_head global_list; + /* immutable, so don't need protecting */ sector_t block; void *data; unsigned char data_mode; /* DATA_MODE_* */ - unsigned int accessed; - unsigned int __hold_count; + /* + * These two fields are used in isolation, so do not need + * a surrounding lock. + */ + atomic_t hold_count; unsigned long last_accessed; + /* + * Everything else is protected by the mutex in + * dm_bufio_client + */ + unsigned long state; + struct lru_entry lru; unsigned char list_mode; /* LIST_* */ blk_status_t read_error; blk_status_t write_error; - unsigned long state; unsigned int dirty_start; unsigned int dirty_end; unsigned int write_start; unsigned int write_end; - struct dm_bufio_client *c; struct list_head write_list; + struct dm_bufio_client *c; void (*end_io)(struct dm_buffer *b, blk_status_t bs); #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING #define MAX_STACK 10 unsigned int stack_len; unsigned long stack_entries[MAX_STACK]; #endif - /* Temp members to allow dm_buffer_cache code to compile */ - atomic_t hold_count; - struct lru_entry lru; }; /*--------------------------------------------------------------*/ @@ -814,7 +819,7 @@ static void cache_remove_range(struct dm_buffer_cache *bc, /* * Linking of buffers: - * All buffers are linked to buffer_tree with their node field. + * All buffers are linked to buffer_cache with their node field. * * Clean buffers that are not being written (B_WRITING not set) * are linked to lru[LIST_CLEAN] with their lru_list field. @@ -832,9 +837,6 @@ struct dm_bufio_client { spinlock_t spinlock; bool no_sleep; - struct list_head lru[LIST_SIZE]; - unsigned long n_buffers[LIST_SIZE]; - struct block_device *bdev; unsigned int block_size; s8 sectors_per_block_bits; @@ -849,7 +851,7 @@ struct dm_bufio_client { unsigned int minimum_buffers; - struct rb_root buffer_tree; + struct dm_buffer_cache cache; wait_queue_head_t free_buffer_wait; sector_t start; @@ -861,6 +863,11 @@ struct dm_bufio_client { struct shrinker shrinker; struct work_struct shrink_work; atomic_long_t need_shrink; + + /* + * Used by global_cleanup to sort the clients list. + */ + unsigned long oldest_buffer; }; static DEFINE_STATIC_KEY_FALSE(no_sleep_enabled); @@ -877,14 +884,6 @@ static void dm_bufio_lock(struct dm_bufio_client *c) mutex_lock_nested(&c->lock, dm_bufio_in_request()); } -static int dm_bufio_trylock(struct dm_bufio_client *c) -{ - if (static_branch_unlikely(&no_sleep_enabled) && c->no_sleep) - return spin_trylock_bh(&c->spinlock); - else - return mutex_trylock(&c->lock); -} - static void dm_bufio_unlock(struct dm_bufio_client *c) { if (static_branch_unlikely(&no_sleep_enabled) && c->no_sleep) @@ -913,10 +912,6 @@ static unsigned long dm_bufio_cache_size_latch; static DEFINE_SPINLOCK(global_spinlock); -static LIST_HEAD(global_queue); - -static unsigned long global_num; - /* * Buffers are freed after this timeout */ @@ -958,78 +953,6 @@ static void buffer_record_stack(struct dm_buffer *b) } #endif -/* - *---------------------------------------------------------------- - * A red/black tree acts as an index for all the buffers. - *---------------------------------------------------------------- - */ -static struct dm_buffer *__find(struct dm_bufio_client *c, sector_t block) -{ - struct rb_node *n = c->buffer_tree.rb_node; - struct dm_buffer *b; - - while (n) { - b = container_of(n, struct dm_buffer, node); - - if (b->block == block) - return b; - - n = block < b->block ? n->rb_left : n->rb_right; - } - - return NULL; -} - -static struct dm_buffer *__find_next_old(struct dm_bufio_client *c, sector_t block) -{ - struct rb_node *n = c->buffer_tree.rb_node; - struct dm_buffer *b; - struct dm_buffer *best = NULL; - - while (n) { - b = container_of(n, struct dm_buffer, node); - - if (b->block == block) - return b; - - if (block <= b->block) { - n = n->rb_left; - best = b; - } else { - n = n->rb_right; - } - } - - return best; -} - -static void __insert(struct dm_bufio_client *c, struct dm_buffer *b) -{ - struct rb_node **new = &c->buffer_tree.rb_node, *parent = NULL; - struct dm_buffer *found; - - while (*new) { - found = container_of(*new, struct dm_buffer, node); - - if (found->block == b->block) { - BUG_ON(found != b); - return; - } - - parent = *new; - new = b->block < found->block ? - &found->node.rb_left : &found->node.rb_right; - } - - rb_link_node(&b->node, parent, new); - rb_insert_color(&b->node, &c->buffer_tree); -} - -static void __remove(struct dm_bufio_client *c, struct dm_buffer *b) -{ - rb_erase(&b->node, &c->buffer_tree); -} - /*----------------------------------------------------------------*/ static void adjust_total_allocated(struct dm_buffer *b, bool unlink) @@ -1057,16 +980,9 @@ static void adjust_total_allocated(struct dm_buffer *b, bool unlink) if (dm_bufio_current_allocated > dm_bufio_peak_allocated) dm_bufio_peak_allocated = dm_bufio_current_allocated; - b->accessed = 1; - if (!unlink) { - list_add(&b->global_list, &global_queue); - global_num++; if (dm_bufio_current_allocated > dm_bufio_cache_size) queue_work(dm_bufio_wq, &dm_bufio_replacement_work); - } else { - list_del(&b->global_list); - global_num--; } spin_unlock(&global_spinlock); @@ -1196,6 +1112,7 @@ static struct dm_buffer *alloc_buffer(struct dm_bufio_client *c, gfp_t gfp_mask) kmem_cache_free(c->slab_buffer, b); return NULL; } + adjust_total_allocated(b, false); #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING b->stack_len = 0; @@ -1210,61 +1127,11 @@ static void free_buffer(struct dm_buffer *b) { struct dm_bufio_client *c = b->c; + adjust_total_allocated(b, true); free_buffer_data(c, b->data, b->data_mode); kmem_cache_free(c->slab_buffer, b); } -/* - * Link buffer to the buffer tree and clean or dirty queue. - */ -static void __link_buffer(struct dm_buffer *b, sector_t block, int dirty) -{ - struct dm_bufio_client *c = b->c; - - c->n_buffers[dirty]++; - b->block = block; - b->list_mode = dirty; - list_add(&b->lru_list, &c->lru[dirty]); - __insert(b->c, b); - b->last_accessed = jiffies; - - adjust_total_allocated(b, false); -} - -/* - * Unlink buffer from the buffer tree and dirty or clean queue. - */ -static void __unlink_buffer(struct dm_buffer *b) -{ - struct dm_bufio_client *c = b->c; - - BUG_ON(!c->n_buffers[b->list_mode]); - - c->n_buffers[b->list_mode]--; - __remove(b->c, b); - list_del(&b->lru_list); - - adjust_total_allocated(b, true); -} - -/* - * Place the buffer to the head of dirty or clean LRU queue. - */ -static void __relink_lru(struct dm_buffer *b, int dirty) -{ - struct dm_bufio_client *c = b->c; - - b->accessed = 1; - - BUG_ON(!c->n_buffers[b->list_mode]); - - c->n_buffers[b->list_mode]--; - c->n_buffers[dirty]++; - b->list_mode = dirty; - list_move(&b->lru_list, &c->lru[dirty]); - b->last_accessed = jiffies; -} - /* *-------------------------------------------------------------------------- * Submit I/O on the buffer. @@ -1504,7 +1371,7 @@ static void __flush_write_list(struct list_head *write_list) */ static void __make_buffer_clean(struct dm_buffer *b) { - BUG_ON(b->__hold_count); + BUG_ON(atomic_read(&b->hold_count)); /* smp_load_acquire() pairs with read_endio()'s smp_mb__before_atomic() */ if (!smp_load_acquire(&b->state)) /* fast case */ @@ -1515,6 +1382,36 @@ static void __make_buffer_clean(struct dm_buffer *b) wait_on_bit_io(&b->state, B_WRITING, TASK_UNINTERRUPTIBLE); } +static enum evict_result is_clean(struct dm_buffer *b, void *context) +{ + struct dm_bufio_client *c = context; + + /* These should never happen */ + if (WARN_ON_ONCE(test_bit(B_WRITING, &b->state))) + return ER_DONT_EVICT; + if (WARN_ON_ONCE(test_bit(B_DIRTY, &b->state))) + return ER_DONT_EVICT; + if (WARN_ON_ONCE(b->list_mode != LIST_CLEAN)) + return ER_DONT_EVICT; + + if (static_branch_unlikely(&no_sleep_enabled) && c->no_sleep && + unlikely(test_bit(B_READING, &b->state))) + return ER_DONT_EVICT; + + return ER_EVICT; +} + +static enum evict_result is_dirty(struct dm_buffer *b, void *context) +{ + /* These should never happen */ + if (WARN_ON_ONCE(test_bit(B_READING, &b->state))) + return ER_DONT_EVICT; + if (WARN_ON_ONCE(b->list_mode != LIST_DIRTY)) + return ER_DONT_EVICT; + + return ER_EVICT; +} + /* * Find some buffer that is not held by anybody, clean it, unlink it and * return it. @@ -1523,34 +1420,20 @@ static struct dm_buffer *__get_unclaimed_buffer(struct dm_bufio_client *c) { struct dm_buffer *b; - list_for_each_entry_reverse(b, &c->lru[LIST_CLEAN], lru_list) { - BUG_ON(test_bit(B_WRITING, &b->state)); - BUG_ON(test_bit(B_DIRTY, &b->state)); - - if (static_branch_unlikely(&no_sleep_enabled) && c->no_sleep && - unlikely(test_bit_acquire(B_READING, &b->state))) - continue; - - if (!b->__hold_count) { - __make_buffer_clean(b); - __unlink_buffer(b); - return b; - } - cond_resched(); + b = cache_evict(&c->cache, LIST_CLEAN, is_clean, c); + if (b) { + /* this also waits for pending reads */ + __make_buffer_clean(b); + return b; } if (static_branch_unlikely(&no_sleep_enabled) && c->no_sleep) return NULL; - list_for_each_entry_reverse(b, &c->lru[LIST_DIRTY], lru_list) { - BUG_ON(test_bit(B_READING, &b->state)); - - if (!b->__hold_count) { - __make_buffer_clean(b); - __unlink_buffer(b); - return b; - } - cond_resched(); + b = cache_evict(&c->cache, LIST_DIRTY, is_dirty, NULL); + if (b) { + __make_buffer_clean(b); + return b; } return NULL; @@ -1571,7 +1454,12 @@ static void __wait_for_free_buffer(struct dm_bufio_client *c) set_current_state(TASK_UNINTERRUPTIBLE); dm_bufio_unlock(c); - io_schedule(); + /* + * It's possible to miss a wake up event since we don't always + * hold c->lock when wake_up is called. So we have a timeout here, + * just in case. + */ + io_schedule_timeout(5 * HZ); remove_wait_queue(&c->free_buffer_wait, &wait); @@ -1629,9 +1517,8 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client } if (!list_empty(&c->reserved_buffers)) { - b = list_entry(c->reserved_buffers.next, - struct dm_buffer, lru_list); - list_del(&b->lru_list); + b = list_to_buffer(c->reserved_buffers.next); + list_del(&b->lru.list); c->need_reserved_buffers++; return b; @@ -1665,36 +1552,56 @@ static void __free_buffer_wake(struct dm_buffer *b) { struct dm_bufio_client *c = b->c; + b->block = -1; if (!c->need_reserved_buffers) free_buffer(b); else { - list_add(&b->lru_list, &c->reserved_buffers); + list_add(&b->lru.list, &c->reserved_buffers); c->need_reserved_buffers--; } wake_up(&c->free_buffer_wait); } +static enum evict_result cleaned(struct dm_buffer *b, void *context) +{ + if (WARN_ON_ONCE(test_bit(B_READING, &b->state))) + return ER_DONT_EVICT; /* should never happen */ + + if (test_bit(B_DIRTY, &b->state) || test_bit(B_WRITING, &b->state)) + return ER_DONT_EVICT; + else + return ER_EVICT; +} + +static void __move_clean_buffers(struct dm_bufio_client *c) +{ + cache_mark_many(&c->cache, LIST_DIRTY, LIST_CLEAN, cleaned, NULL); +} + +struct write_context { + int no_wait; + struct list_head *write_list; +}; + +static enum it_action write_one(struct dm_buffer *b, void *context) +{ + struct write_context *wc = context; + + if (wc->no_wait && test_bit(B_WRITING, &b->state)) + return IT_COMPLETE; + + __write_dirty_buffer(b, wc->write_list); + return IT_NEXT; +} + static void __write_dirty_buffers_async(struct dm_bufio_client *c, int no_wait, struct list_head *write_list) { - struct dm_buffer *b, *tmp; + struct write_context wc = {.no_wait = no_wait, .write_list = write_list}; - list_for_each_entry_safe_reverse(b, tmp, &c->lru[LIST_DIRTY], lru_list) { - BUG_ON(test_bit(B_READING, &b->state)); - - if (!test_bit(B_DIRTY, &b->state) && - !test_bit(B_WRITING, &b->state)) { - __relink_lru(b, LIST_CLEAN); - continue; - } - - if (no_wait && test_bit(B_WRITING, &b->state)) - return; - - __write_dirty_buffer(b, write_list); - cond_resched(); - } + __move_clean_buffers(c); + cache_iterate(&c->cache, LIST_DIRTY, write_one, &wc); } /* @@ -1705,7 +1612,8 @@ static void __write_dirty_buffers_async(struct dm_bufio_client *c, int no_wait, static void __check_watermark(struct dm_bufio_client *c, struct list_head *write_list) { - if (c->n_buffers[LIST_DIRTY] > c->n_buffers[LIST_CLEAN] * DM_BUFIO_WRITEBACK_RATIO) + if (cache_count(&c->cache, LIST_DIRTY) > + cache_count(&c->cache, LIST_CLEAN) * DM_BUFIO_WRITEBACK_RATIO) __write_dirty_buffers_async(c, 1, write_list); } @@ -1715,6 +1623,21 @@ static void __check_watermark(struct dm_bufio_client *c, *-------------------------------------------------------------- */ +static void cache_put_and_wake(struct dm_bufio_client *c, struct dm_buffer *b) +{ + /* + * Relying on waitqueue_active() is racey, but we sleep + * with schedule_timeout anyway. + */ + if (cache_put(&c->cache, b) && + unlikely(waitqueue_active(&c->free_buffer_wait))) + wake_up(&c->free_buffer_wait); +} + +/* + * This assumes you have already checked the cache to see if the buffer + * is already present (it will recheck after dropping the lock for allocation). + */ static struct dm_buffer *__bufio_new(struct dm_bufio_client *c, sector_t block, enum new_flag nf, int *need_submit, struct list_head *write_list) @@ -1723,11 +1646,8 @@ static struct dm_buffer *__bufio_new(struct dm_bufio_client *c, sector_t block, *need_submit = 0; - b = __find(c, block); - if (b) - goto found_buffer; - - if (nf == NF_GET) + /* This can't be called with NF_GET */ + if (WARN_ON_ONCE(nf == NF_GET)) return NULL; new_b = __alloc_buffer_wait(c, nf); @@ -1738,7 +1658,7 @@ static struct dm_buffer *__bufio_new(struct dm_bufio_client *c, sector_t block, * We've had a period where the mutex was unlocked, so need to * recheck the buffer tree. */ - b = __find(c, block); + b = cache_get(&c->cache, block); if (b) { __free_buffer_wake(new_b); goto found_buffer; @@ -1747,24 +1667,35 @@ static struct dm_buffer *__bufio_new(struct dm_bufio_client *c, sector_t block, __check_watermark(c, write_list); b = new_b; - b->__hold_count = 1; + atomic_set(&b->hold_count, 1); + WRITE_ONCE(b->last_accessed, jiffies); + b->block = block; b->read_error = 0; b->write_error = 0; - __link_buffer(b, block, LIST_CLEAN); + b->list_mode = LIST_CLEAN; - if (nf == NF_FRESH) { + if (nf == NF_FRESH) b->state = 0; - return b; + else { + b->state = 1 << B_READING; + *need_submit = 1; } - b->state = 1 << B_READING; - *need_submit = 1; + /* + * We mustn't insert into the cache until the B_READING state + * is set. Otherwise another thread could get it and use + * it before it had been read. + */ + cache_insert(&c->cache, b); return b; found_buffer: - if (nf == NF_PREFETCH) + if (nf == NF_PREFETCH) { + cache_put_and_wake(c, b); return NULL; + } + /* * Note: it is essential that we don't wait for the buffer to be * read if dm_bufio_get function is used. Both dm_bufio_get and @@ -1772,12 +1703,11 @@ static struct dm_buffer *__bufio_new(struct dm_bufio_client *c, sector_t block, * If the user called both dm_bufio_prefetch and dm_bufio_get on * the same buffer, it would deadlock if we waited. */ - if (nf == NF_GET && unlikely(test_bit_acquire(B_READING, &b->state))) + if (nf == NF_GET && unlikely(test_bit_acquire(B_READING, &b->state))) { + cache_put_and_wake(c, b); return NULL; + } - b->__hold_count++; - __relink_lru(b, test_bit(B_DIRTY, &b->state) || - test_bit(B_WRITING, &b->state)); return b; } @@ -1807,18 +1737,50 @@ static void read_endio(struct dm_buffer *b, blk_status_t status) static void *new_read(struct dm_bufio_client *c, sector_t block, enum new_flag nf, struct dm_buffer **bp) { - int need_submit; + int need_submit = 0; struct dm_buffer *b; LIST_HEAD(write_list); - dm_bufio_lock(c); - b = __bufio_new(c, block, nf, &need_submit, &write_list); + *bp = NULL; + + /* + * Fast path, hopefully the block is already in the cache. No need + * to get the client lock for this. + */ + b = cache_get(&c->cache, block); + if (b) { + if (nf == NF_PREFETCH) { + cache_put_and_wake(c, b); + return NULL; + } + + /* + * Note: it is essential that we don't wait for the buffer to be + * read if dm_bufio_get function is used. Both dm_bufio_get and + * dm_bufio_prefetch can be used in the driver request routine. + * If the user called both dm_bufio_prefetch and dm_bufio_get on + * the same buffer, it would deadlock if we waited. + */ + if (nf == NF_GET && unlikely(test_bit_acquire(B_READING, &b->state))) { + cache_put_and_wake(c, b); + return NULL; + } + } + + if (!b) { + if (nf == NF_GET) + return NULL; + + dm_bufio_lock(c); + b = __bufio_new(c, block, nf, &need_submit, &write_list); + dm_bufio_unlock(c); + } + #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING - if (b && b->hold_count == 1) + if (b && (atomic_read(&b->hold_count) == 1)) buffer_record_stack(b); #endif - dm_bufio_unlock(c); __flush_write_list(&write_list); @@ -1881,12 +1843,19 @@ void dm_bufio_prefetch(struct dm_bufio_client *c, return; /* should never happen */ blk_start_plug(&plug); - dm_bufio_lock(c); for (; n_blocks--; block++) { int need_submit; struct dm_buffer *b; + b = cache_get(&c->cache, block); + if (b) { + /* already in cache */ + cache_put_and_wake(c, b); + continue; + } + + dm_bufio_lock(c); b = __bufio_new(c, block, NF_PREFETCH, &need_submit, &write_list); if (unlikely(!list_empty(&write_list))) { @@ -1909,10 +1878,9 @@ void dm_bufio_prefetch(struct dm_bufio_client *c, goto flush_plug; dm_bufio_lock(c); } + dm_bufio_unlock(c); } - dm_bufio_unlock(c); - flush_plug: blk_finish_plug(&plug); } @@ -1922,29 +1890,28 @@ void dm_bufio_release(struct dm_buffer *b) { struct dm_bufio_client *c = b->c; - dm_bufio_lock(c); + /* + * If there were errors on the buffer, and the buffer is not + * to be written, free the buffer. There is no point in caching + * invalid buffer. + */ + if ((b->read_error || b->write_error) && + !test_bit_acquire(B_READING, &b->state) && + !test_bit(B_WRITING, &b->state) && + !test_bit(B_DIRTY, &b->state)) { + dm_bufio_lock(c); - BUG_ON(!b->__hold_count); - - b->__hold_count--; - if (!b->__hold_count) { - wake_up(&c->free_buffer_wait); - - /* - * If there were errors on the buffer, and the buffer is not - * to be written, free the buffer. There is no point in caching - * invalid buffer. - */ - if ((b->read_error || b->write_error) && - !test_bit_acquire(B_READING, &b->state) && - !test_bit(B_WRITING, &b->state) && - !test_bit(B_DIRTY, &b->state)) { - __unlink_buffer(b); + /* cache remove can fail if there are other holders */ + if (cache_remove(&c->cache, b)) { __free_buffer_wake(b); + dm_bufio_unlock(c); + return; } + + dm_bufio_unlock(c); } - dm_bufio_unlock(c); + cache_put_and_wake(c, b); } EXPORT_SYMBOL_GPL(dm_bufio_release); @@ -1963,7 +1930,7 @@ void dm_bufio_mark_partial_buffer_dirty(struct dm_buffer *b, if (!test_and_set_bit(B_DIRTY, &b->state)) { b->dirty_start = start; b->dirty_end = end; - __relink_lru(b, LIST_DIRTY); + cache_mark(&c->cache, b, LIST_DIRTY); } else { if (start < b->dirty_start) b->dirty_start = start; @@ -2002,11 +1969,19 @@ EXPORT_SYMBOL_GPL(dm_bufio_write_dirty_buffers_async); * * Finally, we flush hardware disk cache. */ +static bool is_writing(struct lru_entry *e, void *context) +{ + struct dm_buffer *b = le_to_buffer(e); + + return test_bit(B_WRITING, &b->state); +} + int dm_bufio_write_dirty_buffers(struct dm_bufio_client *c) { int a, f; - unsigned long buffers_processed = 0; - struct dm_buffer *b, *tmp; + unsigned long nr_buffers; + struct lru_entry *e; + struct lru_iter it; LIST_HEAD(write_list); @@ -2016,52 +1991,32 @@ int dm_bufio_write_dirty_buffers(struct dm_bufio_client *c) __flush_write_list(&write_list); dm_bufio_lock(c); -again: - list_for_each_entry_safe_reverse(b, tmp, &c->lru[LIST_DIRTY], lru_list) { - int dropped_lock = 0; - - if (buffers_processed < c->n_buffers[LIST_DIRTY]) - buffers_processed++; + nr_buffers = cache_count(&c->cache, LIST_DIRTY); + lru_iter_begin(&c->cache.lru[LIST_DIRTY], &it); + while ((e = lru_iter_next(&it, is_writing, c))) { + struct dm_buffer *b = le_to_buffer(e); + __cache_inc_buffer(b); BUG_ON(test_bit(B_READING, &b->state)); - if (test_bit(B_WRITING, &b->state)) { - if (buffers_processed < c->n_buffers[LIST_DIRTY]) { - dropped_lock = 1; - b->__hold_count++; - dm_bufio_unlock(c); - wait_on_bit_io(&b->state, B_WRITING, - TASK_UNINTERRUPTIBLE); - dm_bufio_lock(c); - b->__hold_count--; - } else - wait_on_bit_io(&b->state, B_WRITING, - TASK_UNINTERRUPTIBLE); + if (nr_buffers) { + nr_buffers--; + dm_bufio_unlock(c); + wait_on_bit_io(&b->state, B_WRITING, TASK_UNINTERRUPTIBLE); + dm_bufio_lock(c); + } else { + wait_on_bit_io(&b->state, B_WRITING, TASK_UNINTERRUPTIBLE); } - if (!test_bit(B_DIRTY, &b->state) && - !test_bit(B_WRITING, &b->state)) - __relink_lru(b, LIST_CLEAN); + if (!test_bit(B_DIRTY, &b->state) && !test_bit(B_WRITING, &b->state)) + cache_mark(&c->cache, b, LIST_CLEAN); + + cache_put_and_wake(c, b); cond_resched(); - - /* - * If we dropped the lock, the list is no longer consistent, - * so we must restart the search. - * - * In the most common case, the buffer just processed is - * relinked to the clean list, so we won't loop scanning the - * same buffer again and again. - * - * This may livelock if there is another thread simultaneously - * dirtying buffers, so we count the number of buffers walked - * and if it exceeds the total number of buffers, it means that - * someone is doing some writes simultaneously with us. In - * this case, stop, dropping the lock. - */ - if (dropped_lock) - goto again; } + lru_iter_end(&it); + wake_up(&c->free_buffer_wait); dm_bufio_unlock(c); @@ -2122,12 +2077,23 @@ int dm_bufio_issue_discard(struct dm_bufio_client *c, sector_t block, sector_t c } EXPORT_SYMBOL_GPL(dm_bufio_issue_discard); -static void forget_buffer_locked(struct dm_buffer *b) +static bool forget_buffer(struct dm_bufio_client *c, sector_t block) { - if (likely(!b->__hold_count) && likely(!smp_load_acquire(&b->state))) { - __unlink_buffer(b); - __free_buffer_wake(b); + struct dm_buffer *b; + + b = cache_get(&c->cache, block); + if (b) { + if (likely(!smp_load_acquire(&b->state))) { + if (cache_remove(&c->cache, b)) + __free_buffer_wake(b); + else + cache_put_and_wake(c, b); + } else { + cache_put_and_wake(c, b); + } } + + return b ? true : false; } /* @@ -2138,38 +2104,22 @@ static void forget_buffer_locked(struct dm_buffer *b) */ void dm_bufio_forget(struct dm_bufio_client *c, sector_t block) { - struct dm_buffer *b; - dm_bufio_lock(c); - - b = __find(c, block); - if (b) - forget_buffer_locked(b); - + forget_buffer(c, block); dm_bufio_unlock(c); } EXPORT_SYMBOL_GPL(dm_bufio_forget); +static enum evict_result idle(struct dm_buffer *b, void *context) +{ + return b->state ? ER_DONT_EVICT : ER_EVICT; +} + void dm_bufio_forget_buffers(struct dm_bufio_client *c, sector_t block, sector_t n_blocks) { - struct dm_buffer *b; - sector_t end_block = block + n_blocks; - - while (block < end_block) { - dm_bufio_lock(c); - - b = __find_next_old(c, block); - if (b) { - block = b->block + 1; - forget_buffer_locked(b); - } - - dm_bufio_unlock(c); - - if (!b) - break; - } - + dm_bufio_lock(c); + cache_remove_range(&c->cache, block, block + n_blocks, idle, __free_buffer_wake); + dm_bufio_unlock(c); } EXPORT_SYMBOL_GPL(dm_bufio_forget_buffers); @@ -2231,11 +2181,26 @@ struct dm_bufio_client *dm_bufio_get_client(struct dm_buffer *b) } EXPORT_SYMBOL_GPL(dm_bufio_get_client); +static enum it_action warn_leak(struct dm_buffer *b, void *context) +{ + bool *warned = context; + + WARN_ON(!(*warned)); + *warned = true; + DMERR("leaked buffer %llx, hold count %u, list %d", + (unsigned long long)b->block, atomic_read(&b->hold_count), b->list_mode); +#ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING + stack_trace_print(b->stack_entries, b->stack_len, 1); + /* mark unclaimed to avoid WARN_ON at end of drop_buffers() */ + atomic_set(&b->hold_count, 0); +#endif + return IT_NEXT; +} + static void drop_buffers(struct dm_bufio_client *c) { - struct dm_buffer *b; int i; - bool warned = false; + struct dm_buffer *b; if (WARN_ON(dm_bufio_in_request())) return; /* should never happen */ @@ -2250,18 +2215,11 @@ static void drop_buffers(struct dm_bufio_client *c) while ((b = __get_unclaimed_buffer(c))) __free_buffer_wake(b); - for (i = 0; i < LIST_SIZE; i++) - list_for_each_entry(b, &c->lru[i], lru_list) { - WARN_ON(!warned); - warned = true; - DMERR("leaked buffer %llx, hold count %u, list %d", - (unsigned long long)b->block, b->__hold_count, i); -#ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING - stack_trace_print(b->stack_entries, b->stack_len, 1); - /* mark unclaimed to avoid WARN_ON below */ - b->hold_count = 0; -#endif - } + for (i = 0; i < LIST_SIZE; i++) { + bool warned = false; + + cache_iterate(&c->cache, i, warn_leak, &warned); + } #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING while ((b = __get_unclaimed_buffer(c))) @@ -2269,39 +2227,11 @@ static void drop_buffers(struct dm_bufio_client *c) #endif for (i = 0; i < LIST_SIZE; i++) - WARN_ON(!list_empty(&c->lru[i])); + WARN_ON(cache_count(&c->cache, i)); dm_bufio_unlock(c); } -/* - * We may not be able to evict this buffer if IO pending or the client - * is still using it. Caller is expected to know buffer is too old. - * - * And if GFP_NOFS is used, we must not do any I/O because we hold - * dm_bufio_clients_lock and we would risk deadlock if the I/O gets - * rerouted to different bufio client. - */ -static bool __try_evict_buffer(struct dm_buffer *b, gfp_t gfp) -{ - if (!(gfp & __GFP_FS) || - (static_branch_unlikely(&no_sleep_enabled) && b->c->no_sleep)) { - if (test_bit_acquire(B_READING, &b->state) || - test_bit(B_WRITING, &b->state) || - test_bit(B_DIRTY, &b->state)) - return false; - } - - if (b->__hold_count) - return false; - - __make_buffer_clean(b); - __unlink_buffer(b); - __free_buffer_wake(b); - - return true; -} - static unsigned long get_retain_buffers(struct dm_bufio_client *c) { unsigned long retain_bytes = READ_ONCE(dm_bufio_retain_bytes); @@ -2317,22 +2247,28 @@ static unsigned long get_retain_buffers(struct dm_bufio_client *c) static void __scan(struct dm_bufio_client *c) { int l; - struct dm_buffer *b, *tmp; + struct dm_buffer *b; unsigned long freed = 0; - unsigned long count = c->n_buffers[LIST_CLEAN] + - c->n_buffers[LIST_DIRTY]; unsigned long retain_target = get_retain_buffers(c); + unsigned long count = cache_total(&c->cache); for (l = 0; l < LIST_SIZE; l++) { - list_for_each_entry_safe_reverse(b, tmp, &c->lru[l], lru_list) { + while (true) { if (count - freed <= retain_target) atomic_long_set(&c->need_shrink, 0); if (!atomic_long_read(&c->need_shrink)) - return; - if (__try_evict_buffer(b, GFP_KERNEL)) { - atomic_long_dec(&c->need_shrink); - freed++; - } + break; + + b = cache_evict(&c->cache, l, + l == LIST_CLEAN ? is_clean : is_dirty, c); + if (!b) + break; + + __make_buffer_clean(b); + __free_buffer_wake(b); + + atomic_long_dec(&c->need_shrink); + freed++; cond_resched(); } } @@ -2361,8 +2297,7 @@ static unsigned long dm_bufio_shrink_scan(struct shrinker *shrink, struct shrink static unsigned long dm_bufio_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { struct dm_bufio_client *c = container_of(shrink, struct dm_bufio_client, shrinker); - unsigned long count = READ_ONCE(c->n_buffers[LIST_CLEAN]) + - READ_ONCE(c->n_buffers[LIST_DIRTY]); + unsigned long count = cache_total(&c->cache); unsigned long retain_target = get_retain_buffers(c); unsigned long queued_for_cleanup = atomic_long_read(&c->need_shrink); @@ -2390,7 +2325,6 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign { int r; struct dm_bufio_client *c; - unsigned int i; char slab_name[27]; if (!block_size || block_size & ((1 << SECTOR_SHIFT) - 1)) { @@ -2404,7 +2338,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign r = -ENOMEM; goto bad_client; } - c->buffer_tree = RB_ROOT; + cache_init(&c->cache); c->bdev = bdev; c->block_size = block_size; @@ -2421,11 +2355,6 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign static_branch_inc(&no_sleep_enabled); } - for (i = 0; i < LIST_SIZE; i++) { - INIT_LIST_HEAD(&c->lru[i]); - c->n_buffers[i] = 0; - } - mutex_init(&c->lock); spin_lock_init(&c->spinlock); INIT_LIST_HEAD(&c->reserved_buffers); @@ -2497,9 +2426,9 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign bad: while (!list_empty(&c->reserved_buffers)) { - struct dm_buffer *b = list_entry(c->reserved_buffers.next, - struct dm_buffer, lru_list); - list_del(&b->lru_list); + struct dm_buffer *b = list_to_buffer(c->reserved_buffers.next); + + list_del(&b->lru.list); free_buffer(b); } kmem_cache_destroy(c->slab_cache); @@ -2536,23 +2465,23 @@ void dm_bufio_client_destroy(struct dm_bufio_client *c) mutex_unlock(&dm_bufio_clients_lock); - WARN_ON(!RB_EMPTY_ROOT(&c->buffer_tree)); WARN_ON(c->need_reserved_buffers); while (!list_empty(&c->reserved_buffers)) { - struct dm_buffer *b = list_entry(c->reserved_buffers.next, - struct dm_buffer, lru_list); - list_del(&b->lru_list); + struct dm_buffer *b = list_to_buffer(c->reserved_buffers.next); + + list_del(&b->lru.list); free_buffer(b); } for (i = 0; i < LIST_SIZE; i++) - if (c->n_buffers[i]) - DMERR("leaked buffer count %d: %ld", i, c->n_buffers[i]); + if (cache_count(&c->cache, i)) + DMERR("leaked buffer count %d: %lu", i, cache_count(&c->cache, i)); for (i = 0; i < LIST_SIZE; i++) - WARN_ON(c->n_buffers[i]); + WARN_ON(cache_count(&c->cache, i)); + cache_destroy(&c->cache); kmem_cache_destroy(c->slab_cache); kmem_cache_destroy(c->slab_buffer); dm_io_client_destroy(c->dm_io); @@ -2569,6 +2498,8 @@ void dm_bufio_set_sector_offset(struct dm_bufio_client *c, sector_t start) } EXPORT_SYMBOL_GPL(dm_bufio_set_sector_offset); +/*--------------------------------------------------------------*/ + static unsigned int get_max_age_hz(void) { unsigned int max_age = READ_ONCE(dm_bufio_max_age); @@ -2581,13 +2512,74 @@ static unsigned int get_max_age_hz(void) static bool older_than(struct dm_buffer *b, unsigned long age_hz) { - return time_after_eq(jiffies, b->last_accessed + age_hz); + return time_after_eq(jiffies, READ_ONCE(b->last_accessed) + age_hz); } -static void __evict_old_buffers(struct dm_bufio_client *c, unsigned long age_hz) +struct evict_params { + gfp_t gfp; + unsigned long age_hz; + + /* + * This gets updated with the largest last_accessed (ie. most + * recently used) of the evicted buffers. It will not be reinitialised + * by __evict_many(), so you can use it across multiple invocations. + */ + unsigned long last_accessed; +}; + +/* + * We may not be able to evict this buffer if IO pending or the client + * is still using it. + * + * And if GFP_NOFS is used, we must not do any I/O because we hold + * dm_bufio_clients_lock and we would risk deadlock if the I/O gets + * rerouted to different bufio client. + */ +static enum evict_result select_for_evict(struct dm_buffer *b, void *context) +{ + struct evict_params *params = context; + + if (!(params->gfp & __GFP_FS) || + (static_branch_unlikely(&no_sleep_enabled) && b->c->no_sleep)) { + if (test_bit_acquire(B_READING, &b->state) || + test_bit(B_WRITING, &b->state) || + test_bit(B_DIRTY, &b->state)) + return ER_DONT_EVICT; + } + + return older_than(b, params->age_hz) ? ER_EVICT : ER_STOP; +} + +static unsigned long __evict_many(struct dm_bufio_client *c, + struct evict_params *params, + int list_mode, unsigned long max_count) +{ + unsigned long count; + unsigned long last_accessed; + struct dm_buffer *b; + + for (count = 0; count < max_count; count++) { + b = cache_evict(&c->cache, list_mode, select_for_evict, params); + if (!b) + break; + + last_accessed = READ_ONCE(b->last_accessed); + if (time_after_eq(params->last_accessed, last_accessed)) + params->last_accessed = last_accessed; + + __make_buffer_clean(b); + __free_buffer_wake(b); + + cond_resched(); + } + + return count; +} + +static void evict_old_buffers(struct dm_bufio_client *c, unsigned long age_hz) { - struct dm_buffer *b, *tmp; - unsigned long retain_target = get_retain_buffers(c); + struct evict_params params = {.gfp = 0, .age_hz = age_hz, .last_accessed = 0}; + unsigned long retain = get_retain_buffers(c); unsigned long count; LIST_HEAD(write_list); @@ -2600,91 +2592,13 @@ static void __evict_old_buffers(struct dm_bufio_client *c, unsigned long age_hz) dm_bufio_lock(c); } - count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY]; - list_for_each_entry_safe_reverse(b, tmp, &c->lru[LIST_CLEAN], lru_list) { - if (count <= retain_target) - break; - - if (!older_than(b, age_hz)) - break; - - if (__try_evict_buffer(b, 0)) - count--; - - cond_resched(); - } + count = cache_total(&c->cache); + if (count > retain) + __evict_many(c, ¶ms, LIST_CLEAN, count - retain); dm_bufio_unlock(c); } -static void do_global_cleanup(struct work_struct *w) -{ - struct dm_bufio_client *locked_client = NULL; - struct dm_bufio_client *current_client; - struct dm_buffer *b; - unsigned int spinlock_hold_count; - unsigned long threshold = dm_bufio_cache_size - - dm_bufio_cache_size / DM_BUFIO_LOW_WATERMARK_RATIO; - unsigned long loops = global_num * 2; - - mutex_lock(&dm_bufio_clients_lock); - - while (1) { - cond_resched(); - - spin_lock(&global_spinlock); - if (unlikely(dm_bufio_current_allocated <= threshold)) - break; - - spinlock_hold_count = 0; -get_next: - if (!loops--) - break; - if (unlikely(list_empty(&global_queue))) - break; - b = list_entry(global_queue.prev, struct dm_buffer, global_list); - - if (b->accessed) { - b->accessed = 0; - list_move(&b->global_list, &global_queue); - if (likely(++spinlock_hold_count < 16)) - goto get_next; - spin_unlock(&global_spinlock); - continue; - } - - current_client = b->c; - if (unlikely(current_client != locked_client)) { - if (locked_client) - dm_bufio_unlock(locked_client); - - if (!dm_bufio_trylock(current_client)) { - spin_unlock(&global_spinlock); - dm_bufio_lock(current_client); - locked_client = current_client; - continue; - } - - locked_client = current_client; - } - - spin_unlock(&global_spinlock); - - if (unlikely(!__try_evict_buffer(b, GFP_KERNEL))) { - spin_lock(&global_spinlock); - list_move(&b->global_list, &global_queue); - spin_unlock(&global_spinlock); - } - } - - spin_unlock(&global_spinlock); - - if (locked_client) - dm_bufio_unlock(locked_client); - - mutex_unlock(&dm_bufio_clients_lock); -} - static void cleanup_old_buffers(void) { unsigned long max_age_hz = get_max_age_hz(); @@ -2695,7 +2609,7 @@ static void cleanup_old_buffers(void) __cache_size_refresh(); list_for_each_entry(c, &dm_bufio_all_clients, client_list) - __evict_old_buffers(c, max_age_hz); + evict_old_buffers(c, max_age_hz); mutex_unlock(&dm_bufio_clients_lock); } @@ -2708,6 +2622,107 @@ static void work_fn(struct work_struct *w) DM_BUFIO_WORK_TIMER_SECS * HZ); } +/*--------------------------------------------------------------*/ + +/* + * Global cleanup tries to evict the oldest buffers from across _all_ + * the clients. It does this by repeatedly evicting a few buffers from + * the client that holds the oldest buffer. It's approximate, but hopefully + * good enough. + */ +static struct dm_bufio_client *__pop_client(void) +{ + struct list_head *h; + + if (list_empty(&dm_bufio_all_clients)) + return NULL; + + h = dm_bufio_all_clients.next; + list_del(h); + return container_of(h, struct dm_bufio_client, client_list); +} + +/* + * Inserts the client in the global client list based on its + * 'oldest_buffer' field. + */ +static void __insert_client(struct dm_bufio_client *new_client) +{ + struct dm_bufio_client *c; + struct list_head *h = dm_bufio_all_clients.next; + + while (h != &dm_bufio_all_clients) { + c = container_of(h, struct dm_bufio_client, client_list); + if (time_after_eq(c->oldest_buffer, new_client->oldest_buffer)) + break; + h = h->next; + } + + list_add_tail(&new_client->client_list, h); +} + +static unsigned long __evict_a_few(unsigned long nr_buffers) +{ + unsigned long count; + struct dm_bufio_client *c; + struct evict_params params = { + .gfp = GFP_KERNEL, + .age_hz = 0, + /* set to jiffies in case there are no buffers in this client */ + .last_accessed = jiffies + }; + + c = __pop_client(); + if (!c) + return 0; + + dm_bufio_lock(c); + count = __evict_many(c, ¶ms, LIST_CLEAN, nr_buffers); + dm_bufio_unlock(c); + + if (count) + c->oldest_buffer = params.last_accessed; + __insert_client(c); + + return count; +} + +static void check_watermarks(void) +{ + LIST_HEAD(write_list); + struct dm_bufio_client *c; + + mutex_lock(&dm_bufio_clients_lock); + list_for_each_entry(c, &dm_bufio_all_clients, client_list) { + dm_bufio_lock(c); + __check_watermark(c, &write_list); + dm_bufio_unlock(c); + } + mutex_unlock(&dm_bufio_clients_lock); + + __flush_write_list(&write_list); +} + +static void evict_old(void) +{ + unsigned long threshold = dm_bufio_cache_size - + dm_bufio_cache_size / DM_BUFIO_LOW_WATERMARK_RATIO; + + mutex_lock(&dm_bufio_clients_lock); + while (dm_bufio_current_allocated > threshold) { + if (!__evict_a_few(64)) + break; + cond_resched(); + } + mutex_unlock(&dm_bufio_clients_lock); +} + +static void do_global_cleanup(struct work_struct *w) +{ + check_watermarks(); + evict_old(); +} + /* *-------------------------------------------------------------- * Module setup From patchwork Mon Mar 27 20:11:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190030 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E9671C6FD1D for ; Mon, 27 Mar 2023 20:13:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948023; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=BQPdaaQhBtSmvqtgiOfR6OvIQrmPD03uBolZsD26X1c=; b=b/5vXgUOiF18KAfmPbkX7uBOmfKiSKUJrTEaF8yZfUIbrwXY6zLDTQuVcNbQWxcNRygZUo rUJKiUIaTeRrsnKq6MH0Le8IIw3G7Pk0DmI3YPf6UpMuDkexkVfnl5VgdWUx60idjPqLH3 99poBoc9ydHv/6ZkmjUE3vw8au/mvLI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-368-ENX4aV8aOAGaxghHfhmRNQ-1; Mon, 27 Mar 2023 16:13:39 -0400 X-MC-Unique: ENX4aV8aOAGaxghHfhmRNQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3878F8828CF; Mon, 27 Mar 2023 20:13:37 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 250CB44037; Mon, 27 Mar 2023 20:13:37 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 3AEAD1946A78; Mon, 27 Mar 2023 20:13:24 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 6CE2B1946A78 for ; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 5FB69202701F; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast07.extmail.prod.ext.rdu2.redhat.com [10.11.55.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 58066202701E for ; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 383B33C0E451 for ; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-517-pTsjNDYdMFSwTp89o2jsTg-2; Mon, 27 Mar 2023 16:13:20 -0400 X-MC-Unique: pTsjNDYdMFSwTp89o2jsTg-2 Received: by mail-qt1-f174.google.com with SMTP id cj15so2316212qtb.5 for ; Mon, 27 Mar 2023 13:13:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948000; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dgaYGvMwSGrKRZ2WwE4xFce2cAfLE729BA8da+BNlMU=; b=2TnP6p1tIbm7HUU+8oGbdphwoUVcfiFCZvNBxLZFTCqwzs6mA9WoVUxvTmYRxqMPF1 32oGEjuojqkYRc5yv/j4ZBzuC/j6i7GxuarH71iaffYkW8iFgo+SnUwa2/zvhHIom29W NoNJHjc3hvDPTU5iqUaOz5fkKSRPBuryfvwUgRAQpgKzy4+TKfbKtwIo+SzrmGLrTZSW vR0EKN6LMTFpyg1YkdcQNx1Ic6f400jOzKOmT6ZbOj077uc6rVXpyqBxqkffe6VyNIvB TdKBvZNEU6TFlaDnHqg3DGWceVk8le4vQjl+aSQrb9XaBZWw2lVgvGlEOWgUSUoqZUYn K2xw== X-Gm-Message-State: AO0yUKXSSWVAGsTxEvjCIJ1CCs/9kMzhe+Isou0wEoVg6YusAZN+l7si tjIWIajktz5u/W2ubbUnJdTlhSkNxiaggNi40zQ1SQJhzNfLFLfHiRoHGMXFYDV+0tdp9GE3IMv 7Za+5HlgYcPT+gVfxjIKID8daCYMYsxzGLa0uB3cSScMrOlV167JH1YNpNbvIX3M0wncT3W+Jzs Y= X-Google-Smtp-Source: AK7set9AVHWFXkb3npjJh0xtSby8NWtVc9+BAEPPLkZMxF17dzC/dc7TdH75nxfrpPvYHfLhi0sMtQ== X-Received: by 2002:a05:622a:198b:b0:3e2:6a40:5631 with SMTP id u11-20020a05622a198b00b003e26a405631mr22093845qtc.10.1679948000120; Mon, 27 Mar 2023 13:13:20 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id t23-20020ac865d7000000b003b635a5d56csm15119212qto.30.2023.03.27.13.13.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:19 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:31 -0400 Message-Id: <20230327201143.51026-9-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Subject: [dm-devel] [dm-6.4 PATCH v3 08/20] dm bufio: add lock_history optimization for cache iterators X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Joe Thornber Sometimes it is beneficial to repeatedly get and drop locks as part of an iteration. Introduce lock_history struct to help avoid redundant drop and gets of the same lock. Optimizes cache_iterate, cache_mark_many and cache_evict. Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 119 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 111 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 1e000ec73bd6..9ac50024006d 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -421,6 +421,70 @@ static inline void cache_write_unlock(struct dm_buffer_cache *bc, sector_t block up_write(&bc->trees[cache_index(block)].lock); } +/* + * Sometimes we want to repeatedly get and drop locks as part of an iteration. + * This struct helps avoid redundant drop and gets of the same lock. + */ +struct lock_history { + struct dm_buffer_cache *cache; + bool write; + unsigned int previous; +}; + +static void lh_init(struct lock_history *lh, struct dm_buffer_cache *cache, bool write) +{ + lh->cache = cache; + lh->write = write; + lh->previous = NR_LOCKS; /* indicates no previous */ +} + +static void __lh_lock(struct lock_history *lh, unsigned int index) +{ + if (lh->write) + down_write(&lh->cache->trees[index].lock); + else + down_read(&lh->cache->trees[index].lock); +} + +static void __lh_unlock(struct lock_history *lh, unsigned int index) +{ + if (lh->write) + up_write(&lh->cache->trees[index].lock); + else + up_read(&lh->cache->trees[index].lock); +} + +/* + * Make sure you call this since it will unlock the final lock. + */ +static void lh_exit(struct lock_history *lh) +{ + if (lh->previous != NR_LOCKS) { + __lh_unlock(lh, lh->previous); + lh->previous = NR_LOCKS; + } +} + +/* + * Named 'next' because there is no corresponding + * 'up/unlock' call since it's done automatically. + */ +static void lh_next(struct lock_history *lh, sector_t b) +{ + unsigned int index = cache_index(b); + + if (lh->previous != NR_LOCKS) { + if (lh->previous != index) { + __lh_unlock(lh, lh->previous); + __lh_lock(lh, index); + lh->previous = index; + } + } else { + __lh_lock(lh, index); + lh->previous = index; + } +} + static inline struct dm_buffer *le_to_buffer(struct lru_entry *le) { return container_of(le, struct dm_buffer, lru); @@ -550,6 +614,7 @@ typedef enum evict_result (*b_predicate)(struct dm_buffer *, void *); * predicate the hold_count of the selected buffer will be zero. */ struct evict_wrapper { + struct lock_history *lh; b_predicate pred; void *context; }; @@ -563,16 +628,19 @@ static enum evict_result __evict_pred(struct lru_entry *le, void *context) struct evict_wrapper *w = context; struct dm_buffer *b = le_to_buffer(le); + lh_next(w->lh, b->block); + if (atomic_read(&b->hold_count)) return ER_DONT_EVICT; return w->pred(b, w->context); } -static struct dm_buffer *cache_evict(struct dm_buffer_cache *bc, int list_mode, - b_predicate pred, void *context) +static struct dm_buffer *__cache_evict(struct dm_buffer_cache *bc, int list_mode, + b_predicate pred, void *context, + struct lock_history *lh) { - struct evict_wrapper w = {.pred = pred, .context = context}; + struct evict_wrapper w = {.lh = lh, .pred = pred, .context = context}; struct lru_entry *le; struct dm_buffer *b; @@ -587,6 +655,19 @@ static struct dm_buffer *cache_evict(struct dm_buffer_cache *bc, int list_mode, return b; } +static struct dm_buffer *cache_evict(struct dm_buffer_cache *bc, int list_mode, + b_predicate pred, void *context) +{ + struct dm_buffer *b; + struct lock_history lh; + + lh_init(&lh, bc, true); + b = __cache_evict(bc, list_mode, pred, context, &lh); + lh_exit(&lh); + + return b; +} + /*--------------*/ /* @@ -609,12 +690,12 @@ static void cache_mark(struct dm_buffer_cache *bc, struct dm_buffer *b, int list * Runs through the lru associated with 'old_mode', if the predicate matches then * it moves them to 'new_mode'. Not threadsafe. */ -static void cache_mark_many(struct dm_buffer_cache *bc, int old_mode, int new_mode, - b_predicate pred, void *context) +static void __cache_mark_many(struct dm_buffer_cache *bc, int old_mode, int new_mode, + b_predicate pred, void *context, struct lock_history *lh) { struct lru_entry *le; struct dm_buffer *b; - struct evict_wrapper w = {.pred = pred, .context = context}; + struct evict_wrapper w = {.lh = lh, .pred = pred, .context = context}; while (true) { le = lru_evict(&bc->lru[old_mode], __evict_pred, &w); @@ -627,6 +708,16 @@ static void cache_mark_many(struct dm_buffer_cache *bc, int old_mode, int new_mo } } +static void cache_mark_many(struct dm_buffer_cache *bc, int old_mode, int new_mode, + b_predicate pred, void *context) +{ + struct lock_history lh; + + lh_init(&lh, bc, true); + __cache_mark_many(bc, old_mode, new_mode, pred, context, &lh); + lh_exit(&lh); +} + /*--------------*/ /* @@ -645,8 +736,8 @@ enum it_action { typedef enum it_action (*iter_fn)(struct dm_buffer *b, void *context); -static void cache_iterate(struct dm_buffer_cache *bc, int list_mode, - iter_fn fn, void *context) +static void __cache_iterate(struct dm_buffer_cache *bc, int list_mode, + iter_fn fn, void *context, struct lock_history *lh) { struct lru *lru = &bc->lru[list_mode]; struct lru_entry *le, *first; @@ -658,6 +749,8 @@ static void cache_iterate(struct dm_buffer_cache *bc, int list_mode, do { struct dm_buffer *b = le_to_buffer(le); + lh_next(lh, b->block); + switch (fn(b, context)) { case IT_NEXT: break; @@ -671,6 +764,16 @@ static void cache_iterate(struct dm_buffer_cache *bc, int list_mode, } while (le != first); } +static void cache_iterate(struct dm_buffer_cache *bc, int list_mode, + iter_fn fn, void *context) +{ + struct lock_history lh; + + lh_init(&lh, bc, false); + __cache_iterate(bc, list_mode, fn, context, &lh); + lh_exit(&lh); +} + /*--------------*/ /* From patchwork Mon Mar 27 20:11:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190032 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 60271C77B60 for ; Mon, 27 Mar 2023 20:13:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948026; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=EnaoKPkU00S/nGjKbG7I0LQ15J/reY0+x1A1eiH0Sg0=; b=DednicIwBsx3u7yfp8CrbgLj2cmNm53E7Mjn0zXE2t08upLVd8gUV3jGGB4O1E2q7RJuh+ Ly0V6VBL9puOTKGsDSLqVurOYXDeq5eAvflhKXKkozl3KOeFel0uM7C8iOGGRGz2Yb8Xot dRqCoIfvjAETVQkEUlOdVQOBR/txYrU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-158-41s3GkvRPFm4VuFB2AbRPg-1; Mon, 27 Mar 2023 16:13:45 -0400 X-MC-Unique: 41s3GkvRPFm4VuFB2AbRPg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D8EEF8030DC; Mon, 27 Mar 2023 20:13:42 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id C92DF2166B2C; Mon, 27 Mar 2023 20:13:42 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 5758E19472D5; Mon, 27 Mar 2023 20:13:34 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 1242319472C0 for ; Mon, 27 Mar 2023 20:13:29 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id F29B6C15BC0; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast02.extmail.prod.ext.rdu2.redhat.com [10.11.55.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E9EE6C15BB8 for ; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CD7178030D4 for ; Mon, 27 Mar 2023 20:13:23 +0000 (UTC) Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-635-nVRgXkYDN2O0u4rT7lsyyg-4; Mon, 27 Mar 2023 16:13:22 -0400 X-MC-Unique: nVRgXkYDN2O0u4rT7lsyyg-4 Received: by mail-qt1-f169.google.com with SMTP id r5so9815620qtp.4 for ; Mon, 27 Mar 2023 13:13:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948002; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H9SedxOrdczzTjA4k7blz5Fc7tZ0dfZriRfxA3Ti63I=; b=GHkkZl1TJGviCj9GpGjmws8eIrwgGzZaRUUGlVC37zihemsnXvU7stcmqzToYM0fn7 ND18ro8NTjXNyGw41Y31fdY6Xhle56EF6V97o5rJlNAwGmL3+mo8Q5OA7nujbrr9OKyC pR9WjTu9XPpKiuwfy9kc3a3UxMTtIrvP6t0om5MCYRIhwDDTF9kMBnLZWVNWxdLMjPlT SY1JGsxK6rJwBrooCfZtr0otWHggwHv+NZ/nPNTZyVwGb+iVmHJGl+FnBVuO6HcQ5NPi Oe1IIUpDygD/ulL49l+PjZIyYkTmVVuj6KTBeNAtNRMQezBrPwfGN/RynyXQ5icUYXGU G50g== X-Gm-Message-State: AO0yUKUDU64K5IjvCKMVrpwJggBFw09t0YclNRRzrZlktapLD0SzcUAO iIMJUcRppCmpGqIl+fJBnXvrObyN2+XYVZzvBmAmGzkAfXVlkFyaXaECyLeNafWkobLJsLOt2fh wiOPC3WAdJ0UpI5NikrFwiS0D0eLYGAWvM/0hfBLnVvV0G/xVkF7E4b5XP5Unnm05QwtmlNYhQk s= X-Google-Smtp-Source: AK7set98Y7dLUBXYMxUAgPYJRwSwe5G/4pJt7wTHoABVMGSpSmarcdzS7b1UEdwaYPt5vQLyrZsh/w== X-Received: by 2002:a05:622a:170b:b0:3e3:7e24:a35a with SMTP id h11-20020a05622a170b00b003e37e24a35amr21832421qtk.9.1679948001588; Mon, 27 Mar 2023 13:13:21 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id 201-20020a3708d2000000b007456efa7f73sm12278213qki.85.2023.03.27.13.13.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:21 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:32 -0400 Message-Id: <20230327201143.51026-10-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 Subject: [dm-devel] [dm-6.4 PATCH v3 09/20] dm bufio: move dm_bufio_client members to avoid spanning cachelines X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Movement also consolidates holes in dm_bufio_client struct. But the overall size of the struct isn't changed. Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 9ac50024006d..e5459741335d 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -936,13 +936,16 @@ static void cache_remove_range(struct dm_buffer_cache *bc, * context. */ struct dm_bufio_client { - struct mutex lock; - spinlock_t spinlock; - bool no_sleep; - struct block_device *bdev; unsigned int block_size; s8 sectors_per_block_bits; + + bool no_sleep; + struct mutex lock; + spinlock_t spinlock; + + int async_write_error; + void (*alloc_callback)(struct dm_buffer *buf); void (*write_callback)(struct dm_buffer *buf); struct kmem_cache *slab_buffer; @@ -954,23 +957,22 @@ struct dm_bufio_client { unsigned int minimum_buffers; - struct dm_buffer_cache cache; - wait_queue_head_t free_buffer_wait; - sector_t start; - int async_write_error; - - struct list_head client_list; - struct shrinker shrinker; struct work_struct shrink_work; atomic_long_t need_shrink; + wait_queue_head_t free_buffer_wait; + + struct list_head client_list; + /* * Used by global_cleanup to sort the clients list. */ unsigned long oldest_buffer; + + struct dm_buffer_cache cache; }; static DEFINE_STATIC_KEY_FALSE(no_sleep_enabled); From patchwork Mon Mar 27 20:11:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190033 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D8F4CC77B61 for ; Mon, 27 Mar 2023 20:13:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948026; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=+1Hsrztrs4GMdaYIKIj2ULkxaaI+CTD+Qh86Q7AfnVQ=; b=Fl+02pto18xNx9zRUt5PJS/GgIabqmO+hqZmiL2DgeSdFmeCtSndL1J7Mevo0A0guczFyF VBY1XsFTRKvrUyB6+Wq2Rjaswcui4YsGSUCR82CBwzwBKwioSla3lWPLImRdyApJtRy8PU Kda170FFdbfE5VvPozANDaTlbKlgv5g= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-102-f0-W3dFmMx671p2hNI55Xw-1; Mon, 27 Mar 2023 16:13:43 -0400 X-MC-Unique: f0-W3dFmMx671p2hNI55Xw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B5A5D2812958; Mon, 27 Mar 2023 20:13:41 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id A47AEC15BB8; Mon, 27 Mar 2023 20:13:41 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 62AB919465B2; Mon, 27 Mar 2023 20:13:26 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id B867519465A4 for ; Mon, 27 Mar 2023 20:13:25 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 9DC6F4020C83; Mon, 27 Mar 2023 20:13:25 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast04.extmail.prod.ext.rdu2.redhat.com [10.11.55.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 96B434020C82 for ; Mon, 27 Mar 2023 20:13:25 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7CEB4100DEB1 for ; Mon, 27 Mar 2023 20:13:25 +0000 (UTC) Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-628-Ac9qx-WFO7Cy_pJCMeh0Ag-3; Mon, 27 Mar 2023 16:13:24 -0400 X-MC-Unique: Ac9qx-WFO7Cy_pJCMeh0Ag-3 Received: by mail-qt1-f182.google.com with SMTP id ga7so9828563qtb.2 for ; Mon, 27 Mar 2023 13:13:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948003; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W8N6ipnCV5hoHhg/CORvER6nf9MSwcb0kouKMhOxJx4=; b=caD7cQJ42aUijsCXjwbxDRVpx8jl8eXNZCGipNlZOqsg3vPw1Wl4ZwVLOjpOPUgoI4 QpgMcQTpQFdTUol5Svy3NziTTt4go4Zj2Vl8ShlRippiJ/iGrD/r5BQ6YcgmwCt270ZN Ur2rfXqGXaIfD2TsL5R9bkyLgsUs6q37IKPq35huPCHBuYtMZEIAG3pe2ob17gMWdLB2 vjtRmyurK5q5PGjQqhn54zBcUK7+vSkSDoGtlZ8PhJe6h6U5Rax/aK4o+GzG0gd3fntr YHRzoX7BKhEMzysz4o4TtcU2zmCec80ftVb5OWtR1U9Vf42lQUWpOIu0DaztNiYRWO9k R1vA== X-Gm-Message-State: AO0yUKUa8yOEzyKY2nknDvqXsQw+1rJook1gJhHXCbEVfzIixuYQGjkB O+ZvDCC0V7nadOyFXaelC6hDbbeY26pCYZf90s49YBQH8lp/MD+vBzS7Zy1OuE4kUxkwtPAlcp8 +ZKByXJkj1qOXdNOxEDM86REsYIfRBgnGV1Y0ZsyOWYBMTajyPKwzBiFRy2Hh1G8gz8lz/IejAD g= X-Google-Smtp-Source: AK7set9+sfwAJb0hRAS53hihYhDVSWnGpR49MCgCFLmIkZazfZgDsImOqvtZXurStI3emQaDQ0Dvkw== X-Received: by 2002:a05:622a:1315:b0:3d5:500a:4819 with SMTP id v21-20020a05622a131500b003d5500a4819mr22550964qtk.23.1679948003245; Mon, 27 Mar 2023 13:13:23 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id y126-20020a376484000000b007468765b411sm11876942qkb.45.2023.03.27.13.13.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:22 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:33 -0400 Message-Id: <20230327201143.51026-11-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Subject: [dm-devel] [dm-6.4 PATCH v3 10/20] dm bufio: use waitqueue_active in __free_buffer_wake X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Mikulas Patocka Save one spinlock by using waitqueue_active. We hold the bufio lock at this place, so no one can add entries to the waitqueue at this point. Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index e5459741335d..cca43ed13fd1 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1665,7 +1665,12 @@ static void __free_buffer_wake(struct dm_buffer *b) c->need_reserved_buffers--; } - wake_up(&c->free_buffer_wait); + /* + * We hold the bufio lock here, so no one can add entries to the + * wait queue anyway. + */ + if (unlikely(waitqueue_active(&c->free_buffer_wait))) + wake_up(&c->free_buffer_wait); } static enum evict_result cleaned(struct dm_buffer *b, void *context) From patchwork Mon Mar 27 20:11:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190035 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8598EC6FD1D for ; Mon, 27 Mar 2023 20:13:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948032; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=wC/XujJj3ZvjqKDMeFFXKX7Qe40iLFRGtgkJQgagcdQ=; b=CPFEdaYb0z/O/a7ehkZvC2wGPRrV9IqoMZBMFlgq4i6ILRup8aeKWqwuAJwavUJgHh1wdP NZkqa3X9meBC05CQMxH6nxdg7MN7IkYhE6WjHSr7fnRQJ8J7QATzWbz8kl25npU23KJwmQ NycyikothpmlXIho2s3zbNkZO910/lU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-549-3fQ-TuFyPoqXnV6wYkJa3w-1; Mon, 27 Mar 2023 16:13:48 -0400 X-MC-Unique: 3fQ-TuFyPoqXnV6wYkJa3w-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A12BD811E7C; Mon, 27 Mar 2023 20:13:46 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 916012166B29; Mon, 27 Mar 2023 20:13:46 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 19D0319472DC; Mon, 27 Mar 2023 20:13:28 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 5F0AB19465A2 for ; Mon, 27 Mar 2023 20:13:27 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 4FD354020C84; Mon, 27 Mar 2023 20:13:27 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast06.extmail.prod.ext.rdu2.redhat.com [10.11.55.22]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 486204020C82 for ; Mon, 27 Mar 2023 20:13:27 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2E4C4185A791 for ; Mon, 27 Mar 2023 20:13:27 +0000 (UTC) Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-106-v6rciKTEOZaKrufUv62xcA-2; Mon, 27 Mar 2023 16:13:25 -0400 X-MC-Unique: v6rciKTEOZaKrufUv62xcA-2 Received: by mail-qt1-f176.google.com with SMTP id bz27so9824180qtb.1 for ; Mon, 27 Mar 2023 13:13:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948005; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pVnf9GI1VNVce+UTfVz3CQAGt+fJmIpbfH8UFMjTV9Y=; b=WtkkegYR9ZXewfTelmSlIrv0EEx1SRgHW+L3K+00Y1zAr8bCF+kOpSkpIoZ00ZVbDz xjh/di4qn8/q7dQFtT5I5xOJihe1Ao+3XIc/yUrK3Xiv6xK+JDP4rrLLZSNFBZN/M21E obvmKmuv8fcr10ApYbz/9hA+0sAKDF0xWMnQLrcLOlwlBSiAHTzGXSw4N/5oujik3tBT rbUFcSGDUN2KzMDghZpbVCaojLwLqMCyXAkGsxNfp1TAcEv8q/2nkqNIMGFVBaj4H38J 9/143grlDmKXjpZbTM4FLPjUjH2fPNN3iyHm8gThA/UPskgrQH0egLqkKBaOM7/OSml1 u5aQ== X-Gm-Message-State: AO0yUKWFMUQi79KS8rRb/PipXOSJT6muSnJOBLROAFvXii2CiTeF5zWO qNcgW99JWkB8BXG6tkxjVI/hNiJGuhToBYsr1tfe5XRKUUZ2wp1ugl6emhwsnHyYu6TPe+hnmvE Vhe4kJ4lkdM29UR4os0ssaloOHk3qxqPyWevxQKX3j83P8UWMHz3xwG3zYi2fBC9gIH9OY1iBo4 c= X-Google-Smtp-Source: AK7set/QhxRjmAYqq9vwOTVV6ZCx9K0pKe5GG2wkdvifb91iLlhC2h6StgNwdnfJ1O/ta0hPff4PHw== X-Received: by 2002:a05:622a:1a03:b0:3e3:8427:fb56 with SMTP id f3-20020a05622a1a0300b003e38427fb56mr22558795qtb.63.1679948004961; Mon, 27 Mar 2023 13:13:24 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id 75-20020a370a4e000000b006ff8a122a1asm11508110qkk.78.2023.03.27.13.13.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:24 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:34 -0400 Message-Id: <20230327201143.51026-12-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Subject: [dm-devel] [dm-6.4 PATCH v3 11/20] dm bufio: use multi-page bio vector X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Mikulas Patocka The kernel supports multi page bio vector entries, so we can use them in dm-bufio as an optimization. Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 24 ++++-------------------- 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index cca43ed13fd1..ae552644a0b4 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -1312,19 +1312,14 @@ static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector, { struct bio *bio; char *ptr; - unsigned int vec_size, len; + unsigned int len; - vec_size = b->c->block_size >> PAGE_SHIFT; - if (unlikely(b->c->sectors_per_block_bits < PAGE_SHIFT - SECTOR_SHIFT)) - vec_size += 2; - - bio = bio_kmalloc(vec_size, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOWARN); + bio = bio_kmalloc(1, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOWARN); if (!bio) { -dmio: use_dmio(b, op, sector, n_sectors, offset); return; } - bio_init(bio, b->c->bdev, bio->bi_inline_vecs, vec_size, op); + bio_init(bio, b->c->bdev, bio->bi_inline_vecs, 1, op); bio->bi_iter.bi_sector = sector; bio->bi_end_io = bio_complete; bio->bi_private = b; @@ -1332,18 +1327,7 @@ static void use_bio(struct dm_buffer *b, enum req_op op, sector_t sector, ptr = (char *)b->data + offset; len = n_sectors << SECTOR_SHIFT; - do { - unsigned int this_step = min((unsigned int)(PAGE_SIZE - offset_in_page(ptr)), len); - - if (!bio_add_page(bio, virt_to_page(ptr), this_step, - offset_in_page(ptr))) { - bio_put(bio); - goto dmio; - } - - len -= this_step; - ptr += this_step; - } while (len > 0); + __bio_add_page(bio, virt_to_page(ptr), len, offset_in_page(ptr)); submit_bio(bio); } From patchwork Mon Mar 27 20:11:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190037 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 314F7C761A6 for ; Mon, 27 Mar 2023 20:14:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948040; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=6XusNNgPye6YWNO8uR7HXXGvCeN2EUngVHyjX/W7QXA=; b=iLhfy+u/DOngXKN57wkFTSSrJ6y2mNyXFm1rwIYT0yolcl/hp4od0SsPBHItOAexnJfF33 FAKE8TAejEsE0lyeE8OvHUZ57phtxefKvtZGvQgkDptXf5ux6LBFvFWnj7o13tiQpYBRjQ Kz+Dbu4LSxe81qZTVFAJSaNqwvHQK5Y= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-600-dqRQrk--MtaovBndaWL80A-1; Mon, 27 Mar 2023 16:13:58 -0400 X-MC-Unique: dqRQrk--MtaovBndaWL80A-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 718942812951; Mon, 27 Mar 2023 20:13:56 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5F5CC492C3E; Mon, 27 Mar 2023 20:13:56 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 1ED1619452CF; Mon, 27 Mar 2023 20:13:37 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 3694619472D5 for ; Mon, 27 Mar 2023 20:13:34 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 21A9C1121331; Mon, 27 Mar 2023 20:13:29 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast07.extmail.prod.ext.rdu2.redhat.com [10.11.55.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1A5761121330 for ; Mon, 27 Mar 2023 20:13:29 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id ECC243C0ED4F for ; Mon, 27 Mar 2023 20:13:28 +0000 (UTC) Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-539-cQpnMlLhPwOgTOgilaooFg-2; Mon, 27 Mar 2023 16:13:27 -0400 X-MC-Unique: cQpnMlLhPwOgTOgilaooFg-2 Received: by mail-qv1-f47.google.com with SMTP id l7so7669605qvh.5 for ; Mon, 27 Mar 2023 13:13:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948007; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7GATmDpp2LqxaO3YBsTr83VB1ZcPRtnkKD4xtYp0nm8=; b=q3L0aP9jtX1FT3DRNXs1HPSVwoP+Z98mmzhaip58dzSnlJvbbnV+6zqRlb02n3jFji sbAxacvtdZ+H6f9/o4szq0KeUlEOsr6wPjcH2muFilWs0hJ0rrnl8CVochDDEsSDXh0q wAdO2ywValMtAFRGIzdBka0oPR/r4OgSvwQSYcscGx4wAgcf4c3yRWZSZhWUFRYH8H1j MOcuEcA2VEK5V5YrwIfkQ0voWLQLX1JFxfw6B+6MVwjHmAgEhOavhKM3klYWB9fNzr1d s50H6z00G7BlHKzqM8nihRchKOBGrbWNEEng3ZTa3DAxA4qMHjAJ4AeGuquJ/aT4UD13 ozXw== X-Gm-Message-State: AAQBX9fzhmZ9WJ5kfEIgSRy2tx4KYuiFO5znrn7B7CUPHX0e11ll5edq CLjMR0Q6f+8J+lCatOZ82efOY0k9l/Mji/+7191N5owOMIanYuNaNigsaEJnuitDZ/JQdvqIMFT O6qwHBEiLgWbzgqR4yLY3Cmvrpp+NunptBSpNA+ZQrg+BUPXzLo5f78Wmqlo8kQJchUdt+6D2Gr g= X-Google-Smtp-Source: AKy350ZI9rUbEEtVYKNa1rqSBi0yYegoZt54hv+kbNekJScyZbdA+0A6vj49+yGatxgvJEDQcYubbw== X-Received: by 2002:ad4:5bc7:0:b0:5b8:d0b5:9a46 with SMTP id t7-20020ad45bc7000000b005b8d0b59a46mr20840019qvt.37.1679948006778; Mon, 27 Mar 2023 13:13:26 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id jy3-20020a0562142b4300b005dd8b934595sm3228566qvb.45.2023.03.27.13.13.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:26 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:35 -0400 Message-Id: <20230327201143.51026-13-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Subject: [dm-devel] [dm-6.4 PATCH v3 12/20] dm thin: speed up cell_defer_no_holder() X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Joe Thornber Reduce the time that a spinlock is held in cell_defer_no_holder(). Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer --- drivers/md/dm-thin.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 13d4677baafd..00323428919e 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -883,15 +883,17 @@ static void cell_defer_no_holder(struct thin_c *tc, struct dm_bio_prison_cell *c { struct pool *pool = tc->pool; unsigned long flags; - int has_work; + struct bio_list bios; - spin_lock_irqsave(&tc->lock, flags); - cell_release_no_holder(pool, cell, &tc->deferred_bio_list); - has_work = !bio_list_empty(&tc->deferred_bio_list); - spin_unlock_irqrestore(&tc->lock, flags); + bio_list_init(&bios); + cell_release_no_holder(pool, cell, &bios); - if (has_work) + if (!bio_list_empty(&bios)) { + spin_lock_irqsave(&tc->lock, flags); + bio_list_merge(&tc->deferred_bio_list, &bios); + spin_unlock_irqrestore(&tc->lock, flags); wake_worker(pool); + } } static void thin_defer_bio(struct thin_c *tc, struct bio *bio); From patchwork Mon Mar 27 20:11:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190038 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CD1F1C6FD1D for ; Mon, 27 Mar 2023 20:14:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948042; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=LvzAMU/K1xUgAE8aF/ZwaChC12KsnRw4Xns3a301pG4=; b=E/QIVWu9iNmDxKozqak8ZPSZLhiK5elCUaMvbwdjOCbT9CVSmxsHtxI8OZSidtNIp1253x E057V+06/BldmLxmfKoS1G1HEqACmoEupC/AdVBZA56btKWvZ1dAzwevh/GRDYA4GiWt+W wyyIfJSg9guUhnc5lv8jDwYKp4Y8790= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-663-b_gClHWrPHy1LVPgl6UT5A-1; Mon, 27 Mar 2023 16:14:00 -0400 X-MC-Unique: b_gClHWrPHy1LVPgl6UT5A-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5A3E23C0F18F; Mon, 27 Mar 2023 20:13:58 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 49BB9492B01; Mon, 27 Mar 2023 20:13:58 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 9F87119472D6; Mon, 27 Mar 2023 20:13:34 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 2EECC1946A41 for ; Mon, 27 Mar 2023 20:13:33 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 2233F492C3E; Mon, 27 Mar 2023 20:13:33 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast04.extmail.prod.ext.rdu2.redhat.com [10.11.55.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1AD77492B03 for ; Mon, 27 Mar 2023 20:13:33 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F18C4101A552 for ; Mon, 27 Mar 2023 20:13:32 +0000 (UTC) Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-221-Y0rZRn7nPemSq_upB9a6JA-3; Mon, 27 Mar 2023 16:13:30 -0400 X-MC-Unique: Y0rZRn7nPemSq_upB9a6JA-3 Received: by mail-qt1-f169.google.com with SMTP id n14so9790281qta.10 for ; Mon, 27 Mar 2023 13:13:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948009; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6PKqdJFHlMUFZNUEvpL9gBzLYs3G0zKQPB+NnAYo5GY=; b=ctg8xcjeZmSBVzhuhbigKRl+L0T51YoN8t3effCSI/NgVqtes0OtMQ0QAS2Q4w0c17 8WZ/YcB70j0aPv6ION3TovMOD/REyfn+A6s4qBBdHYSy0y7xTZbybVNxIE7lQn9XjI+z rXpUhuMwj6HOdDJH+PJVN5X9ONCp2RGypPGrvATfGXh9LmXevM2FJH1KDP1eI211qp3N Epu/fg50r8n0/o1/wIpVWvnrBxTJTt5FoiAdoH+JYfRX09LIrm8UOyYGI1xjPwCbg2RD yoZjcNNGoH2t3LGREpC2kWbRsp7/gCu4vstMx4bN9NBi2zGeVICpqXVW7H1arG6api5D S2hA== X-Gm-Message-State: AO0yUKU8RPXDSn9L183O1sZG1O8sFOkcum6WMrpLg2LhP2G7kOFVMOxO Sn2IhUrPZKh3R8aa1qxidXr8GQbo3y5kNZTsKaru7F52PBxkDkdvDxClSp6+eu1H3ijLwsUUKvD Wv56+36QsAExz9evtZVl081UeCUzulT1mofA/MbBul2ro3sDLvEx5Z642eicW20OxvLFywcTGWu 0= X-Google-Smtp-Source: AK7set8CW+cpPq4Pp9FJfsBjh5ZA7tWZLyqOfAoMlX+sA6HGXom7lYbcNpi3u7woOx27vLDKG9NLYA== X-Received: by 2002:a05:622a:1a98:b0:3d8:9b45:d362 with SMTP id s24-20020a05622a1a9800b003d89b45d362mr21296148qtc.28.1679948008860; Mon, 27 Mar 2023 13:13:28 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id m64-20020a375843000000b0073b8512d2dbsm16899758qkb.72.2023.03.27.13.13.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:28 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:36 -0400 Message-Id: <20230327201143.51026-14-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Subject: [dm-devel] [dm-6.4 PATCH v3 13/20] dm: split discards further if target sets max_discard_granularity X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org The block core (bio_split_discard) will already split discards based on the 'discard_granularity' and 'max_discard_sectors' queue_limits. But the DM thin target also needs to ensure that it doesn't receive a discard that spans a 'max_discard_sectors' boundary. Introduce a dm_target 'max_discard_granularity' flag that if set will cause DM core to split discard bios relative to 'max_discard_sectors'. This treats 'discard_granularity' as a "min_discard_granularity" and 'max_discard_sectors' as a "max_discard_granularity". Requested-by: Joe Thornber Signed-off-by: Mike Snitzer --- drivers/md/dm.c | 25 +++++++++++++++++++------ include/linux/device-mapper.h | 6 ++++++ include/uapi/linux/dm-ioctl.h | 4 ++-- 3 files changed, 27 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index b6ace995b9ca..6eb0748a3bb2 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1162,7 +1162,8 @@ static inline sector_t max_io_len_target_boundary(struct dm_target *ti, return ti->len - target_offset; } -static sector_t max_io_len(struct dm_target *ti, sector_t sector) +static sector_t __max_io_len(struct dm_target *ti, sector_t sector, + unsigned int max_granularity) { sector_t target_offset = dm_target_offset(ti, sector); sector_t len = max_io_len_target_boundary(ti, target_offset); @@ -1173,11 +1174,16 @@ static sector_t max_io_len(struct dm_target *ti, sector_t sector) * explains why stacked chunk_sectors based splitting via * bio_split_to_limits() isn't possible here. */ - if (!ti->max_io_len) + if (!max_granularity) return len; return min_t(sector_t, len, min(queue_max_sectors(ti->table->md->queue), - blk_chunk_sectors_left(target_offset, ti->max_io_len))); + blk_chunk_sectors_left(target_offset, max_granularity))); +} + +static inline sector_t max_io_len(struct dm_target *ti, sector_t sector) +{ + return __max_io_len(ti, sector, ti->max_io_len); } int dm_set_target_max_io_len(struct dm_target *ti, sector_t len) @@ -1562,12 +1568,13 @@ static void __send_empty_flush(struct clone_info *ci) } static void __send_changing_extent_only(struct clone_info *ci, struct dm_target *ti, - unsigned int num_bios) + unsigned int num_bios, + unsigned int max_granularity) { unsigned int len, bios; len = min_t(sector_t, ci->sector_count, - max_io_len_target_boundary(ti, dm_target_offset(ti, ci->sector))); + __max_io_len(ti, ci->sector, max_granularity)); atomic_add(num_bios, &ci->io->io_count); bios = __send_duplicate_bios(ci, ti, num_bios, &len); @@ -1603,10 +1610,16 @@ static blk_status_t __process_abnormal_io(struct clone_info *ci, struct dm_target *ti) { unsigned int num_bios = 0; + unsigned int max_granularity = 0; switch (bio_op(ci->bio)) { case REQ_OP_DISCARD: num_bios = ti->num_discard_bios; + if (ti->max_discard_granularity) { + struct queue_limits *limits = + dm_get_queue_limits(ti->table->md); + max_granularity = limits->max_discard_sectors; + } break; case REQ_OP_SECURE_ERASE: num_bios = ti->num_secure_erase_bios; @@ -1627,7 +1640,7 @@ static blk_status_t __process_abnormal_io(struct clone_info *ci, if (unlikely(!num_bios)) return BLK_STS_NOTSUPP; - __send_changing_extent_only(ci, ti, num_bios); + __send_changing_extent_only(ci, ti, num_bios, max_granularity); return BLK_STS_OK; } diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index 7975483816e4..8aa6b3ea91fa 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -358,6 +358,12 @@ struct dm_target { */ bool discards_supported:1; + /* + * Set if this target requires that discards be split on both + * 'discard_granularity' and 'max_discard_sectors' boundaries. + */ + bool max_discard_granularity:1; + /* * Set if we need to limit the number of in-flight bios when swapping. */ diff --git a/include/uapi/linux/dm-ioctl.h b/include/uapi/linux/dm-ioctl.h index 7edf335778ba..1990b5700f69 100644 --- a/include/uapi/linux/dm-ioctl.h +++ b/include/uapi/linux/dm-ioctl.h @@ -286,9 +286,9 @@ enum { #define DM_DEV_SET_GEOMETRY _IOWR(DM_IOCTL, DM_DEV_SET_GEOMETRY_CMD, struct dm_ioctl) #define DM_VERSION_MAJOR 4 -#define DM_VERSION_MINOR 47 +#define DM_VERSION_MINOR 48 #define DM_VERSION_PATCHLEVEL 0 -#define DM_VERSION_EXTRA "-ioctl (2022-07-28)" +#define DM_VERSION_EXTRA "-ioctl (2023-03-01)" /* Status bits */ #define DM_READONLY_FLAG (1 << 0) /* In/Out */ From patchwork Mon Mar 27 20:11:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190036 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2906C76195 for ; Mon, 27 Mar 2023 20:13:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948038; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=vdWE55nQbwbRw/CMHvMl6f05SIsVDe3XReENiPyA8pY=; b=IP1oEIrss8ZgXwbpiV7e9QHiKwN5+q/fqLEb2mZMNkAm0UtOzaAkwa2AwyQBkme02Sb7P3 yDjP/kR9dwHUKTcN1fxX2HFcCyuLHH1V1HdhG7qBPttmNlL/KhLqF83TfUdOYoqMwfhEkL 1R6ULJvWG3qJhmscBhPvjNtSEC5+cDE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-616-8CNxD95UNdu-ATJ7gBJUkA-1; Mon, 27 Mar 2023 16:13:55 -0400 X-MC-Unique: 8CNxD95UNdu-ATJ7gBJUkA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4A7F6858297; Mon, 27 Mar 2023 20:13:53 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 37ABE14171C6; Mon, 27 Mar 2023 20:13:53 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id E45D819472D2; Mon, 27 Mar 2023 20:13:36 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 3FB711946A50 for ; Mon, 27 Mar 2023 20:13:33 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 2630840B3EDA; Mon, 27 Mar 2023 20:13:33 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast06.extmail.prod.ext.rdu2.redhat.com [10.11.55.22]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1E0E240B3ED9 for ; Mon, 27 Mar 2023 20:13:33 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F1E82185A790 for ; Mon, 27 Mar 2023 20:13:32 +0000 (UTC) Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-292-ly6G0KNxMSmVFWCKczhcHw-4; Mon, 27 Mar 2023 16:13:31 -0400 X-MC-Unique: ly6G0KNxMSmVFWCKczhcHw-4 Received: by mail-qt1-f178.google.com with SMTP id s12so6189326qtx.11 for ; Mon, 27 Mar 2023 13:13:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948011; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VKZdoNXFigBuaARZaAptCA6/0fq1wlNkMtf1bLoETbM=; b=0u0e4JHNyggokcsoSB2pTytke49xjFmuVI9lZDq9KdXJL0jtCjcxbOoBAgnHPCnKH1 c6eClkSBeQWFiXzQ+a7xhkAdvzMZXdamDH/x6wKoSgZ1MHENrFkXdNnR9mKHnlQoyIRk Zfm8AALcHMVTLRmWRQ9kimEW8V6yYm82duo20wTiqNPjOHIpySkgfclSfcwLz/yI844g JIo0hdyvozkRuBM+FA42yasoWDQj53CcuLW2BnjsQqp95/jkGIJyLkAI3hQ65lUauryi 8svITwBK/Yw+x3I+DsARP0tBxallaor6SLSR837McQi1iylKwkSBMZyEIBhdT4jliG75 rNMA== X-Gm-Message-State: AO0yUKUymRk+8VLWKYhoXuT6avNQUkptCAEH6TyU/9Rf+wGFAu6uQ9IF CEUaOyvVO7buCbWNkc3KTjT4xl1J1+tCAsS3BKjuYoqXjLOQ9+G0ADLHS8QXnDrsjUf2ylhHZQ+ Z7eNtwBnswikLY1PILd7B8pGo0I5uvMgsg7mfk9FSKBjRs64YO/BnDGI5m38s08pNpFtr7UkCtG g= X-Google-Smtp-Source: AK7set9KXUMs1rqs7kAqxN9Yk6OnkJzXX9Ry0bTCMOPwZUEbPgMi3QtSYmMLZ60y5R3fllURrlRHWA== X-Received: by 2002:a05:622a:647:b0:3bf:d9ee:882d with SMTP id a7-20020a05622a064700b003bfd9ee882dmr22445529qtb.40.1679948010612; Mon, 27 Mar 2023 13:13:30 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id j9-20020a05620a288900b0070648cf78bdsm16879656qkp.54.2023.03.27.13.13.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:30 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:37 -0400 Message-Id: <20230327201143.51026-15-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Subject: [dm-devel] [dm-6.4 PATCH v3 14/20] dm bio prison v1: improve concurrent IO performance X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org From: Joe Thornber Split the bio prison into multiple regions, with a separate rbtree and associated lock for each region. To get fast bio prison locking and not damage the performance of discards too much the bio-prison now stipulates that discards should not cross a BIO_PRISON_MAX_RANGE boundary. Because the range of a key (block_end - block_begin) must not exceed BIO_PRISON_MAX_RANGE: break_up_discard_bio() now ensures the data range reflected in PHYSICAL key doesn't exceed BIO_PRISON_MAX_RANGE. And splitting the thin target's discards (handled with VIRTUAL key) is achieved by updating dm-thin.c to set limits->max_discard_sectors in terms of BIO_PRISON_MAX_RANGE _and_ setting the thin and thin-pool targets' max_discard_granularity to true. Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer --- drivers/md/dm-bio-prison-v1.c | 87 +++++++++++++++++++++------------ drivers/md/dm-bio-prison-v1.h | 10 ++++ drivers/md/dm-thin.c | 92 ++++++++++++++++++++--------------- 3 files changed, 121 insertions(+), 68 deletions(-) diff --git a/drivers/md/dm-bio-prison-v1.c b/drivers/md/dm-bio-prison-v1.c index c4c05d5d8909..2b8af861e5f6 100644 --- a/drivers/md/dm-bio-prison-v1.c +++ b/drivers/md/dm-bio-prison-v1.c @@ -16,11 +16,17 @@ /*----------------------------------------------------------------*/ +#define NR_LOCKS 64 +#define LOCK_MASK (NR_LOCKS - 1) #define MIN_CELLS 1024 +struct prison_region { + spinlock_t lock; + struct rb_root cell; +} ____cacheline_aligned_in_smp; + struct dm_bio_prison { - spinlock_t lock; - struct rb_root cells; + struct prison_region regions[NR_LOCKS]; mempool_t cell_pool; }; @@ -34,13 +40,17 @@ static struct kmem_cache *_cell_cache; */ struct dm_bio_prison *dm_bio_prison_create(void) { - struct dm_bio_prison *prison = kzalloc(sizeof(*prison), GFP_KERNEL); int ret; + unsigned i; + struct dm_bio_prison *prison = kzalloc(sizeof(*prison), GFP_KERNEL); if (!prison) return NULL; - spin_lock_init(&prison->lock); + for (i = 0; i < NR_LOCKS; i++) { + spin_lock_init(&prison->regions[i].lock); + prison->regions[i].cell = RB_ROOT; + } ret = mempool_init_slab_pool(&prison->cell_pool, MIN_CELLS, _cell_cache); if (ret) { @@ -48,8 +58,6 @@ struct dm_bio_prison *dm_bio_prison_create(void) return NULL; } - prison->cells = RB_ROOT; - return prison; } EXPORT_SYMBOL_GPL(dm_bio_prison_create); @@ -107,14 +115,26 @@ static int cmp_keys(struct dm_cell_key *lhs, return 0; } -static int __bio_detain(struct dm_bio_prison *prison, +static unsigned lock_nr(struct dm_cell_key *key) +{ + return (key->block_begin >> BIO_PRISON_MAX_RANGE_SHIFT) & LOCK_MASK; +} + +static void check_range(struct dm_cell_key *key) +{ + BUG_ON(key->block_end - key->block_begin > BIO_PRISON_MAX_RANGE); + BUG_ON((key->block_begin >> BIO_PRISON_MAX_RANGE_SHIFT) != + ((key->block_end - 1) >> BIO_PRISON_MAX_RANGE_SHIFT)); +} + +static int __bio_detain(struct rb_root *root, struct dm_cell_key *key, struct bio *inmate, struct dm_bio_prison_cell *cell_prealloc, struct dm_bio_prison_cell **cell_result) { int r; - struct rb_node **new = &prison->cells.rb_node, *parent = NULL; + struct rb_node **new = &root->rb_node, *parent = NULL; while (*new) { struct dm_bio_prison_cell *cell = @@ -139,7 +159,7 @@ static int __bio_detain(struct dm_bio_prison *prison, *cell_result = cell_prealloc; rb_link_node(&cell_prealloc->node, parent, new); - rb_insert_color(&cell_prealloc->node, &prison->cells); + rb_insert_color(&cell_prealloc->node, root); return 0; } @@ -151,10 +171,12 @@ static int bio_detain(struct dm_bio_prison *prison, struct dm_bio_prison_cell **cell_result) { int r; + unsigned l = lock_nr(key); + check_range(key); - spin_lock_irq(&prison->lock); - r = __bio_detain(prison, key, inmate, cell_prealloc, cell_result); - spin_unlock_irq(&prison->lock); + spin_lock_irq(&prison->regions[l].lock); + r = __bio_detain(&prison->regions[l].cell, key, inmate, cell_prealloc, cell_result); + spin_unlock_irq(&prison->regions[l].lock); return r; } @@ -181,11 +203,11 @@ EXPORT_SYMBOL_GPL(dm_get_cell); /* * @inmates must have been initialised prior to this call */ -static void __cell_release(struct dm_bio_prison *prison, +static void __cell_release(struct rb_root *root, struct dm_bio_prison_cell *cell, struct bio_list *inmates) { - rb_erase(&cell->node, &prison->cells); + rb_erase(&cell->node, root); if (inmates) { if (cell->holder) @@ -198,20 +220,22 @@ void dm_cell_release(struct dm_bio_prison *prison, struct dm_bio_prison_cell *cell, struct bio_list *bios) { - spin_lock_irq(&prison->lock); - __cell_release(prison, cell, bios); - spin_unlock_irq(&prison->lock); + unsigned l = lock_nr(&cell->key); + + spin_lock_irq(&prison->regions[l].lock); + __cell_release(&prison->regions[l].cell, cell, bios); + spin_unlock_irq(&prison->regions[l].lock); } EXPORT_SYMBOL_GPL(dm_cell_release); /* * Sometimes we don't want the holder, just the additional bios. */ -static void __cell_release_no_holder(struct dm_bio_prison *prison, +static void __cell_release_no_holder(struct rb_root *root, struct dm_bio_prison_cell *cell, struct bio_list *inmates) { - rb_erase(&cell->node, &prison->cells); + rb_erase(&cell->node, root); bio_list_merge(inmates, &cell->bios); } @@ -219,11 +243,12 @@ void dm_cell_release_no_holder(struct dm_bio_prison *prison, struct dm_bio_prison_cell *cell, struct bio_list *inmates) { + unsigned l = lock_nr(&cell->key); unsigned long flags; - spin_lock_irqsave(&prison->lock, flags); - __cell_release_no_holder(prison, cell, inmates); - spin_unlock_irqrestore(&prison->lock, flags); + spin_lock_irqsave(&prison->regions[l].lock, flags); + __cell_release_no_holder(&prison->regions[l].cell, cell, inmates); + spin_unlock_irqrestore(&prison->regions[l].lock, flags); } EXPORT_SYMBOL_GPL(dm_cell_release_no_holder); @@ -248,18 +273,19 @@ void dm_cell_visit_release(struct dm_bio_prison *prison, void *context, struct dm_bio_prison_cell *cell) { - spin_lock_irq(&prison->lock); + unsigned l = lock_nr(&cell->key); + spin_lock_irq(&prison->regions[l].lock); visit_fn(context, cell); - rb_erase(&cell->node, &prison->cells); - spin_unlock_irq(&prison->lock); + rb_erase(&cell->node, &prison->regions[l].cell); + spin_unlock_irq(&prison->regions[l].lock); } EXPORT_SYMBOL_GPL(dm_cell_visit_release); -static int __promote_or_release(struct dm_bio_prison *prison, +static int __promote_or_release(struct rb_root *root, struct dm_bio_prison_cell *cell) { if (bio_list_empty(&cell->bios)) { - rb_erase(&cell->node, &prison->cells); + rb_erase(&cell->node, root); return 1; } @@ -271,10 +297,11 @@ int dm_cell_promote_or_release(struct dm_bio_prison *prison, struct dm_bio_prison_cell *cell) { int r; + unsigned l = lock_nr(&cell->key); - spin_lock_irq(&prison->lock); - r = __promote_or_release(prison, cell); - spin_unlock_irq(&prison->lock); + spin_lock_irq(&prison->regions[l].lock); + r = __promote_or_release(&prison->regions[l].cell, cell); + spin_unlock_irq(&prison->regions[l].lock); return r; } diff --git a/drivers/md/dm-bio-prison-v1.h b/drivers/md/dm-bio-prison-v1.h index dfbf1e94cb75..0b8acd6708fb 100644 --- a/drivers/md/dm-bio-prison-v1.h +++ b/drivers/md/dm-bio-prison-v1.h @@ -34,6 +34,16 @@ struct dm_cell_key { dm_block_t block_begin, block_end; }; +/* + * The range of a key (block_end - block_begin) must not + * exceed BIO_PRISON_MAX_RANGE. Also the range must not + * cross a similarly sized boundary. + * + * Must be a power of 2. + */ +#define BIO_PRISON_MAX_RANGE 1024 +#define BIO_PRISON_MAX_RANGE_SHIFT 10 + /* * Treat this as opaque, only in header so callers can manage allocation * themselves. diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 00323428919e..33ad5695f959 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -1674,54 +1674,69 @@ static void break_up_discard_bio(struct thin_c *tc, dm_block_t begin, dm_block_t struct dm_cell_key data_key; struct dm_bio_prison_cell *data_cell; struct dm_thin_new_mapping *m; - dm_block_t virt_begin, virt_end, data_begin; + dm_block_t virt_begin, virt_end, data_begin, data_end; + dm_block_t len, next_boundary; while (begin != end) { - r = ensure_next_mapping(pool); - if (r) - /* we did our best */ - return; - r = dm_thin_find_mapped_range(tc->td, begin, end, &virt_begin, &virt_end, &data_begin, &maybe_shared); - if (r) + if (r) { /* * Silently fail, letting any mappings we've * created complete. */ break; - - build_key(tc->td, PHYSICAL, data_begin, data_begin + (virt_end - virt_begin), &data_key); - if (bio_detain(tc->pool, &data_key, NULL, &data_cell)) { - /* contention, we'll give up with this range */ - begin = virt_end; - continue; } - /* - * IO may still be going to the destination block. We must - * quiesce before we can do the removal. - */ - m = get_next_mapping(pool); - m->tc = tc; - m->maybe_shared = maybe_shared; - m->virt_begin = virt_begin; - m->virt_end = virt_end; - m->data_block = data_begin; - m->cell = data_cell; - m->bio = bio; + data_end = data_begin + (virt_end - virt_begin); /* - * The parent bio must not complete before sub discard bios are - * chained to it (see end_discard's bio_chain)! - * - * This per-mapping bi_remaining increment is paired with - * the implicit decrement that occurs via bio_endio() in - * end_discard(). + * Make sure the data region obeys the bio prison restrictions. */ - bio_inc_remaining(bio); - if (!dm_deferred_set_add_work(pool->all_io_ds, &m->list)) - pool->process_prepared_discard(m); + while (data_begin < data_end) { + r = ensure_next_mapping(pool); + if (r) + return; /* we did our best */ + + next_boundary = ((data_begin >> BIO_PRISON_MAX_RANGE_SHIFT) + 1) + << BIO_PRISON_MAX_RANGE_SHIFT; + len = min_t(sector_t, data_end - data_begin, next_boundary - data_begin); + + build_key(tc->td, PHYSICAL, data_begin, data_begin + len, &data_key); + if (bio_detain(tc->pool, &data_key, NULL, &data_cell)) { + /* contention, we'll give up with this range */ + data_begin += len; + continue; + } + + /* + * IO may still be going to the destination block. We must + * quiesce before we can do the removal. + */ + m = get_next_mapping(pool); + m->tc = tc; + m->maybe_shared = maybe_shared; + m->virt_begin = virt_begin; + m->virt_end = virt_begin + len; + m->data_block = data_begin; + m->cell = data_cell; + m->bio = bio; + + /* + * The parent bio must not complete before sub discard bios are + * chained to it (see end_discard's bio_chain)! + * + * This per-mapping bi_remaining increment is paired with + * the implicit decrement that occurs via bio_endio() in + * end_discard(). + */ + bio_inc_remaining(bio); + if (!dm_deferred_set_add_work(pool->all_io_ds, &m->list)) + pool->process_prepared_discard(m); + + virt_begin += len; + data_begin += len; + } begin = virt_end; } @@ -3380,13 +3395,13 @@ static int pool_ctr(struct dm_target *ti, unsigned int argc, char **argv) */ if (pf.discard_enabled && pf.discard_passdown) { ti->num_discard_bios = 1; - /* * Setting 'discards_supported' circumvents the normal * stacking of discard limits (this keeps the pool and * thin devices' discard limits consistent). */ ti->discards_supported = true; + ti->max_discard_granularity = true; } ti->private = pt; @@ -4096,7 +4111,7 @@ static struct target_type pool_target = { .name = "thin-pool", .features = DM_TARGET_SINGLETON | DM_TARGET_ALWAYS_WRITEABLE | DM_TARGET_IMMUTABLE, - .version = {1, 22, 0}, + .version = {1, 23, 0}, .module = THIS_MODULE, .ctr = pool_ctr, .dtr = pool_dtr, @@ -4261,6 +4276,7 @@ static int thin_ctr(struct dm_target *ti, unsigned int argc, char **argv) if (tc->pool->pf.discard_enabled) { ti->discards_supported = true; ti->num_discard_bios = 1; + ti->max_discard_granularity = true; } mutex_unlock(&dm_thin_pool_table.mutex); @@ -4476,12 +4492,12 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits) return; limits->discard_granularity = pool->sectors_per_block << SECTOR_SHIFT; - limits->max_discard_sectors = 2048 * 1024 * 16; /* 16G */ + limits->max_discard_sectors = pool->sectors_per_block * BIO_PRISON_MAX_RANGE; } static struct target_type thin_target = { .name = "thin", - .version = {1, 22, 0}, + .version = {1, 23, 0}, .module = THIS_MODULE, .ctr = thin_ctr, .dtr = thin_dtr, From patchwork Mon Mar 27 20:11:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190043 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9641BC77B61 for ; Mon, 27 Mar 2023 20:14:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948057; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=pyjwQH4wcZf8TFfktbWNPIS/X/s3JrFpOT5DpQqAMHk=; b=TVOwKOEj7hiDzEy+4JrI2bUzAtZ5/oaf/lSKReEdkzR9wet3wyQzKFsuPSJKhqPDOXeTRA tF21XW9b8RDMvKzK10dRGfp5udfEvDmCxAau7OYB5+4t0tS9PR5nsq9fId+U5H+ZV/ACWq 6cvAzchYksYSJxhlL5YrGNE0rmoN2GM= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-570-r1XiNvFzNumrN02vE2gnvA-1; Mon, 27 Mar 2023 16:14:14 -0400 X-MC-Unique: r1XiNvFzNumrN02vE2gnvA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DFEE81C068DB; Mon, 27 Mar 2023 20:14:11 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id CF36AC33187; Mon, 27 Mar 2023 20:14:11 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 916411949742; Mon, 27 Mar 2023 20:13:56 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 722B11946A6A for ; Mon, 27 Mar 2023 20:13:42 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 48AFC1415139; Mon, 27 Mar 2023 20:13:42 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast05.extmail.prod.ext.rdu2.redhat.com [10.11.55.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 40F22140E96A for ; Mon, 27 Mar 2023 20:13:34 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2C9B28828C5 for ; Mon, 27 Mar 2023 20:13:34 +0000 (UTC) Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-656-WneT3H_5O4OSyrug76feqg-3; Mon, 27 Mar 2023 16:13:32 -0400 X-MC-Unique: WneT3H_5O4OSyrug76feqg-3 Received: by mail-qt1-f169.google.com with SMTP id p2so4726246qtw.13 for ; Mon, 27 Mar 2023 13:13:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948012; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SBN3e8d5Qq0hFXuPFkD52ybusL049Itzx+LBtkZ2VOM=; b=MerJQ+yJW6XgdwWJC4GQ+lPsD98nrP9m7ONxWHyaCMaHvfe5M80ymxh4xcUrA9gDyU 2fucGPNrYqMjm9YzL4H5PVjln2fdt+tU+B/qCPnpPkJ64WWTB7rE6b0fS2leNEGKHVzT U6TrHz883tP92bsK2kMsnNmJW3PdD2sPOgxg3o8T4oJWkzrsYWfyiWQuwtx9lba8wgdh KzKyo76K7e6OHlXrjicQS7GHheKjl/GV0YoIbeZ9sLRreQsK4y3dnU9zfku7WktWfmxQ fwTlOtCnOX/RPwTyJTFEldEWd3YaQknyqfQ1nmm1R/7Dk1ySXB+imCCkHnkwSvGXozcz g4Vw== X-Gm-Message-State: AO0yUKXTuNJF1cQscMBLCBxo7Ucd43VHtBut7IwlhuyvYx+RJOyvP0m8 Stp+Z2u0VbsDRVDW7Cb5k+MK/XEAwJWxbzzI1PmfgyfEwn4PIaIQ5OsKckc4UnSn6V7yxCEqVYb bSLRim++OQEOKr7RwSfM8J+FXHcYGsrOhtc3DC1zjUj/56wYnutP5AQeWDeLTv0C2Qkulpdu9Gx o= X-Google-Smtp-Source: AK7set/KcR8zPzvyZmCoD66iZdu4uOgZL06cjihbV5XKhpEFExLT5A/6x5Rr5XaBiwEgxAgiHfTc4g== X-Received: by 2002:ac8:58c6:0:b0:3e3:9199:d27 with SMTP id u6-20020ac858c6000000b003e391990d27mr22314214qta.53.1679948012077; Mon, 27 Mar 2023 13:13:32 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id b126-20020ae9eb84000000b007468b183a65sm11345416qkg.30.2023.03.27.13.13.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:31 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:38 -0400 Message-Id: <20230327201143.51026-16-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Subject: [dm-devel] [dm-6.4 PATCH v3 15/20] dm bio prison v1: add dm_cell_key_has_valid_range X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Don't have bio_detain() BUG_ON if a dm_cell_key is beyond BIO_PRISON_MAX_RANGE or spans a boundary. Update dm-thin.c:build_key() to use dm_cell_key_has_valid_range() which will do this checking without using BUG_ON. Also update process_discard_bio() to check the discard bio that DM core passes in (having first imposed max_discard_granularity based splitting). dm_cell_key_has_valid_range() will merely WARN_ON_ONCE if it returns false because if it does: it is programmer error that should be caught with proper testing. So relax the BUG_ONs to be WARN_ON_ONCE. Signed-off-by: Mike Snitzer --- drivers/md/dm-bio-prison-v1.c | 14 +++++++++----- drivers/md/dm-bio-prison-v1.h | 5 +++++ drivers/md/dm-thin.c | 21 +++++++++++++++------ 3 files changed, 29 insertions(+), 11 deletions(-) diff --git a/drivers/md/dm-bio-prison-v1.c b/drivers/md/dm-bio-prison-v1.c index 2b8af861e5f6..78bb559b521c 100644 --- a/drivers/md/dm-bio-prison-v1.c +++ b/drivers/md/dm-bio-prison-v1.c @@ -120,12 +120,17 @@ static unsigned lock_nr(struct dm_cell_key *key) return (key->block_begin >> BIO_PRISON_MAX_RANGE_SHIFT) & LOCK_MASK; } -static void check_range(struct dm_cell_key *key) +bool dm_cell_key_has_valid_range(struct dm_cell_key *key) { - BUG_ON(key->block_end - key->block_begin > BIO_PRISON_MAX_RANGE); - BUG_ON((key->block_begin >> BIO_PRISON_MAX_RANGE_SHIFT) != - ((key->block_end - 1) >> BIO_PRISON_MAX_RANGE_SHIFT)); + if (WARN_ON_ONCE(key->block_end - key->block_begin > BIO_PRISON_MAX_RANGE)) + return false; + if (WARN_ON_ONCE((key->block_begin >> BIO_PRISON_MAX_RANGE_SHIFT) != + (key->block_end - 1) >> BIO_PRISON_MAX_RANGE_SHIFT)) + return false; + + return true; } +EXPORT_SYMBOL(dm_cell_key_has_valid_range); static int __bio_detain(struct rb_root *root, struct dm_cell_key *key, @@ -172,7 +177,6 @@ static int bio_detain(struct dm_bio_prison *prison, { int r; unsigned l = lock_nr(key); - check_range(key); spin_lock_irq(&prison->regions[l].lock); r = __bio_detain(&prison->regions[l].cell, key, inmate, cell_prealloc, cell_result); diff --git a/drivers/md/dm-bio-prison-v1.h b/drivers/md/dm-bio-prison-v1.h index 0b8acd6708fb..2a097ed0d85e 100644 --- a/drivers/md/dm-bio-prison-v1.h +++ b/drivers/md/dm-bio-prison-v1.h @@ -83,6 +83,11 @@ int dm_get_cell(struct dm_bio_prison *prison, struct dm_bio_prison_cell *cell_prealloc, struct dm_bio_prison_cell **cell_result); +/* + * Returns false if key is beyond BIO_PRISON_MAX_RANGE or spans a boundary. + */ +bool dm_cell_key_has_valid_range(struct dm_cell_key *key); + /* * An atomic op that combines retrieving or creating a cell, and adding a * bio to it. diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 33ad5695f959..2b13c949bd72 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -118,25 +118,27 @@ enum lock_space { PHYSICAL }; -static void build_key(struct dm_thin_device *td, enum lock_space ls, +static bool build_key(struct dm_thin_device *td, enum lock_space ls, dm_block_t b, dm_block_t e, struct dm_cell_key *key) { key->virtual = (ls == VIRTUAL); key->dev = dm_thin_dev_id(td); key->block_begin = b; key->block_end = e; + + return dm_cell_key_has_valid_range(key); } static void build_data_key(struct dm_thin_device *td, dm_block_t b, struct dm_cell_key *key) { - build_key(td, PHYSICAL, b, b + 1llu, key); + (void) build_key(td, PHYSICAL, b, b + 1llu, key); } static void build_virtual_key(struct dm_thin_device *td, dm_block_t b, struct dm_cell_key *key) { - build_key(td, VIRTUAL, b, b + 1llu, key); + (void) build_key(td, VIRTUAL, b, b + 1llu, key); } /*----------------------------------------------------------------*/ @@ -1702,7 +1704,8 @@ static void break_up_discard_bio(struct thin_c *tc, dm_block_t begin, dm_block_t << BIO_PRISON_MAX_RANGE_SHIFT; len = min_t(sector_t, data_end - data_begin, next_boundary - data_begin); - build_key(tc->td, PHYSICAL, data_begin, data_begin + len, &data_key); + /* This key is certainly within range given the above splitting */ + (void) build_key(tc->td, PHYSICAL, data_begin, data_begin + len, &data_key); if (bio_detain(tc->pool, &data_key, NULL, &data_cell)) { /* contention, we'll give up with this range */ data_begin += len; @@ -1778,8 +1781,13 @@ static void process_discard_bio(struct thin_c *tc, struct bio *bio) return; } - build_key(tc->td, VIRTUAL, begin, end, &virt_key); - if (bio_detain(tc->pool, &virt_key, bio, &virt_cell)) + if (unlikely(!build_key(tc->td, VIRTUAL, begin, end, &virt_key))) { + DMERR_LIMIT("Discard doesn't respect bio prison limits"); + bio_endio(bio); + return; + } + + if (bio_detain(tc->pool, &virt_key, bio, &virt_cell)) { /* * Potential starvation issue: We're relying on the * fs/application being well behaved, and not trying to @@ -1788,6 +1796,7 @@ static void process_discard_bio(struct thin_c *tc, struct bio *bio) * cell will never be granted. */ return; + } tc->pool->process_discard_cell(tc, virt_cell); } From patchwork Mon Mar 27 20:11:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190039 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3065C77B61 for ; Mon, 27 Mar 2023 20:14:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948043; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=5W6XgTush3zH1bJRdFcTgXdEm3aJJqjQUMrAbQwYXcU=; b=HQCqruStaOkN76lHgoNEOXQvCHYJjKfjgFhIdC23sGHacNVLwgFz5PfgudFckvXMaeV+RP BsQjw+5690BU3H2MK4X8tsXgNwVQqfNzobYqPlFFE4K86RmjBrSwlPQjnef3/OWCHBSi3n 2KpMrgC5RnwZB4l7K5QKXRhioBJnnz8= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-665-vXNVD7p9Pi2yM_ad0l8WGQ-1; Mon, 27 Mar 2023 16:14:02 -0400 X-MC-Unique: vXNVD7p9Pi2yM_ad0l8WGQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3C96B1C068CF; Mon, 27 Mar 2023 20:13:59 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2A2DC40B3ED9; Mon, 27 Mar 2023 20:13:59 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 89DE01946A7E; Mon, 27 Mar 2023 20:13:47 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 192711946A61 for ; Mon, 27 Mar 2023 20:13:36 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 0D0C03542A; Mon, 27 Mar 2023 20:13:36 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast07.extmail.prod.ext.rdu2.redhat.com [10.11.55.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 03DB943FBE for ; Mon, 27 Mar 2023 20:13:35 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D69803C0ED4F for ; Mon, 27 Mar 2023 20:13:35 +0000 (UTC) Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-17-bgXo-6lAOp2nqTaboMKKMQ-4; Mon, 27 Mar 2023 16:13:34 -0400 X-MC-Unique: bgXo-6lAOp2nqTaboMKKMQ-4 Received: by mail-qt1-f175.google.com with SMTP id h16so3007921qtn.7 for ; Mon, 27 Mar 2023 13:13:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948013; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P0nH8y5nrVQVdjQ+7MWEvssVfY/cEbcGbEWp6ylrqyQ=; b=QLS2gGn1qL2IHgMM3IFnwFhSTm7EGtExBqL081b8BsVlwcGXTq+HGZGf2NNQO8ZoDG thk2j8vJkXDT82xd9QJYaMD06TX6RDZjbom6Q3kM5J1RMYAuFGOu2ceFzwxovoIm/lQC Do6ysB6NP5E/pk5IGiEo3X3cXPktjXHCiYSyrkqj7LPSELW/Vc1JiDLA52iK9pbsipAl +nag+sTIbTvwltrbHfxq9ax+DPNthxB6uErsmzhqW5B5J1trmg4FnM4TGx3JvL+4Mv9d D+Df4Aipc8SrXlP0t4M+5nbIuwGCrcC6KHFe2YlT9gxQ7EVSOSJPRifEHVMClSN168hx BjJQ== X-Gm-Message-State: AAQBX9fFKtghJFrWBFzPenl0/oaWkF/n/FAD4UxnlAXdyXZVThIz+3/a biElC1gjgfcuH5vefYqVUn5nYMoStxu9V7PVBPxNv+CXd76Gy+VZyCy89x/zyrDUtUEhNKoZHRU LvqWSx4a80zrY3OsF+EkFqJoCRgKnZAvnoZzW4AXu6IMtnwiFgBFRo3uCz0nBdZwXhBEKi3zcbX 0= X-Google-Smtp-Source: AKy350Y96gqJ8gm1eUDkWBdhHVPSIF25C9Z/7pTPicBHvic+yBssV8gx7JTYQ5NQ4P/SYw+5vk3tRA== X-Received: by 2002:ac8:5c46:0:b0:3e4:e8a8:f235 with SMTP id j6-20020ac85c46000000b003e4e8a8f235mr8388241qtj.36.1679948013682; Mon, 27 Mar 2023 13:13:33 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id 6-20020ac85646000000b003e3897d8505sm6400099qtt.54.2023.03.27.13.13.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:33 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:39 -0400 Message-Id: <20230327201143.51026-17-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Subject: [dm-devel] [dm-6.4 PATCH v3 16/20] dm: add dm_num_sharded_locks() X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Simple helper to use when DM core code needs to appropriately size, based on num_online_cpus(), its data structures that split locks. dm_num_sharded_locks() rounds up num_online_cpus() to next power of 2 but caps return at DM_MAX_SHARDED_LOCKS (64). This heuristic may evolve as warranted, but as-is it will serve as a more informed basis for sizing the sharded lock structs in dm-bufio's dm_buffer_cache (buffer_trees) and dm-bio-prison-v1's dm_bio_prison (prison_regions). Signed-off-by: Mike Snitzer --- drivers/md/dm.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/md/dm.h b/drivers/md/dm.h index 22eaed188907..18450282d0d9 100644 --- a/drivers/md/dm.h +++ b/drivers/md/dm.h @@ -20,6 +20,7 @@ #include #include #include +#include #include "dm-stats.h" @@ -228,4 +229,13 @@ void dm_free_md_mempools(struct dm_md_mempools *pools); */ unsigned int dm_get_reserved_bio_based_ios(void); +#define DM_MAX_SHARDED_LOCKS 64 + +static inline unsigned int dm_num_sharded_locks(void) +{ + unsigned int num_locks = roundup_pow_of_two(num_online_cpus()); + + return min_t(unsigned int, num_locks, DM_MAX_SHARDED_LOCKS); +} + #endif From patchwork Mon Mar 27 20:11:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190041 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B21E5C6FD1D for ; Mon, 27 Mar 2023 20:14:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948051; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=86a2SALdDWd24N+uGx7da6g6m2QDS1d4RPFBCKqeWLQ=; b=SCl9XRt1BpPumTSJsWjW1eylvGd3LwH39AY5U7dJhaJpN44zFSXwNbQT0aGUWfbhF2RPbP wnU+NziDG5+8DjZVqCcQibFEu8KSg1pmF2BomLnFRqjcfQSwvUUh+VXpVCfgdddnctGenl wae29khC6mnhyEtHbUzF1qJKP7qQrg0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-113-FdqPkG28OwyWAp_boeI-LQ-1; Mon, 27 Mar 2023 16:14:09 -0400 X-MC-Unique: FdqPkG28OwyWAp_boeI-LQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7D9A4101A551; Mon, 27 Mar 2023 20:14:07 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6B347C15BA0; Mon, 27 Mar 2023 20:14:07 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id D4E7219472D5; Mon, 27 Mar 2023 20:13:52 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 84CBE19465A2 for ; Mon, 27 Mar 2023 20:13:39 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 6A8292166B2A; Mon, 27 Mar 2023 20:13:39 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast09.extmail.prod.ext.rdu2.redhat.com [10.11.55.25]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 62CF52166B26 for ; Mon, 27 Mar 2023 20:13:39 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 433CC281294D for ; Mon, 27 Mar 2023 20:13:39 +0000 (UTC) Received: from mail-qv1-f50.google.com (mail-qv1-f50.google.com [209.85.219.50]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-9-hHoJ6uetPrWl9oLnOgGL-w-4; Mon, 27 Mar 2023 16:13:35 -0400 X-MC-Unique: hHoJ6uetPrWl9oLnOgGL-w-4 Received: by mail-qv1-f50.google.com with SMTP id qh28so7640857qvb.7 for ; Mon, 27 Mar 2023 13:13:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948015; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xis1PZIHxgZc33JUTcnyVvy12Vhl77i8OslXpZXLdgA=; b=yvn2rkTo0Gv9hXhCbUuF0eBQzyjwW08LuTUiMWz87mZ3CgXolBt3ftB5RoHjkMDuSx JXWpjUrk3HeYEMJdIhiSoGsfb0BYBUt+I4kbVkipAIceLPe6GWTphUO9m4D5Qb0LRtlF KnY4Ae/4MwQy0U+/h9GNaYD0uYkFSKqDDySMSjISYTSBcvIjGFDouXBuCNLihMVV5YoS rckRqekJOuygc/NpgHq4PRcSRlIP7gDSazugtX4nkK9UVZdSneM+QYGw1KFpWglHCMn7 iqwz4Fmmk5tpDhWoKCXitPpf507oO54k4bGog9WprE1su7Dc7fl+l0z4llSW+WjDCeiT OjzQ== X-Gm-Message-State: AAQBX9dRvYqfzmf1ihI/2tn1RwGBK0H+8PipyOQjIQ7kzq9IwRq9W4TF kTp5TukqzvfWgekbTS+9weD+dDpy2ZY5lYZ1oWxFKCLL9q/fNXF2oD5J6TKeF3VQUfGJbwJGkHb 67hvB33XMRzvE6vkbcmkro/QwHOaP+/329L6PvYqeQhsaaUSRwoFruqDBuLk7xta0z/ISVXUcg+ 8= X-Google-Smtp-Source: AKy350bdI7fPLiZRLW4GPQlt6CDNO3b2uDbMBtu6JVcWz3roMzyEeTD8mnUG2okibZRJCTBwmlMNmw== X-Received: by 2002:a05:6214:27c6:b0:56b:fb18:adcd with SMTP id ge6-20020a05621427c600b0056bfb18adcdmr19738100qvb.8.1679948015146; Mon, 27 Mar 2023 13:13:35 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id cw2-20020ad44dc2000000b005dd8b9345aesm3214903qvb.70.2023.03.27.13.13.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:34 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:40 -0400 Message-Id: <20230327201143.51026-18-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Subject: [dm-devel] [dm-6.4 PATCH v3 17/20] dm bufio: prepare to intelligently size dm_buffer_cache's buffer_trees X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Add num_locks member to dm_buffer_cache struct and use it rather than the NR_LOCKS magic value (64). Next commit will size the dm_buffer_cache's buffer_trees according to dm_num_sharded_locks(). Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 48 +++++++++++++++++++++++-------------------- 1 file changed, 26 insertions(+), 22 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index ae552644a0b4..2250799a70e4 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -380,7 +380,6 @@ struct dm_buffer { */ #define NR_LOCKS 64 -#define LOCKS_MASK (NR_LOCKS - 1) struct buffer_tree { struct rw_semaphore lock; @@ -388,37 +387,38 @@ struct buffer_tree { } ____cacheline_aligned_in_smp; struct dm_buffer_cache { + struct lru lru[LIST_SIZE]; /* * We spread entries across multiple trees to reduce contention * on the locks. */ + unsigned int num_locks; struct buffer_tree trees[NR_LOCKS]; - struct lru lru[LIST_SIZE]; }; -static inline unsigned int cache_index(sector_t block) +static inline unsigned int cache_index(sector_t block, unsigned int num_locks) { - return block & LOCKS_MASK; + return block & (num_locks - 1); } static inline void cache_read_lock(struct dm_buffer_cache *bc, sector_t block) { - down_read(&bc->trees[cache_index(block)].lock); + down_read(&bc->trees[cache_index(block, bc->num_locks)].lock); } static inline void cache_read_unlock(struct dm_buffer_cache *bc, sector_t block) { - up_read(&bc->trees[cache_index(block)].lock); + up_read(&bc->trees[cache_index(block, bc->num_locks)].lock); } static inline void cache_write_lock(struct dm_buffer_cache *bc, sector_t block) { - down_write(&bc->trees[cache_index(block)].lock); + down_write(&bc->trees[cache_index(block, bc->num_locks)].lock); } static inline void cache_write_unlock(struct dm_buffer_cache *bc, sector_t block) { - up_write(&bc->trees[cache_index(block)].lock); + up_write(&bc->trees[cache_index(block, bc->num_locks)].lock); } /* @@ -429,13 +429,15 @@ struct lock_history { struct dm_buffer_cache *cache; bool write; unsigned int previous; + unsigned int no_previous; }; static void lh_init(struct lock_history *lh, struct dm_buffer_cache *cache, bool write) { lh->cache = cache; lh->write = write; - lh->previous = NR_LOCKS; /* indicates no previous */ + lh->no_previous = cache->num_locks; + lh->previous = lh->no_previous; } static void __lh_lock(struct lock_history *lh, unsigned int index) @@ -459,9 +461,9 @@ static void __lh_unlock(struct lock_history *lh, unsigned int index) */ static void lh_exit(struct lock_history *lh) { - if (lh->previous != NR_LOCKS) { + if (lh->previous != lh->no_previous) { __lh_unlock(lh, lh->previous); - lh->previous = NR_LOCKS; + lh->previous = lh->no_previous; } } @@ -471,9 +473,9 @@ static void lh_exit(struct lock_history *lh) */ static void lh_next(struct lock_history *lh, sector_t b) { - unsigned int index = cache_index(b); + unsigned int index = cache_index(b, lh->no_previous); /* no_previous is num_locks */ - if (lh->previous != NR_LOCKS) { + if (lh->previous != lh->no_previous) { if (lh->previous != index) { __lh_unlock(lh, lh->previous); __lh_lock(lh, index); @@ -500,11 +502,13 @@ static struct dm_buffer *list_to_buffer(struct list_head *l) return le_to_buffer(le); } -static void cache_init(struct dm_buffer_cache *bc) +static void cache_init(struct dm_buffer_cache *bc, unsigned int num_locks) { unsigned int i; - for (i = 0; i < NR_LOCKS; i++) { + bc->num_locks = num_locks; + + for (i = 0; i < bc->num_locks; i++) { init_rwsem(&bc->trees[i].lock); bc->trees[i].root = RB_ROOT; } @@ -517,7 +521,7 @@ static void cache_destroy(struct dm_buffer_cache *bc) { unsigned int i; - for (i = 0; i < NR_LOCKS; i++) + for (i = 0; i < bc->num_locks; i++) WARN_ON_ONCE(!RB_EMPTY_ROOT(&bc->trees[i].root)); lru_destroy(&bc->lru[LIST_CLEAN]); @@ -576,7 +580,7 @@ static struct dm_buffer *cache_get(struct dm_buffer_cache *bc, sector_t block) struct dm_buffer *b; cache_read_lock(bc, block); - b = __cache_get(&bc->trees[cache_index(block)].root, block); + b = __cache_get(&bc->trees[cache_index(block, bc->num_locks)].root, block); if (b) { lru_reference(&b->lru); __cache_inc_buffer(b); @@ -650,7 +654,7 @@ static struct dm_buffer *__cache_evict(struct dm_buffer_cache *bc, int list_mode b = le_to_buffer(le); /* __evict_pred will have locked the appropriate tree. */ - rb_erase(&b->node, &bc->trees[cache_index(b->block)].root); + rb_erase(&b->node, &bc->trees[cache_index(b->block, bc->num_locks)].root); return b; } @@ -816,7 +820,7 @@ static bool cache_insert(struct dm_buffer_cache *bc, struct dm_buffer *b) cache_write_lock(bc, b->block); BUG_ON(atomic_read(&b->hold_count) != 1); - r = __cache_insert(&bc->trees[cache_index(b->block)].root, b); + r = __cache_insert(&bc->trees[cache_index(b->block, bc->num_locks)].root, b); if (r) lru_insert(&bc->lru[b->list_mode], &b->lru); cache_write_unlock(bc, b->block); @@ -842,7 +846,7 @@ static bool cache_remove(struct dm_buffer_cache *bc, struct dm_buffer *b) r = false; } else { r = true; - rb_erase(&b->node, &bc->trees[cache_index(b->block)].root); + rb_erase(&b->node, &bc->trees[cache_index(b->block, bc->num_locks)].root); lru_remove(&bc->lru[b->list_mode], &b->lru); } @@ -911,7 +915,7 @@ static void cache_remove_range(struct dm_buffer_cache *bc, { unsigned int i; - for (i = 0; i < NR_LOCKS; i++) { + for (i = 0; i < bc->num_locks; i++) { down_write(&bc->trees[i].lock); __remove_range(bc, &bc->trees[i].root, begin, end, pred, release); up_write(&bc->trees[i].lock); @@ -2432,7 +2436,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign r = -ENOMEM; goto bad_client; } - cache_init(&c->cache); + cache_init(&c->cache, NR_LOCKS); c->bdev = bdev; c->block_size = block_size; From patchwork Mon Mar 27 20:11:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190044 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EE4FAC76195 for ; Mon, 27 Mar 2023 20:14:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948063; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=RnW4yMGw8sW4/Lp8VW6SGPGhyjX8Ev3H+VB0V7MwpV0=; b=HdJHfHl0hsvPp16ppPSuKr6k9bTNBDR048XhMxANnBRM3dgU+OQa2rcrIqX/hFD+ZmO2RG sdlhA5ushyeoP8g4Z1MDXGc7iEjJABWN7JuAtkG36jVynRcsueuK2Vwo7c0Iqv2YXNApM7 kTAmbteSKC/BOsjoR2Yyjr6uFVGFgJw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-235-duaa1k-sPzSJSlgDwJfboA-1; Mon, 27 Mar 2023 16:14:19 -0400 X-MC-Unique: duaa1k-sPzSJSlgDwJfboA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D195B802D32; Mon, 27 Mar 2023 20:14:17 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id C05A94020C82; Mon, 27 Mar 2023 20:14:17 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 34C6419472D9; Mon, 27 Mar 2023 20:13:56 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 343AC19465A2 for ; Mon, 27 Mar 2023 20:13:39 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 1AF32492B0B; Mon, 27 Mar 2023 20:13:39 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast02.extmail.prod.ext.rdu2.redhat.com [10.11.55.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1322B492B0A for ; Mon, 27 Mar 2023 20:13:39 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EA187801779 for ; Mon, 27 Mar 2023 20:13:38 +0000 (UTC) Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-304-Y1vkodrEM3udqoCnSA8-Ug-6; Mon, 27 Mar 2023 16:13:37 -0400 X-MC-Unique: Y1vkodrEM3udqoCnSA8-Ug-6 Received: by mail-qt1-f179.google.com with SMTP id hf2so9812071qtb.3 for ; Mon, 27 Mar 2023 13:13:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948017; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PwFipSc82OIEFDU8Id4luDtLttcFOJWrHTnvxSKayKA=; b=lfPjobkaeanbc+RQlePNusm+q44f0e3qZhvcW1NkZawvAzJ1taE/1oC/nuL2vxhBhx SiU4VkrFOo+P9I68VjvyQX2FnJc0QRI/Rm831gSXbfWIV6a7I5U2n9rQHfjwEkgWIOEM dbk2SR6bU7dKdn5sz2gdwJ8pDAt3uye2Unrn559/+sibMUFyBBTmLBlSEAuDnMGrsHVn Yp2mgk8X5S2gpP8W78Ke/nlqIICPz7tH4fK2UjQ9thvgNgN7pPc+Xgxg4wBZFkxtAg1j 5Jyrcwr17M7xEkQ5Teccn3nX4Oe6gTX9cNRM2l0+yG6xtMDcQumVpyRsmAgoDWbW3atS mVzA== X-Gm-Message-State: AO0yUKWWWTOCEXE11o6SetIONEEkeb7iLXlAU6bgsbjyxYDzeWg2/7Dj GW16hnF0Z7ravp3jJ7T2lKMRyvS+TRvOkQhg9PFb1FrEMUBuqBJ8Wljh2G+bjfDPGjlBVE6r4c6 P10BVV7tlFJA2qkuOSsrhicojVH1r/bmkiUR/UddBJaVY5JQTvfKQGRH7pLaGNrl/A2FglmzTfG I= X-Google-Smtp-Source: AK7set/FmDxDs4w1j0e5PA+jX3MwJRj6+AOGAADIBIksQBRZljk9BW4Q1CQMIOVrgNxvcG4up41zZQ== X-Received: by 2002:a05:622a:c6:b0:3b6:2c3b:8c00 with SMTP id p6-20020a05622a00c600b003b62c3b8c00mr23270455qtw.66.1679948016921; Mon, 27 Mar 2023 13:13:36 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id f8-20020a05620a280800b0074269db4699sm10226828qkp.46.2023.03.27.13.13.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:36 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:41 -0400 Message-Id: <20230327201143.51026-19-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Subject: [dm-devel] [dm-6.4 PATCH v3 18/20] dm bufio: intelligently size dm_buffer_cache's buffer_trees X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Size the dm_buffer_cache's number of buffer_tree structs using dm_num_sharded_locks(). Signed-off-by: Mike Snitzer --- drivers/md/dm-bufio.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c index 2250799a70e4..7dc53f3d0739 100644 --- a/drivers/md/dm-bufio.c +++ b/drivers/md/dm-bufio.c @@ -21,6 +21,8 @@ #include #include +#include "dm.h" + #define DM_MSG_PREFIX "bufio" /* @@ -379,8 +381,6 @@ struct dm_buffer { * only enough to ensure get/put are threadsafe. */ -#define NR_LOCKS 64 - struct buffer_tree { struct rw_semaphore lock; struct rb_root root; @@ -393,7 +393,7 @@ struct dm_buffer_cache { * on the locks. */ unsigned int num_locks; - struct buffer_tree trees[NR_LOCKS]; + struct buffer_tree trees[]; }; static inline unsigned int cache_index(sector_t block, unsigned int num_locks) @@ -976,7 +976,7 @@ struct dm_bufio_client { */ unsigned long oldest_buffer; - struct dm_buffer_cache cache; + struct dm_buffer_cache cache; /* must be last member */ }; static DEFINE_STATIC_KEY_FALSE(no_sleep_enabled); @@ -2422,6 +2422,7 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign unsigned int flags) { int r; + unsigned int num_locks; struct dm_bufio_client *c; char slab_name[27]; @@ -2431,12 +2432,13 @@ struct dm_bufio_client *dm_bufio_client_create(struct block_device *bdev, unsign goto bad_client; } - c = kzalloc(sizeof(*c), GFP_KERNEL); + num_locks = dm_num_sharded_locks(); + c = kzalloc(sizeof(*c) + (num_locks * sizeof(struct buffer_tree)), GFP_KERNEL); if (!c) { r = -ENOMEM; goto bad_client; } - cache_init(&c->cache, NR_LOCKS); + cache_init(&c->cache, num_locks); c->bdev = bdev; c->block_size = block_size; From patchwork Mon Mar 27 20:11:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190042 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B55D5C77B60 for ; Mon, 27 Mar 2023 20:14:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948056; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=k+ow5h4fvOB68teKqp0nfcPWI9d3iZfIBa4LJYSiXg0=; b=YZL8SDFnCvjOFoRyXmlkqNyPpm9JSS0EAF4Ci9EZJW8vW1OlBAMN9/gnMCx5fr4O9LuNhe Nvo94VNKPsA1LhO38TKXbHf/SvSQNQSignu4kYcGaa1pEqSV85xTTffo8UKMI5oyyvGUI/ wWBw04SU+wF91bgLBkafpuU31Gb1WiI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-659-yGnin8-EPPO4kB-xDOTfaA-1; Mon, 27 Mar 2023 16:14:13 -0400 X-MC-Unique: yGnin8-EPPO4kB-xDOTfaA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5F377857FB3; Mon, 27 Mar 2023 20:14:11 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4EB64492B01; Mon, 27 Mar 2023 20:14:11 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 689871949739; Mon, 27 Mar 2023 20:13:56 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id BC139194658F for ; Mon, 27 Mar 2023 20:13:41 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id AC6B72166B2C; Mon, 27 Mar 2023 20:13:41 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast02.extmail.prod.ext.rdu2.redhat.com [10.11.55.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A555E2166B29 for ; Mon, 27 Mar 2023 20:13:41 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 865C0802D32 for ; Mon, 27 Mar 2023 20:13:41 +0000 (UTC) Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-635-F6F-nH9kNbarqske-1UjjA-7; Mon, 27 Mar 2023 16:13:39 -0400 X-MC-Unique: F6F-nH9kNbarqske-1UjjA-7 Received: by mail-qt1-f169.google.com with SMTP id r5so9816382qtp.4 for ; Mon, 27 Mar 2023 13:13:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948018; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RS/kygwvcLIIpB3MPZf9iFXCT5/ZQtXgTxSpDCvveGI=; b=kuvDlucaP7Pnf/j3LrunawoalZlYY6v32/yU3GSVSFJ3Mu1xEPBfm8klmBAEYSRsF7 TxKHtBo9Z8C3RK2PiPjnXQOM2fQwEMow3/B84ABJd+QdShJSvb3O1dyqJJwJyF/erwmv pStY4Q1kDHt9umBNDfjtluRONWHiiDCwEqQlyN7FQ4FDQAjWZXccDNi/VjDvHuVQFONA nfXCu2V/EHQaiBSOJsF+qRU7GYnw8W4TqsaZiCMNeO+hTep/vCluskP6p/TtGLXC8oDt RArpBEndoa39j8pG5MQ2INm0iYxDh6/+m6Qte462nhjyoUOlHerT6pI1mBQsNwZoLdYI fL5w== X-Gm-Message-State: AO0yUKWSXzX6X3idUvJWcP9Q0lXBhgcQl78pdJ8+QP1HuE1bExpFQDxY C/5Av6JufdVOMWso0m6Jb6kBsFvQbD3FFw/9slAuX+6uVN6KpUk2sZpuoCwBpm4sBj5YRVQYHbw cJiMgneOAwLOyVddB2ZBc/R/8mLvESb8Dlu0Dye09s3RXlUrGLNcpsrknKu78Z718emCxeisu7j Q= X-Google-Smtp-Source: AK7set9gY+NOfaP2EPxXeQTEmFaMokIIbBlTZWwiEUt/LmRMwCYwgGOWnXBndmFgQ57KlO5UkZnONg== X-Received: by 2002:a05:622a:1d3:b0:3e3:87a2:e7f5 with SMTP id t19-20020a05622a01d300b003e387a2e7f5mr20519135qtw.11.1679948018387; Mon, 27 Mar 2023 13:13:38 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id 207-20020a3703d8000000b007468ec2e5dcsm10816449qkd.87.2023.03.27.13.13.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:38 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:42 -0400 Message-Id: <20230327201143.51026-20-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Subject: [dm-devel] [dm-6.4 PATCH v3 19/20] dm bio prison v1: prepare to intelligently size dm_bio_prison's prison_regions X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Add num_locks member to dm_bio_prison struct and use it rather than the NR_LOCKS magic value (64). Next commit will size the dm_bio_prison's prison_regions according to dm_num_sharded_locks(). Signed-off-by: Mike Snitzer --- drivers/md/dm-bio-prison-v1.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm-bio-prison-v1.c b/drivers/md/dm-bio-prison-v1.c index 78bb559b521c..a7930ad1878b 100644 --- a/drivers/md/dm-bio-prison-v1.c +++ b/drivers/md/dm-bio-prison-v1.c @@ -17,7 +17,6 @@ /*----------------------------------------------------------------*/ #define NR_LOCKS 64 -#define LOCK_MASK (NR_LOCKS - 1) #define MIN_CELLS 1024 struct prison_region { @@ -26,8 +25,9 @@ struct prison_region { } ____cacheline_aligned_in_smp; struct dm_bio_prison { - struct prison_region regions[NR_LOCKS]; mempool_t cell_pool; + unsigned int num_locks; + struct prison_region regions[NR_LOCKS]; }; static struct kmem_cache *_cell_cache; @@ -46,8 +46,9 @@ struct dm_bio_prison *dm_bio_prison_create(void) if (!prison) return NULL; + prison->num_locks = NR_LOCKS; - for (i = 0; i < NR_LOCKS; i++) { + for (i = 0; i < prison->num_locks; i++) { spin_lock_init(&prison->regions[i].lock); prison->regions[i].cell = RB_ROOT; } @@ -115,9 +116,9 @@ static int cmp_keys(struct dm_cell_key *lhs, return 0; } -static unsigned lock_nr(struct dm_cell_key *key) +static unsigned lock_nr(struct dm_cell_key *key, unsigned int num_locks) { - return (key->block_begin >> BIO_PRISON_MAX_RANGE_SHIFT) & LOCK_MASK; + return (key->block_begin >> BIO_PRISON_MAX_RANGE_SHIFT) & (num_locks - 1); } bool dm_cell_key_has_valid_range(struct dm_cell_key *key) @@ -176,7 +177,7 @@ static int bio_detain(struct dm_bio_prison *prison, struct dm_bio_prison_cell **cell_result) { int r; - unsigned l = lock_nr(key); + unsigned l = lock_nr(key, prison->num_locks); spin_lock_irq(&prison->regions[l].lock); r = __bio_detain(&prison->regions[l].cell, key, inmate, cell_prealloc, cell_result); @@ -224,7 +225,7 @@ void dm_cell_release(struct dm_bio_prison *prison, struct dm_bio_prison_cell *cell, struct bio_list *bios) { - unsigned l = lock_nr(&cell->key); + unsigned l = lock_nr(&cell->key, prison->num_locks); spin_lock_irq(&prison->regions[l].lock); __cell_release(&prison->regions[l].cell, cell, bios); @@ -247,7 +248,7 @@ void dm_cell_release_no_holder(struct dm_bio_prison *prison, struct dm_bio_prison_cell *cell, struct bio_list *inmates) { - unsigned l = lock_nr(&cell->key); + unsigned l = lock_nr(&cell->key, prison->num_locks); unsigned long flags; spin_lock_irqsave(&prison->regions[l].lock, flags); @@ -277,7 +278,7 @@ void dm_cell_visit_release(struct dm_bio_prison *prison, void *context, struct dm_bio_prison_cell *cell) { - unsigned l = lock_nr(&cell->key); + unsigned l = lock_nr(&cell->key, prison->num_locks); spin_lock_irq(&prison->regions[l].lock); visit_fn(context, cell); rb_erase(&cell->node, &prison->regions[l].cell); @@ -301,7 +302,7 @@ int dm_cell_promote_or_release(struct dm_bio_prison *prison, struct dm_bio_prison_cell *cell) { int r; - unsigned l = lock_nr(&cell->key); + unsigned l = lock_nr(&cell->key, prison->num_locks); spin_lock_irq(&prison->regions[l].lock); r = __promote_or_release(&prison->regions[l].cell, cell); From patchwork Mon Mar 27 20:11:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 13190040 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A06C3C76195 for ; Mon, 27 Mar 2023 20:14:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679948048; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=oXvtm1dG6O+nFjZGE46hHa880ZJ1dRJ/rbBvqQDk0Jw=; b=T9+vSAhsOUAoE77ENt4aeu2WXuySS4IhqqmO641+r8UGnyoVJOy2kZlcWf4UaqNyu5GVVJ mYFq4nwH3zTIDLukIub4eK5dYnWpJH3vYo+ZPRTo1/Vi9fTzHoAFg7fL45Yiyv/4h1HISh 50EYQeNcraGI9Uu0NpEgVqIld5eIot4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-509-kvQ64tKTOp2mjOccezuFpw-1; Mon, 27 Mar 2023 16:14:07 -0400 X-MC-Unique: kvQ64tKTOp2mjOccezuFpw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3BFBA858297; Mon, 27 Mar 2023 20:14:05 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2A86CC15BA0; Mon, 27 Mar 2023 20:14:05 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 1FBF719472E8; Mon, 27 Mar 2023 20:13:53 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 6CAF619465B5 for ; Mon, 27 Mar 2023 20:13:42 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 510994020C82; Mon, 27 Mar 2023 20:13:42 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast06.extmail.prod.ext.rdu2.redhat.com [10.11.55.22]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 49F804020C84 for ; Mon, 27 Mar 2023 20:13:42 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2C723185A7B3 for ; Mon, 27 Mar 2023 20:13:42 +0000 (UTC) Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-259-p56DykNbP_aNb8eScZGr6w-3; Mon, 27 Mar 2023 16:13:40 -0400 X-MC-Unique: p56DykNbP_aNb8eScZGr6w-3 Received: by mail-qv1-f54.google.com with SMTP id jl13so7609672qvb.10 for ; Mon, 27 Mar 2023 13:13:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679948020; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mQyYjWMmcas91qVtcY8jvp2wBSEYoGBjkcnNGzLUcAg=; b=j44ccb8/XYEZrMqO0nZGUbN/Ta2sAveMC45WS5keOFwdWLagh4Ht/hZp8+RJUKwt+o eudAgTexbU5HuDxivIDL4JPJ07zYUWPjAAUhFMV439AwV2GFaHL3P8rGsPv5JG1TgIPn i89J+4M+/6Qvo+JyUdZey0WZkJFhCYe7oPG8FF2H9t6+oH0+UrV4EnvWq9V+BzwqGK7k AM+ihoAfFSTVPZvfsp85KyY6wWkCuGlw2NwIuLkijV25F3Ly04ctJxKsfZE/LuJ8yD/m cKL+gvmsYzSmquyAnLUzygLTuyr2rDFHyvg6/dl1Z/diOTBI2cZDmyXAQgXkzjVEXDfk rfRw== X-Gm-Message-State: AAQBX9dOuVguOnPowkMEmRrgXDbXqRSutlXk+nFTN4hG28SifnrLzz/w zpTZGfNMf8pcxqgFkV4N8yRcdohjP13TW7aRs3/LXyKlxiw2GZRvseCtObKxP0njdb+mSsQbw4w 8sjGYrEXdxJu2jeH/DRNa5PKfim3wFbBRd0uiYbqhZC3JTZJ97OOwYFUCjlBxyBfqgSWanxvHBW U= X-Google-Smtp-Source: AKy350aUOhS5a7pfiFoOC/vCDrOmXamVaEKNC4KXqyNQGje3o8Ci+KX+zebywjT0o5kNzZBKf3poOw== X-Received: by 2002:a05:6214:2aa6:b0:56c:376:3191 with SMTP id js6-20020a0562142aa600b0056c03763191mr21114410qvb.44.1679948019857; Mon, 27 Mar 2023 13:13:39 -0700 (PDT) Received: from localhost (pool-68-160-166-30.bstnma.fios.verizon.net. [68.160.166.30]) by smtp.gmail.com with ESMTPSA id a8-20020a0ce908000000b005dd8b9345a3sm3224293qvo.59.2023.03.27.13.13.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Mar 2023 13:13:39 -0700 (PDT) From: Mike Snitzer To: dm-devel@redhat.com Date: Mon, 27 Mar 2023 16:11:43 -0400 Message-Id: <20230327201143.51026-21-snitzer@kernel.org> In-Reply-To: <20230327201143.51026-1-snitzer@kernel.org> References: <20230327201143.51026-1-snitzer@kernel.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Subject: [dm-devel] [dm-6.4 PATCH v3 20/20] dm bio prison v1: intelligently size dm_bio_prison's prison_regions X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: axboe@kernel.dk, ebiggers@kernel.org, keescook@chromium.org, heinzm@redhat.com, Mike Snitzer , nhuck@google.com, linux-block@vger.kernel.org, ejt@redhat.com, mpatocka@redhat.com, luomeng12@huawei.com Errors-To: dm-devel-bounces@redhat.com Sender: "dm-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kernel.org Size the dm_bio_prison's number of prison_region structs using dm_num_sharded_locks(). Signed-off-by: Mike Snitzer --- drivers/md/dm-bio-prison-v1.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-bio-prison-v1.c b/drivers/md/dm-bio-prison-v1.c index a7930ad1878b..1d560a83b144 100644 --- a/drivers/md/dm-bio-prison-v1.c +++ b/drivers/md/dm-bio-prison-v1.c @@ -16,7 +16,6 @@ /*----------------------------------------------------------------*/ -#define NR_LOCKS 64 #define MIN_CELLS 1024 struct prison_region { @@ -27,7 +26,7 @@ struct prison_region { struct dm_bio_prison { mempool_t cell_pool; unsigned int num_locks; - struct prison_region regions[NR_LOCKS]; + struct prison_region regions[]; }; static struct kmem_cache *_cell_cache; @@ -41,12 +40,14 @@ static struct kmem_cache *_cell_cache; struct dm_bio_prison *dm_bio_prison_create(void) { int ret; - unsigned i; - struct dm_bio_prison *prison = kzalloc(sizeof(*prison), GFP_KERNEL); + unsigned int i, num_locks; + struct dm_bio_prison *prison; + num_locks = dm_num_sharded_locks(); + prison = kzalloc(struct_size(prison, regions, num_locks), GFP_KERNEL); if (!prison) return NULL; - prison->num_locks = NR_LOCKS; + prison->num_locks = num_locks; for (i = 0; i < prison->num_locks; i++) { spin_lock_init(&prison->regions[i].lock);