From patchwork Sun Mar 9 22:28:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009054 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DFFA81E5200 for ; Sun, 9 Mar 2025 22:29:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559354; cv=none; b=kFTPSt+34N18OVZkWvRH9GcvC9fQj7G2HvWVkP5EH8RC1ZYezyBVYOuolFez2DiFIwAy+l4DPTih0XXJIFVY35qqhGrxy/uHKOj5AgV1NnBBd7QIpW6m/M5PB+6OJNPEcVn3acUt55kxdJcOTs6+t+AMEUOCHfGdvnDcVXCd64k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559354; c=relaxed/simple; bh=Nit5eGBZo36ImOkkw0d8GTm2439IuMLkPgxUWZZjClo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MEbMcSqNQeWPY/ghL8oVLdl6b+Dlr10t0L7pwybhYSMzMz3cfwO6A5FvvVAwgETcv6/S3+r8R8w33ENcUBFvJo2t78mz/kyLA30gJuYi/zBEAvO5vYxJYYUVqVDJQA5L+vhQt3eA8iMVcqMtpnZ14zCEermDETyBDIYSDd839Ws= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=YeZczslR; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YeZczslR" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559351; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0wzDrXohFD8kfHjsv0TqvsdLEWh7a1xlYQ7Ha0uawxU=; b=YeZczslRDBbTm2UP3ylWpNR+EZB3V9aK2GWkIEUGjszKACAAwF+xhdxqS3OqgWdxDrWO4+ 9/ylaPNTn37N6g2JMEIKEgqhBFTXDQ1ni6sIJCzxFUmFuQEOPXSXsEv7KQS0YrNlJx6W0z pWVi9yGM9FcCjSavs6f7NfQJB78KEYA= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-593-Cp1syWSWMu-ro1OwATYxjQ-1; Sun, 09 Mar 2025 18:29:08 -0400 X-MC-Unique: Cp1syWSWMu-ro1OwATYxjQ-1 X-Mimecast-MFC-AGG-ID: Cp1syWSWMu-ro1OwATYxjQ_1741559347 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3238D1956087; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5BAB41956094; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT4Nk449823 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:04 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT4jW449822; Sun, 9 Mar 2025 18:29:04 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 1/7] dm: don't change md if dm_table_set_restrictions() fails Date: Sun, 9 Mar 2025 18:28:57 -0400 Message-ID: <20250309222904.449803-2-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: dm-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 __bind was changing the disk capacity, geometry and mempools of the mapped device before calling dm_table_set_restrictions() which could fail, forcing dm to drop the new table. Failing here would leave the device using the old table but with the wrong capacity and mempools. Move dm_table_set_restrictions() earlier in __bind(). Since it needs the capacity to be set, save the old version and restore it on failure. Signed-off-by: Benjamin Marzinski Reviewed-by: Damien Le Moal --- drivers/md/dm.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 5ab7574c0c76..f5c5ccb6f8d2 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2421,21 +2421,29 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, struct queue_limits *limits) { struct dm_table *old_map; - sector_t size; + sector_t size, old_size; int ret; lockdep_assert_held(&md->suspend_lock); size = dm_table_get_size(t); + old_size = dm_get_size(md); + set_capacity(md->disk, size); + + ret = dm_table_set_restrictions(t, md->queue, limits); + if (ret) { + set_capacity(md->disk, old_size); + old_map = ERR_PTR(ret); + goto out; + } + /* * Wipe any geometry if the size of the table changed. */ - if (size != dm_get_size(md)) + if (size != old_size) memset(&md->geometry, 0, sizeof(md->geometry)); - set_capacity(md->disk, size); - dm_table_event_callback(t, event_callback, md); if (dm_table_request_based(t)) { @@ -2468,12 +2476,6 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, t->mempools = NULL; } - ret = dm_table_set_restrictions(t, md->queue, limits); - if (ret) { - old_map = ERR_PTR(ret); - goto out; - } - old_map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); rcu_assign_pointer(md->map, (void *)t); md->immutable_target_type = dm_table_get_immutable_target_type(t); From patchwork Sun Mar 9 22:28:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009058 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D6DD21D5B2 for ; Sun, 9 Mar 2025 22:29:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559356; cv=none; b=Fvc9ieHd2kHFsoFDEaa89NsptAmFXUVlWIGZ8zRi89egLAVGOcjOeLs7JeareMrMXSi9K0y/GP+HEnzmHPOc4zJR4JMzm0xM1bL6Nm3tBDvw2Jg+168fwzprrtcBt3SqaQG48SzLUwyX9E2xwFn+FqakPkvYSAiI4im1hQbj0Mg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559356; c=relaxed/simple; bh=43PbpmJfSWvOJz5G2Rc9sL2YRH0HfTiDwebBDoggemA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YDHiWyKiwrv6HKxF7Qid7UCkEFyC+aGd4inImg/k+UDa0y/Ip1Nn/aDbYLOv9HllEgndAVqOS3GQ7oGkxUdu3UoqiT351ilXIhIsaArWZR6CUKri76evry8YoFGFybTmqqg2hCV6QgFz8CQvWiDje8e2DEImG/90oEHixbISdlk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=BeZssOPv; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="BeZssOPv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559354; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=er4MSknSNs1OnoVrpqZVQCru3VYOq3xFx9b87bjxhnw=; b=BeZssOPvnmxFi7pj9hETJkRwsz45epdDQv5veYrFYSRAWjyl+Ay2K/r2qae8+W6EHJPm6X Jr9QLZsc1rfLjMb1X+mKZNLaUiGOJLuEiRx6sstaVkDgX8zd/jiU5Yz5eFAaCBPdYqWlk5 22+NQCSgq+0+S2yOnpqRK9tpqobwQLw= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-554-8yFV8Eb7PuCWevlJRNI4Hw-1; Sun, 09 Mar 2025 18:29:08 -0400 X-MC-Unique: 8yFV8Eb7PuCWevlJRNI4Hw-1 X-Mimecast-MFC-AGG-ID: 8yFV8Eb7PuCWevlJRNI4Hw_1741559347 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 90C0D19560B3; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 484301956096; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT45Y449827 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:04 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT4qM449826; Sun, 9 Mar 2025 18:29:04 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 2/7] dm: free table mempools if not used in __bind Date: Sun, 9 Mar 2025 18:28:58 -0400 Message-ID: <20250309222904.449803-3-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: dm-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 With request-based dm, the mempools don't need reloading when switching tables, but the unused table mempools are not freed until the active table is finally freed. Free them immediately if they are not needed. Signed-off-by: Benjamin Marzinski Reviewed-by: Damien Le Moal --- drivers/md/dm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index f5c5ccb6f8d2..292414da871d 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2461,10 +2461,10 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, * requests in the queue may refer to bio from the old bioset, * so you must walk through the queue to unprep. */ - if (!md->mempools) { + if (!md->mempools) md->mempools = t->mempools; - t->mempools = NULL; - } + else + dm_free_md_mempools(t->mempools); } else { /* * The md may already have mempools that need changing. @@ -2473,8 +2473,8 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, */ dm_free_md_mempools(md->mempools); md->mempools = t->mempools; - t->mempools = NULL; } + t->mempools = NULL; old_map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); rcu_assign_pointer(md->map, (void *)t); From patchwork Sun Mar 9 22:28:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009059 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E10CE1F585D for ; Sun, 9 Mar 2025 22:29:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559357; cv=none; b=dIglyaG6LzjqwL75aBpv95VIWfN9KMSUYQvIQp2fmiJpFItliAYtkAVWUfoO6lUl/pP3XWLVb6rMRrGO8TfLcbuBCkacuGX2fAobLyGDruda7uDJHV4rT/JWct71U3xpjKmuzBXI2BrZ8SvCKmn7BrT931HQRHnF8Dr0Hq4xtcI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559357; c=relaxed/simple; bh=xjqrQy3/c2CAxNKpBKKoixokmpaWgzY9JitddRa4JL0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=t/NStn1SGw+L4KFL+A4eRTwAlUqmT80VvhRvOTMRpVI4Vfx7c3twFMNeXvN1buSLeWlFXrrnHWzawDKywsc8r4hQ2gQ8G1G9fIKGG/MtDuzR0Y0TCthxASwlwxohrfUTuh7ju9sALsmwisayOf9S/UUfBhnF1wwNWBDTPEm04zA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Re3qCzHY; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Re3qCzHY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559354; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mBTya5jDAn2ECaAH85zdtyIYqLGYimiGN42zC1yyuhw=; b=Re3qCzHYDKWfUdZcUZJ824Cp6MpjVHjq1igRZtsmgmz0I/U/sralJxsBbvkw6hehgsCjt/ 5tzx1bAwPByqgUO97fhvoDXbCHgWACSSwlFXA6CPXOGEcqGBt5B5V8Dm3FxWde+iRrR3Sn CxsdyWB+ZvmyTCauOeiFnRCaw9VnkJw= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-382-inuSz8IPP6ShbcW1Lv_jiA-1; Sun, 09 Mar 2025 18:29:09 -0400 X-MC-Unique: inuSz8IPP6ShbcW1Lv_jiA-1 X-Mimecast-MFC-AGG-ID: inuSz8IPP6ShbcW1Lv_jiA_1741559347 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 756F719560AB; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9B2EB19560AD; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5eW449831 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT5EM449830; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 3/7] dm: handle failures in dm_table_set_restrictions Date: Sun, 9 Mar 2025 18:28:59 -0400 Message-ID: <20250309222904.449803-4-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: dm-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 If dm_table_set_restrictions() fails while swapping tables, device-mapper will continue using the previous table. It must be sure to leave the mapped_device in it's previous state on failure. Otherwise device-mapper could end up using the old table with settings from the unused table. Do not update the mapped device in dm_set_zones_restrictions(). Wait till after dm_table_set_restrictions() is sure to succeed to update the md zoned settings. Do the same with the dax settings, and if dm_revalidate_zones() fails, restore the original queue limits. Signed-off-by: Benjamin Marzinski --- drivers/md/dm-table.c | 24 ++++++++++++++++-------- drivers/md/dm-zone.c | 26 ++++++++++++++++++-------- drivers/md/dm.h | 1 + 3 files changed, 35 insertions(+), 16 deletions(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 0ef5203387b2..4003e84af11d 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1836,6 +1836,7 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, struct queue_limits *limits) { int r; + struct queue_limits old_limits; if (!dm_table_supports_nowait(t)) limits->features &= ~BLK_FEAT_NOWAIT; @@ -1862,16 +1863,11 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (dm_table_supports_flush(t)) limits->features |= BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA; - if (dm_table_supports_dax(t, device_not_dax_capable)) { + if (dm_table_supports_dax(t, device_not_dax_capable)) limits->features |= BLK_FEAT_DAX; - if (dm_table_supports_dax(t, device_not_dax_synchronous_capable)) - set_dax_synchronous(t->md->dax_dev); - } else + else limits->features &= ~BLK_FEAT_DAX; - if (dm_table_any_dev_attr(t, device_dax_write_cache_enabled, NULL)) - dax_write_cache(t->md->dax_dev, true); - /* For a zoned table, setup the zone related queue attributes. */ if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && (limits->features & BLK_FEAT_ZONED)) { @@ -1883,6 +1879,7 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (dm_table_supports_atomic_writes(t)) limits->features |= BLK_FEAT_ATOMIC_WRITES; + old_limits = q->limits; r = queue_limits_set(q, limits); if (r) return r; @@ -1894,10 +1891,21 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && (limits->features & BLK_FEAT_ZONED)) { r = dm_revalidate_zones(t, q); - if (r) + if (r) { + queue_limits_set(q, &old_limits); return r; + } } + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED)) + dm_finalize_zone_settings(t, limits); + + if (dm_table_supports_dax(t, device_not_dax_synchronous_capable)) + set_dax_synchronous(t->md->dax_dev); + + if (dm_table_any_dev_attr(t, device_dax_write_cache_enabled, NULL)) + dax_write_cache(t->md->dax_dev, true); + dm_update_crypto_profile(q, t); return 0; } diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c index 20edd3fabbab..681058feb63b 100644 --- a/drivers/md/dm-zone.c +++ b/drivers/md/dm-zone.c @@ -340,12 +340,8 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, * mapped device queue as needing zone append emulation. */ WARN_ON_ONCE(queue_is_mq(q)); - if (dm_table_supports_zone_append(t)) { - clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); - } else { - set_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); + if (!dm_table_supports_zone_append(t)) lim->max_hw_zone_append_sectors = 0; - } /* * Determine the max open and max active zone limits for the mapped @@ -383,9 +379,6 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, lim->zone_write_granularity = 0; lim->chunk_sectors = 0; lim->features &= ~BLK_FEAT_ZONED; - clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); - md->nr_zones = 0; - disk->nr_zones = 0; return 0; } @@ -408,6 +401,23 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, return 0; } +void dm_finalize_zone_settings(struct dm_table *t, struct queue_limits *lim) +{ + struct mapped_device *md = t->md; + + if (lim->features & BLK_FEAT_ZONED) { + if (dm_table_supports_zone_append(t)) + clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); + else + set_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); + } else { + clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); + md->nr_zones = 0; + md->disk->nr_zones = 0; + } +} + + /* * IO completion callback called from clone_endio(). */ diff --git a/drivers/md/dm.h b/drivers/md/dm.h index a0a8ff119815..e5d3a9f46a91 100644 --- a/drivers/md/dm.h +++ b/drivers/md/dm.h @@ -102,6 +102,7 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t); int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, struct queue_limits *lim); int dm_revalidate_zones(struct dm_table *t, struct request_queue *q); +void dm_finalize_zone_settings(struct dm_table *t, struct queue_limits *lim); void dm_zone_endio(struct dm_io *io, struct bio *clone); #ifdef CONFIG_BLK_DEV_ZONED int dm_blk_report_zones(struct gendisk *disk, sector_t sector, From patchwork Sun Mar 9 22:29:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009053 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E2F219DF52 for ; Sun, 9 Mar 2025 22:29:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559354; cv=none; b=nUjmJZdm5Gj62aaFekcpK67k8HsGt1MVR2Y5cYH4pXkXA/dODZJOhjRnGl3gUbgf3AW1ipCp3dbaKBPo5/fryWtNQoL6Gp0hz3wn9AwIOh9DYbWeiLrQMiEqIIVuOKwTOGJB0+LYvEFGwUDy+YUbGXXsBEI26kCK8PkmB5CEYKs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559354; c=relaxed/simple; bh=c5TydlltQKhU70bZL1lOASWRAWYjrzCuQqBS3XoeHHY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Dk7HM5khdCQ07b1oVhX2RtrHiYq6K19BgKVBB5ebcNaSN7HU1ryez17SM66OcrARtb9zqZ8syNpGMXzYzgci5SE1wNbMzBPKAiOEts/1LykhKk9QEXsQcs4rpy92dd/jgsx+mtk+hAMjLuYLvpxOqVbOJnDj0FYZETZBWsKUFyQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=RxjbbLcT; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RxjbbLcT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559350; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EcNEMW+T3REMW3kEpfHgcJDYzQBJcYbEs/Gi1FvvCME=; b=RxjbbLcTPJEpwSq4Y4hdhR6axAXwWP8qQWmTD71SR7oPrzwSJzFc+NlWoGr1rYqWTPOG6g QMI8kDZrmmBYhVtWqcq5B74/Y/kHXK++pXwjc1JN6tGmszRtTC1UVL4Vcz5S1fH0DAS8o0 Pd1/ckcLIJjj7+Z2G8pNgQjTEJDYYWw= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-423-Wh1IeIuFOXyfI5pqEuCs6Q-1; Sun, 09 Mar 2025 18:29:08 -0400 X-MC-Unique: Wh1IeIuFOXyfI5pqEuCs6Q-1 X-Mimecast-MFC-AGG-ID: Wh1IeIuFOXyfI5pqEuCs6Q_1741559347 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8197519560B2; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9CC901800944; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5bU449835 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT56T449834; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 4/7] dm: fix dm_blk_report_zones Date: Sun, 9 Mar 2025 18:29:00 -0400 Message-ID: <20250309222904.449803-5-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: dm-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 If dm_get_live_table() returned NULL, dm_put_live_table() was never called. Also, if md->zone_revalidate_map is set, check that dm_blk_report_zones() is being called from the process that set it in __bind(). Otherwise the zone resources could change while accessing them. Finally, it is possible that md->zone_revalidate_map will change while calling this function. Only read it once, so that we are always using the same value. Otherwise we might miss a call to dm_put_live_table(). Signed-off-by: Benjamin Marzinski Reviewed-by: Damien Le Moal --- drivers/md/dm-core.h | 1 + drivers/md/dm-zone.c | 23 +++++++++++++++-------- 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h index 3637761f3585..f3a3f2ef6322 100644 --- a/drivers/md/dm-core.h +++ b/drivers/md/dm-core.h @@ -141,6 +141,7 @@ struct mapped_device { #ifdef CONFIG_BLK_DEV_ZONED unsigned int nr_zones; void *zone_revalidate_map; + struct task_struct *revalidate_map_task; #endif #ifdef CONFIG_IMA diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c index 681058feb63b..11e19281bb64 100644 --- a/drivers/md/dm-zone.c +++ b/drivers/md/dm-zone.c @@ -56,24 +56,29 @@ int dm_blk_report_zones(struct gendisk *disk, sector_t sector, { struct mapped_device *md = disk->private_data; struct dm_table *map; - int srcu_idx, ret; + struct dm_table *zone_revalidate_map = md->zone_revalidate_map; + int srcu_idx, ret = -EIO; - if (!md->zone_revalidate_map) { - /* Regular user context */ + if (!zone_revalidate_map || md->revalidate_map_task != current) { + /* + * Regular user context or + * Zone revalidation during __bind() is in progress, but this + * call is from a different process + */ if (dm_suspended_md(md)) return -EAGAIN; map = dm_get_live_table(md, &srcu_idx); - if (!map) - return -EIO; } else { /* Zone revalidation during __bind() */ - map = md->zone_revalidate_map; + map = zone_revalidate_map; } - ret = dm_blk_do_report_zones(md, map, sector, nr_zones, cb, data); + if (map) + ret = dm_blk_do_report_zones(md, map, sector, nr_zones, cb, + data); - if (!md->zone_revalidate_map) + if (!zone_revalidate_map) dm_put_live_table(md, srcu_idx); return ret; @@ -175,7 +180,9 @@ int dm_revalidate_zones(struct dm_table *t, struct request_queue *q) * our table for dm_blk_report_zones() to use directly. */ md->zone_revalidate_map = t; + md->revalidate_map_task = current; ret = blk_revalidate_disk_zones(disk); + md->revalidate_map_task = NULL; md->zone_revalidate_map = NULL; if (ret) { From patchwork Sun Mar 9 22:29:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009055 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A577D21ABA0 for ; Sun, 9 Mar 2025 22:29:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; cv=none; b=Pq7y53ivtfY2iNadv+y8nK1EJ76JIWhov4XBwiQvxxdrPayi6ijmqZX2xG3jb0oL4F6sVNTsViNPlmT1km2uHJqP1Mi+o1KRkaT9+yJsHpflQcs7qH0YoEx2DDcDpnJrcRFphvIqeAWidVGPSc5urR0KNV+e+35ugxMG1BOuLJc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; c=relaxed/simple; bh=RqpEUDKRpX+/sxS+/cAxze8je/z92gmcAuFdiNF8+no=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pVa8Dq1VSNyz7alJViOtYXIs2I09P9I3gsE+gQNSvwnkm35b6TMhCO907Fw13qTDd+UsP2Eds4Y1X+mNdelNJS12mdpJeUWMG5BrET2s3FrkQBHTfH2KzNRKJChlhH11AkXvG8X4MEnX+RiYdwk6VsvIlydZzhIWOpfbxTk2Vyw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DcgRHsXr; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DcgRHsXr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559352; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y8fv7Z1eNPyvos22f4cRSlyF9ihNCLNiBcEZgwkwgFQ=; b=DcgRHsXrBKBURTzET+YQHInE3b6mmP4TiljnuvZYDt6BiWPzLtmJs8EvLBEdsoJFur9yJ1 w3mGc+2e2zPLhjOl5wp7G95Y60/8qoQtY/rbI6Kxcp5rHFFqI4mYFIgi7HWRhJptUfDQ9U b2N9p5KPiJNqjV8vala9xMNHKzcVI2c= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-447-wGI-mortPM-MAOIgFdLTeA-1; Sun, 09 Mar 2025 18:29:09 -0400 X-MC-Unique: wGI-mortPM-MAOIgFdLTeA-1 X-Mimecast-MFC-AGG-ID: wGI-mortPM-MAOIgFdLTeA_1741559348 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B46481800258; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 17A7E1956095; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5Ce449839 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT5um449838; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 5/7] blk-zoned: clean up zone settings for devices without zwplugs Date: Sun, 9 Mar 2025 18:29:01 -0400 Message-ID: <20250309222904.449803-6-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: dm-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 Previously disk_free_zone_resources() would only clean up zoned settings on a disk if the disk had write plugs allocated. Make it clean up zoned settings on any disk, so disks that don't allocate write plugs can use it as well. Signed-off-by: Benjamin Marzinski --- block/blk-zoned.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 761ea662ddc3..d7dc89cbdccb 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -1363,24 +1363,23 @@ static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk, void disk_free_zone_resources(struct gendisk *disk) { - if (!disk->zone_wplugs_pool) - return; - - if (disk->zone_wplugs_wq) { - destroy_workqueue(disk->zone_wplugs_wq); - disk->zone_wplugs_wq = NULL; - } + if (disk->zone_wplugs_pool) { + if (disk->zone_wplugs_wq) { + destroy_workqueue(disk->zone_wplugs_wq); + disk->zone_wplugs_wq = NULL; + } - disk_destroy_zone_wplugs_hash_table(disk); + disk_destroy_zone_wplugs_hash_table(disk); - /* - * Wait for the zone write plugs to be RCU-freed before - * destorying the mempool. - */ - rcu_barrier(); + /* + * Wait for the zone write plugs to be RCU-freed before + * destorying the mempool. + */ + rcu_barrier(); - mempool_destroy(disk->zone_wplugs_pool); - disk->zone_wplugs_pool = NULL; + mempool_destroy(disk->zone_wplugs_pool); + disk->zone_wplugs_pool = NULL; + } disk_set_conv_zones_bitmap(disk, NULL); disk->zone_capacity = 0; From patchwork Sun Mar 9 22:29:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009056 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62BAA1F585D for ; Sun, 9 Mar 2025 22:29:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; cv=none; b=g/Ds/b2Bk3blR7S50Z7PA6eYOtI8ZUQbLINLI9wwS1OQ+RzW+bkk2kBBNr/CZuzv0BUOIZe+4BP69EAAUFwEz01y4DnVSpqvt5kI7rusTGW7jClHd4+gjw8EXaV1gl9DaiXpMWqzdRUqKwl1srtQr45U/orcYYEZh0esXRlTuTE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; c=relaxed/simple; bh=wEBY7Oj29M2D0Dmp362L0Hp7USy1ZPKfP0Cp+xoOaHk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=e64in/IZgp3vVEDAJBQ6fLpjgGYUBdDK9Gno7dD7c249YzsR0COS3rgdcN/0aMQ76oukRjePPLk63Rc0xv7Io59fmAcL0CQIfr3PAfBzvsOQkNCqkOxzGw/e7/eqVx6+z7TPa61cU+8WAemyWRs1TYYIO5itXBIaipxPiLmv8g4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DPM36FLN; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DPM36FLN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559352; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yTmrroXbLUYBu+54zjexbe1SJ0Kvrf2MRZCMFkFLWhY=; b=DPM36FLNP3a/Tg2oVlshjHx9OARRU66V2EiUcNBGBTurRR/W0YMy+in88OVuZPn1X8xnKJ aMxSBYXe4Qi45LBIUVDExNEDhJKR3logl2CEPSJnejHHx+5n7Trs3s4nOiYiPFE7wgLGto J3NVYya17oamehHDJzQN/gtyTvduODM= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-679-TCwUvQ0qM1edNUrtcDfbuA-1; Sun, 09 Mar 2025 18:29:08 -0400 X-MC-Unique: TCwUvQ0qM1edNUrtcDfbuA-1 X-Mimecast-MFC-AGG-ID: TCwUvQ0qM1edNUrtcDfbuA_1741559347 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CF90219560A3; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D6C511955BCB; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5po449843 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT5b8449842; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [RFC PATCH 6/7] blk-zoned: modify blk_revalidate_disk_zones for bio-based drivers Date: Sun, 9 Mar 2025 18:29:02 -0400 Message-ID: <20250309222904.449803-7-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: dm-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 If device-mapper is swapping zoned tables, and blk_revalidate_disk_zones() fails. It must retain its current zoned resources since device-mapper will be failing back to using the previous table and the zoned settings need to match the table. Allocating unnecessary zwplugs is acceptable, but the zoned configuration must not change. Otherwise it can run into errors like bdev_zone_is_seq() reading invalid memory because disk->conv_zones_bitmap is the wrong size. However if device-mapper did not previously have a zoned table, it should free up the zoned resources, instead of leaving them allocated and unused. To solve this, do not free the zone resources when blk_revalidate_disk_zones() fails for bio based drivers. Additionally, delay copying the zoned settings to the gendisk until disk_update_zone_resources() can no longer fail, and do not freeze the queue for bio-based drivers, since this will hang if there are any plugged zone write bios. Also, export disk_free_zone_resources() so that device-mapper can choose when to free the zoned resources. Signed-off-by: Benjamin Marzinski --- block/blk-zoned.c | 49 +++++++++++++++++++++++------------------- block/blk.h | 4 ---- drivers/md/dm-zone.c | 3 +++ include/linux/blkdev.h | 4 ++++ 4 files changed, 34 insertions(+), 26 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index d7dc89cbdccb..3bec289d27db 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -1343,22 +1343,17 @@ static void disk_destroy_zone_wplugs_hash_table(struct gendisk *disk) disk->zone_wplugs_hash_bits = 0; } -static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk, - unsigned long *bitmap) +static void disk_set_conv_zones_bitmap(struct gendisk *disk, + unsigned long *bitmap) { - unsigned int nr_conv_zones = 0; unsigned long flags; spin_lock_irqsave(&disk->zone_wplugs_lock, flags); - if (bitmap) - nr_conv_zones = bitmap_weight(bitmap, disk->nr_zones); bitmap = rcu_replace_pointer(disk->conv_zones_bitmap, bitmap, lockdep_is_held(&disk->zone_wplugs_lock)); spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); kfree_rcu_mightsleep(bitmap); - - return nr_conv_zones; } void disk_free_zone_resources(struct gendisk *disk) @@ -1386,6 +1381,7 @@ void disk_free_zone_resources(struct gendisk *disk) disk->last_zone_capacity = 0; disk->nr_zones = 0; } +EXPORT_SYMBOL_GPL(disk_free_zone_resources); static inline bool disk_need_zone_resources(struct gendisk *disk) { @@ -1434,24 +1430,23 @@ struct blk_revalidate_zone_args { /* * Update the disk zone resources information and device queue limits. - * The disk queue is frozen when this is executed. + * The disk queue is frozen when this is executed on blk-mq drivers. */ static int disk_update_zone_resources(struct gendisk *disk, struct blk_revalidate_zone_args *args) { struct request_queue *q = disk->queue; - unsigned int nr_seq_zones, nr_conv_zones; + unsigned int nr_seq_zones, nr_conv_zones = 0; unsigned int pool_size; struct queue_limits lim; + int ret; - disk->nr_zones = args->nr_zones; - disk->zone_capacity = args->zone_capacity; - disk->last_zone_capacity = args->last_zone_capacity; - nr_conv_zones = - disk_set_conv_zones_bitmap(disk, args->conv_zones_bitmap); - if (nr_conv_zones >= disk->nr_zones) { + if (args->conv_zones_bitmap) + nr_conv_zones = bitmap_weight(args->conv_zones_bitmap, + args->nr_zones); + if (nr_conv_zones >= args->nr_zones) { pr_warn("%s: Invalid number of conventional zones %u / %u\n", - disk->disk_name, nr_conv_zones, disk->nr_zones); + disk->disk_name, nr_conv_zones, args->nr_zones); return -ENODEV; } @@ -1463,7 +1458,7 @@ static int disk_update_zone_resources(struct gendisk *disk, * small ZNS namespace. For such case, assume that the zoned device has * no zone resource limits. */ - nr_seq_zones = disk->nr_zones - nr_conv_zones; + nr_seq_zones = args->nr_zones - nr_conv_zones; if (lim.max_open_zones >= nr_seq_zones) lim.max_open_zones = 0; if (lim.max_active_zones >= nr_seq_zones) @@ -1493,7 +1488,19 @@ static int disk_update_zone_resources(struct gendisk *disk, } commit: - return queue_limits_commit_update_frozen(q, &lim); + if (queue_is_mq(disk->queue)) + ret = queue_limits_commit_update_frozen(q, &lim); + else + ret = queue_limits_commit_update(q, &lim); + + if (!ret) { + disk->nr_zones = args->nr_zones; + disk->zone_capacity = args->zone_capacity; + disk->last_zone_capacity = args->last_zone_capacity; + disk_set_conv_zones_bitmap(disk, args->conv_zones_bitmap); + } + + return ret; } static int blk_revalidate_conv_zone(struct blk_zone *zone, unsigned int idx, @@ -1648,8 +1655,6 @@ static int blk_revalidate_zone_cb(struct blk_zone *zone, unsigned int idx, * and when the zone configuration of the gendisk changes (e.g. after a format). * Before calling this function, the device driver must already have set the * device zone size (chunk_sector limit) and the max zone append limit. - * BIO based drivers can also use this function as long as the device queue - * can be safely frozen. */ int blk_revalidate_disk_zones(struct gendisk *disk) { @@ -1709,13 +1714,13 @@ int blk_revalidate_disk_zones(struct gendisk *disk) /* * Set the new disk zone parameters only once the queue is frozen and - * all I/Os are completed. + * all I/Os are completed on blk-mq drivers. */ if (ret > 0) ret = disk_update_zone_resources(disk, &args); else pr_warn("%s: failed to revalidate zones\n", disk->disk_name); - if (ret) { + if (ret && queue_is_mq(disk->queue)) { unsigned int memflags = blk_mq_freeze_queue(q); disk_free_zone_resources(disk); diff --git a/block/blk.h b/block/blk.h index 90fa5f28ccab..c84af503b77b 100644 --- a/block/blk.h +++ b/block/blk.h @@ -454,7 +454,6 @@ static inline struct bio *blk_queue_bounce(struct bio *bio, #ifdef CONFIG_BLK_DEV_ZONED void disk_init_zone_resources(struct gendisk *disk); -void disk_free_zone_resources(struct gendisk *disk); static inline bool bio_zone_write_plugging(struct bio *bio) { return bio_flagged(bio, BIO_ZONE_WRITE_PLUGGING); @@ -500,9 +499,6 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, static inline void disk_init_zone_resources(struct gendisk *disk) { } -static inline void disk_free_zone_resources(struct gendisk *disk) -{ -} static inline bool bio_zone_write_plugging(struct bio *bio) { return false; diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c index 11e19281bb64..ac86011640c3 100644 --- a/drivers/md/dm-zone.c +++ b/drivers/md/dm-zone.c @@ -159,6 +159,7 @@ int dm_revalidate_zones(struct dm_table *t, struct request_queue *q) struct mapped_device *md = t->md; struct gendisk *disk = md->disk; int ret; + bool was_zoned = disk->nr_zones != 0; if (!get_capacity(disk)) return 0; @@ -187,6 +188,8 @@ int dm_revalidate_zones(struct dm_table *t, struct request_queue *q) if (ret) { DMERR("Revalidate zones failed %d", ret); + if (!was_zoned) + disk_free_zone_resources(disk); return ret; } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 248416ecd01c..51edf35ff715 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -690,12 +690,16 @@ static inline bool blk_queue_is_zoned(struct request_queue *q) } #ifdef CONFIG_BLK_DEV_ZONED +void disk_free_zone_resources(struct gendisk *disk); static inline unsigned int disk_nr_zones(struct gendisk *disk) { return disk->nr_zones; } bool blk_zone_plug_bio(struct bio *bio, unsigned int nr_segs); #else /* CONFIG_BLK_DEV_ZONED */ +static inline void disk_free_zone_resources(struct gendisk *disk) +{ +} static inline unsigned int disk_nr_zones(struct gendisk *disk) { return 0; From patchwork Sun Mar 9 22:29:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009057 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBFB219DF52 for ; Sun, 9 Mar 2025 22:29:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559356; cv=none; b=kJb/8vZCXYD/0sHWKIxy73de+UhQndMVipakvmDNxHFYMNEK3Y1AsTI1nipaJCnXwoOSUPdOqGmHENgWlv3dqq79ZcOyZtfY8XE/XGOZHg+3dyzg/a+aI+6egyGzjPtwktKaYDknsVfOubzEdLum73C7aa7lThgIWOt+IkENTwQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559356; c=relaxed/simple; bh=NQESKpUO4ZMrFMIoAIQEI3ZvDpobn+WJnUTsMH0BTGI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JLbIEjHkjfUF0zURrtDE4xnHgPyEGYRI25tlvphWPGALNwaQPCW2VVCLl+5ssGFwPE6a980YKaTSZIEFubG6C7mCcV4ANp8ui5W32DRzoCW+/tWW8+WojQCefbBLgpmmayjrdPUEEAw/LPGVgq05kA05OztHhwcDj/TGJW7Qmew= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EXYUDPo3; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EXYUDPo3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559354; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fhIqTYUkB46Mb29OKVAlh9WVdW1uNLVb6c9d10hk6Co=; b=EXYUDPo3kgTrRnS9qFNytlTenMYwUCrwQqaXcx9oPe4GOLnviK6QBeOJNwC2VfQmZomaNg lw4ZomMnfDMCA52YU5Z6HtnhhGv3V1+UZqdf8aeNWjFKdUcRmg6s1ezdp9vuh3jnFquZMO spuVOPsymUiB/j1UTS0uJWbHa2gC9ss= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-649-UxP6utj-Mgefkedba8sHLw-1; Sun, 09 Mar 2025 18:29:09 -0400 X-MC-Unique: UxP6utj-Mgefkedba8sHLw-1 X-Mimecast-MFC-AGG-ID: UxP6utj-Mgefkedba8sHLw_1741559348 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6B8701955D4B; Sun, 9 Mar 2025 22:29:08 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6B2CD180AF71; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5ZC449847 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT5Df449846; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [RFC PATCH 7/7] dm: allow devices to revalidate existing zones Date: Sun, 9 Mar 2025 18:29:03 -0400 Message-ID: <20250309222904.449803-8-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: dm-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 dm_revalidate_zones() only allowed devices that had no zone resources set up to call blk_revalidate_disk_zones(). If the device already had zone resources, disk->nr_zones would always equal md->nr_zones so dm_revalidate_zones() returned without doing any work. Instead, always call blk_revalidate_disk_zones() if you are loading a new zoned table. However, if the device emulates zone append operations and already has zone append emulation resources, the table size cannot change when loading a new table. Otherwise, all those resources will be garbage. If emulated zone append operations are needed and the zone write pointer offsets of the new table do not match those of the old table, writes to the device will still fail. This patch allows users to safely grow and shrink zone devices. But swapping arbitrary zoned tables will still not work. Signed-off-by: Benjamin Marzinski --- drivers/md/dm-zone.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c index ac86011640c3..7e9ebeee7eac 100644 --- a/drivers/md/dm-zone.c +++ b/drivers/md/dm-zone.c @@ -164,16 +164,8 @@ int dm_revalidate_zones(struct dm_table *t, struct request_queue *q) if (!get_capacity(disk)) return 0; - /* Revalidate only if something changed. */ - if (!disk->nr_zones || disk->nr_zones != md->nr_zones) { - DMINFO("%s using %s zone append", - disk->disk_name, - queue_emulates_zone_append(q) ? "emulated" : "native"); - md->nr_zones = 0; - } - - if (md->nr_zones) - return 0; + DMINFO("%s using %s zone append", disk->disk_name, + queue_emulates_zone_append(q) ? "emulated" : "native"); /* * Our table is not live yet. So the call to dm_get_live_table() @@ -392,6 +384,17 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, return 0; } + /* + * If the device needs zone append emulation, and the device already has + * zone append emulation resources, make sure that the chunk_sectors + * hasn't changed size. Otherwise those resources will be garbage. + */ + if (!lim->max_hw_zone_append_sectors && disk->zone_wplugs_hash && + q->limits.chunk_sectors != lim->chunk_sectors) { + DMERR("Cannot change zone size when swapping tables"); + return -EINVAL; + } + /* * Warn once (when the capacity is not yet set) if the mapped device is * partially using zone resources of the target devices as that leads to