From patchwork Sun Mar 9 22:28:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009060 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E299380 for ; Sun, 9 Mar 2025 22:29:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559353; cv=none; b=ixSDDTXEDCdBYHpH4tve8o8DGSwsxF3Kt2Xid2gWv6O957iv6BXp+0vSHgf1dITvbniYuKIX92oDQOEg/v6FTXS6hVa9z3XI1UTiEo1rTlmRxoJxggpRic6DxGvmWE+3w3YNVqY6JdNSF4LZ//veJnJNU+u3YYgNAeiAIVfASQE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559353; c=relaxed/simple; bh=Nit5eGBZo36ImOkkw0d8GTm2439IuMLkPgxUWZZjClo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=srYUedNpjhGlJnq7kAAfd20vqtjqfJDQpX6bEBx0CPtbJwcku0TwdM7MlDL4S1qYWd6pIhAYiumNAIclcg2MkOczuikhD8Opd4DOuEkmn5+eMACxnWjw2xIu7s7ZkCm+7WephHC2XSl85LxXiuVs8ZhtFxbtkBPJsoZe3L0z2Vc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=erG4tnS9; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="erG4tnS9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559350; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0wzDrXohFD8kfHjsv0TqvsdLEWh7a1xlYQ7Ha0uawxU=; b=erG4tnS9Z63+psMbkmlb5/YUFuwBFQ/VL2VayN1ot6eR+r0hUvFPjYsW5fP7UjbfuvOb8W JaH8GK5zPEfo3hXVb+ihlRnMmV66GbFB9hbtJlUh64Q2IrD0ATfFVRrhvNfc+Q/9oYNg2H cX2DxGVLFOcHM/6rQgh8QZFN7KVkbKM= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-593-Cp1syWSWMu-ro1OwATYxjQ-1; Sun, 09 Mar 2025 18:29:08 -0400 X-MC-Unique: Cp1syWSWMu-ro1OwATYxjQ-1 X-Mimecast-MFC-AGG-ID: Cp1syWSWMu-ro1OwATYxjQ_1741559347 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3238D1956087; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5BAB41956094; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT4Nk449823 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:04 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT4jW449822; Sun, 9 Mar 2025 18:29:04 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 1/7] dm: don't change md if dm_table_set_restrictions() fails Date: Sun, 9 Mar 2025 18:28:57 -0400 Message-ID: <20250309222904.449803-2-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 __bind was changing the disk capacity, geometry and mempools of the mapped device before calling dm_table_set_restrictions() which could fail, forcing dm to drop the new table. Failing here would leave the device using the old table but with the wrong capacity and mempools. Move dm_table_set_restrictions() earlier in __bind(). Since it needs the capacity to be set, save the old version and restore it on failure. Signed-off-by: Benjamin Marzinski Reviewed-by: Damien Le Moal --- drivers/md/dm.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 5ab7574c0c76..f5c5ccb6f8d2 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2421,21 +2421,29 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, struct queue_limits *limits) { struct dm_table *old_map; - sector_t size; + sector_t size, old_size; int ret; lockdep_assert_held(&md->suspend_lock); size = dm_table_get_size(t); + old_size = dm_get_size(md); + set_capacity(md->disk, size); + + ret = dm_table_set_restrictions(t, md->queue, limits); + if (ret) { + set_capacity(md->disk, old_size); + old_map = ERR_PTR(ret); + goto out; + } + /* * Wipe any geometry if the size of the table changed. */ - if (size != dm_get_size(md)) + if (size != old_size) memset(&md->geometry, 0, sizeof(md->geometry)); - set_capacity(md->disk, size); - dm_table_event_callback(t, event_callback, md); if (dm_table_request_based(t)) { @@ -2468,12 +2476,6 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, t->mempools = NULL; } - ret = dm_table_set_restrictions(t, md->queue, limits); - if (ret) { - old_map = ERR_PTR(ret); - goto out; - } - old_map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); rcu_assign_pointer(md->map, (void *)t); md->immutable_target_type = dm_table_get_immutable_target_type(t); From patchwork Sun Mar 9 22:28:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009062 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B8621B4138 for ; Sun, 9 Mar 2025 22:29:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559354; cv=none; b=eL/OZ6hjexsfMKk2gUFOXJOpBSCBWGJNnKP8OYhLihnzf5xO9V929Ot57e03Rl9Ghs9RY5f5jt6Lwp9BRjdSzzQlM6jI2s/pquy19IV5FRHUN/vXjazNfTt+n9tXfTXpECliGkzf8oXm4m0qKn0XuusFS/KLwvLk8TczRRDDXPc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559354; c=relaxed/simple; bh=43PbpmJfSWvOJz5G2Rc9sL2YRH0HfTiDwebBDoggemA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VBSZVEU/4S0Ac3mcaNlCvb2mvVlhcSokHvTmzLQTbZSze6MwK9EpqDEPhd5kD4ecps16FrXDX1WxX3B/LUN/lHdjTjDbJ6H7pKjmfg4bSEHxHTg/slFzUyJ0ONR0PxP69u6bHpOjZpOgFcXHMaRpCzwJswHQZir4TinPpwZhE6Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=PtECKQbp; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="PtECKQbp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559351; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=er4MSknSNs1OnoVrpqZVQCru3VYOq3xFx9b87bjxhnw=; b=PtECKQbpC3aPIbEjGBQq6Wt96WHCrUpuFv6vXd6ISKZ5Ub9CK1TlLCUcX/dtTWGvdj8LN/ NT981v1Pay3Fk2y1n5siENS1dLGtMnFEq4zEfcB3mOdazd0fWqc+tu/v8FYJi556511E03 7ArFQMFbqAc1SawKHav13+xCjtliS08= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-554-8yFV8Eb7PuCWevlJRNI4Hw-1; Sun, 09 Mar 2025 18:29:08 -0400 X-MC-Unique: 8yFV8Eb7PuCWevlJRNI4Hw-1 X-Mimecast-MFC-AGG-ID: 8yFV8Eb7PuCWevlJRNI4Hw_1741559347 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 90C0D19560B3; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 484301956096; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT45Y449827 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:04 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT4qM449826; Sun, 9 Mar 2025 18:29:04 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 2/7] dm: free table mempools if not used in __bind Date: Sun, 9 Mar 2025 18:28:58 -0400 Message-ID: <20250309222904.449803-3-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 With request-based dm, the mempools don't need reloading when switching tables, but the unused table mempools are not freed until the active table is finally freed. Free them immediately if they are not needed. Signed-off-by: Benjamin Marzinski Reviewed-by: Damien Le Moal --- drivers/md/dm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index f5c5ccb6f8d2..292414da871d 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2461,10 +2461,10 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, * requests in the queue may refer to bio from the old bioset, * so you must walk through the queue to unprep. */ - if (!md->mempools) { + if (!md->mempools) md->mempools = t->mempools; - t->mempools = NULL; - } + else + dm_free_md_mempools(t->mempools); } else { /* * The md may already have mempools that need changing. @@ -2473,8 +2473,8 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t, */ dm_free_md_mempools(md->mempools); md->mempools = t->mempools; - t->mempools = NULL; } + t->mempools = NULL; old_map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock)); rcu_assign_pointer(md->map, (void *)t); From patchwork Sun Mar 9 22:28:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009067 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA7F721C187 for ; Sun, 9 Mar 2025 22:29:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; cv=none; b=GV/ZymMpJT4MiKJtzSYS+FwHil2mGpKNMT59lAAkxsziXo3u7ga1Y2J4DMwKX1cbBS9PRmy2zq7ny48RTZWPj7HmYHCscAXA/kndGn2Yxd+4KnIeHPVNzStg8dxjgKgmVluylMOaBDe1oGim1agehiETJahcwagudZrV8aTnr3M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; c=relaxed/simple; bh=xjqrQy3/c2CAxNKpBKKoixokmpaWgzY9JitddRa4JL0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=T/wPo/753JCUvB+aVu8ztyEpcc2keAJgkvV6Q/Xn29IU3G/8vKHYd1gqG5fuFURis1JDyyvKgbRHXM0cKxA2Tn5KBBRbZsdKPqRxhlfUeLIvRvQp4qnnPtBrgwISztFy9GsIRKL/SVtTHfoaWytMHialXl0EQeC9ZvZQYYdCDIA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=iGL6n8/3; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="iGL6n8/3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559352; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mBTya5jDAn2ECaAH85zdtyIYqLGYimiGN42zC1yyuhw=; b=iGL6n8/3X5oBEy4tUFwD2tQb2/zHPuQC0uAuL0F0wCSRTyVLs1dGM7IUXw4MRKNq1RJbqj lJgCfbVO/OdM92gIcP0Q7no897NOjog7rcBYCkWP6dn6UFx4Ym48C40DONkQ4BliOeLuwe PL7e3fNrupQplely8vBzMJBwWzHm0ws= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-382-inuSz8IPP6ShbcW1Lv_jiA-1; Sun, 09 Mar 2025 18:29:09 -0400 X-MC-Unique: inuSz8IPP6ShbcW1Lv_jiA-1 X-Mimecast-MFC-AGG-ID: inuSz8IPP6ShbcW1Lv_jiA_1741559347 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 756F719560AB; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9B2EB19560AD; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5eW449831 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT5EM449830; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 3/7] dm: handle failures in dm_table_set_restrictions Date: Sun, 9 Mar 2025 18:28:59 -0400 Message-ID: <20250309222904.449803-4-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 If dm_table_set_restrictions() fails while swapping tables, device-mapper will continue using the previous table. It must be sure to leave the mapped_device in it's previous state on failure. Otherwise device-mapper could end up using the old table with settings from the unused table. Do not update the mapped device in dm_set_zones_restrictions(). Wait till after dm_table_set_restrictions() is sure to succeed to update the md zoned settings. Do the same with the dax settings, and if dm_revalidate_zones() fails, restore the original queue limits. Signed-off-by: Benjamin Marzinski --- drivers/md/dm-table.c | 24 ++++++++++++++++-------- drivers/md/dm-zone.c | 26 ++++++++++++++++++-------- drivers/md/dm.h | 1 + 3 files changed, 35 insertions(+), 16 deletions(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 0ef5203387b2..4003e84af11d 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1836,6 +1836,7 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, struct queue_limits *limits) { int r; + struct queue_limits old_limits; if (!dm_table_supports_nowait(t)) limits->features &= ~BLK_FEAT_NOWAIT; @@ -1862,16 +1863,11 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (dm_table_supports_flush(t)) limits->features |= BLK_FEAT_WRITE_CACHE | BLK_FEAT_FUA; - if (dm_table_supports_dax(t, device_not_dax_capable)) { + if (dm_table_supports_dax(t, device_not_dax_capable)) limits->features |= BLK_FEAT_DAX; - if (dm_table_supports_dax(t, device_not_dax_synchronous_capable)) - set_dax_synchronous(t->md->dax_dev); - } else + else limits->features &= ~BLK_FEAT_DAX; - if (dm_table_any_dev_attr(t, device_dax_write_cache_enabled, NULL)) - dax_write_cache(t->md->dax_dev, true); - /* For a zoned table, setup the zone related queue attributes. */ if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && (limits->features & BLK_FEAT_ZONED)) { @@ -1883,6 +1879,7 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (dm_table_supports_atomic_writes(t)) limits->features |= BLK_FEAT_ATOMIC_WRITES; + old_limits = q->limits; r = queue_limits_set(q, limits); if (r) return r; @@ -1894,10 +1891,21 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) && (limits->features & BLK_FEAT_ZONED)) { r = dm_revalidate_zones(t, q); - if (r) + if (r) { + queue_limits_set(q, &old_limits); return r; + } } + if (IS_ENABLED(CONFIG_BLK_DEV_ZONED)) + dm_finalize_zone_settings(t, limits); + + if (dm_table_supports_dax(t, device_not_dax_synchronous_capable)) + set_dax_synchronous(t->md->dax_dev); + + if (dm_table_any_dev_attr(t, device_dax_write_cache_enabled, NULL)) + dax_write_cache(t->md->dax_dev, true); + dm_update_crypto_profile(q, t); return 0; } diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c index 20edd3fabbab..681058feb63b 100644 --- a/drivers/md/dm-zone.c +++ b/drivers/md/dm-zone.c @@ -340,12 +340,8 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, * mapped device queue as needing zone append emulation. */ WARN_ON_ONCE(queue_is_mq(q)); - if (dm_table_supports_zone_append(t)) { - clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); - } else { - set_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); + if (!dm_table_supports_zone_append(t)) lim->max_hw_zone_append_sectors = 0; - } /* * Determine the max open and max active zone limits for the mapped @@ -383,9 +379,6 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, lim->zone_write_granularity = 0; lim->chunk_sectors = 0; lim->features &= ~BLK_FEAT_ZONED; - clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); - md->nr_zones = 0; - disk->nr_zones = 0; return 0; } @@ -408,6 +401,23 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, return 0; } +void dm_finalize_zone_settings(struct dm_table *t, struct queue_limits *lim) +{ + struct mapped_device *md = t->md; + + if (lim->features & BLK_FEAT_ZONED) { + if (dm_table_supports_zone_append(t)) + clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); + else + set_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); + } else { + clear_bit(DMF_EMULATE_ZONE_APPEND, &md->flags); + md->nr_zones = 0; + md->disk->nr_zones = 0; + } +} + + /* * IO completion callback called from clone_endio(). */ diff --git a/drivers/md/dm.h b/drivers/md/dm.h index a0a8ff119815..e5d3a9f46a91 100644 --- a/drivers/md/dm.h +++ b/drivers/md/dm.h @@ -102,6 +102,7 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t); int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, struct queue_limits *lim); int dm_revalidate_zones(struct dm_table *t, struct request_queue *q); +void dm_finalize_zone_settings(struct dm_table *t, struct queue_limits *lim); void dm_zone_endio(struct dm_io *io, struct bio *clone); #ifdef CONFIG_BLK_DEV_ZONED int dm_blk_report_zones(struct gendisk *disk, sector_t sector, From patchwork Sun Mar 9 22:29:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009063 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24C451EDA00 for ; Sun, 9 Mar 2025 22:29:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559354; cv=none; b=n8DduZlnrvoYpzyOjVPS/brbXxDrRljuXRsVFDV+sgvbdlIFMK9dTxMarbGY6aPXLBA93ndCsip7fRdCDt5B0rgJg4a+84FDZULiVmS8IpfYPsb550yVEI34RJCSFRUgpxJxR2x64QRezzC3I7r7lOUOw9cPtXI/peV17syTPOQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559354; c=relaxed/simple; bh=c5TydlltQKhU70bZL1lOASWRAWYjrzCuQqBS3XoeHHY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Dk7HM5khdCQ07b1oVhX2RtrHiYq6K19BgKVBB5ebcNaSN7HU1ryez17SM66OcrARtb9zqZ8syNpGMXzYzgci5SE1wNbMzBPKAiOEts/1LykhKk9QEXsQcs4rpy92dd/jgsx+mtk+hAMjLuYLvpxOqVbOJnDj0FYZETZBWsKUFyQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XDz+KIEK; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XDz+KIEK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559352; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EcNEMW+T3REMW3kEpfHgcJDYzQBJcYbEs/Gi1FvvCME=; b=XDz+KIEK0pkhK9evfIOD2BMHWaBRaajowg4//KzbrsofUnbgl/OF2HbATLVL92KBjBXMWA QXLRu2w0tUMfagBUnrCh25LXAQMsWiF9EyZxoOIJCea481eXgR+ICfhAvjD9snrsLcvia2 XN5RREBmBBL7BihuNVxgFO1vhGeA+zI= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-423-Wh1IeIuFOXyfI5pqEuCs6Q-1; Sun, 09 Mar 2025 18:29:08 -0400 X-MC-Unique: Wh1IeIuFOXyfI5pqEuCs6Q-1 X-Mimecast-MFC-AGG-ID: Wh1IeIuFOXyfI5pqEuCs6Q_1741559347 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8197519560B2; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9CC901800944; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5bU449835 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT56T449834; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 4/7] dm: fix dm_blk_report_zones Date: Sun, 9 Mar 2025 18:29:00 -0400 Message-ID: <20250309222904.449803-5-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 If dm_get_live_table() returned NULL, dm_put_live_table() was never called. Also, if md->zone_revalidate_map is set, check that dm_blk_report_zones() is being called from the process that set it in __bind(). Otherwise the zone resources could change while accessing them. Finally, it is possible that md->zone_revalidate_map will change while calling this function. Only read it once, so that we are always using the same value. Otherwise we might miss a call to dm_put_live_table(). Signed-off-by: Benjamin Marzinski Reviewed-by: Damien Le Moal --- drivers/md/dm-core.h | 1 + drivers/md/dm-zone.c | 23 +++++++++++++++-------- 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h index 3637761f3585..f3a3f2ef6322 100644 --- a/drivers/md/dm-core.h +++ b/drivers/md/dm-core.h @@ -141,6 +141,7 @@ struct mapped_device { #ifdef CONFIG_BLK_DEV_ZONED unsigned int nr_zones; void *zone_revalidate_map; + struct task_struct *revalidate_map_task; #endif #ifdef CONFIG_IMA diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c index 681058feb63b..11e19281bb64 100644 --- a/drivers/md/dm-zone.c +++ b/drivers/md/dm-zone.c @@ -56,24 +56,29 @@ int dm_blk_report_zones(struct gendisk *disk, sector_t sector, { struct mapped_device *md = disk->private_data; struct dm_table *map; - int srcu_idx, ret; + struct dm_table *zone_revalidate_map = md->zone_revalidate_map; + int srcu_idx, ret = -EIO; - if (!md->zone_revalidate_map) { - /* Regular user context */ + if (!zone_revalidate_map || md->revalidate_map_task != current) { + /* + * Regular user context or + * Zone revalidation during __bind() is in progress, but this + * call is from a different process + */ if (dm_suspended_md(md)) return -EAGAIN; map = dm_get_live_table(md, &srcu_idx); - if (!map) - return -EIO; } else { /* Zone revalidation during __bind() */ - map = md->zone_revalidate_map; + map = zone_revalidate_map; } - ret = dm_blk_do_report_zones(md, map, sector, nr_zones, cb, data); + if (map) + ret = dm_blk_do_report_zones(md, map, sector, nr_zones, cb, + data); - if (!md->zone_revalidate_map) + if (!zone_revalidate_map) dm_put_live_table(md, srcu_idx); return ret; @@ -175,7 +180,9 @@ int dm_revalidate_zones(struct dm_table *t, struct request_queue *q) * our table for dm_blk_report_zones() to use directly. */ md->zone_revalidate_map = t; + md->revalidate_map_task = current; ret = blk_revalidate_disk_zones(disk); + md->revalidate_map_task = NULL; md->zone_revalidate_map = NULL; if (ret) { From patchwork Sun Mar 9 22:29:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009061 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3B7C1ACECF for ; Sun, 9 Mar 2025 22:29:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559353; cv=none; b=WEZwJ8IUop+/qrnb05GTErR1iTMEgXWq9+ylvEPllAa6MLIbLaadGSrwhU+u3iS/Pl2N5DeWHuQOrXOYvCEidHGUrkMJMtvnH6NdNsrb+mahByTHnFU+VUgWewwc1ixKH8H0s6d4l7KBhoAU6261JctsQAWkN92hy/xM4+133kA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559353; c=relaxed/simple; bh=RqpEUDKRpX+/sxS+/cAxze8je/z92gmcAuFdiNF8+no=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PrxJFFfVC2KZgarxN5gxiZ9c4eN1wfu1AQzdAbrtbmviE+M8TqcwnYYgwB81o7xKxqdVsj8fZMuUDthKXTOIwyqXxiA2mKsdmwWSWbV5RISkPpFuW7uuvt2+ElmKga6tDohBILnPSNDQgsfkBGUM2sFKLhWx52GG77opwQb5SrA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=O6GdM84g; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="O6GdM84g" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559350; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y8fv7Z1eNPyvos22f4cRSlyF9ihNCLNiBcEZgwkwgFQ=; b=O6GdM84gyMXh/0jE109vNzHaxSytpMNg9rVSmUt2z5F7USrFzd8DVsfF6UOAdBXcBpdHMt DxCCDL5VQR8ml/k1gGVOJxiIOjaGNYpmSAJGX1klh2xV51sliFCIH44gVPIGuw8ruFxTW2 FgujpBI76q5JCbMef0Hw84DPS0043mA= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-447-wGI-mortPM-MAOIgFdLTeA-1; Sun, 09 Mar 2025 18:29:09 -0400 X-MC-Unique: wGI-mortPM-MAOIgFdLTeA-1 X-Mimecast-MFC-AGG-ID: wGI-mortPM-MAOIgFdLTeA_1741559348 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B46481800258; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 17A7E1956095; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5Ce449839 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT5um449838; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [PATCH 5/7] blk-zoned: clean up zone settings for devices without zwplugs Date: Sun, 9 Mar 2025 18:29:01 -0400 Message-ID: <20250309222904.449803-6-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 Previously disk_free_zone_resources() would only clean up zoned settings on a disk if the disk had write plugs allocated. Make it clean up zoned settings on any disk, so disks that don't allocate write plugs can use it as well. Signed-off-by: Benjamin Marzinski --- block/blk-zoned.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 761ea662ddc3..d7dc89cbdccb 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -1363,24 +1363,23 @@ static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk, void disk_free_zone_resources(struct gendisk *disk) { - if (!disk->zone_wplugs_pool) - return; - - if (disk->zone_wplugs_wq) { - destroy_workqueue(disk->zone_wplugs_wq); - disk->zone_wplugs_wq = NULL; - } + if (disk->zone_wplugs_pool) { + if (disk->zone_wplugs_wq) { + destroy_workqueue(disk->zone_wplugs_wq); + disk->zone_wplugs_wq = NULL; + } - disk_destroy_zone_wplugs_hash_table(disk); + disk_destroy_zone_wplugs_hash_table(disk); - /* - * Wait for the zone write plugs to be RCU-freed before - * destorying the mempool. - */ - rcu_barrier(); + /* + * Wait for the zone write plugs to be RCU-freed before + * destorying the mempool. + */ + rcu_barrier(); - mempool_destroy(disk->zone_wplugs_pool); - disk->zone_wplugs_pool = NULL; + mempool_destroy(disk->zone_wplugs_pool); + disk->zone_wplugs_pool = NULL; + } disk_set_conv_zones_bitmap(disk, NULL); disk->zone_capacity = 0; From patchwork Sun Mar 9 22:29:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009064 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24BBD1E8336 for ; Sun, 9 Mar 2025 22:29:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; cv=none; b=g/Ds/b2Bk3blR7S50Z7PA6eYOtI8ZUQbLINLI9wwS1OQ+RzW+bkk2kBBNr/CZuzv0BUOIZe+4BP69EAAUFwEz01y4DnVSpqvt5kI7rusTGW7jClHd4+gjw8EXaV1gl9DaiXpMWqzdRUqKwl1srtQr45U/orcYYEZh0esXRlTuTE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; c=relaxed/simple; bh=wEBY7Oj29M2D0Dmp362L0Hp7USy1ZPKfP0Cp+xoOaHk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=e64in/IZgp3vVEDAJBQ6fLpjgGYUBdDK9Gno7dD7c249YzsR0COS3rgdcN/0aMQ76oukRjePPLk63Rc0xv7Io59fmAcL0CQIfr3PAfBzvsOQkNCqkOxzGw/e7/eqVx6+z7TPa61cU+8WAemyWRs1TYYIO5itXBIaipxPiLmv8g4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DPM36FLN; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DPM36FLN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559352; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yTmrroXbLUYBu+54zjexbe1SJ0Kvrf2MRZCMFkFLWhY=; b=DPM36FLNP3a/Tg2oVlshjHx9OARRU66V2EiUcNBGBTurRR/W0YMy+in88OVuZPn1X8xnKJ aMxSBYXe4Qi45LBIUVDExNEDhJKR3logl2CEPSJnejHHx+5n7Trs3s4nOiYiPFE7wgLGto J3NVYya17oamehHDJzQN/gtyTvduODM= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-679-TCwUvQ0qM1edNUrtcDfbuA-1; Sun, 09 Mar 2025 18:29:08 -0400 X-MC-Unique: TCwUvQ0qM1edNUrtcDfbuA-1 X-Mimecast-MFC-AGG-ID: TCwUvQ0qM1edNUrtcDfbuA_1741559347 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CF90219560A3; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D6C511955BCB; Sun, 9 Mar 2025 22:29:06 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5po449843 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT5b8449842; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [RFC PATCH 6/7] blk-zoned: modify blk_revalidate_disk_zones for bio-based drivers Date: Sun, 9 Mar 2025 18:29:02 -0400 Message-ID: <20250309222904.449803-7-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 If device-mapper is swapping zoned tables, and blk_revalidate_disk_zones() fails. It must retain its current zoned resources since device-mapper will be failing back to using the previous table and the zoned settings need to match the table. Allocating unnecessary zwplugs is acceptable, but the zoned configuration must not change. Otherwise it can run into errors like bdev_zone_is_seq() reading invalid memory because disk->conv_zones_bitmap is the wrong size. However if device-mapper did not previously have a zoned table, it should free up the zoned resources, instead of leaving them allocated and unused. To solve this, do not free the zone resources when blk_revalidate_disk_zones() fails for bio based drivers. Additionally, delay copying the zoned settings to the gendisk until disk_update_zone_resources() can no longer fail, and do not freeze the queue for bio-based drivers, since this will hang if there are any plugged zone write bios. Also, export disk_free_zone_resources() so that device-mapper can choose when to free the zoned resources. Signed-off-by: Benjamin Marzinski --- block/blk-zoned.c | 49 +++++++++++++++++++++++------------------- block/blk.h | 4 ---- drivers/md/dm-zone.c | 3 +++ include/linux/blkdev.h | 4 ++++ 4 files changed, 34 insertions(+), 26 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index d7dc89cbdccb..3bec289d27db 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -1343,22 +1343,17 @@ static void disk_destroy_zone_wplugs_hash_table(struct gendisk *disk) disk->zone_wplugs_hash_bits = 0; } -static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk, - unsigned long *bitmap) +static void disk_set_conv_zones_bitmap(struct gendisk *disk, + unsigned long *bitmap) { - unsigned int nr_conv_zones = 0; unsigned long flags; spin_lock_irqsave(&disk->zone_wplugs_lock, flags); - if (bitmap) - nr_conv_zones = bitmap_weight(bitmap, disk->nr_zones); bitmap = rcu_replace_pointer(disk->conv_zones_bitmap, bitmap, lockdep_is_held(&disk->zone_wplugs_lock)); spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); kfree_rcu_mightsleep(bitmap); - - return nr_conv_zones; } void disk_free_zone_resources(struct gendisk *disk) @@ -1386,6 +1381,7 @@ void disk_free_zone_resources(struct gendisk *disk) disk->last_zone_capacity = 0; disk->nr_zones = 0; } +EXPORT_SYMBOL_GPL(disk_free_zone_resources); static inline bool disk_need_zone_resources(struct gendisk *disk) { @@ -1434,24 +1430,23 @@ struct blk_revalidate_zone_args { /* * Update the disk zone resources information and device queue limits. - * The disk queue is frozen when this is executed. + * The disk queue is frozen when this is executed on blk-mq drivers. */ static int disk_update_zone_resources(struct gendisk *disk, struct blk_revalidate_zone_args *args) { struct request_queue *q = disk->queue; - unsigned int nr_seq_zones, nr_conv_zones; + unsigned int nr_seq_zones, nr_conv_zones = 0; unsigned int pool_size; struct queue_limits lim; + int ret; - disk->nr_zones = args->nr_zones; - disk->zone_capacity = args->zone_capacity; - disk->last_zone_capacity = args->last_zone_capacity; - nr_conv_zones = - disk_set_conv_zones_bitmap(disk, args->conv_zones_bitmap); - if (nr_conv_zones >= disk->nr_zones) { + if (args->conv_zones_bitmap) + nr_conv_zones = bitmap_weight(args->conv_zones_bitmap, + args->nr_zones); + if (nr_conv_zones >= args->nr_zones) { pr_warn("%s: Invalid number of conventional zones %u / %u\n", - disk->disk_name, nr_conv_zones, disk->nr_zones); + disk->disk_name, nr_conv_zones, args->nr_zones); return -ENODEV; } @@ -1463,7 +1458,7 @@ static int disk_update_zone_resources(struct gendisk *disk, * small ZNS namespace. For such case, assume that the zoned device has * no zone resource limits. */ - nr_seq_zones = disk->nr_zones - nr_conv_zones; + nr_seq_zones = args->nr_zones - nr_conv_zones; if (lim.max_open_zones >= nr_seq_zones) lim.max_open_zones = 0; if (lim.max_active_zones >= nr_seq_zones) @@ -1493,7 +1488,19 @@ static int disk_update_zone_resources(struct gendisk *disk, } commit: - return queue_limits_commit_update_frozen(q, &lim); + if (queue_is_mq(disk->queue)) + ret = queue_limits_commit_update_frozen(q, &lim); + else + ret = queue_limits_commit_update(q, &lim); + + if (!ret) { + disk->nr_zones = args->nr_zones; + disk->zone_capacity = args->zone_capacity; + disk->last_zone_capacity = args->last_zone_capacity; + disk_set_conv_zones_bitmap(disk, args->conv_zones_bitmap); + } + + return ret; } static int blk_revalidate_conv_zone(struct blk_zone *zone, unsigned int idx, @@ -1648,8 +1655,6 @@ static int blk_revalidate_zone_cb(struct blk_zone *zone, unsigned int idx, * and when the zone configuration of the gendisk changes (e.g. after a format). * Before calling this function, the device driver must already have set the * device zone size (chunk_sector limit) and the max zone append limit. - * BIO based drivers can also use this function as long as the device queue - * can be safely frozen. */ int blk_revalidate_disk_zones(struct gendisk *disk) { @@ -1709,13 +1714,13 @@ int blk_revalidate_disk_zones(struct gendisk *disk) /* * Set the new disk zone parameters only once the queue is frozen and - * all I/Os are completed. + * all I/Os are completed on blk-mq drivers. */ if (ret > 0) ret = disk_update_zone_resources(disk, &args); else pr_warn("%s: failed to revalidate zones\n", disk->disk_name); - if (ret) { + if (ret && queue_is_mq(disk->queue)) { unsigned int memflags = blk_mq_freeze_queue(q); disk_free_zone_resources(disk); diff --git a/block/blk.h b/block/blk.h index 90fa5f28ccab..c84af503b77b 100644 --- a/block/blk.h +++ b/block/blk.h @@ -454,7 +454,6 @@ static inline struct bio *blk_queue_bounce(struct bio *bio, #ifdef CONFIG_BLK_DEV_ZONED void disk_init_zone_resources(struct gendisk *disk); -void disk_free_zone_resources(struct gendisk *disk); static inline bool bio_zone_write_plugging(struct bio *bio) { return bio_flagged(bio, BIO_ZONE_WRITE_PLUGGING); @@ -500,9 +499,6 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, static inline void disk_init_zone_resources(struct gendisk *disk) { } -static inline void disk_free_zone_resources(struct gendisk *disk) -{ -} static inline bool bio_zone_write_plugging(struct bio *bio) { return false; diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c index 11e19281bb64..ac86011640c3 100644 --- a/drivers/md/dm-zone.c +++ b/drivers/md/dm-zone.c @@ -159,6 +159,7 @@ int dm_revalidate_zones(struct dm_table *t, struct request_queue *q) struct mapped_device *md = t->md; struct gendisk *disk = md->disk; int ret; + bool was_zoned = disk->nr_zones != 0; if (!get_capacity(disk)) return 0; @@ -187,6 +188,8 @@ int dm_revalidate_zones(struct dm_table *t, struct request_queue *q) if (ret) { DMERR("Revalidate zones failed %d", ret); + if (!was_zoned) + disk_free_zone_resources(disk); return ret; } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 248416ecd01c..51edf35ff715 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -690,12 +690,16 @@ static inline bool blk_queue_is_zoned(struct request_queue *q) } #ifdef CONFIG_BLK_DEV_ZONED +void disk_free_zone_resources(struct gendisk *disk); static inline unsigned int disk_nr_zones(struct gendisk *disk) { return disk->nr_zones; } bool blk_zone_plug_bio(struct bio *bio, unsigned int nr_segs); #else /* CONFIG_BLK_DEV_ZONED */ +static inline void disk_free_zone_resources(struct gendisk *disk) +{ +} static inline unsigned int disk_nr_zones(struct gendisk *disk) { return 0; From patchwork Sun Mar 9 22:29:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 14009066 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6DF3380 for ; Sun, 9 Mar 2025 22:29:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; cv=none; b=hpHmrvTUi9aYbsCWPTqK1GgekNg7iujfhnIEFeS1k0Fx4D1brfgAJi8HX1HW549Cy2tSeHYkYwfM19UW1rvuUHpk9hcPWDvKhiDMXusdkyXbCPBdDNoJjxNN+vhH+BCy1sm7PlVfFGREDLUU8xfbrv0cKRpKaVt44VgtXkJ4WxQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741559355; c=relaxed/simple; bh=NQESKpUO4ZMrFMIoAIQEI3ZvDpobn+WJnUTsMH0BTGI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LmGx3KC6Qt/jiquC/AL3zx5IKjvc996eDjHNeRYcGXbykD0J3Qor5IscFWxicxmkmIC7QUYw9Q7NZRl2OTVX3r1V90EPdyWEhHxpp0yfviUbS3FUJwLMt1Mllb+OLOjTfX53oUFL+X4YwfeZDKsI+06cSgNMcVYifF0c0OjCt4s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=jSSx0wiO; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="jSSx0wiO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741559352; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fhIqTYUkB46Mb29OKVAlh9WVdW1uNLVb6c9d10hk6Co=; b=jSSx0wiOheh/XjsODIA3lOd/aHrpaRfSpJZce3gEHgTPWqV3yj+JZdEEQXoYFl1XKWVNyj 7CBhF6H7k2h0bzo5Tdm1vHN25cTZuJz9CS5qUS64xniGgCbUpYW5aeU65Vp3QwNgUPoyDF cLeRt4XdQndhQuej9HS9hHx7mMmJRms= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-649-UxP6utj-Mgefkedba8sHLw-1; Sun, 09 Mar 2025 18:29:09 -0400 X-MC-Unique: UxP6utj-Mgefkedba8sHLw-1 X-Mimecast-MFC-AGG-ID: UxP6utj-Mgefkedba8sHLw_1741559348 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6B8701955D4B; Sun, 9 Mar 2025 22:29:08 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (unknown [10.6.23.247]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6B2CD180AF71; Sun, 9 Mar 2025 22:29:07 +0000 (UTC) Received: from bmarzins-01.fast.eng.rdu2.dc.redhat.com (localhost [127.0.0.1]) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.17.1) with ESMTPS id 529MT5ZC449847 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 9 Mar 2025 18:29:05 -0400 Received: (from bmarzins@localhost) by bmarzins-01.fast.eng.rdu2.dc.redhat.com (8.18.1/8.18.1/Submit) id 529MT5Df449846; Sun, 9 Mar 2025 18:29:05 -0400 From: Benjamin Marzinski To: Mikulas Patocka , Mike Snitzer , Jens Axboe Cc: dm-devel@lists.linux.dev, linux-block@vger.kernel.org, Damien Le Moal , Christoph Hellwig Subject: [RFC PATCH 7/7] dm: allow devices to revalidate existing zones Date: Sun, 9 Mar 2025 18:29:03 -0400 Message-ID: <20250309222904.449803-8-bmarzins@redhat.com> In-Reply-To: <20250309222904.449803-1-bmarzins@redhat.com> References: <20250309222904.449803-1-bmarzins@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 dm_revalidate_zones() only allowed devices that had no zone resources set up to call blk_revalidate_disk_zones(). If the device already had zone resources, disk->nr_zones would always equal md->nr_zones so dm_revalidate_zones() returned without doing any work. Instead, always call blk_revalidate_disk_zones() if you are loading a new zoned table. However, if the device emulates zone append operations and already has zone append emulation resources, the table size cannot change when loading a new table. Otherwise, all those resources will be garbage. If emulated zone append operations are needed and the zone write pointer offsets of the new table do not match those of the old table, writes to the device will still fail. This patch allows users to safely grow and shrink zone devices. But swapping arbitrary zoned tables will still not work. Signed-off-by: Benjamin Marzinski --- drivers/md/dm-zone.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c index ac86011640c3..7e9ebeee7eac 100644 --- a/drivers/md/dm-zone.c +++ b/drivers/md/dm-zone.c @@ -164,16 +164,8 @@ int dm_revalidate_zones(struct dm_table *t, struct request_queue *q) if (!get_capacity(disk)) return 0; - /* Revalidate only if something changed. */ - if (!disk->nr_zones || disk->nr_zones != md->nr_zones) { - DMINFO("%s using %s zone append", - disk->disk_name, - queue_emulates_zone_append(q) ? "emulated" : "native"); - md->nr_zones = 0; - } - - if (md->nr_zones) - return 0; + DMINFO("%s using %s zone append", disk->disk_name, + queue_emulates_zone_append(q) ? "emulated" : "native"); /* * Our table is not live yet. So the call to dm_get_live_table() @@ -392,6 +384,17 @@ int dm_set_zones_restrictions(struct dm_table *t, struct request_queue *q, return 0; } + /* + * If the device needs zone append emulation, and the device already has + * zone append emulation resources, make sure that the chunk_sectors + * hasn't changed size. Otherwise those resources will be garbage. + */ + if (!lim->max_hw_zone_append_sectors && disk->zone_wplugs_hash && + q->limits.chunk_sectors != lim->chunk_sectors) { + DMERR("Cannot change zone size when swapping tables"); + return -EINVAL; + } + /* * Warn once (when the capacity is not yet set) if the mapped device is * partially using zone resources of the target devices as that leads to