From patchwork Thu Dec 20 16:23:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739123 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B992D746 for ; Thu, 20 Dec 2018 16:29:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A7BB02906D for ; Thu, 20 Dec 2018 16:29:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9BDB8290BC; Thu, 20 Dec 2018 16:29:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 56AF32906D for ; Thu, 20 Dec 2018 16:29:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733065AbeLTQ3i (ORCPT ); Thu, 20 Dec 2018 11:29:38 -0500 Received: from mail09.linbit.com ([212.69.161.110]:49429 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733010AbeLTQ3i (ORCPT ); Thu, 20 Dec 2018 11:29:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id E7578103B4C3; Thu, 20 Dec 2018 17:23:44 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 01/17] drbd: narrow rcu_read_lock in drbd_sync_handshake Date: Thu, 20 Dec 2018 17:23:28 +0100 Message-Id: <20181220162344.8430-2-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Roland Kammerer So far there was the possibility that we called genlmsg_new(GFP_NOIO)/mutex_lock() while holding an rcu_read_lock(). This included cases like: drbd_sync_handshake (acquire the RCU lock) drbd_asb_recover_1p drbd_khelper drbd_bcast_event genlmsg_new(GFP_NOIO) --> may sleep drbd_sync_handshake (acquire the RCU lock) drbd_asb_recover_1p drbd_khelper notify_helper genlmsg_new(GFP_NOIO) --> may sleep drbd_sync_handshake (acquire the RCU lock) drbd_asb_recover_1p drbd_khelper notify_helper mutex_lock --> may sleep While using GFP_ATOMIC whould have been possible in the first two cases, the real fix is to narrow the rcu_read_lock. Reported-by: Jia-Ju Bai Reviewed-by: Lars Ellenberg Signed-off-by: Roland Kammerer --- drivers/block/drbd/drbd_receiver.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 61c392752fe4..1b9822f264d2 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -3364,7 +3364,7 @@ static enum drbd_conns drbd_sync_handshake(struct drbd_peer_device *peer_device, enum drbd_conns rv = C_MASK; enum drbd_disk_state mydisk; struct net_conf *nc; - int hg, rule_nr, rr_conflict, tentative; + int hg, rule_nr, rr_conflict, tentative, always_asbp; mydisk = device->state.disk; if (mydisk == D_NEGOTIATING) @@ -3415,8 +3415,12 @@ static enum drbd_conns drbd_sync_handshake(struct drbd_peer_device *peer_device, rcu_read_lock(); nc = rcu_dereference(peer_device->connection->net_conf); + always_asbp = nc->always_asbp; + rr_conflict = nc->rr_conflict; + tentative = nc->tentative; + rcu_read_unlock(); - if (hg == 100 || (hg == -100 && nc->always_asbp)) { + if (hg == 100 || (hg == -100 && always_asbp)) { int pcount = (device->state.role == R_PRIMARY) + (peer_role == R_PRIMARY); int forced = (hg == -100); @@ -3455,9 +3459,6 @@ static enum drbd_conns drbd_sync_handshake(struct drbd_peer_device *peer_device, "Sync from %s node\n", (hg < 0) ? "peer" : "this"); } - rr_conflict = nc->rr_conflict; - tentative = nc->tentative; - rcu_read_unlock(); if (hg == -100) { /* FIXME this log message is not correct if we end up here From patchwork Thu Dec 20 16:23:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739147 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D054F746 for ; Thu, 20 Dec 2018 16:35:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BF61A28E3D for ; Thu, 20 Dec 2018 16:35:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B37BB29087; Thu, 20 Dec 2018 16:35:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7300B28E3D for ; Thu, 20 Dec 2018 16:35:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387440AbeLTQfP (ORCPT ); Thu, 20 Dec 2018 11:35:15 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50097 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726604AbeLTQei (ORCPT ); Thu, 20 Dec 2018 11:34:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 4CFF4103B4CA; Thu, 20 Dec 2018 17:23:45 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 02/17] drbd: must not use connection after kref_put(&connection->kref) Date: Thu, 20 Dec 2018 17:23:29 +0100 Message-Id: <20181220162344.8430-3-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_state.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c index 0813c654c893..18d53fe60d1d 100644 --- a/drivers/block/drbd/drbd_state.c +++ b/drivers/block/drbd/drbd_state.c @@ -2109,9 +2109,8 @@ static int w_after_conn_state_ch(struct drbd_work *w, int unused) spin_unlock_irq(&connection->resource->req_lock); } } - kref_put(&connection->kref, drbd_destroy_connection); - conn_md_sync(connection); + kref_put(&connection->kref, drbd_destroy_connection); return 0; } From patchwork Thu Dec 20 16:23:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739135 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17562746 for ; Thu, 20 Dec 2018 16:34:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0714828DDF for ; Thu, 20 Dec 2018 16:34:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ECC6C28E16; Thu, 20 Dec 2018 16:34:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4D40328DDF for ; Thu, 20 Dec 2018 16:34:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733185AbeLTQem (ORCPT ); Thu, 20 Dec 2018 11:34:42 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50102 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727387AbeLTQek (ORCPT ); Thu, 20 Dec 2018 11:34:40 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id AF600103B4CE; Thu, 20 Dec 2018 17:23:45 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 03/17] drbd: centralize printk reporting of new size into drbd_set_my_capacity() Date: Thu, 20 Dec 2018 17:23:30 +0100 Message-Id: <20181220162344.8430-4-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Previously, some implicit resizes that happend during handshake have not been reported as prominently as explicit resize. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_int.h | 8 +------- drivers/block/drbd/drbd_main.c | 17 ++++++++++++++++- drivers/block/drbd/drbd_nl.c | 3 --- 3 files changed, 17 insertions(+), 11 deletions(-) diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h index 1e47db57b9d2..ab718582a092 100644 --- a/drivers/block/drbd/drbd_int.h +++ b/drivers/block/drbd/drbd_int.h @@ -1609,13 +1609,7 @@ static inline void drbd_tcp_quickack(struct socket *sock) } /* sets the number of 512 byte sectors of our virtual device */ -static inline void drbd_set_my_capacity(struct drbd_device *device, - sector_t size) -{ - /* set_capacity(device->this_bdev->bd_disk, size); */ - set_capacity(device->vdisk, size); - device->this_bdev->bd_inode->i_size = (loff_t)size << 9; -} +void drbd_set_my_capacity(struct drbd_device *device, sector_t size); /* * used to submit our private bio diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index f973a2a845c8..f9b4228cc2d9 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -2034,6 +2034,21 @@ void drbd_init_set_defaults(struct drbd_device *device) device->local_max_bio_size = DRBD_MAX_BIO_SIZE_SAFE; } +static void _drbd_set_my_capacity(struct drbd_device *device, sector_t size) +{ + /* set_capacity(device->this_bdev->bd_disk, size); */ + set_capacity(device->vdisk, size); + device->this_bdev->bd_inode->i_size = (loff_t)size << 9; +} + +void drbd_set_my_capacity(struct drbd_device *device, sector_t size) +{ + char ppb[10]; + _drbd_set_my_capacity(device, size); + drbd_info(device, "size = %s (%llu KB)\n", + ppsize(ppb, size>>1), (unsigned long long)size>>1); +} + void drbd_device_cleanup(struct drbd_device *device) { int i; @@ -2059,7 +2074,7 @@ void drbd_device_cleanup(struct drbd_device *device) } D_ASSERT(device, first_peer_device(device)->connection->net_conf == NULL); - drbd_set_my_capacity(device, 0); + _drbd_set_my_capacity(device, 0); if (device->bitmap) { /* maybe never allocated. */ drbd_bm_resize(device, 0, 1); diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index d15703b1ffe8..d02d38fd1288 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -921,7 +921,6 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct } prev; sector_t u_size, size; struct drbd_md *md = &device->ldev->md; - char ppb[10]; void *buffer; int md_moved, la_size_changed; @@ -999,8 +998,6 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct /* racy, see comments above. */ drbd_set_my_capacity(device, size); md->la_size_sect = size; - drbd_info(device, "size = %s (%llu KB)\n", ppsize(ppb, size>>1), - (unsigned long long)size>>1); } if (rv <= DS_ERROR) goto err_out; From patchwork Thu Dec 20 16:23:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739139 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3792114E5 for ; Thu, 20 Dec 2018 16:35:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2522E28E35 for ; Thu, 20 Dec 2018 16:35:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1863929129; Thu, 20 Dec 2018 16:35:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0A82328E35 for ; Thu, 20 Dec 2018 16:34:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733218AbeLTQet (ORCPT ); Thu, 20 Dec 2018 11:34:49 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50101 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730121AbeLTQek (ORCPT ); Thu, 20 Dec 2018 11:34:40 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 14AA2103B4D2; Thu, 20 Dec 2018 17:23:46 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 04/17] drbd: ignore "all zero" peer volume sizes in handshake Date: Thu, 20 Dec 2018 17:23:31 +0100 Message-Id: <20181220162344.8430-5-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP During handshake, if we are diskless ourselves, we used to accept any size presented by the peer. Which could be zero if that peer was just brought up and connected to us without having a disk attached first, in which case both peers would just "flip" their volume sizes. Now, even a diskless node will ignore "zero" sizes presented by a diskless peer. Also a currently Diskless Primary will refuse to shrink during handshake: it may be frozen, and waiting for a "suitable" local disk or peer to re-appear (on-no-data-accessible suspend-io). If the peer is smaller than what we used to be, it is not suitable. The logic for a diskless node during handshake is now supposed to be: believe the peer, if - I don't have a current size myself - we agree on the size anyways - I do have a current size, am Secondary, and he has the only disk - I do have a current size, am Primary, and he has the only disk, which is larger than my current size Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_receiver.c | 33 +++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-) diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 1b9822f264d2..fbf30fe45862 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -3981,6 +3981,7 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info struct o_qlim *o = (connection->agreed_features & DRBD_FF_WSAME) ? p->qlim : NULL; enum determine_dev_size dd = DS_UNCHANGED; sector_t p_size, p_usize, p_csize, my_usize; + sector_t new_size, cur_size; int ldsc = 0; /* local disk size changed */ enum dds_flags ddsf; @@ -3988,6 +3989,7 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info if (!peer_device) return config_unknown_volume(connection, pi); device = peer_device->device; + cur_size = drbd_get_capacity(device->this_bdev); p_size = be64_to_cpu(p->d_size); p_usize = be64_to_cpu(p->u_size); @@ -3998,7 +4000,6 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info device->p_size = p_size; if (get_ldev(device)) { - sector_t new_size, cur_size; rcu_read_lock(); my_usize = rcu_dereference(device->ldev->disk_conf)->disk_size; rcu_read_unlock(); @@ -4016,7 +4017,6 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info /* Never shrink a device with usable data during connect. But allow online shrinking if we are connected. */ new_size = drbd_new_dev_size(device, device->ldev, p_usize, 0); - cur_size = drbd_get_capacity(device->this_bdev); if (new_size < cur_size && device->state.disk >= D_OUTDATED && device->state.conn < C_CONNECTED) { @@ -4081,9 +4081,36 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info * * However, if he sends a zero current size, * take his (user-capped or) backing disk size anyways. + * + * Unless of course he does not have a disk himself. + * In which case we ignore this completely. */ + sector_t new_size = p_csize ?: p_usize ?: p_size; drbd_reconsider_queue_parameters(device, NULL, o); - drbd_set_my_capacity(device, p_csize ?: p_usize ?: p_size); + if (new_size == 0) { + /* Ignore, peer does not know nothing. */ + } else if (new_size == cur_size) { + /* nothing to do */ + } else if (cur_size != 0 && p_size == 0) { + drbd_warn(device, "Ignored diskless peer device size (peer:%llu != me:%llu sectors)!\n", + (unsigned long long)new_size, (unsigned long long)cur_size); + } else if (new_size < cur_size && device->state.role == R_PRIMARY) { + drbd_err(device, "The peer's device size is too small! (%llu < %llu sectors); demote me first!\n", + (unsigned long long)new_size, (unsigned long long)cur_size); + conn_request_state(peer_device->connection, NS(conn, C_DISCONNECTING), CS_HARD); + return -EIO; + } else { + /* I believe the peer, if + * - I don't have a current size myself + * - we agree on the size anyways + * - I do have a current size, am Secondary, + * and he has the only disk + * - I do have a current size, am Primary, + * and he has the only disk, + * which is larger than my current size + */ + drbd_set_my_capacity(device, new_size); + } } if (get_ldev(device)) { From patchwork Thu Dec 20 16:23:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739141 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 06BD3746 for ; Thu, 20 Dec 2018 16:35:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E8ED228E38 for ; Thu, 20 Dec 2018 16:35:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DBF91290EC; Thu, 20 Dec 2018 16:35:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 90E3828E38 for ; Thu, 20 Dec 2018 16:35:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733086AbeLTQek (ORCPT ); Thu, 20 Dec 2018 11:34:40 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50104 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730180AbeLTQej (ORCPT ); Thu, 20 Dec 2018 11:34:39 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 6E7EB103B4DE; Thu, 20 Dec 2018 17:23:46 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 05/17] drbd: disconnect, if the wrong UUIDs are attached on a connected peer Date: Thu, 20 Dec 2018 17:23:32 +0100 Message-Id: <20181220162344.8430-6-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP With "on-no-data-accessible suspend-io", DRBD requires the next attach or connect to be to the very same data generation uuid tag it lost last. If we first lost connection to the peer, then later lost connection to our own disk, we would usually refuse to re-connect to the peer, because it presents the wrong data set. However, if the peer first connects without a disk, and then attached its disk, we accepted that same wrong data set, which would be "unexpected" by any user of that DRBD and cause "undefined results" (read: very likely data corruption). The fix is to forcefully disconnect as soon as we notice that the peer attached to the "wrong" dataset. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_receiver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index fbf30fe45862..0a9f3c65f70a 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -4170,7 +4170,7 @@ static int receive_uuids(struct drbd_connection *connection, struct packet_info kfree(device->p_uuid); device->p_uuid = p_uuid; - if (device->state.conn < C_CONNECTED && + if ((device->state.conn < C_CONNECTED || device->state.pdsk == D_DISKLESS) && device->state.disk < D_INCONSISTENT && device->state.role == R_PRIMARY && (device->ed_uuid & ~((u64)1)) != (p_uuid[UI_CURRENT] & ~((u64)1))) { From patchwork Thu Dec 20 16:23:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739145 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9DD8114E5 for ; Thu, 20 Dec 2018 16:35:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8DD6E28E3D for ; Thu, 20 Dec 2018 16:35:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 81DDA29087; Thu, 20 Dec 2018 16:35:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3AACC28E3D for ; Thu, 20 Dec 2018 16:35:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730121AbeLTQfE (ORCPT ); Thu, 20 Dec 2018 11:35:04 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50088 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728109AbeLTQei (ORCPT ); Thu, 20 Dec 2018 11:34:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id C5CE9103B4DF; Thu, 20 Dec 2018 17:23:46 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 06/17] drbd: fix confusing error message during attach Date: Thu, 20 Dec 2018 17:23:33 +0100 Message-Id: <20181220162344.8430-7-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If we attach a (consistent) backing device, which knows about a last-agreed effective size, and that effective size is *larger* than the currently requested size, we refused to attach with ERR_DISK_TOO_SMALL Failure: (111) Low.dev. smaller than requested DRBD-dev. size. which is confusing to say the least. This patch changes the error code in that case to ERR_IMPLICIT_SHRINK Failure: (170) Implicit device shrinking not allowed. See kernel log. additional info from kernel: To-be-attached device has last effective > current size, and is consistent (9999 > 7777 sectors). Refusing to attach. It also allows to attach with an explicit size. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_nl.c | 49 ++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 5 deletions(-) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index d02d38fd1288..e4774f720de5 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -127,6 +127,35 @@ static int drbd_msg_put_info(struct sk_buff *skb, const char *info) return 0; } +__printf(2, 3) +static int drbd_msg_sprintf_info(struct sk_buff *skb, const char *fmt, ...) +{ + va_list args; + struct nlattr *nla, *txt; + int err = -EMSGSIZE; + int len; + + nla = nla_nest_start(skb, DRBD_NLA_CFG_REPLY); + if (!nla) + return err; + + txt = nla_reserve(skb, T_info_text, 256); + if (!txt) { + nla_nest_cancel(skb, nla); + return err; + } + va_start(args, fmt); + len = vscnprintf(nla_data(txt), 256, fmt, args); + va_end(args); + + /* maybe: retry with larger reserve, if truncated */ + txt->nla_len = nla_attr_size(len+1); + nlmsg_trim(skb, (char*)txt + NLA_ALIGN(txt->nla_len)); + nla_nest_end(skb, nla); + + return 0; +} + /* This would be a good candidate for a "pre_doit" hook, * and per-family private info->pointers. * But we need to stay compatible with older kernels. @@ -1947,11 +1976,21 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) } /* Prevent shrinking of consistent devices ! */ - if (drbd_md_test_flag(nbc, MDF_CONSISTENT) && - drbd_new_dev_size(device, nbc, nbc->disk_conf->disk_size, 0) < nbc->md.la_size_sect) { - drbd_warn(device, "refusing to truncate a consistent device\n"); - retcode = ERR_DISK_TOO_SMALL; - goto force_diskless_dec; + { + unsigned long long nsz = drbd_new_dev_size(device, nbc, nbc->disk_conf->disk_size, 0); + unsigned long long eff = nbc->md.la_size_sect; + if (drbd_md_test_flag(nbc, MDF_CONSISTENT) && nsz < eff) { + if (nsz == nbc->disk_conf->disk_size) { + drbd_warn(device, "truncating a consistent device during attach (%llu < %llu)\n", nsz, eff); + } else { + drbd_warn(device, "refusing to truncate a consistent device (%llu < %llu)\n", nsz, eff); + drbd_msg_sprintf_info(adm_ctx.reply_skb, + "To-be-attached device has last effective > current size, and is consistent\n" + "(%llu > %llu sectors). Refusing to attach.", eff, nsz); + retcode = ERR_IMPLICIT_SHRINK; + goto force_diskless_dec; + } + } } lock_all_resources(); From patchwork Thu Dec 20 16:23:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739137 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A638A746 for ; Thu, 20 Dec 2018 16:35:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 96AB628E38 for ; Thu, 20 Dec 2018 16:35:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8A9ED29129; Thu, 20 Dec 2018 16:35:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2AFF028E3D for ; Thu, 20 Dec 2018 16:35:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727387AbeLTQes (ORCPT ); Thu, 20 Dec 2018 11:34:48 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50103 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730443AbeLTQek (ORCPT ); Thu, 20 Dec 2018 11:34:40 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 2F1061045C23; Thu, 20 Dec 2018 17:23:47 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 07/17] drbd: attach on connected diskless peer must not shrink a consistent device Date: Thu, 20 Dec 2018 17:23:34 +0100 Message-Id: <20181220162344.8430-8-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If we would reject a new handshake, if the peer had attached first, and then connected, we should force disconnect if the peer first connects, and only then attaches. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_receiver.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 0a9f3c65f70a..85e3d846a23a 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -4014,12 +4014,13 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info if (device->state.conn == C_WF_REPORT_PARAMS) p_usize = min_not_zero(my_usize, p_usize); - /* Never shrink a device with usable data during connect. - But allow online shrinking if we are connected. */ + /* Never shrink a device with usable data during connect, + * or "attach" on the peer. + * But allow online shrinking if we are connected. */ new_size = drbd_new_dev_size(device, device->ldev, p_usize, 0); if (new_size < cur_size && device->state.disk >= D_OUTDATED && - device->state.conn < C_CONNECTED) { + (device->state.conn < C_CONNECTED || device->state.pdsk == D_DISKLESS)) { drbd_err(device, "The peer's disk size is too small! (%llu < %llu sectors)\n", (unsigned long long)new_size, (unsigned long long)cur_size); conn_request_state(peer_device->connection, NS(conn, C_DISCONNECTING), CS_HARD); @@ -4047,8 +4048,8 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info synchronize_rcu(); kfree(old_disk_conf); - drbd_info(device, "Peer sets u_size to %lu sectors\n", - (unsigned long)my_usize); + drbd_info(device, "Peer sets u_size to %lu sectors (old: %lu)\n", + (unsigned long)p_usize, (unsigned long)my_usize); } put_ldev(device); From patchwork Thu Dec 20 16:23:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739143 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 52FC3746 for ; Thu, 20 Dec 2018 16:35:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 43EAA28FE3 for ; Thu, 20 Dec 2018 16:35:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 37CBF29087; Thu, 20 Dec 2018 16:35:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDA0728E3D for ; Thu, 20 Dec 2018 16:35:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733292AbeLTQfE (ORCPT ); Thu, 20 Dec 2018 11:35:04 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50094 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725372AbeLTQei (ORCPT ); Thu, 20 Dec 2018 11:34:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 8C3C21045C26; Thu, 20 Dec 2018 17:23:47 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 08/17] drbd: reject attach of unsuitable uuids even if connected Date: Thu, 20 Dec 2018 17:23:35 +0100 Message-Id: <20181220162344.8430-9-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Multiple failure scenario: a) all good Connected Primary/Secondary UpToDate/UpToDate b) lose disk on Primary, Connected Primary/Secondary Diskless/UpToDate c) continue to write to the device, changes only make it to the Secondary storage. d) lose disk on Secondary, Connected Primary/Secondary Diskless/Diskless e) now try to re-attach on Primary This would have succeeded before, even though that is clearly the wrong data set to attach to (missing the modifications from c). Because we only compared our "effective" and the "to-be-attached" data generation uuid tags if (device->state.conn < C_CONNECTED). Fix: change that constraint to (device->state.pdsk != D_UP_TO_DATE) compare the uuids, and reject the attach. This patch also tries to improve the reverse scenario: first lose Secondary, then Primary disk, then try to attach the disk on Secondary. Before this patch, the attach on the Secondary succeeds, but since commit drbd: disconnect, if the wrong UUIDs are attached on a connected peer the Primary will notice unsuitable data, and drop the connection hard. Though unfortunately at a point in time during the handshake where we cannot easily abort the attach on the peer without more refactoring of the handshake. We now reject any attach to "unsuitable" uuids, as long as we can see a Primary role, unless we already have access to "good" data. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_nl.c | 6 +++--- drivers/block/drbd/drbd_receiver.c | 19 +++++++++++++++++++ 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index e4774f720de5..4b934e543e2d 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -1960,9 +1960,9 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info) } } - if (device->state.conn < C_CONNECTED && - device->state.role == R_PRIMARY && device->ed_uuid && - (device->ed_uuid & ~((u64)1)) != (nbc->md.uuid[UI_CURRENT] & ~((u64)1))) { + if (device->state.pdsk != D_UP_TO_DATE && device->ed_uuid && + (device->state.role == R_PRIMARY || device->state.peer == R_PRIMARY) && + (device->ed_uuid & ~((u64)1)) != (nbc->md.uuid[UI_CURRENT] & ~((u64)1))) { drbd_err(device, "Can only attach to data with current UUID=%016llX\n", (unsigned long long)device->ed_uuid); retcode = ERR_DATA_NOT_CURRENT; diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 85e3d846a23a..76d74b2122d6 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -4397,6 +4397,25 @@ static int receive_state(struct drbd_connection *connection, struct packet_info if (peer_state.conn == C_AHEAD) ns.conn = C_BEHIND; + /* TODO: + * if (primary and diskless and peer uuid != effective uuid) + * abort attach on peer; + * + * If this node does not have good data, was already connected, but + * the peer did a late attach only now, trying to "negotiate" with me, + * AND I am currently Primary, possibly frozen, with some specific + * "effective" uuid, this should never be reached, really, because + * we first send the uuids, then the current state. + * + * In this scenario, we already dropped the connection hard + * when we received the unsuitable uuids (receive_uuids(). + * + * Should we want to change this, that is: not drop the connection in + * receive_uuids() already, then we would need to add a branch here + * that aborts the attach of "unsuitable uuids" on the peer in case + * this node is currently Diskless Primary. + */ + if (device->p_uuid && peer_state.disk >= D_NEGOTIATING && get_ldev_if_state(device, D_NEGOTIATING)) { int cr; /* consider resync */ From patchwork Thu Dec 20 16:23:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739133 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 47DCF13B5 for ; Thu, 20 Dec 2018 16:34:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 37EF128DDF for ; Thu, 20 Dec 2018 16:34:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2B83228E05; Thu, 20 Dec 2018 16:34:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C300828DDF for ; Thu, 20 Dec 2018 16:34:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728528AbeLTQei (ORCPT ); Thu, 20 Dec 2018 11:34:38 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50091 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725781AbeLTQei (ORCPT ); Thu, 20 Dec 2018 11:34:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id EF3901045C30; Thu, 20 Dec 2018 17:23:47 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 09/17] drbd: fix comment typos Date: Thu, 20 Dec 2018 17:23:36 +0100 Message-Id: <20181220162344.8430-10-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_receiver.c | 2 +- drivers/block/drbd/drbd_state.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 76d74b2122d6..3a0fe357b68b 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -4428,7 +4428,7 @@ static int receive_state(struct drbd_connection *connection, struct packet_info (peer_state.disk == D_NEGOTIATING || os.disk == D_NEGOTIATING)); /* if we have both been inconsistent, and the peer has been - * forced to be UpToDate with --overwrite-data */ + * forced to be UpToDate with --force */ cr |= test_bit(CONSIDER_RESYNC, &device->flags); /* if we had been plain connected, and the admin requested to * start a sync by "invalidate" or "invalidate-remote" */ diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c index 18d53fe60d1d..883155657273 100644 --- a/drivers/block/drbd/drbd_state.c +++ b/drivers/block/drbd/drbd_state.c @@ -1124,7 +1124,7 @@ static union drbd_state sanitize_state(struct drbd_device *device, union drbd_st ns.pdsk = D_UP_TO_DATE; } - /* Implications of the connection stat on the disk states */ + /* Implications of the connection state on the disk states */ disk_min = D_DISKLESS; disk_max = D_UP_TO_DATE; pdsk_min = D_INCONSISTENT; From patchwork Thu Dec 20 16:23:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739149 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 251E6746 for ; Thu, 20 Dec 2018 16:35:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1638429087 for ; Thu, 20 Dec 2018 16:35:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0A76629129; Thu, 20 Dec 2018 16:35:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BA38B29087 for ; Thu, 20 Dec 2018 16:35:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729137AbeLTQei (ORCPT ); Thu, 20 Dec 2018 11:34:38 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50096 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727387AbeLTQei (ORCPT ); Thu, 20 Dec 2018 11:34:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 548541045C31; Thu, 20 Dec 2018 17:23:48 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 10/17] drbd: do not block when adjusting "disk-options" while IO is frozen Date: Thu, 20 Dec 2018 17:23:37 +0100 Message-Id: <20181220162344.8430-11-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP "suspending" IO is overloaded. It can mean "do not allow new requests" (obviously), but it also may mean "must not complete pending IO", for example while the fencing handlers do their arbitration. When adjusting disk options, we suspend io (disallow new requests), then wait for the activity-log to become unused (drain all IO completions), and possibly replace it with a new activity log of different size. If the other "suspend IO" aspect is active, pending IO completions won't happen, and we would block forever (unkillable drbdsetup process). Fix this by skipping the activity log adjustment if the "al-extents" setting did not change. Also, in case it did change, fail early without blocking if it looks like we would block forever. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_nl.c | 37 ++++++++++++++++++++++++++++-------- 1 file changed, 29 insertions(+), 8 deletions(-) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index 4b934e543e2d..1958eb33b643 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -1540,6 +1540,30 @@ static void sanitize_disk_conf(struct drbd_device *device, struct disk_conf *dis } } +static int disk_opts_check_al_size(struct drbd_device *device, struct disk_conf *dc) +{ + int err = -EBUSY; + + if (device->act_log && + device->act_log->nr_elements == dc->al_extents) + return 0; + + drbd_suspend_io(device); + /* If IO completion is currently blocked, we would likely wait + * "forever" for the activity log to become unused. So we don't. */ + if (atomic_read(&device->ap_bio_cnt)) + goto out; + + wait_event(device->al_wait, lc_try_lock(device->act_log)); + drbd_al_shrink(device); + err = drbd_check_al_size(device, dc); + lc_unlock(device->act_log); + wake_up(&device->al_wait); +out: + drbd_resume_io(device); + return err; +} + int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info) { struct drbd_config_context adm_ctx; @@ -1602,15 +1626,12 @@ int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info) } } - drbd_suspend_io(device); - wait_event(device->al_wait, lc_try_lock(device->act_log)); - drbd_al_shrink(device); - err = drbd_check_al_size(device, new_disk_conf); - lc_unlock(device->act_log); - wake_up(&device->al_wait); - drbd_resume_io(device); - + err = disk_opts_check_al_size(device, new_disk_conf); if (err) { + /* Could be just "busy". Ignore? + * Introduce dedicated error code? */ + drbd_msg_put_info(adm_ctx.reply_skb, + "Try again without changing current al-extents setting"); retcode = ERR_NOMEM; goto fail_unlock; } From patchwork Thu Dec 20 16:23:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739159 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 54871746 for ; Thu, 20 Dec 2018 16:39:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4418C2923E for ; Thu, 20 Dec 2018 16:39:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 37F2B29241; Thu, 20 Dec 2018 16:39:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BF0F82923E for ; Thu, 20 Dec 2018 16:39:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729004AbeLTQji (ORCPT ); Thu, 20 Dec 2018 11:39:38 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50122 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729103AbeLTQjh (ORCPT ); Thu, 20 Dec 2018 11:39:37 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id B6A2C1045C32; Thu, 20 Dec 2018 17:23:48 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 11/17] drbd: avoid spurious self-outdating with concurrent disconnect / down Date: Thu, 20 Dec 2018 17:23:38 +0100 Message-Id: <20181220162344.8430-12-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If peers are "simultaneously" told to disconnect from each other, either explicitly, or implicitly by taking down the resource, with bad timing, one side may see its disconnect "fail" with a result of "state change failed by peer", and interpret this as "please oudate yourself". Try to catch this by checking for current connection status, and possibly retry as local-only state change instead. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_nl.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index 1958eb33b643..82915880c5e9 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -2711,8 +2711,10 @@ int drbd_adm_connect(struct sk_buff *skb, struct genl_info *info) static enum drbd_state_rv conn_try_disconnect(struct drbd_connection *connection, bool force) { + enum drbd_conns cstate; enum drbd_state_rv rv; +repeat: rv = conn_request_state(connection, NS(conn, C_DISCONNECTING), force ? CS_HARD : 0); @@ -2730,6 +2732,11 @@ static enum drbd_state_rv conn_try_disconnect(struct drbd_connection *connection break; case SS_CW_FAILED_BY_PEER: + spin_lock_irq(&connection->resource->req_lock); + cstate = connection->cstate; + spin_unlock_irq(&connection->resource->req_lock); + if (cstate <= C_WF_CONNECTION) + goto repeat; /* The peer probably wants to see us outdated. */ rv = conn_request_state(connection, NS2(conn, C_DISCONNECTING, disk, D_OUTDATED), 0); From patchwork Thu Dec 20 16:23:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739161 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5D6BF746 for ; Thu, 20 Dec 2018 16:39:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4B7A62923E for ; Thu, 20 Dec 2018 16:39:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3EE6529241; Thu, 20 Dec 2018 16:39:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E38E72923E for ; Thu, 20 Dec 2018 16:39:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729103AbeLTQji (ORCPT ); Thu, 20 Dec 2018 11:39:38 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50128 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729924AbeLTQji (ORCPT ); Thu, 20 Dec 2018 11:39:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 1C2331045C33; Thu, 20 Dec 2018 17:23:49 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 12/17] drbd: fix print_st_err()'s prototype to match the definition Date: Thu, 20 Dec 2018 17:23:39 +0100 Message-Id: <20181220162344.8430-13-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Luc Van Oostenryck print_st_err() is defined with its 4th argument taking an 'enum drbd_state_rv' but its prototype use an int for it. Fix this by using 'enum drbd_state_rv' in the prototype too. Signed-off-by: Luc Van Oostenryck Signed-off-by: Roland Kammerer Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_state.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/drbd/drbd_state.h b/drivers/block/drbd/drbd_state.h index ea58301d0895..b2a390ba73a0 100644 --- a/drivers/block/drbd/drbd_state.h +++ b/drivers/block/drbd/drbd_state.h @@ -131,7 +131,7 @@ extern enum drbd_state_rv _drbd_set_state(struct drbd_device *, union drbd_state enum chg_state_flags, struct completion *done); extern void print_st_err(struct drbd_device *, union drbd_state, - union drbd_state, int); + union drbd_state, enum drbd_state_rv); enum drbd_state_rv _conn_request_state(struct drbd_connection *connection, union drbd_state mask, union drbd_state val, From patchwork Thu Dec 20 16:23:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739169 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C760413B5 for ; Thu, 20 Dec 2018 16:40:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B360E29240 for ; Thu, 20 Dec 2018 16:40:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A3D2129241; Thu, 20 Dec 2018 16:40:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 611822923E for ; Thu, 20 Dec 2018 16:40:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730566AbeLTQjj (ORCPT ); Thu, 20 Dec 2018 11:39:39 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50125 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729140AbeLTQji (ORCPT ); Thu, 20 Dec 2018 11:39:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 7DF341045C34; Thu, 20 Dec 2018 17:23:49 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 13/17] drbd: don't retry connection if peers do not agree on "authentication" settings Date: Thu, 20 Dec 2018 17:23:40 +0100 Message-Id: <20181220162344.8430-14-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP emma: "Unexpected data packet AuthChallenge (0x0010)" ava: "expected AuthChallenge packet, received: ReportProtocol (0x000b)" "Authentication of peer failed, trying again." Pattern repeats. There is no point in retrying the handshake, if we expect to receive an AuthChallenge, but the peer is not even configured to expect or use a shared secret. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_receiver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 3a0fe357b68b..02a327891568 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -5332,7 +5332,7 @@ static int drbd_do_auth(struct drbd_connection *connection) if (pi.cmd != P_AUTH_CHALLENGE) { drbd_err(connection, "expected AuthChallenge packet, received: %s (0x%04x)\n", cmdname(pi.cmd), pi.cmd); - rv = 0; + rv = -1; goto fail; } From patchwork Thu Dec 20 16:23:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739163 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1AD9013B5 for ; Thu, 20 Dec 2018 16:39:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 09B7E2923E for ; Thu, 20 Dec 2018 16:39:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EFF9129241; Thu, 20 Dec 2018 16:39:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2C7522923E for ; Thu, 20 Dec 2018 16:39:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726604AbeLTQjo (ORCPT ); Thu, 20 Dec 2018 11:39:44 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50124 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730787AbeLTQjk (ORCPT ); Thu, 20 Dec 2018 11:39:40 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id E15731045C35; Thu, 20 Dec 2018 17:23:49 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 14/17] drbd: skip spurious timeout (ping-timeo) when failing promote Date: Thu, 20 Dec 2018 17:23:41 +0100 Message-Id: <20181220162344.8430-15-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If you try to promote a Secondary while connected to a Primary and allow-two-primaries is NOT set, we will wait for "ping-timeout" to give this node a chance to detect a dead primary, in case the cluster manager noticed faster than we did. But if we then are *still* connected to a Primary, we fail (after an additional timeout of ping-timout). This change skips the spurious second timeout. Most people won't notice really, since "ping-timeout" by default is half a second. But in some installations, ping-timeout may be 10 or 20 seconds or more, and spuriously delaying the error return becomes annoying. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_nl.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index 82915880c5e9..bfe1b0062d62 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -697,14 +697,15 @@ drbd_set_role(struct drbd_device *const device, enum drbd_role new_role, int for if (rv == SS_TWO_PRIMARIES) { /* Maybe the peer is detected as dead very soon... retry at most once more in this case. */ - int timeo; - rcu_read_lock(); - nc = rcu_dereference(connection->net_conf); - timeo = nc ? (nc->ping_timeo + 1) * HZ / 10 : 1; - rcu_read_unlock(); - schedule_timeout_interruptible(timeo); - if (try < max_tries) + if (try < max_tries) { + int timeo; try = max_tries - 1; + rcu_read_lock(); + nc = rcu_dereference(connection->net_conf); + timeo = nc ? (nc->ping_timeo + 1) * HZ / 10 : 1; + rcu_read_unlock(); + schedule_timeout_interruptible(timeo); + } continue; } if (rv < SS_SUCCESS) { From patchwork Thu Dec 20 16:23:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739167 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C7CA9746 for ; Thu, 20 Dec 2018 16:40:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B4F252923E for ; Thu, 20 Dec 2018 16:40:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A804B29241; Thu, 20 Dec 2018 16:40:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D5A922923E for ; Thu, 20 Dec 2018 16:39:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730787AbeLTQjw (ORCPT ); Thu, 20 Dec 2018 11:39:52 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50127 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729937AbeLTQjj (ORCPT ); Thu, 20 Dec 2018 11:39:39 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 480611045C36; Thu, 20 Dec 2018 17:23:50 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 15/17] drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire") Date: Thu, 20 Dec 2018 17:23:42 +0100 Message-Id: <20181220162344.8430-16-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP And also re-enable partial-zero-out + discard aligned. With the introduction of REQ_OP_WRITE_ZEROES, we started to use that for both WRITE_ZEROES and DISCARDS, hoping that WRITE_ZEROES would "do what we want", UNMAP if possible, zero-out the rest. The example scenario is some LVM "thin" backend. While an un-allocated block on dm-thin reads as zeroes, on a dm-thin with "skip_block_zeroing=true", after a partial block write allocated that block, that same block may well map "undefined old garbage" from the backends on LBAs that have not yet been written to. If we cannot distinguish between zero-out and discard on the receiving side, to avoid "undefined old garbage" to pop up randomly at later times on supposedly zero-initialized blocks, we'd need to map all discards to zero-out on the receiving side. But that would potentially do a full alloc on thinly provisioned backends, even when the expectation was to unmap/trim/discard/de-allocate. We need to distinguish on the protocol level, whether we need to guarantee zeroes (and thus use zero-out, potentially doing the mentioned full-alloc), or if we want to put the emphasis on discard, and only do a "best effort zeroing" (by "discarding" blocks aligned to discard-granularity, and zeroing only potential unaligned head and tail clippings to at least *try* to avoid "false positives" in an online-verify later), hoping that someone set skip_block_zeroing=false. For some discussion regarding this on dm-devel, see also https://www.mail-archive.com/dm-devel%40redhat.com/msg07965.html https://www.redhat.com/archives/dm-devel/2018-January/msg00271.html For backward compatibility, P_TRIM means zero-out, unless the DRBD_FF_WZEROES feature flag is agreed upon during handshake. To have upper layers even try to submit WRITE ZEROES requests, we need to announce "efficient zeroout" independently. We need to fixup max_write_zeroes_sectors after blk_queue_stack_limits(): if we can handle "zeroes" efficiently on the protocol, we want to do that, even if our backend does not announce max_write_zeroes_sectors itself. Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_debugfs.c | 2 + drivers/block/drbd/drbd_int.h | 11 +- drivers/block/drbd/drbd_main.c | 11 +- drivers/block/drbd/drbd_nl.c | 16 +++ drivers/block/drbd/drbd_protocol.h | 47 ++++++++ drivers/block/drbd/drbd_receiver.c | 171 ++++++++++++++++++++++++++--- drivers/block/drbd/drbd_req.c | 19 ++-- drivers/block/drbd/drbd_req.h | 2 + drivers/block/drbd/drbd_worker.c | 2 +- include/linux/drbd.h | 2 +- 10 files changed, 252 insertions(+), 31 deletions(-) diff --git a/drivers/block/drbd/drbd_debugfs.c b/drivers/block/drbd/drbd_debugfs.c index 5d5e8d6a8a56..f13b48ff5f43 100644 --- a/drivers/block/drbd/drbd_debugfs.c +++ b/drivers/block/drbd/drbd_debugfs.c @@ -237,6 +237,8 @@ static void seq_print_peer_request_flags(struct seq_file *m, struct drbd_peer_re seq_print_rq_state_bit(m, f & EE_CALL_AL_COMPLETE_IO, &sep, "in-AL"); seq_print_rq_state_bit(m, f & EE_SEND_WRITE_ACK, &sep, "C"); seq_print_rq_state_bit(m, f & EE_MAY_SET_IN_SYNC, &sep, "set-in-sync"); + seq_print_rq_state_bit(m, f & EE_TRIM, &sep, "trim"); + seq_print_rq_state_bit(m, f & EE_ZEROOUT, &sep, "zero-out"); seq_print_rq_state_bit(m, f & EE_WRITE_SAME, &sep, "write-same"); seq_putc(m, '\n'); } diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h index ab718582a092..000a2f4c0e92 100644 --- a/drivers/block/drbd/drbd_int.h +++ b/drivers/block/drbd/drbd_int.h @@ -430,7 +430,11 @@ enum { __EE_MAY_SET_IN_SYNC, /* is this a TRIM aka REQ_OP_DISCARD? */ - __EE_IS_TRIM, + __EE_TRIM, + /* explicit zero-out requested, or + * our lower level cannot handle trim, + * and we want to fall back to zeroout instead */ + __EE_ZEROOUT, /* In case a barrier failed, * we need to resubmit without the barrier flag. */ @@ -472,7 +476,8 @@ enum { }; #define EE_CALL_AL_COMPLETE_IO (1<<__EE_CALL_AL_COMPLETE_IO) #define EE_MAY_SET_IN_SYNC (1<<__EE_MAY_SET_IN_SYNC) -#define EE_IS_TRIM (1<<__EE_IS_TRIM) +#define EE_TRIM (1<<__EE_TRIM) +#define EE_ZEROOUT (1<<__EE_ZEROOUT) #define EE_RESUBMITTED (1<<__EE_RESUBMITTED) #define EE_WAS_ERROR (1<<__EE_WAS_ERROR) #define EE_HAS_DIGEST (1<<__EE_HAS_DIGEST) @@ -1556,6 +1561,8 @@ extern void start_resync_timer_fn(struct timer_list *t); extern void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req); /* drbd_receiver.c */ +extern int drbd_issue_discard_or_zero_out(struct drbd_device *device, + sector_t start, unsigned int nr_sectors, int flags); extern int drbd_receiver(struct drbd_thread *thi); extern int drbd_ack_receiver(struct drbd_thread *thi); extern void drbd_send_ping_wf(struct work_struct *ws); diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index f9b4228cc2d9..714eb64fabfd 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1668,7 +1668,11 @@ static u32 bio_flags_to_wire(struct drbd_connection *connection, (bio->bi_opf & REQ_PREFLUSH ? DP_FLUSH : 0) | (bio_op(bio) == REQ_OP_WRITE_SAME ? DP_WSAME : 0) | (bio_op(bio) == REQ_OP_DISCARD ? DP_DISCARD : 0) | - (bio_op(bio) == REQ_OP_WRITE_ZEROES ? DP_DISCARD : 0); + (bio_op(bio) == REQ_OP_WRITE_ZEROES ? + ((connection->agreed_features & DRBD_FF_WZEROES) ? + (DP_ZEROES |(!(bio->bi_opf & REQ_NOUNMAP) ? DP_DISCARD : 0)) + : DP_DISCARD) + : 0); else return bio->bi_opf & REQ_SYNC ? DP_RW_SYNC : 0; } @@ -1712,10 +1716,11 @@ int drbd_send_dblock(struct drbd_peer_device *peer_device, struct drbd_request * } p->dp_flags = cpu_to_be32(dp_flags); - if (dp_flags & DP_DISCARD) { + if (dp_flags & (DP_DISCARD|DP_ZEROES)) { + enum drbd_packet cmd = (dp_flags & DP_ZEROES) ? P_ZEROES : P_TRIM; struct p_trim *t = (struct p_trim*)p; t->size = cpu_to_be32(req->i.size); - err = __send_command(peer_device->connection, device->vnr, sock, P_TRIM, sizeof(*t), NULL, 0); + err = __send_command(peer_device->connection, device->vnr, sock, cmd, sizeof(*t), NULL, 0); goto out; } if (dp_flags & DP_WSAME) { diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index bfe1b0062d62..f2471172a961 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -1261,6 +1261,21 @@ static void fixup_discard_if_not_supported(struct request_queue *q) } } +static void fixup_write_zeroes(struct drbd_device *device, struct request_queue *q) +{ + /* Fixup max_write_zeroes_sectors after blk_queue_stack_limits(): + * if we can handle "zeroes" efficiently on the protocol, + * we want to do that, even if our backend does not announce + * max_write_zeroes_sectors itself. */ + struct drbd_connection *connection = first_peer_device(device)->connection; + /* If the peer announces WZEROES support, use it. Otherwise, rather + * send explicit zeroes than rely on some discard-zeroes-data magic. */ + if (connection->agreed_features & DRBD_FF_WZEROES) + q->limits.max_write_zeroes_sectors = DRBD_MAX_BBIO_SECTORS; + else + q->limits.max_write_zeroes_sectors = 0; +} + static void decide_on_write_same_support(struct drbd_device *device, struct request_queue *q, struct request_queue *b, struct o_qlim *o, @@ -1371,6 +1386,7 @@ static void drbd_setup_queue_param(struct drbd_device *device, struct drbd_backi } } fixup_discard_if_not_supported(q); + fixup_write_zeroes(device, q); } void drbd_reconsider_queue_parameters(struct drbd_device *device, struct drbd_backing_dev *bdev, struct o_qlim *o) diff --git a/drivers/block/drbd/drbd_protocol.h b/drivers/block/drbd/drbd_protocol.h index 48dabbb21e11..e6fc5ad72501 100644 --- a/drivers/block/drbd/drbd_protocol.h +++ b/drivers/block/drbd/drbd_protocol.h @@ -70,6 +70,11 @@ enum drbd_packet { * we may fall back to an opencoded loop instead. */ P_WSAME = 0x34, + /* 0x35 already claimed in DRBD 9 */ + P_ZEROES = 0x36, /* data sock: zero-out, WRITE_ZEROES */ + + /* 0x40 .. 0x48 already claimed in DRBD 9 */ + P_MAY_IGNORE = 0x100, /* Flag to test if (cmd > P_MAY_IGNORE) ... */ P_MAX_OPT_CMD = 0x101, @@ -130,6 +135,12 @@ struct p_header100 { #define DP_SEND_RECEIVE_ACK 128 /* This is a proto B write request */ #define DP_SEND_WRITE_ACK 256 /* This is a proto C write request */ #define DP_WSAME 512 /* equiv. REQ_WRITE_SAME */ +#define DP_ZEROES 1024 /* equiv. REQ_OP_WRITE_ZEROES */ + +/* possible combinations: + * REQ_OP_WRITE_ZEROES: DP_DISCARD | DP_ZEROES + * REQ_OP_WRITE_ZEROES + REQ_NOUNMAP: DP_ZEROES + */ struct p_data { u64 sector; /* 64 bits sector number */ @@ -197,6 +208,42 @@ struct p_block_req { */ #define DRBD_FF_WSAME 4 +/* supports REQ_OP_WRITE_ZEROES on the "wire" protocol. + * + * We used to map that to "discard" on the sending side, and if we cannot + * guarantee that discard zeroes data, the receiving side would map discard + * back to zero-out. + * + * With the introduction of REQ_OP_WRITE_ZEROES, + * we started to use that for both WRITE_ZEROES and DISCARDS, + * hoping that WRITE_ZEROES would "do what we want", + * UNMAP if possible, zero-out the rest. + * + * The example scenario is some LVM "thin" backend. + * + * While an un-allocated block on dm-thin reads as zeroes, on a dm-thin + * with "skip_block_zeroing=true", after a partial block write allocated + * that block, that same block may well map "undefined old garbage" from + * the backends on LBAs that have not yet been written to. + * + * If we cannot distinguish between zero-out and discard on the receiving + * side, to avoid "undefined old garbage" to pop up randomly at later times + * on supposedly zero-initialized blocks, we'd need to map all discards to + * zero-out on the receiving side. But that would potentially do a full + * alloc on thinly provisioned backends, even when the expectation was to + * unmap/trim/discard/de-allocate. + * + * We need to distinguish on the protocol level, whether we need to guarantee + * zeroes (and thus use zero-out, potentially doing the mentioned full-alloc), + * or if we want to put the emphasis on discard, and only do a "best effort + * zeroing" (by "discarding" blocks aligned to discard-granularity, and zeroing + * only potential unaligned head and tail clippings), to at least *try* to + * avoid "false positives" in an online-verify later, hoping that someone + * set skip_block_zeroing=false. + */ +#define DRBD_FF_WZEROES 8 + + struct p_connection_features { u32 protocol_min; u32 feature_flags; diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 02a327891568..47d2d6f87c2c 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -50,7 +50,7 @@ #include "drbd_req.h" #include "drbd_vli.h" -#define PRO_FEATURES (DRBD_FF_TRIM|DRBD_FF_THIN_RESYNC|DRBD_FF_WSAME) +#define PRO_FEATURES (DRBD_FF_TRIM|DRBD_FF_THIN_RESYNC|DRBD_FF_WSAME|DRBD_FF_WZEROES) struct packet_info { enum drbd_packet cmd; @@ -1490,14 +1490,129 @@ void drbd_bump_write_ordering(struct drbd_resource *resource, struct drbd_backin drbd_info(resource, "Method to ensure write ordering: %s\n", write_ordering_str[resource->write_ordering]); } -static void drbd_issue_peer_discard(struct drbd_device *device, struct drbd_peer_request *peer_req) +/* + * Mapping "discard" to ZEROOUT with UNMAP does not work for us: + * Drivers have to "announce" q->limits.max_write_zeroes_sectors, or it + * will directly go to fallback mode, submitting normal writes, and + * never even try to UNMAP. + * + * And dm-thin does not do this (yet), mostly because in general it has + * to assume that "skip_block_zeroing" is set. See also: + * https://www.mail-archive.com/dm-devel%40redhat.com/msg07965.html + * https://www.redhat.com/archives/dm-devel/2018-January/msg00271.html + * + * We *may* ignore the discard-zeroes-data setting, if so configured. + * + * Assumption is that this "discard_zeroes_data=0" is only because the backend + * may ignore partial unaligned discards. + * + * LVM/DM thin as of at least + * LVM version: 2.02.115(2)-RHEL7 (2015-01-28) + * Library version: 1.02.93-RHEL7 (2015-01-28) + * Driver version: 4.29.0 + * still behaves this way. + * + * For unaligned (wrt. alignment and granularity) or too small discards, + * we zero-out the initial (and/or) trailing unaligned partial chunks, + * but discard all the aligned full chunks. + * + * At least for LVM/DM thin, with skip_block_zeroing=false, + * the result is effectively "discard_zeroes_data=1". + */ +/* flags: EE_TRIM|EE_ZEROOUT */ +int drbd_issue_discard_or_zero_out(struct drbd_device *device, sector_t start, unsigned int nr_sectors, int flags) { struct block_device *bdev = device->ldev->backing_bdev; + struct request_queue *q = bdev_get_queue(bdev); + sector_t tmp, nr; + unsigned int max_discard_sectors, granularity; + int alignment; + int err = 0; - if (blkdev_issue_zeroout(bdev, peer_req->i.sector, peer_req->i.size >> 9, - GFP_NOIO, 0)) - peer_req->flags |= EE_WAS_ERROR; + if ((flags & EE_ZEROOUT) || !(flags & EE_TRIM)) + goto zero_out; + + /* Zero-sector (unknown) and one-sector granularities are the same. */ + granularity = max(q->limits.discard_granularity >> 9, 1U); + alignment = (bdev_discard_alignment(bdev) >> 9) % granularity; + + max_discard_sectors = min(q->limits.max_discard_sectors, (1U << 22)); + max_discard_sectors -= max_discard_sectors % granularity; + if (unlikely(!max_discard_sectors)) + goto zero_out; + + if (nr_sectors < granularity) + goto zero_out; + + tmp = start; + if (sector_div(tmp, granularity) != alignment) { + if (nr_sectors < 2*granularity) + goto zero_out; + /* start + gran - (start + gran - align) % gran */ + tmp = start + granularity - alignment; + tmp = start + granularity - sector_div(tmp, granularity); + + nr = tmp - start; + /* don't flag BLKDEV_ZERO_NOUNMAP, we don't know how many + * layers are below us, some may have smaller granularity */ + err |= blkdev_issue_zeroout(bdev, start, nr, GFP_NOIO, 0); + nr_sectors -= nr; + start = tmp; + } + while (nr_sectors >= max_discard_sectors) { + err |= blkdev_issue_discard(bdev, start, max_discard_sectors, GFP_NOIO, 0); + nr_sectors -= max_discard_sectors; + start += max_discard_sectors; + } + if (nr_sectors) { + /* max_discard_sectors is unsigned int (and a multiple of + * granularity, we made sure of that above already); + * nr is < max_discard_sectors; + * I don't need sector_div here, even though nr is sector_t */ + nr = nr_sectors; + nr -= (unsigned int)nr % granularity; + if (nr) { + err |= blkdev_issue_discard(bdev, start, nr, GFP_NOIO, 0); + nr_sectors -= nr; + start += nr; + } + } + zero_out: + if (nr_sectors) { + err |= blkdev_issue_zeroout(bdev, start, nr_sectors, GFP_NOIO, + (flags & EE_TRIM) ? 0 : BLKDEV_ZERO_NOUNMAP); + } + return err != 0; +} + +static bool can_do_reliable_discards(struct drbd_device *device) +{ + struct request_queue *q = bdev_get_queue(device->ldev->backing_bdev); + struct disk_conf *dc; + bool can_do; + if (!blk_queue_discard(q)) + return false; + + rcu_read_lock(); + dc = rcu_dereference(device->ldev->disk_conf); + can_do = dc->discard_zeroes_if_aligned; + rcu_read_unlock(); + return can_do; +} + +static void drbd_issue_peer_discard_or_zero_out(struct drbd_device *device, struct drbd_peer_request *peer_req) +{ + /* If the backend cannot discard, or does not guarantee + * read-back zeroes in discarded ranges, we fall back to + * zero-out. Unless configuration specifically requested + * otherwise. */ + if (!can_do_reliable_discards(device)) + peer_req->flags |= EE_ZEROOUT; + + if (drbd_issue_discard_or_zero_out(device, peer_req->i.sector, + peer_req->i.size >> 9, peer_req->flags & (EE_ZEROOUT|EE_TRIM))) + peer_req->flags |= EE_WAS_ERROR; drbd_endio_write_sec_final(peer_req); } @@ -1550,7 +1665,7 @@ int drbd_submit_peer_request(struct drbd_device *device, * Correctness first, performance later. Next step is to code an * asynchronous variant of the same. */ - if (peer_req->flags & (EE_IS_TRIM|EE_WRITE_SAME)) { + if (peer_req->flags & (EE_TRIM|EE_WRITE_SAME|EE_ZEROOUT)) { /* wait for all pending IO completions, before we start * zeroing things out. */ conn_wait_active_ee_empty(peer_req->peer_device->connection); @@ -1567,8 +1682,8 @@ int drbd_submit_peer_request(struct drbd_device *device, spin_unlock_irq(&device->resource->req_lock); } - if (peer_req->flags & EE_IS_TRIM) - drbd_issue_peer_discard(device, peer_req); + if (peer_req->flags & (EE_TRIM|EE_ZEROOUT)) + drbd_issue_peer_discard_or_zero_out(device, peer_req); else /* EE_WRITE_SAME */ drbd_issue_peer_wsame(device, peer_req); return 0; @@ -1765,6 +1880,7 @@ read_in_block(struct drbd_peer_device *peer_device, u64 id, sector_t sector, void *dig_vv = peer_device->connection->int_dig_vv; unsigned long *data; struct p_trim *trim = (pi->cmd == P_TRIM) ? pi->data : NULL; + struct p_trim *zeroes = (pi->cmd == P_ZEROES) ? pi->data : NULL; struct p_trim *wsame = (pi->cmd == P_WSAME) ? pi->data : NULL; digest_size = 0; @@ -1786,6 +1902,10 @@ read_in_block(struct drbd_peer_device *peer_device, u64 id, sector_t sector, if (!expect(data_size == 0)) return NULL; ds = be32_to_cpu(trim->size); + } else if (zeroes) { + if (!expect(data_size == 0)) + return NULL; + ds = be32_to_cpu(zeroes->size); } else if (wsame) { if (data_size != queue_logical_block_size(device->rq_queue)) { drbd_err(peer_device, "data size (%u) != drbd logical block size (%u)\n", @@ -1802,7 +1922,7 @@ read_in_block(struct drbd_peer_device *peer_device, u64 id, sector_t sector, if (!expect(IS_ALIGNED(ds, 512))) return NULL; - if (trim || wsame) { + if (trim || wsame || zeroes) { if (!expect(ds <= (DRBD_MAX_BBIO_SECTORS << 9))) return NULL; } else if (!expect(ds <= DRBD_MAX_BIO_SIZE)) @@ -1827,7 +1947,11 @@ read_in_block(struct drbd_peer_device *peer_device, u64 id, sector_t sector, peer_req->flags |= EE_WRITE; if (trim) { - peer_req->flags |= EE_IS_TRIM; + peer_req->flags |= EE_TRIM; + return peer_req; + } + if (zeroes) { + peer_req->flags |= EE_ZEROOUT; return peer_req; } if (wsame) @@ -2326,8 +2450,12 @@ static unsigned long wire_flags_to_bio_flags(u32 dpf) static unsigned long wire_flags_to_bio_op(u32 dpf) { - if (dpf & DP_DISCARD) + if (dpf & DP_ZEROES) return REQ_OP_WRITE_ZEROES; + if (dpf & DP_DISCARD) + return REQ_OP_DISCARD; + if (dpf & DP_WSAME) + return REQ_OP_WRITE_SAME; else return REQ_OP_WRITE; } @@ -2517,9 +2645,20 @@ static int receive_Data(struct drbd_connection *connection, struct packet_info * op = wire_flags_to_bio_op(dp_flags); op_flags = wire_flags_to_bio_flags(dp_flags); if (pi->cmd == P_TRIM) { + D_ASSERT(peer_device, peer_req->i.size > 0); + D_ASSERT(peer_device, op == REQ_OP_DISCARD); + D_ASSERT(peer_device, peer_req->pages == NULL); + /* need to play safe: an older DRBD sender + * may mean zero-out while sending P_TRIM. */ + if (0 == (connection->agreed_features & DRBD_FF_WZEROES)) + peer_req->flags |= EE_ZEROOUT; + } else if (pi->cmd == P_ZEROES) { D_ASSERT(peer_device, peer_req->i.size > 0); D_ASSERT(peer_device, op == REQ_OP_WRITE_ZEROES); D_ASSERT(peer_device, peer_req->pages == NULL); + /* Do (not) pass down BLKDEV_ZERO_NOUNMAP? */ + if (dp_flags & DP_DISCARD) + peer_req->flags |= EE_TRIM; } else if (peer_req->pages == NULL) { D_ASSERT(device, peer_req->i.size == 0); D_ASSERT(device, dp_flags & DP_FLUSH); @@ -2587,7 +2726,7 @@ static int receive_Data(struct drbd_connection *connection, struct packet_info * * we wait for all pending requests, respectively wait for * active_ee to become empty in drbd_submit_peer_request(); * better not add ourselves here. */ - if ((peer_req->flags & (EE_IS_TRIM|EE_WRITE_SAME)) == 0) + if ((peer_req->flags & (EE_TRIM|EE_WRITE_SAME|EE_ZEROOUT)) == 0) list_add_tail(&peer_req->w.list, &device->active_ee); spin_unlock_irq(&device->resource->req_lock); @@ -4893,7 +5032,7 @@ static int receive_rs_deallocated(struct drbd_connection *connection, struct pac peer_req->w.cb = e_end_resync_block; peer_req->submit_jif = jiffies; - peer_req->flags |= EE_IS_TRIM; + peer_req->flags |= EE_TRIM; spin_lock_irq(&device->resource->req_lock); list_add_tail(&peer_req->w.list, &device->sync_ee); @@ -4961,6 +5100,7 @@ static struct data_cmd drbd_cmd_handler[] = { [P_CONN_ST_CHG_REQ] = { 0, sizeof(struct p_req_state), receive_req_conn_state }, [P_PROTOCOL_UPDATE] = { 1, sizeof(struct p_protocol), receive_protocol }, [P_TRIM] = { 0, sizeof(struct p_trim), receive_Data }, + [P_ZEROES] = { 0, sizeof(struct p_trim), receive_Data }, [P_RS_DEALLOCATED] = { 0, sizeof(struct p_block_desc), receive_rs_deallocated }, [P_WSAME] = { 1, sizeof(struct p_wsame), receive_Data }, }; @@ -5245,11 +5385,12 @@ static int drbd_do_features(struct drbd_connection *connection) drbd_info(connection, "Handshake successful: " "Agreed network protocol version %d\n", connection->agreed_pro_version); - drbd_info(connection, "Feature flags enabled on protocol level: 0x%x%s%s%s.\n", + drbd_info(connection, "Feature flags enabled on protocol level: 0x%x%s%s%s%s.\n", connection->agreed_features, connection->agreed_features & DRBD_FF_TRIM ? " TRIM" : "", connection->agreed_features & DRBD_FF_THIN_RESYNC ? " THIN_RESYNC" : "", - connection->agreed_features & DRBD_FF_WSAME ? " WRITE_SAME" : + connection->agreed_features & DRBD_FF_WSAME ? " WRITE_SAME" : "", + connection->agreed_features & DRBD_FF_WZEROES ? " WRITE_ZEROES" : connection->agreed_features ? "" : " none"); return 1; diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c index 1c4da17e902e..643a04af213b 100644 --- a/drivers/block/drbd/drbd_req.c +++ b/drivers/block/drbd/drbd_req.c @@ -63,7 +63,7 @@ static struct drbd_request *drbd_req_new(struct drbd_device *device, struct bio drbd_req_make_private_bio(req, bio_src); req->rq_state = (bio_data_dir(bio_src) == WRITE ? RQ_WRITE : 0) | (bio_op(bio_src) == REQ_OP_WRITE_SAME ? RQ_WSAME : 0) - | (bio_op(bio_src) == REQ_OP_WRITE_ZEROES ? RQ_UNMAP : 0) + | (bio_op(bio_src) == REQ_OP_WRITE_ZEROES ? RQ_ZEROES : 0) | (bio_op(bio_src) == REQ_OP_DISCARD ? RQ_UNMAP : 0); req->device = device; req->master_bio = bio_src; @@ -1155,12 +1155,11 @@ static int drbd_process_write_request(struct drbd_request *req) return remote; } -static void drbd_process_discard_req(struct drbd_request *req) +static void drbd_process_discard_or_zeroes_req(struct drbd_request *req, int flags) { - struct block_device *bdev = req->device->ldev->backing_bdev; - - if (blkdev_issue_zeroout(bdev, req->i.sector, req->i.size >> 9, - GFP_NOIO, 0)) + int err = drbd_issue_discard_or_zero_out(req->device, + req->i.sector, req->i.size >> 9, flags); + if (err) req->private_bio->bi_status = BLK_STS_IOERR; bio_endio(req->private_bio); } @@ -1189,9 +1188,11 @@ drbd_submit_req_private_bio(struct drbd_request *req) if (get_ldev(device)) { if (drbd_insert_fault(device, type)) bio_io_error(bio); - else if (bio_op(bio) == REQ_OP_WRITE_ZEROES || - bio_op(bio) == REQ_OP_DISCARD) - drbd_process_discard_req(req); + else if (bio_op(bio) == REQ_OP_WRITE_ZEROES) + drbd_process_discard_or_zeroes_req(req, EE_ZEROOUT | + ((bio->bi_opf & REQ_NOUNMAP) ? 0 : EE_TRIM)); + else if (bio_op(bio) == REQ_OP_DISCARD) + drbd_process_discard_or_zeroes_req(req, EE_TRIM); else generic_make_request(bio); put_ldev(device); diff --git a/drivers/block/drbd/drbd_req.h b/drivers/block/drbd/drbd_req.h index 94c654020f0f..c2f569d2661b 100644 --- a/drivers/block/drbd/drbd_req.h +++ b/drivers/block/drbd/drbd_req.h @@ -208,6 +208,7 @@ enum drbd_req_state_bits { __RQ_WRITE, __RQ_WSAME, __RQ_UNMAP, + __RQ_ZEROES, /* Should call drbd_al_complete_io() for this request... */ __RQ_IN_ACT_LOG, @@ -253,6 +254,7 @@ enum drbd_req_state_bits { #define RQ_WRITE (1UL << __RQ_WRITE) #define RQ_WSAME (1UL << __RQ_WSAME) #define RQ_UNMAP (1UL << __RQ_UNMAP) +#define RQ_ZEROES (1UL << __RQ_ZEROES) #define RQ_IN_ACT_LOG (1UL << __RQ_IN_ACT_LOG) #define RQ_UNPLUG (1UL << __RQ_UNPLUG) #define RQ_POSTPONED (1UL << __RQ_POSTPONED) diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c index 99255d0c9e2f..268ef0c5d4ab 100644 --- a/drivers/block/drbd/drbd_worker.c +++ b/drivers/block/drbd/drbd_worker.c @@ -153,7 +153,7 @@ void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req) __releases(l do_wake = list_empty(block_id == ID_SYNCER ? &device->sync_ee : &device->active_ee); /* FIXME do we want to detach for failed REQ_OP_DISCARD? - * ((peer_req->flags & (EE_WAS_ERROR|EE_IS_TRIM)) == EE_WAS_ERROR) */ + * ((peer_req->flags & (EE_WAS_ERROR|EE_TRIM)) == EE_WAS_ERROR) */ if (peer_req->flags & EE_WAS_ERROR) __drbd_chk_io_error(device, DRBD_WRITE_ERROR); diff --git a/include/linux/drbd.h b/include/linux/drbd.h index 2d0259327721..a19d98367f08 100644 --- a/include/linux/drbd.h +++ b/include/linux/drbd.h @@ -51,7 +51,7 @@ #endif extern const char *drbd_buildtag(void); -#define REL_VERSION "8.4.10" +#define REL_VERSION "8.4.11" #define API_VERSION 1 #define PRO_VERSION_MIN 86 #define PRO_VERSION_MAX 101 From patchwork Thu Dec 20 16:23:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739177 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC943746 for ; Thu, 20 Dec 2018 16:44:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C645C28E22 for ; Thu, 20 Dec 2018 16:44:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BA57E28F22; Thu, 20 Dec 2018 16:44:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5DAB228E22 for ; Thu, 20 Dec 2018 16:44:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731522AbeLTQoi (ORCPT ); Thu, 20 Dec 2018 11:44:38 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50160 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730086AbeLTQoi (ORCPT ); Thu, 20 Dec 2018 11:44:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id AB7301045C39; Thu, 20 Dec 2018 17:23:50 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 16/17] drbd: Avoid Clang warning about pointless switch statment Date: Thu, 20 Dec 2018 17:23:43 +0100 Message-Id: <20181220162344.8430-17-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Nathan Chancellor There are several warnings from Clang about no case statement matching the constant 0: In file included from drivers/block/drbd/drbd_receiver.c:48: In file included from drivers/block/drbd/drbd_int.h:48: In file included from ./include/linux/drbd_genl_api.h:54: In file included from ./include/linux/genl_magic_struct.h:236: ./include/linux/drbd_genl.h:321:1: warning: no case matching constant switch condition '0' GENL_struct(DRBD_NLA_HELPER, 24, drbd_helper_info, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/genl_magic_struct.h:220:10: note: expanded from macro 'GENL_struct' switch (0) { ^ Silence this warning by adding a 'case 0:' statement. Additionally, adjust the alignment of the statements in the ct_assert_unique macro to avoid a checkpatch warning. This solution was originally sent by Arnd Bergmann with a default case statement: https://lore.kernel.org/patchwork/patch/756723/ Link: https://github.com/ClangBuiltLinux/linux/issues/43 Suggested-by: Lars Ellenberg Signed-off-by: Nathan Chancellor --- include/linux/genl_magic_struct.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/include/linux/genl_magic_struct.h b/include/linux/genl_magic_struct.h index 5972e4969197..eeae59d3ceb7 100644 --- a/include/linux/genl_magic_struct.h +++ b/include/linux/genl_magic_struct.h @@ -191,6 +191,7 @@ static inline void ct_assert_unique_operations(void) { switch (0) { #include GENL_MAGIC_INCLUDE_FILE + case 0: ; } } @@ -209,6 +210,7 @@ static inline void ct_assert_unique_top_level_attributes(void) { switch (0) { #include GENL_MAGIC_INCLUDE_FILE + case 0: ; } } @@ -218,7 +220,8 @@ static inline void ct_assert_unique_top_level_attributes(void) static inline void ct_assert_unique_ ## s_name ## _attributes(void) \ { \ switch (0) { \ - s_fields \ + s_fields \ + case 0: \ ; \ } \ } From patchwork Thu Dec 20 16:23:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lars Ellenberg X-Patchwork-Id: 10739179 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A8AF9746 for ; Thu, 20 Dec 2018 16:44:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 96EBD28E22 for ; Thu, 20 Dec 2018 16:44:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 88D8C28F22; Thu, 20 Dec 2018 16:44:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 488EB28E22 for ; Thu, 20 Dec 2018 16:44:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731566AbeLTQoi (ORCPT ); Thu, 20 Dec 2018 11:44:38 -0500 Received: from mail09.linbit.com ([212.69.161.110]:50159 "EHLO mail09.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729603AbeLTQoi (ORCPT ); Thu, 20 Dec 2018 11:44:38 -0500 Received: from soda.linbit (212-186-191-219.static.upcbusiness.at [212.186.191.219]) by mail09.linbit.com (LINBIT Mail Daemon) with ESMTP id 112E310476FF; Thu, 20 Dec 2018 17:23:51 +0100 (CET) From: Lars Ellenberg To: Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org Cc: drbd-dev@lists.linbit.com Subject: [PATCH 17/17] drbd: Change drbd_request_detach_interruptible's return type to int Date: Thu, 20 Dec 2018 17:23:44 +0100 Message-Id: <20181220162344.8430-18-lars.ellenberg@linbit.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181220162344.8430-1-lars.ellenberg@linbit.com> References: <20181220162344.8430-1-lars.ellenberg@linbit.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Nathan Chancellor Clang warns when an implicit conversion is done between enumerated types: drivers/block/drbd/drbd_state.c:708:8: warning: implicit conversion from enumeration type 'enum drbd_ret_code' to different enumeration type 'enum drbd_state_rv' [-Wenum-conversion] rv = ERR_INTR; ~ ^~~~~~~~ drbd_request_detach_interruptible's only call site is in the return statement of adm_detach, which returns an int. Change the return type of drbd_request_detach_interruptible to match, silencing Clang's warning. Reported-by: Nick Desaulniers Reviewed-by: Nick Desaulniers Signed-off-by: Nathan Chancellor --- drivers/block/drbd/drbd_state.c | 6 ++---- drivers/block/drbd/drbd_state.h | 3 +-- 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c index 883155657273..2b4c0db5d867 100644 --- a/drivers/block/drbd/drbd_state.c +++ b/drivers/block/drbd/drbd_state.c @@ -688,11 +688,9 @@ request_detach(struct drbd_device *device) CS_VERBOSE | CS_ORDERED | CS_INHIBIT_MD_IO); } -enum drbd_state_rv -drbd_request_detach_interruptible(struct drbd_device *device) +int drbd_request_detach_interruptible(struct drbd_device *device) { - enum drbd_state_rv rv; - int ret; + int ret, rv; drbd_suspend_io(device); /* so no-one is stuck in drbd_al_begin_io */ wait_event_interruptible(device->state_wait, diff --git a/drivers/block/drbd/drbd_state.h b/drivers/block/drbd/drbd_state.h index b2a390ba73a0..f87371e55e68 100644 --- a/drivers/block/drbd/drbd_state.h +++ b/drivers/block/drbd/drbd_state.h @@ -162,8 +162,7 @@ static inline int drbd_request_state(struct drbd_device *device, } /* for use in adm_detach() (drbd_adm_detach(), drbd_adm_down()) */ -enum drbd_state_rv -drbd_request_detach_interruptible(struct drbd_device *device); +int drbd_request_detach_interruptible(struct drbd_device *device); enum drbd_role conn_highest_role(struct drbd_connection *connection); enum drbd_role conn_highest_peer(struct drbd_connection *connection);