From patchwork Mon Nov 14 22:51:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042958 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 21CEBC4332F for ; Mon, 14 Nov 2022 23:46:01 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouie1-0002A4-Af; Mon, 14 Nov 2022 18:13:37 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidp-0001df-1z for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:25 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJE-000344-0b for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466327; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L7tAutWYxsCeeX40NRb/7ZHQ/rU/csbg541R8KjQ+DU=; b=b5u1Qc6Vmc7LPW/FP3z4EBe1IYhIEYaWP4OkrGN+BP39HlTR0GYG1Et1NOkt+Q938IRh9O QuevlvcpY7FzbjEBrG9BujmK2/MBsStkxlW+n3F19UAjymQaYf1B9QvYUVKNX2E186/RwS Kt2+fIEDpeQEpE1M7iotne1OpQiJv6Y= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-189-p1pO5aOMP_O9Ow7IFKOVBg-1; Mon, 14 Nov 2022 17:52:02 -0500 X-MC-Unique: p1pO5aOMP_O9Ow7IFKOVBg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 67CC6811E7A; Mon, 14 Nov 2022 22:52:02 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 04B9340E9786; Mon, 14 Nov 2022 22:52:01 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 01/23] block_status: Refactor array storage Date: Mon, 14 Nov 2022 16:51:36 -0600 Message-Id: <20221114225158.2186742-2-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org For 32-bit block status, we were able to cheat and use an array with an odd number of elements, with array[0] holding the context id, and passing &array[1] to the user's callback. But once we have 64-bit extents, we can no longer abuse array element 0 like that, for two reasons: 64-bit extents contain uint64_t which might not be alignment-compatible with an array of uint32_t on all architectures, and the new NBD_REPLY_TYPE_BLOCK_STATUS_EXT adds an additional count field before the array. Split out a new state STRUCTURED_REPLY.BS_HEADER to receive the context id (and eventually the new count field for 64-bit replies) separately from the extents array, and add another structured_reply type in the payload section for tracking it. No behavioral change, other than the rare possibility of landing in the new state. --- lib/internal.h | 1 + lib/nbd-protocol.h | 19 ++++++---- generator/state_machine.ml | 9 ++++- generator/states-reply-structured.c | 56 ++++++++++++++++++++--------- 4 files changed, 61 insertions(+), 24 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index bbbd2639..fe81f1a0 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -241,6 +241,7 @@ struct nbd_handle { union { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; + struct nbd_structured_reply_block_status_hdr bs_hdr; struct { struct nbd_structured_reply_error error; char msg[NBD_MAX_STRING]; /* Common to all error types */ diff --git a/lib/nbd-protocol.h b/lib/nbd-protocol.h index e5d6404b..4400d3ab 100644 --- a/lib/nbd-protocol.h +++ b/lib/nbd-protocol.h @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2013-2020 Red Hat Inc. + * Copyright (C) 2013-2022 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -182,12 +182,6 @@ struct nbd_fixed_new_option_reply_meta_context { /* followed by a string */ } NBD_ATTRIBUTE_PACKED; -/* NBD_REPLY_TYPE_BLOCK_STATUS block descriptor. */ -struct nbd_block_descriptor { - uint32_t length; /* length of block */ - uint32_t status_flags; /* block type (hole etc) */ -} NBD_ATTRIBUTE_PACKED; - /* Request (client -> server). */ struct nbd_request { uint32_t magic; /* NBD_REQUEST_MAGIC. */ @@ -224,6 +218,17 @@ struct nbd_structured_reply_offset_hole { uint32_t length; /* Length of hole. */ } NBD_ATTRIBUTE_PACKED; +/* NBD_REPLY_TYPE_BLOCK_STATUS block descriptor. */ +struct nbd_block_descriptor { + uint32_t length; /* length of block */ + uint32_t status_flags; /* block type (hole etc) */ +} NBD_ATTRIBUTE_PACKED; + +struct nbd_structured_reply_block_status_hdr { + uint32_t context_id; /* metadata context ID */ + /* followed by array of nbd_block_descriptor extents */ +} NBD_ATTRIBUTE_PACKED; + struct nbd_structured_reply_error { uint32_t error; /* NBD_E* error number */ uint16_t len; /* Length of human readable error. */ diff --git a/generator/state_machine.ml b/generator/state_machine.ml index 257cb4f4..d2c326f3 100644 --- a/generator/state_machine.ml +++ b/generator/state_machine.ml @@ -871,10 +871,17 @@ and external_events = []; }; + State { + default_state with + name = "RECV_BS_HEADER"; + comment = "Receive header of structured reply block-status payload"; + external_events = []; + }; + State { default_state with name = "RECV_BS_ENTRIES"; - comment = "Receive a structured reply block-status payload"; + comment = "Receive entries array of structured reply block-status payload"; external_events = []; }; diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 2456e6da..bbd3de0c 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -126,19 +126,10 @@ REPLY.STRUCTURED_REPLY.CHECK: length < 12 || ((length-4) & 7) != 0) goto resync; assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); - /* We read the context ID followed by all the entries into a - * single array and deal with it at the end. - */ - free (h->bs_entries); - h->bs_entries = malloc (length); - if (h->bs_entries == NULL) { - SET_NEXT_STATE (%.DEAD); - set_error (errno, "malloc"); - break; - } - h->rbuf = h->bs_entries; - h->rlen = length; - SET_NEXT_STATE (%RECV_BS_ENTRIES); + /* Start by reading the context ID. */ + h->rbuf = &h->sbuf.sr.payload.bs_hdr; + h->rlen = sizeof h->sbuf.sr.payload.bs_hdr; + SET_NEXT_STATE (%RECV_BS_HEADER); break; default: @@ -424,9 +415,41 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: } return 0; + REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: + struct command *cmd = h->reply_cmd; + uint32_t length; + + switch (recv_into_rbuf (h)) { + case -1: SET_NEXT_STATE (%.DEAD); return 0; + case 1: + save_reply_state (h); + SET_NEXT_STATE (%.READY); + return 0; + case 0: + length = be32toh (h->sbuf.sr.structured_reply.length); + + assert (cmd); /* guaranteed by CHECK */ + assert (cmd->type == NBD_CMD_BLOCK_STATUS); + assert (length >= 12); + length -= sizeof h->sbuf.sr.payload.bs_hdr; + + free (h->bs_entries); + h->bs_entries = malloc (length); + if (h->bs_entries == NULL) { + SET_NEXT_STATE (%.DEAD); + set_error (errno, "malloc"); + return 0; + } + h->rbuf = h->bs_entries; + h->rlen = length; + SET_NEXT_STATE (%RECV_BS_ENTRIES); + } + return 0; + REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: struct command *cmd = h->reply_cmd; uint32_t length; + uint32_t count; size_t i; uint32_t context_id; @@ -445,15 +468,16 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: assert (h->bs_entries); assert (length >= 12); assert (h->meta_valid); + count = (length - sizeof h->sbuf.sr.payload.bs_hdr) / sizeof *h->bs_entries; /* Need to byte-swap the entries returned, but apart from that we * don't validate them. */ - for (i = 0; i < length/4; ++i) + for (i = 0; i < count; ++i) h->bs_entries[i] = be32toh (h->bs_entries[i]); /* Look up the context ID. */ - context_id = h->bs_entries[0]; + context_id = be32toh (h->sbuf.sr.payload.bs_hdr.context_id); for (i = 0; i < h->meta_contexts.len; ++i) if (context_id == h->meta_contexts.ptr[i].context_id) break; @@ -464,7 +488,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: if (CALL_CALLBACK (cmd->cb.fn.extent, h->meta_contexts.ptr[i].name, cmd->offset, - &h->bs_entries[1], (length-4) / 4, + h->bs_entries, count, &error) == -1) if (cmd->error == 0) cmd->error = error ? error : EPROTO; From patchwork Mon Nov 14 22:51:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 923A0C433FE for ; Mon, 14 Nov 2022 23:37:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouie4-0002K0-GP; Mon, 14 Nov 2022 18:13:40 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidp-0001df-Cd for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:25 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJD-00033o-C3 for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466326; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=itFgoH2I2cGNg7DbBiWhL/3P3BGjy13hUa55jZr1NR0=; b=JSeA/9zxOHOeExPVXW5/jN07KJRgrO92g9iOlwIQNIgSV7eNDaVE1qAAXZUy62SvKLwxZN KU6G8KFe1rwRnxFldHmphTqs7F3h8aQgEJW4iA9Y4DH5tsxkuIElAZ2qyuPFtC/910hmpY s8qmrpnpnKUh+9hvgV++DKepFdnBwzM= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-29-RAA1NospNMKeMPDEcxcqIg-1; Mon, 14 Nov 2022 17:52:03 -0500 X-MC-Unique: RAA1NospNMKeMPDEcxcqIg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0B3DF1C075A6; Mon, 14 Nov 2022 22:52:03 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8C14840E9786; Mon, 14 Nov 2022 22:52:02 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 02/23] internal: Refactor layout of replies in sbuf Date: Mon, 14 Nov 2022 16:51:37 -0600 Message-Id: <20221114225158.2186742-3-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org In order to more easily add a third reply type with an even larger header, but where the payload will look the same for both structured and extended replies, it is nicer if simple and structured replies are nested inside the same layer of sbuf.reply.hdr. While at it, note that while .or and .sr are structs declared within the overall sbuf union, we never read into both halves of those structs at the same time, so it does not matter if their two halves are consecutive. Dropping the packed notation on those structs means the compiler can align .payload more naturally, which may slightly improve performance on some platforms, even if it makes the overall union a few bytes larger due to padding. Visually, this patch changes the layout from: offset simple structured +------------------------------------------------------------+ | union sbuf | | +---------------------+------------------------------+ | | | struct simple_reply | union sr | | | | +-----------------+ | +--------------------------+ | | | | | | | | struct structured_reply | | | | | | | | | +----------------------+ | | | | 0 | | uint32_t magic | | | | uint32_t magic | | | | | 4 | | uint32_t error | | | | uint16_t flags | | | | | 6 | | | | | | uint16_t type | | | | | 8 | | uint64_t handle | | | | uint64_t handle | | | | | | +-----------------+ | | | | | | | | 16 | [padding] | | | uint32_t length | | | | | | | | +----------------------+ | | | | | | | union payload | | | | | | | +-----------+----------+ | | | | 20 | | | | ... | ... | | | | | | | | +-----------+----------+ | | | | | | +--------------------------+ | | | +---------------------+------------------------------+ | +------------------------------------------------------------+ to: offset simple structured +-------------------------------------------------------------+ | union sbuf | | +-----------------------------------------------------+ | | | struct reply | | | | +-------------------------------------------------+ | | | | | union hdr | | | | | | +--------------------+------------------------+ | | | | | | | struct simple | struct structured | | | | | | | | +----------------+ | +--------------------+ | | | | | 0 | | | | uint32_t magic | | | uint32_t magic | | | | | | 4 | | | | uint32_t error | | | uint16_t flags | | | | | | 6 | | | | | | | uint16_t type | | | | | | 8 | | | | uint64_t handle| | | uint64_t handle | | | | | | | | | +----------------+ | | | | | | | | 16 | | | [padding] | | uint32_t length | | | | | | | | | | +--------------------+ | | | | | 20 | | | | [padding] | | | | | | | +--------------------+------------------------+ | | | | | | union payload | | | | | | +--------------------+------------------------+ | | | | 24 | | | ... | ... | | | | | | | +--------------------+------------------------+ | | | | | +-------------------------------------------------+ | | | +-----------------------------------------------------+ | +-------------------------------------------------------------+ Technically, whether the payload union offset moves to byte 24 (with 20-23 now padding) or stays at 20 depends on compiler ABI; but many systems prefer that any struct with a uint64_t provide 8-byte alignment to its containing union. The commit is largely mechanical, and there should be no semantic change. --- lib/internal.h | 12 ++-- generator/states-reply-simple.c | 4 +- generator/states-reply-structured.c | 103 ++++++++++++++-------------- generator/states-reply.c | 10 +-- 4 files changed, 66 insertions(+), 63 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index fe81f1a0..f81c41ba 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -230,14 +230,16 @@ struct nbd_handle { struct { struct nbd_fixed_new_option_reply_meta_context context; char str[NBD_MAX_STRING]; - } __attribute__((packed)) context; + } __attribute__((packed)) context; char err_msg[NBD_MAX_STRING]; } payload; - } __attribute__((packed)) or; + } or; struct nbd_export_name_option_reply export_name_reply; - struct nbd_simple_reply simple_reply; struct { - struct nbd_structured_reply structured_reply; + union { + struct nbd_simple_reply simple; + struct nbd_structured_reply structured; + } hdr; union { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; @@ -248,7 +250,7 @@ struct nbd_handle { uint64_t offset; /* Only used for NBD_REPLY_TYPE_ERROR_OFFSET */ } __attribute__((packed)) error; } payload; - } __attribute__((packed)) sr; + } reply; uint16_t gflags; uint32_t cflags; uint32_t len; diff --git a/generator/states-reply-simple.c b/generator/states-reply-simple.c index a43d61dd..f390b5a5 100644 --- a/generator/states-reply-simple.c +++ b/generator/states-reply-simple.c @@ -23,7 +23,7 @@ REPLY.SIMPLE_REPLY.START: struct command *cmd = h->reply_cmd; uint32_t error; - error = be32toh (h->sbuf.simple_reply.error); + error = be32toh (h->sbuf.reply.hdr.simple.error); if (cmd == NULL) { /* Unexpected reply. If error was set or we have structured @@ -39,7 +39,7 @@ REPLY.SIMPLE_REPLY.START: if (error || h->structured_replies) SET_NEXT_STATE (%^FINISH_COMMAND); else { - uint64_t cookie = be64toh (h->sbuf.simple_reply.handle); + uint64_t cookie = be64toh (h->sbuf.reply.hdr.simple.handle); SET_NEXT_STATE (%.DEAD); set_error (EPROTO, "no matching cookie %" PRIu64 " found for server reply, " diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index bbd3de0c..7587d856 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -49,9 +49,9 @@ REPLY.STRUCTURED_REPLY.START: * so read the remaining part. */ h->rbuf = &h->sbuf; - h->rbuf = (char *) h->rbuf + sizeof h->sbuf.simple_reply; - h->rlen = sizeof h->sbuf.sr.structured_reply; - h->rlen -= sizeof h->sbuf.simple_reply; + h->rbuf = (char *) h->rbuf + sizeof h->sbuf.reply.hdr.simple; + h->rlen = sizeof h->sbuf.reply.hdr.structured; + h->rlen -= sizeof h->sbuf.reply.hdr.simple; SET_NEXT_STATE (%RECV_REMAINING); return 0; @@ -71,9 +71,9 @@ REPLY.STRUCTURED_REPLY.CHECK: uint16_t flags, type; uint32_t length; - flags = be16toh (h->sbuf.sr.structured_reply.flags); - type = be16toh (h->sbuf.sr.structured_reply.type); - length = be32toh (h->sbuf.sr.structured_reply.length); + flags = be16toh (h->sbuf.reply.hdr.structured.flags); + type = be16toh (h->sbuf.reply.hdr.structured.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); /* Reject a server that replies with too much information, but don't * reject a single structured reply to NBD_CMD_READ on the largest @@ -82,7 +82,7 @@ REPLY.STRUCTURED_REPLY.CHECK: * oversized reply is going to take long enough to resync that it is * not worth keeping the connection alive. */ - if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.sr.payload.offset_data) { + if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.reply.payload.offset_data) { set_error (0, "invalid server reply length %" PRIu32, length); SET_NEXT_STATE (%.DEAD); return 0; @@ -105,19 +105,19 @@ REPLY.STRUCTURED_REPLY.CHECK: * them as an extension, so we use < instead of <=. */ if (cmd->type != NBD_CMD_READ || - length < sizeof h->sbuf.sr.payload.offset_data) + length < sizeof h->sbuf.reply.payload.offset_data) goto resync; - h->rbuf = &h->sbuf.sr.payload.offset_data; - h->rlen = sizeof h->sbuf.sr.payload.offset_data; + h->rbuf = &h->sbuf.reply.payload.offset_data; + h->rlen = sizeof h->sbuf.reply.payload.offset_data; SET_NEXT_STATE (%RECV_OFFSET_DATA); break; case NBD_REPLY_TYPE_OFFSET_HOLE: if (cmd->type != NBD_CMD_READ || - length != sizeof h->sbuf.sr.payload.offset_hole) + length != sizeof h->sbuf.reply.payload.offset_hole) goto resync; - h->rbuf = &h->sbuf.sr.payload.offset_hole; - h->rlen = sizeof h->sbuf.sr.payload.offset_hole; + h->rbuf = &h->sbuf.reply.payload.offset_hole; + h->rlen = sizeof h->sbuf.reply.payload.offset_hole; SET_NEXT_STATE (%RECV_OFFSET_HOLE); break; @@ -127,8 +127,8 @@ REPLY.STRUCTURED_REPLY.CHECK: goto resync; assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); /* Start by reading the context ID. */ - h->rbuf = &h->sbuf.sr.payload.bs_hdr; - h->rlen = sizeof h->sbuf.sr.payload.bs_hdr; + h->rbuf = &h->sbuf.reply.payload.bs_hdr; + h->rlen = sizeof h->sbuf.reply.payload.bs_hdr; SET_NEXT_STATE (%RECV_BS_HEADER); break; @@ -139,10 +139,10 @@ REPLY.STRUCTURED_REPLY.CHECK: * compliant, will favor the wire error over EPROTO during more * length checks in RECV_ERROR_MESSAGE and RECV_ERROR_TAIL. */ - if (length < sizeof h->sbuf.sr.payload.error.error.error) + if (length < sizeof h->sbuf.reply.payload.error.error.error) goto resync; - h->rbuf = &h->sbuf.sr.payload.error.error; - h->rlen = MIN (length, sizeof h->sbuf.sr.payload.error.error); + h->rbuf = &h->sbuf.reply.payload.error.error; + h->rlen = MIN (length, sizeof h->sbuf.reply.payload.error.error); SET_NEXT_STATE (%RECV_ERROR); } else @@ -168,19 +168,19 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - assert (length >= sizeof h->sbuf.sr.payload.error.error.error); + length = be32toh (h->sbuf.reply.hdr.structured.length); + assert (length >= sizeof h->sbuf.reply.payload.error.error.error); assert (cmd); - if (length < sizeof h->sbuf.sr.payload.error.error) + if (length < sizeof h->sbuf.reply.payload.error.error) goto resync; - msglen = be16toh (h->sbuf.sr.payload.error.error.len); - if (msglen > length - sizeof h->sbuf.sr.payload.error.error || - msglen > sizeof h->sbuf.sr.payload.error.msg) + msglen = be16toh (h->sbuf.reply.payload.error.error.len); + if (msglen > length - sizeof h->sbuf.reply.payload.error.error || + msglen > sizeof h->sbuf.reply.payload.error.msg) goto resync; - h->rbuf = h->sbuf.sr.payload.error.msg; + h->rbuf = h->sbuf.reply.payload.error.msg; h->rlen = msglen; SET_NEXT_STATE (%RECV_ERROR_MESSAGE); } @@ -188,11 +188,11 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR: resync: /* Favor the error packet's errno over RESYNC's EPROTO. */ - error = be32toh (h->sbuf.sr.payload.error.error.error); + error = be32toh (h->sbuf.reply.payload.error.error.error); if (cmd->error == 0) cmd->error = nbd_internal_errno_of_nbd_error (error); h->rbuf = NULL; - h->rlen = length - MIN (length, sizeof h->sbuf.sr.payload.error.error); + h->rlen = length - MIN (length, sizeof h->sbuf.reply.payload.error.error); SET_NEXT_STATE (%RESYNC); return 0; @@ -207,15 +207,15 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_MESSAGE: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - msglen = be16toh (h->sbuf.sr.payload.error.error.len); - type = be16toh (h->sbuf.sr.structured_reply.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); + msglen = be16toh (h->sbuf.reply.payload.error.error.len); + type = be16toh (h->sbuf.reply.hdr.structured.type); - length -= sizeof h->sbuf.sr.payload.error.error + msglen; + length -= sizeof h->sbuf.reply.payload.error.error + msglen; if (msglen) debug (h, "structured error server message: %.*s", (int) msglen, - h->sbuf.sr.payload.error.msg); + h->sbuf.reply.payload.error.msg); /* Special case two specific errors; silently ignore tail for all others */ h->rbuf = NULL; @@ -227,11 +227,11 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_MESSAGE: "the server may have a bug"); break; case NBD_REPLY_TYPE_ERROR_OFFSET: - if (length != sizeof h->sbuf.sr.payload.error.offset) + if (length != sizeof h->sbuf.reply.payload.error.offset) debug (h, "unable to safely extract error offset, " "the server may have a bug"); else - h->rbuf = &h->sbuf.sr.payload.error.offset; + h->rbuf = &h->sbuf.reply.payload.error.offset; break; } SET_NEXT_STATE (%RECV_ERROR_TAIL); @@ -250,8 +250,8 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_TAIL: SET_NEXT_STATE (%.READY); return 0; case 0: - error = be32toh (h->sbuf.sr.payload.error.error.error); - type = be16toh (h->sbuf.sr.structured_reply.type); + error = be32toh (h->sbuf.reply.payload.error.error.error); + type = be16toh (h->sbuf.reply.hdr.structured.type); assert (cmd); /* guaranteed by CHECK */ @@ -266,7 +266,7 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_TAIL: * user callback if present. Ignore the offset if it was bogus. */ if (type == NBD_REPLY_TYPE_ERROR_OFFSET && h->rbuf) { - uint64_t offset = be64toh (h->sbuf.sr.payload.error.offset); + uint64_t offset = be64toh (h->sbuf.reply.payload.error.offset); if (structured_reply_in_bounds (offset, 0, cmd) && cmd->type == NBD_CMD_READ && CALLBACK_IS_NOT_NULL (cmd->cb.fn.chunk)) { @@ -307,8 +307,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - offset = be64toh (h->sbuf.sr.payload.offset_data.offset); + length = be32toh (h->sbuf.reply.hdr.structured.length); + offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ @@ -346,8 +346,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - offset = be64toh (h->sbuf.sr.payload.offset_data.offset); + length = be32toh (h->sbuf.reply.hdr.structured.length); + offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ if (CALLBACK_IS_NOT_NULL (cmd->cb.fn.chunk)) { @@ -377,8 +377,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: SET_NEXT_STATE (%.READY); return 0; case 0: - offset = be64toh (h->sbuf.sr.payload.offset_hole.offset); - length = be32toh (h->sbuf.sr.payload.offset_hole.length); + offset = be64toh (h->sbuf.reply.payload.offset_hole.offset); + length = be32toh (h->sbuf.reply.payload.offset_hole.length); assert (cmd); /* guaranteed by CHECK */ @@ -426,12 +426,12 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); + length = be32toh (h->sbuf.reply.hdr.structured.length); assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (length >= 12); - length -= sizeof h->sbuf.sr.payload.bs_hdr; + length -= sizeof h->sbuf.reply.payload.bs_hdr; free (h->bs_entries); h->bs_entries = malloc (length); @@ -460,7 +460,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); + length = be32toh (h->sbuf.reply.hdr.structured.length); assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); @@ -468,7 +468,8 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: assert (h->bs_entries); assert (length >= 12); assert (h->meta_valid); - count = (length - sizeof h->sbuf.sr.payload.bs_hdr) / sizeof *h->bs_entries; + count = (length - sizeof h->sbuf.reply.payload.bs_hdr) / + sizeof *h->bs_entries; /* Need to byte-swap the entries returned, but apart from that we * don't validate them. @@ -477,7 +478,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: h->bs_entries[i] = be32toh (h->bs_entries[i]); /* Look up the context ID. */ - context_id = be32toh (h->sbuf.sr.payload.bs_hdr.context_id); + context_id = be32toh (h->sbuf.reply.payload.bs_hdr.context_id); for (i = 0; i < h->meta_contexts.len; ++i) if (context_id == h->meta_contexts.ptr[i].context_id) break; @@ -524,8 +525,8 @@ REPLY.STRUCTURED_REPLY.RESYNC: SET_NEXT_STATE (%^FINISH_COMMAND); return 0; } - type = be16toh (h->sbuf.sr.structured_reply.type); - length = be32toh (h->sbuf.sr.structured_reply.length); + type = be16toh (h->sbuf.reply.hdr.structured.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); debug (h, "unexpected reply type %u or payload length %" PRIu32 " for cookie %" PRIu64 " and command %" PRIu32 ", this is probably a server bug", @@ -539,7 +540,7 @@ REPLY.STRUCTURED_REPLY.RESYNC: REPLY.STRUCTURED_REPLY.FINISH: uint16_t flags; - flags = be16toh (h->sbuf.sr.structured_reply.flags); + flags = be16toh (h->sbuf.reply.hdr.structured.flags); if (flags & NBD_REPLY_FLAG_DONE) { SET_NEXT_STATE (%^FINISH_COMMAND); } diff --git a/generator/states-reply.c b/generator/states-reply.c index b53409e4..845973a4 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -69,8 +69,8 @@ REPLY.START: assert (h->reply_cmd == NULL); assert (h->rlen == 0); - h->rbuf = &h->sbuf; - h->rlen = sizeof h->sbuf.simple_reply; + h->rbuf = &h->sbuf.reply.hdr; + h->rlen = sizeof h->sbuf.reply.hdr.simple; r = h->sock->ops->recv (h, h->sock, h->rbuf, h->rlen); if (r == -1) { @@ -115,7 +115,7 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: uint32_t magic; uint64_t cookie; - magic = be32toh (h->sbuf.simple_reply.magic); + magic = be32toh (h->sbuf.reply.hdr.simple.magic); if (magic == NBD_SIMPLE_REPLY_MAGIC) { SET_NEXT_STATE (%SIMPLE_REPLY.START); } @@ -132,7 +132,7 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: * handle (our cookie) is stored at the same offset. */ h->chunks_received++; - cookie = be64toh (h->sbuf.simple_reply.handle); + cookie = be64toh (h->sbuf.reply.hdr.simple.handle); /* Find the command amongst the commands in flight. If the server sends * a reply for an unknown cookie, FINISH will diagnose that later. */ @@ -151,7 +151,7 @@ REPLY.FINISH_COMMAND: /* NB: This works for both simple and structured replies because the * handle (our cookie) is stored at the same offset. */ - cookie = be64toh (h->sbuf.simple_reply.handle); + cookie = be64toh (h->sbuf.reply.hdr.simple.handle); /* Find the command amongst the commands in flight. */ for (cmd = h->cmds_in_flight, prev_cmd = NULL; cmd != NULL; From patchwork Mon Nov 14 22:51:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 945ACC4332F for ; Mon, 14 Nov 2022 23:40:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouie1-00029X-6A; Mon, 14 Nov 2022 18:13:37 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidp-0001hU-E4 for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:25 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJD-00033h-2u for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466326; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ke/C9mzfSwkCq8UrXJ5GbnBHrV042sMz47vbHjQG3zw=; b=fO9hMp/fnwXMVf7yYUsRanRudG1NnW1dzJAJs/aB92/5/2PFfM7QoPTe81xmL/AsM6gQES ++NrmDpusGqamUszwEAOvcy2/tHFFd5jZlBdL8otMcDa3smHwzg2TYtuHO0HGUGJNBYuxt f4niCxhl5N4C/89O6fhTRafXdHgvg7o= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-85-KsX07whzO0-Ztc0ZUNGaqA-1; Mon, 14 Nov 2022 17:52:03 -0500 X-MC-Unique: KsX07whzO0-Ztc0ZUNGaqA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9465B87B2A4; Mon, 14 Nov 2022 22:52:03 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 302DE40E9786; Mon, 14 Nov 2022 22:52:03 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 03/23] protocol: Add definitions for extended headers Date: Mon, 14 Nov 2022 16:51:38 -0600 Message-Id: <20221114225158.2186742-4-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add the magic numbers and new structs necessary to implement the NBD protocol extension of extended headers providing 64-bit lengths. This corresponds to upstream nbd commits XXX-XXX[*]. --- [*] FIXME update commit ids before pushing --- lib/nbd-protocol.h | 66 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 52 insertions(+), 14 deletions(-) diff --git a/lib/nbd-protocol.h b/lib/nbd-protocol.h index 4400d3ab..ac569a11 100644 --- a/lib/nbd-protocol.h +++ b/lib/nbd-protocol.h @@ -124,6 +124,7 @@ struct nbd_fixed_new_option_reply { #define NBD_OPT_STRUCTURED_REPLY 8 #define NBD_OPT_LIST_META_CONTEXT 9 #define NBD_OPT_SET_META_CONTEXT 10 +#define NBD_OPT_EXTENDED_HEADERS 11 #define NBD_REP_ERR(val) (0x80000000 | (val)) #define NBD_REP_IS_ERR(val) (!!((val) & 0x80000000)) @@ -141,6 +142,7 @@ struct nbd_fixed_new_option_reply { #define NBD_REP_ERR_SHUTDOWN NBD_REP_ERR (7) #define NBD_REP_ERR_BLOCK_SIZE_REQD NBD_REP_ERR (8) #define NBD_REP_ERR_TOO_BIG NBD_REP_ERR (9) +#define NBD_REP_ERR_EXT_HEADER_REQD NBD_REP_ERR (10) #define NBD_INFO_EXPORT 0 #define NBD_INFO_NAME 1 @@ -182,16 +184,26 @@ struct nbd_fixed_new_option_reply_meta_context { /* followed by a string */ } NBD_ATTRIBUTE_PACKED; -/* Request (client -> server). */ +/* Compact request (client -> server). */ struct nbd_request { uint32_t magic; /* NBD_REQUEST_MAGIC. */ - uint16_t flags; /* Request flags. */ - uint16_t type; /* Request type. */ + uint16_t flags; /* Request flags: NBD_CMD_FLAG_*. */ + uint16_t type; /* Request type: NBD_CMD_*. */ uint64_t handle; /* Opaque handle. */ uint64_t offset; /* Request offset. */ uint32_t count; /* Request length. */ } NBD_ATTRIBUTE_PACKED; +/* Extended request (client -> server). */ +struct nbd_request_ext { + uint32_t magic; /* NBD_EXTENDED_REQUEST_MAGIC. */ + uint16_t flags; /* Request flags: NBD_CMD_FLAG_*. */ + uint16_t type; /* Request type: NBD_CMD_*. */ + uint64_t handle; /* Opaque handle. */ + uint64_t offset; /* Request offset. */ + uint64_t count; /* Request effect or payload length. */ +} NBD_ATTRIBUTE_PACKED; + /* Simple reply (server -> client). */ struct nbd_simple_reply { uint32_t magic; /* NBD_SIMPLE_REPLY_MAGIC. */ @@ -208,6 +220,16 @@ struct nbd_structured_reply { uint32_t length; /* Length of payload which follows. */ } NBD_ATTRIBUTE_PACKED; +/* Extended reply (server -> client). */ +struct nbd_extended_reply { + uint32_t magic; /* NBD_EXTENDED_REPLY_MAGIC. */ + uint16_t flags; /* NBD_REPLY_FLAG_* */ + uint16_t type; /* NBD_REPLY_TYPE_* */ + uint64_t handle; /* Opaque handle. */ + uint64_t offset; /* Client's offset. */ + uint64_t length; /* Length of payload which follows. */ +} NBD_ATTRIBUTE_PACKED; + struct nbd_structured_reply_offset_data { uint64_t offset; /* offset */ /* Followed by data. */ @@ -224,11 +246,23 @@ struct nbd_block_descriptor { uint32_t status_flags; /* block type (hole etc) */ } NBD_ATTRIBUTE_PACKED; +/* NBD_REPLY_TYPE_BLOCK_STATUS_EXT block descriptor. */ +struct nbd_block_descriptor_ext { + uint64_t length; /* length of block */ + uint64_t status_flags; /* block type (hole etc) */ +} NBD_ATTRIBUTE_PACKED; + struct nbd_structured_reply_block_status_hdr { uint32_t context_id; /* metadata context ID */ /* followed by array of nbd_block_descriptor extents */ } NBD_ATTRIBUTE_PACKED; +struct nbd_structured_reply_block_status_ext_hdr { + uint32_t context_id; /* metadata context ID */ + uint32_t count; /* 0, or length of following array */ + /* followed by array of nbd_block_descriptor_ext extents */ +} NBD_ATTRIBUTE_PACKED; + struct nbd_structured_reply_error { uint32_t error; /* NBD_E* error number */ uint16_t len; /* Length of human readable error. */ @@ -236,8 +270,10 @@ struct nbd_structured_reply_error { } NBD_ATTRIBUTE_PACKED; #define NBD_REQUEST_MAGIC 0x25609513 +#define NBD_EXTENDED_REQUEST_MAGIC 0x21e41c71 #define NBD_SIMPLE_REPLY_MAGIC 0x67446698 #define NBD_STRUCTURED_REPLY_MAGIC 0x668e33ef +#define NBD_EXTENDED_REPLY_MAGIC 0x6e8a278c /* Structured reply flags. */ #define NBD_REPLY_FLAG_DONE (1<<0) @@ -246,12 +282,13 @@ struct nbd_structured_reply_error { #define NBD_REPLY_TYPE_IS_ERR(val) (!!((val) & (1<<15))) /* Structured reply types. */ -#define NBD_REPLY_TYPE_NONE 0 -#define NBD_REPLY_TYPE_OFFSET_DATA 1 -#define NBD_REPLY_TYPE_OFFSET_HOLE 2 -#define NBD_REPLY_TYPE_BLOCK_STATUS 5 -#define NBD_REPLY_TYPE_ERROR NBD_REPLY_TYPE_ERR (1) -#define NBD_REPLY_TYPE_ERROR_OFFSET NBD_REPLY_TYPE_ERR (2) +#define NBD_REPLY_TYPE_NONE 0 +#define NBD_REPLY_TYPE_OFFSET_DATA 1 +#define NBD_REPLY_TYPE_OFFSET_HOLE 2 +#define NBD_REPLY_TYPE_BLOCK_STATUS 5 +#define NBD_REPLY_TYPE_BLOCK_STATUS_EXT 6 +#define NBD_REPLY_TYPE_ERROR NBD_REPLY_TYPE_ERR (1) +#define NBD_REPLY_TYPE_ERROR_OFFSET NBD_REPLY_TYPE_ERR (2) /* NBD commands. */ #define NBD_CMD_READ 0 @@ -263,11 +300,12 @@ struct nbd_structured_reply_error { #define NBD_CMD_WRITE_ZEROES 6 #define NBD_CMD_BLOCK_STATUS 7 -#define NBD_CMD_FLAG_FUA (1<<0) -#define NBD_CMD_FLAG_NO_HOLE (1<<1) -#define NBD_CMD_FLAG_DF (1<<2) -#define NBD_CMD_FLAG_REQ_ONE (1<<3) -#define NBD_CMD_FLAG_FAST_ZERO (1<<4) +#define NBD_CMD_FLAG_FUA (1<<0) +#define NBD_CMD_FLAG_NO_HOLE (1<<1) +#define NBD_CMD_FLAG_DF (1<<2) +#define NBD_CMD_FLAG_REQ_ONE (1<<3) +#define NBD_CMD_FLAG_FAST_ZERO (1<<4) +#define NBD_CMD_FLAG_PAYLOAD_LEN (1<<5) /* NBD error codes. */ #define NBD_SUCCESS 0 From patchwork Mon Nov 14 22:51:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042996 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E167C4332F for ; Tue, 15 Nov 2022 00:14:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouie0-00026t-4s; Mon, 14 Nov 2022 18:13:36 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouido-0001df-LN for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:24 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJF-00034b-CE for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466328; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=21mFIb8VsGicmXJWSaDFueEflwSnP7lyaVvp7ihlkLY=; b=WjahNLRDGCx3FbDiitCLdODWkEFe+xDARU4fLCCvSF63aEeZKGM11aWVkFlq5MPsJXg+Ol EGokp4Jp4knG13Yhhoiax1FL07zlAJ81EaKRxWK2pK8okAC1Kjm79BmzaACbZkJYPCKxqt lX14yc4WZtDgdFG67Sm/Knd9fJ8Vd3M= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-389-xBx4mP6XPoO6Zj4ZmWTqIw-1; Mon, 14 Nov 2022 17:52:04 -0500 X-MC-Unique: xBx4mP6XPoO6Zj4ZmWTqIw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4770F185A7AC; Mon, 14 Nov 2022 22:52:04 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id B951640E9787; Mon, 14 Nov 2022 22:52:03 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 04/23] states: Prepare to send 64-bit requests Date: Mon, 14 Nov 2022 16:51:39 -0600 Message-Id: <20221114225158.2186742-5-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Support sending 64-bit requests if extended headers were negotiated. This includes setting NBD_CMD_FLAG_PAYLOAD_LEN any time we send an extended NBD_CMD_WRITE; this is such a fundamental part of the protocol that for now it is easier to silently ignore whatever value the user passes in for that bit in the flags parameter of nbd_pwrite regardless of the current settings in set_strict_mode, rather than trying to force the user to pass in the correct value to match whether extended mode is negotiated. However, when we later add APIs to give the user more control for interoperability testing, it may be worth adding a new set_strict_mode control knob to explicitly enable the client to intentionally violate the protocol (the testsuite added in this patch would then be updated to match). At this point, h->extended_headers is permanently false (we can't enable it until all other aspects of the protocol have likewise been converted). Support for using FLAG_PAYLOAD_LEN with NBD_CMD_BLOCK_STATUS is less fundamental, and deserves to be in its own patch. --- lib/internal.h | 10 ++- generator/API.ml | 20 +++-- generator/states-issue-command.c | 29 ++++--- generator/states-reply-structured.c | 2 +- lib/rw.c | 17 +++-- tests/Makefile.am | 4 + tests/pwrite-extended.c | 112 ++++++++++++++++++++++++++++ .gitignore | 1 + 8 files changed, 169 insertions(+), 26 deletions(-) create mode 100644 tests/pwrite-extended.c diff --git a/lib/internal.h b/lib/internal.h index f81c41ba..e900eca3 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -109,6 +109,9 @@ struct nbd_handle { char *tls_username; /* Username, NULL = use current username */ char *tls_psk_file; /* PSK filename, NULL = no PSK */ + /* Extended headers. */ + bool extended_headers; /* If we negotiated NBD_OPT_EXTENDED_HEADERS */ + /* Desired metadata contexts. */ bool request_sr; bool request_meta; @@ -262,7 +265,10 @@ struct nbd_handle { /* Issuing a command must use a buffer separate from sbuf, for the * case when we interrupt a request to service a reply. */ - struct nbd_request request; + union { + struct nbd_request compact; + struct nbd_request_ext extended; + } req; bool in_write_payload; bool in_write_shutdown; @@ -363,7 +369,7 @@ struct command { uint16_t type; uint64_t cookie; uint64_t offset; - uint32_t count; + uint64_t count; void *data; /* Buffer for read/write */ struct command_cb cb; bool initialized; /* For read, true if getting a hole may skip memset */ diff --git a/generator/API.ml b/generator/API.ml index 25a612a2..beb7a2b4 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -198,11 +198,12 @@ let cmd_flags = flag_prefix = "CMD_FLAG"; guard = Some "((h->strict & LIBNBD_STRICT_FLAGS) || flags > UINT16_MAX)"; flags = [ - "FUA", 1 lsl 0; - "NO_HOLE", 1 lsl 1; - "DF", 1 lsl 2; - "REQ_ONE", 1 lsl 3; - "FAST_ZERO", 1 lsl 4; + "FUA", 1 lsl 0; + "NO_HOLE", 1 lsl 1; + "DF", 1 lsl 2; + "REQ_ONE", 1 lsl 3; + "FAST_ZERO", 1 lsl 4; + "PAYLOAD_LEN", 1 lsl 5; ] } let handshake_flags = { @@ -2459,7 +2460,7 @@ "pread_structured", { "pwrite", { default_call with args = [ BytesIn ("buf", "count"); UInt64 "offset" ]; - optargs = [ OFlags ("flags", cmd_flags, Some ["FUA"]) ]; + optargs = [ OFlags ("flags", cmd_flags, Some ["FUA"; "PAYLOAD_LEN"]) ]; ret = RErr; permitted_states = [ Connected ]; shortdesc = "write to the NBD server"; @@ -2482,7 +2483,10 @@ "pwrite", { C meaning that the server should not return until the data has been committed to permanent storage (if that is supported - some servers cannot do this, see -L)." +L). For convenience, libnbd ignores the presence +or absence of the flag C in C, +while correctly using the flag over the wire according to whether +extended headers were negotiated." ^ strict_call_description; see_also = [Link "can_fua"; Link "is_read_only"; Link "aio_pwrite"; Link "get_block_size"; @@ -3172,7 +3176,7 @@ "aio_pwrite", { default_call with args = [ BytesPersistIn ("buf", "count"); UInt64 "offset" ]; optargs = [ OClosure completion_closure; - OFlags ("flags", cmd_flags, Some ["FUA"]) ]; + OFlags ("flags", cmd_flags, Some ["FUA"; "PAYLOAD_LEN"]) ]; ret = RCookie; permitted_states = [ Connected ]; shortdesc = "write to the NBD server"; diff --git a/generator/states-issue-command.c b/generator/states-issue-command.c index df9295b5..feea2672 100644 --- a/generator/states-issue-command.c +++ b/generator/states-issue-command.c @@ -41,15 +41,24 @@ ISSUE_COMMAND.START: return 0; } - h->request.magic = htobe32 (NBD_REQUEST_MAGIC); - h->request.flags = htobe16 (cmd->flags); - h->request.type = htobe16 (cmd->type); - h->request.handle = htobe64 (cmd->cookie); - h->request.offset = htobe64 (cmd->offset); - h->request.count = htobe32 ((uint32_t) cmd->count); + /* These fields are coincident between req.compact and req.extended */ + h->req.compact.flags = htobe16 (cmd->flags); + h->req.compact.type = htobe16 (cmd->type); + h->req.compact.handle = htobe64 (cmd->cookie); + h->req.compact.offset = htobe64 (cmd->offset); + if (h->extended_headers) { + h->req.extended.magic = htobe32 (NBD_EXTENDED_REQUEST_MAGIC); + h->req.extended.count = htobe64 (cmd->count); + h->wlen = sizeof (h->req.extended); + } + else { + assert (cmd->count <= UINT32_MAX); + h->req.compact.magic = htobe32 (NBD_REQUEST_MAGIC); + h->req.compact.count = htobe32 (cmd->count); + h->wlen = sizeof (h->req.compact); + } h->chunks_sent++; - h->wbuf = &h->request; - h->wlen = sizeof (h->request); + h->wbuf = &h->req; if (cmd->type == NBD_CMD_WRITE || cmd->next) h->wflags = MSG_MORE; SET_NEXT_STATE (%SEND_REQUEST); @@ -74,7 +83,7 @@ ISSUE_COMMAND.PREPARE_WRITE_PAYLOAD: assert (h->cmds_to_issue != NULL); cmd = h->cmds_to_issue; - assert (cmd->cookie == be64toh (h->request.handle)); + assert (cmd->cookie == be64toh (h->req.compact.handle)); if (cmd->type == NBD_CMD_WRITE) { h->wbuf = cmd->data; h->wlen = cmd->count; @@ -120,7 +129,7 @@ ISSUE_COMMAND.FINISH: assert (!h->wlen); assert (h->cmds_to_issue != NULL); cmd = h->cmds_to_issue; - assert (cmd->cookie == be64toh (h->request.handle)); + assert (cmd->cookie == be64toh (h->req.compact.handle)); h->cmds_to_issue = cmd->next; if (h->cmds_to_issue_tail == cmd) h->cmds_to_issue_tail = NULL; diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 7587d856..6f187f14 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -34,7 +34,7 @@ structured_reply_in_bounds (uint64_t offset, uint32_t length, offset + length > cmd->offset + cmd->count) { set_error (0, "range of structured reply is out of bounds, " "offset=%" PRIu64 ", cmd->offset=%" PRIu64 ", " - "length=%" PRIu32 ", cmd->count=%" PRIu32 ": " + "length=%" PRIu32 ", cmd->count=%" PRIu64 ": " "this is likely to be a bug in the NBD server", offset, cmd->offset, length, cmd->count); return false; diff --git a/lib/rw.c b/lib/rw.c index 104d0fb0..81dded3f 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -223,13 +223,11 @@ nbd_internal_command_common (struct nbd_handle *h, } break; - /* Other commands are currently limited by the 32 bit field in the - * command structure on the wire, but in future we hope to support - * 64 bit values here with a change to the NBD protocol which is - * being discussed upstream. + /* Other commands are limited by the 32 bit field in the command + * structure on the wire, unless extended headers were negotiated. */ default: - if (count > UINT32_MAX) { + if (!h->extended_headers && count > UINT32_MAX) { set_error (ERANGE, "request too large: maximum request size is %" PRIu32, UINT32_MAX); goto err; @@ -358,6 +356,15 @@ nbd_unlocked_aio_pwrite (struct nbd_handle *h, const void *buf, return -1; } } + /* It is more convenient to manage PAYLOAD_LEN by what was negotiated + * than to require the user to have to set it correctly. + * TODO: Add new h->strict bit to allow intentional protocol violation + * for interoperability testing. + */ + if (h->extended_headers) + flags |= LIBNBD_CMD_FLAG_PAYLOAD_LEN; + else + flags &= ~LIBNBD_CMD_FLAG_PAYLOAD_LEN; SET_CALLBACK_TO_NULL (*completion); return nbd_internal_command_common (h, flags, NBD_CMD_WRITE, offset, count, diff --git a/tests/Makefile.am b/tests/Makefile.am index 0b9c454e..cb99309c 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -231,6 +231,7 @@ check_PROGRAMS += \ meta-base-allocation \ closure-lifetimes \ pread-initialize \ + pwrite-extended \ $(NULL) TESTS += \ @@ -642,6 +643,9 @@ closure_lifetimes_LDADD = $(top_builddir)/lib/libnbd.la pread_initialize_SOURCES = pread-initialize.c pread_initialize_LDADD = $(top_builddir)/lib/libnbd.la +pwrite_extended_SOURCES = pwrite-extended.c +pwrite_extended_LDADD = $(top_builddir)/lib/libnbd.la + #---------------------------------------------------------------------- # Testing TLS support. diff --git a/tests/pwrite-extended.c b/tests/pwrite-extended.c new file mode 100644 index 00000000..eeedf0a1 --- /dev/null +++ b/tests/pwrite-extended.c @@ -0,0 +1,112 @@ +/* NBD client library in userspace + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Check behavior of pwrite with PAYLOAD_LEN flag for extended headers. */ + +#include + +#include +#include +#include +#include +#include +#include + +#include + +static char *progname; +static char buf[512]; + +static void +check (int experr, const char *prefix) +{ + const char *msg = nbd_get_error (); + int errnum = nbd_get_errno (); + + fprintf (stderr, "error: \"%s\"\n", msg); + fprintf (stderr, "errno: %d (%s)\n", errnum, strerror (errnum)); + if (strncmp (msg, prefix, strlen (prefix)) != 0) { + fprintf (stderr, "%s: test failed: missing context prefix: %s\n", + progname, msg); + exit (EXIT_FAILURE); + } + if (errnum != experr) { + fprintf (stderr, "%s: test failed: " + "expected errno = %d (%s), but got %d\n", + progname, experr, strerror (experr), errnum); + exit (EXIT_FAILURE); + } +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + const char *cmd[] = { + "nbdkit", "-s", "-v", "--exit-with-parent", "memory", "1048576", NULL + }; + uint32_t strict; + + progname = argv[0]; + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* Connect to the server. */ + if (nbd_connect_command (nbd, (char **) cmd) == -1) { + fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* FIXME: future API addition to test if server negotiated extended mode. + * Until then, strict flags must ignore the PAYLOAD_LEN flag for pwrite, + * even though it is rejected for other commands. + */ + strict = nbd_get_strict_mode (nbd); + if (!(strict & LIBNBD_STRICT_FLAGS)) { + fprintf (stderr, "%s: test failed: " + "nbd_get_strict_mode did not have expected flag set\n", + argv[0]); + exit (EXIT_FAILURE); + } + if (nbd_aio_pread (nbd, buf, 512, 0, NBD_NULL_COMPLETION, + LIBNBD_CMD_FLAG_PAYLOAD_LEN) != -1) { + fprintf (stderr, "%s: test failed: " + "nbd_aio_pread did not fail with unexpected flag\n", + argv[0]); + exit (EXIT_FAILURE); + } + check (EINVAL, "nbd_aio_pread: "); + + if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION, + LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1) { + fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ()); + exit (EXIT_FAILURE); + } + + if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION, 0) == -1) { + fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ()); + exit (EXIT_FAILURE); + } + + nbd_close (nbd); + exit (EXIT_SUCCESS); +} diff --git a/.gitignore b/.gitignore index f4273713..bd002522 100644 --- a/.gitignore +++ b/.gitignore @@ -242,6 +242,7 @@ Makefile.in /tests/pki/ /tests/pread-initialize /tests/private-data +/tests/pwrite-extended /tests/read-only-flag /tests/read-write-flag /tests/server-death From patchwork Mon Nov 14 22:51:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042966 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1D9F2C433FE for ; Mon, 14 Nov 2022 23:48:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouieE-0002tw-VL; Mon, 14 Nov 2022 18:13:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouido-0001m0-OI for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:24 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJF-00034a-BQ for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466328; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kWDF5Wsuz6zyFA0zKG8H8k3mlK2VookbMr7BHQU/T78=; b=SAcDzO8qnH8DCLaUXgiIW9xan/Dt52XB5dyuHZyzTHRJh/BEFzgPJNkMsZvBQ6Xl7/higg VP6yKjUByO94kuFUyPeaIq2Kn9njkjuG9lcUuL8meqYyaX6Q7/Cn4OMCoj8FWa4oiLLMG9 rOKLHMm6FN7WXdljh3LYGmCUlArA2Hs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-624-8mg6OTf-PMmfdH1Agz96Fg-1; Mon, 14 Nov 2022 17:52:05 -0500 X-MC-Unique: 8mg6OTf-PMmfdH1Agz96Fg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D017687B2B2; Mon, 14 Nov 2022 22:52:04 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6B37340E9787; Mon, 14 Nov 2022 22:52:04 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 05/23] states: Prepare to receive 64-bit replies Date: Mon, 14 Nov 2022 16:51:40 -0600 Message-Id: <20221114225158.2186742-6-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Support receiving headers for 64-bit replies if extended headers were negotiated. We already insist that the server not send us too much payload in one reply, so we can exploit that and merge the 64-bit length back into a normalized 32-bit field for the rest of the payload length calculations. The NBD protocol specifically documents that extended mode takes precedence over structured replies, and that there are no simple replies in extended mode. We can also take advantage that the handle field is in the same offset in all the various reply types. Note that if we negotiate extended headers, but a non-compliant server replies with a non-extended header, this patch will stall waiting for the server to send more bytes rather than noticing that the magic number is wrong (for aio operations, you'll get a magic number mismatch once you send a second command that elicits a reply; but for blocking operations, we basically deadlock). The easy alternative would be to read just the first 4 bytes of magic, then determine how many more bytes to expect; but that would require more states and syscalls, and not worth it since the typical server will be compliant. The other alternative is what the next patch implements: teaching REPLY.RECV_REPLY to handle short reads that were at least long enough to transmit magic to specifically look for magic number mismatch. At this point, h->extended_headers is permanently false (we can't enable it until all other aspects of the protocol have likewise been converted). --- lib/internal.h | 1 + generator/states-reply-structured.c | 52 +++++++++++++++++++---------- generator/states-reply.c | 31 +++++++++++------ 3 files changed, 57 insertions(+), 27 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index e900eca3..73fd24c0 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -242,6 +242,7 @@ struct nbd_handle { union { struct nbd_simple_reply simple; struct nbd_structured_reply structured; + struct nbd_extended_reply extended; } hdr; union { struct nbd_structured_reply_offset_data offset_data; diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 6f187f14..da9894c6 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -45,14 +45,20 @@ structured_reply_in_bounds (uint64_t offset, uint32_t length, STATE_MACHINE { REPLY.STRUCTURED_REPLY.START: - /* We've only read the simple_reply. The structured_reply is longer, - * so read the remaining part. + /* If we have extended headers, we've already read the entire header. + * Otherwise, we've only read enough for a simple_reply; since structured + * replies are longer, read the remaining part. */ - h->rbuf = &h->sbuf; - h->rbuf = (char *) h->rbuf + sizeof h->sbuf.reply.hdr.simple; - h->rlen = sizeof h->sbuf.reply.hdr.structured; - h->rlen -= sizeof h->sbuf.reply.hdr.simple; - SET_NEXT_STATE (%RECV_REMAINING); + if (h->extended_headers) { + assert (h->rbuf == sizeof h->sbuf.reply.hdr.extended + (char*) &h->sbuf); + SET_NEXT_STATE (%CHECK); + } + else { + assert (h->rbuf == sizeof h->sbuf.reply.hdr.simple + (char*) &h->sbuf); + h->rlen = sizeof h->sbuf.reply.hdr.structured - + sizeof h->sbuf.reply.hdr.simple; + SET_NEXT_STATE (%RECV_REMAINING); + } return 0; REPLY.STRUCTURED_REPLY.RECV_REMAINING: @@ -69,11 +75,18 @@ REPLY.STRUCTURED_REPLY.RECV_REMAINING: REPLY.STRUCTURED_REPLY.CHECK: struct command *cmd = h->reply_cmd; uint16_t flags, type; - uint32_t length; + uint64_t length; + uint64_t offset = -1; + assert (cmd); flags = be16toh (h->sbuf.reply.hdr.structured.flags); type = be16toh (h->sbuf.reply.hdr.structured.type); - length = be32toh (h->sbuf.reply.hdr.structured.length); + if (h->extended_headers) { + length = be64toh (h->sbuf.reply.hdr.extended.length); + offset = be64toh (h->sbuf.reply.hdr.extended.offset); + } + else + length = be32toh (h->sbuf.reply.hdr.structured.length); /* Reject a server that replies with too much information, but don't * reject a single structured reply to NBD_CMD_READ on the largest @@ -83,13 +96,18 @@ REPLY.STRUCTURED_REPLY.CHECK: * not worth keeping the connection alive. */ if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.reply.payload.offset_data) { - set_error (0, "invalid server reply length %" PRIu32, length); + set_error (0, "invalid server reply length %" PRIu64, length); SET_NEXT_STATE (%.DEAD); return 0; } + /* For convenience, we now normalize extended replies into compact, + * doable since we validated length fits in 32 bits. + */ + h->sbuf.reply.hdr.structured.length = length; /* Skip an unexpected structured reply, including to an unknown cookie. */ - if (cmd == NULL || !h->structured_replies) + if (cmd == NULL || !h->structured_replies || + (h->extended_headers && offset != cmd->offset)) goto resync; switch (type) { @@ -168,7 +186,7 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ assert (length >= sizeof h->sbuf.reply.payload.error.error.error); assert (cmd); @@ -207,7 +225,7 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_MESSAGE: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ msglen = be16toh (h->sbuf.reply.payload.error.error.len); type = be16toh (h->sbuf.reply.hdr.structured.type); @@ -307,7 +325,7 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ @@ -346,7 +364,7 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ @@ -426,7 +444,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); @@ -460,7 +478,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); diff --git a/generator/states-reply.c b/generator/states-reply.c index 845973a4..dde23b39 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -62,15 +62,19 @@ REPLY.START: */ ssize_t r; - /* We read all replies initially as if they are simple replies, but - * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. - * This works because the structured_reply header is larger. + /* With extended headers, there is only one size to read, so we can do it + * all in one syscall. But for older structured replies, we don't know if + * we have a simple or structured reply until we read the magic number, + * requiring a two-part read with CHECK_SIMPLE_OR_STRUCTURED_REPLY below. */ assert (h->reply_cmd == NULL); assert (h->rlen == 0); h->rbuf = &h->sbuf.reply.hdr; - h->rlen = sizeof h->sbuf.reply.hdr.simple; + if (h->extended_headers) + h->rlen = sizeof h->sbuf.reply.hdr.extended; + else + h->rlen = sizeof h->sbuf.reply.hdr.simple; r = h->sock->ops->recv (h, h->sock, h->rbuf, h->rlen); if (r == -1) { @@ -116,15 +120,22 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: uint64_t cookie; magic = be32toh (h->sbuf.reply.hdr.simple.magic); - if (magic == NBD_SIMPLE_REPLY_MAGIC) { + switch (magic) { + case NBD_SIMPLE_REPLY_MAGIC: + if (h->extended_headers) + goto invalid; SET_NEXT_STATE (%SIMPLE_REPLY.START); - } - else if (magic == NBD_STRUCTURED_REPLY_MAGIC) { + break; + case NBD_STRUCTURED_REPLY_MAGIC: + case NBD_EXTENDED_REPLY_MAGIC: + if ((magic == NBD_STRUCTURED_REPLY_MAGIC) == h->extended_headers) + goto invalid; SET_NEXT_STATE (%STRUCTURED_REPLY.START); - } - else { + break; + default: + invalid: SET_NEXT_STATE (%.DEAD); /* We've probably lost synchronization. */ - set_error (0, "invalid reply magic 0x%" PRIx32, magic); + set_error (0, "invalid or unexpected reply magic 0x%" PRIx32, magic); return 0; } From patchwork Mon Nov 14 22:51:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13043005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51D10C433FE for ; Tue, 15 Nov 2022 00:21:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouie0-00026u-4s; Mon, 14 Nov 2022 18:13:36 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidn-0001hU-Ud for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:23 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJF-00034i-OW for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466329; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KDtXC+vARe66jRIcdES2jWaXwJ20uxlRPIx8ZWr1N8M=; b=F2SnMrJSkJMpqMWEzfDZmEhNnF6GLxOtdsLr3Q+EW3AIm1boZivATCpGavuupj6+vXydee BXRB/IUZPhlCaArebofEqYeMbhvtE9+lOIMo9HtyhKyuza5hycIqWoxIzlm4oy8tpoEv9d vweROzqS8IglJN4aR9407Tx9mrtAozw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-79-q_XoddxBOPalLCO1acSQvw-1; Mon, 14 Nov 2022 17:52:05 -0500 X-MC-Unique: q_XoddxBOPalLCO1acSQvw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 63D4C101CC6B; Mon, 14 Nov 2022 22:52:05 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0050940E9786; Mon, 14 Nov 2022 22:52:04 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 06/23] states: Break deadlock if server goofs on extended replies Date: Mon, 14 Nov 2022 16:51:41 -0600 Message-Id: <20221114225158.2186742-7-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org One of the benefits of extended replies is that we can do a fixed-length read for the entire header of every server reply, which is fewer syscalls than the split-read approach required by structured replies. But one of the drawbacks of doing a large read is that if the server is non-compliant (not a problem for normal servers, but something I hit rather more than I'd like to admit while developing extended header support in servers), nbd_pwrite() and friends will deadlock if the server replies with the wrong header. Add in some code to catch that failure mode and move the state machine to DEAD sooner, to make it easier to diagnose the fault in the server. Unlike in the case of an unexpected simply reply from a structured server (where we never over-read, and can therefore commit b31e7bac can merely fail the command with EPROTO and successfully move on to the next server reply), in this case we really do have to move to DEAD: in addition to having already read the 16 or 20 bytes that the server sent in its (short) reply for this command, we may have already read the initial bytes of the server's next reply, but we have no way to push those extra bytes back onto our read stream for parsing on our next pass through the state machine. --- generator/states-reply.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/generator/states-reply.c b/generator/states-reply.c index dde23b39..e89e9019 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -109,7 +109,23 @@ REPLY.START: REPLY.RECV_REPLY: switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; - case 1: SET_NEXT_STATE (%.READY); return 0; + case 1: SET_NEXT_STATE (%.READY); + /* Special case: if we have a short read, but got at least far + * enough to decode the magic number, we can check if the server + * is matching our expectations. This lets us avoid deadlocking if + * a buggy server sends only 16 bytes of a simple reply, and is + * waiting for our next command, while we are blocked waiting for + * the server to send 32 bytes of an extended reply. + */ + if (h->extended_headers && + (char *) h->rbuf >= (char *) &h->sbuf.reply.hdr.extended.flags) { + uint32_t magic = be32toh (h->sbuf.reply.hdr.extended.magic); + if (magic != NBD_EXTENDED_REPLY_MAGIC) { + SET_NEXT_STATE (%.DEAD); /* We've probably lost synchronization. */ + set_error (0, "invalid or unexpected reply magic 0x%" PRIx32, magic); + } + } + return 0; case 0: SET_NEXT_STATE (%CHECK_SIMPLE_OR_STRUCTURED_REPLY); } return 0; From patchwork Mon Nov 14 22:51:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13043006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D14F2C433FE for ; Tue, 15 Nov 2022 00:21:51 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidy-00021A-Ge; Mon, 14 Nov 2022 18:13:34 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidm-0001eb-KN for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJG-00034m-7c for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466329; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jcl1g0NTgJdaOT8hk77pigXhAVIVBMnr6sGj9adZyjQ=; b=L813RCQTQYVnZmnauGhXGu35XIJHRrygJnIxvq0uL5CnDt6IgiBPi5mjP8QAFIiYAl2N9x x1lMEhSzamfjcVdS6MkD2Q/bXZqXOvEYTuvFSWt8rqbqU1rJasQQTsHuBu/X2SdvklM6DK Cpyoy859cOVV+/X9ISVnTRMA4fvmzKs= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-76-f1ZHxNNnOTSRmxXqOC1qHw-1; Mon, 14 Nov 2022 17:52:06 -0500 X-MC-Unique: f1ZHxNNnOTSRmxXqOC1qHw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 095E838164CB; Mon, 14 Nov 2022 22:52:06 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8779640E978A; Mon, 14 Nov 2022 22:52:05 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 07/23] generator: Add struct nbd_extent in prep for 64-bit extents Date: Mon, 14 Nov 2022 16:51:42 -0600 Message-Id: <20221114225158.2186742-8-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The existing nbd_block_status() callback is permanently stuck with an array of uint32_t pairs (len/2 extents), which is both constrained on maximum extent size (no larger than 4G) and on the status flags (must fit in 32 bits). While the "base:allocation" metacontext will never exceed 32 bits, it is not hard to envision other extension metacontexts where a 64-bit status would be useful (for example, Zoned Block Devices expressing a 64-bit offset[1]). Exposing 64-bit extents will require a new API; we now have the decision of whether to copy the existing API practice of returning a bare array containing len/2 extent pairs, or with a saner idea of an array of structs with len extents. Returning an array of structs is actually easier to map to various language bindings, particularly since we know the length field can fit in 63-bits (fallout from the fact that NBD exports are capped in size by signed off_t), but where the status field must represent a full 64 bits (assuming that the user wants to connect to a metadata extension that utilizes 64 bits, rather than existing metadata contexts that only expose 32 bits). This patch introduces the struct we plan to use in the new API, along with language bindings. The bindings for Python and OCaml were relatively straightforward; the Golang bindings took a bit more effort for me to write. Temporary unused attributes are needed to keep the compiler happy until a later patch exposes a new API using the new callback type. Note that 'struct nbd_block_descriptor_ext' in lib/nbd-protocol.h is exactly the same as what we want to use in C. But it is easier to stick a new public type in than to figure out how to expose just part of a header we only want to use internally. [1] https://zonedstorage.io/docs/linux/zbd-api --- generator/API.ml | 12 +++++++++++- generator/API.mli | 1 + generator/C.ml | 24 +++++++++++++++++++++--- generator/GoLang.ml | 24 ++++++++++++++++++++++++ generator/OCaml.ml | 19 ++++++++++++++++--- generator/Python.ml | 21 ++++++++++++++++++++- ocaml/helpers.c | 20 ++++++++++++++++++++ ocaml/nbd-c.h | 3 ++- golang/handle.go | 8 +++++++- 9 files changed, 122 insertions(+), 10 deletions(-) diff --git a/generator/API.ml b/generator/API.ml index beb7a2b4..85509867 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -42,6 +42,7 @@ | BytesPersistOut of string * string | Closure of closure | Enum of string * enum +| Extent64 of string | Fd of string | Flags of string * flags | Int of string @@ -157,6 +158,14 @@ let extent_closure = "nr_entries"); CBMutable (Int "error") ] } +let extent64_closure = { + cbname = "extent64"; + cbargs = [ CBString "metacontext"; + CBUInt64 "offset"; + CBArrayAndLen (Extent64 "entries", + "nr_entries"); + CBMutable (Int "error") ] +} let list_closure = { cbname = "list"; cbargs = [ CBString "name"; CBString "description" ] @@ -166,7 +175,8 @@ let context_closure = cbargs = [ CBString "name" ] } let all_closures = [ chunk_closure; completion_closure; - debug_closure; extent_closure; list_closure; + debug_closure; extent_closure; extent64_closure; + list_closure; context_closure ] (* Enums. *) diff --git a/generator/API.mli b/generator/API.mli index b0267705..12ad1fb4 100644 --- a/generator/API.mli +++ b/generator/API.mli @@ -52,6 +52,7 @@ and | BytesPersistOut of string * string | Closure of closure (** function pointer + void *opaque *) | Enum of string * enum (** enum/union type, int in C *) +| Extent64 of string (** extent descriptor, struct nbd_extent in C *) | Fd of string (** file descriptor *) | Flags of string * flags (** flags, uint32_t in C *) | Int of string (** small int *) diff --git a/generator/C.ml b/generator/C.ml index f9171996..51a4df65 100644 --- a/generator/C.ml +++ b/generator/C.ml @@ -93,6 +93,7 @@ let | Closure { cbname } -> [ sprintf "%s_callback" cbname; sprintf "%s_user_data" cbname ] | Enum (n, _) -> [n] + | Extent64 n -> [n] | Fd n -> [n] | Flags (n, _) -> [n] | Int n -> [n] @@ -128,7 +129,7 @@ let (* list of strings should be marked as non-null *) | StringList n -> [ true ] (* numeric and other non-pointer types are not able to be null *) - | Bool _ | Closure _ | Enum _ | Fd _ | Flags _ + | Bool _ | Closure _ | Enum _ | Extent64 _ | Fd _ | Flags _ | Int _ | Int64 _ | SizeT _ | UInt _ | UInt32 _ | UInt64 _ | UIntPtr _ -> [ false ] @@ -182,6 +183,9 @@ and | Enum (n, _) -> if types then pr "int "; pr "%s" n + | Extent64 n -> + if types then pr "nbd_extent "; + pr "%s" n | Flags (n, _) -> if types then pr "uint32_t "; pr "%s" n @@ -292,6 +296,11 @@ and pr "%s, " n; if types then pr "size_t "; pr "%s" len + | CBArrayAndLen (Extent64 n, len) -> + if types then pr "nbd_extent *"; + pr "%s, " n; + if types then pr "size_t "; + pr "%s" len | CBArrayAndLen _ -> assert false | CBBytesIn (n, len) -> if types then pr "const void *"; @@ -463,6 +472,13 @@ let pr "extern int nbd_get_errno (void);\n"; pr "#define LIBNBD_HAVE_NBD_GET_ERRNO 1\n"; pr "\n"; + pr "/* This is used in the callback for nbd_block_status_64.\n"; + pr " */\n"; + pr "typedef struct {\n"; + pr " uint64_t length;\n"; + pr " uint64_t flags;\n"; + pr "} nbd_extent;\n"; + pr "\n"; print_closure_structs (); List.iter ( fun (name, { args; optargs; ret }) -> @@ -716,7 +732,7 @@ let pr " char *%s_printable =\n" n; pr " nbd_internal_printable_string_list (%s);\n" n | BytesOut _ | BytesPersistOut _ - | Bool _ | Closure _ | Enum _ | Flags _ | Fd _ | Int _ + | Bool _ | Closure _ | Enum _ | Extent64 _ | Flags _ | Fd _ | Int _ | Int64 _ | SizeT _ | SockAddrAndLen _ | UInt _ | UInt32 _ | UInt64 _ | UIntPtr _ -> () ) args; @@ -735,6 +751,7 @@ let pr " %s=\\\"%%s\\\" %s=%%zu" n count | Closure { cbname } -> pr " %s=" cbname | Enum (n, _) -> pr " %s=%%d" n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> pr " %s=0x%%x" n | Fd n | Int n -> pr " %s=%%d" n | Int64 n -> pr " %s=%%\" PRIi64 \"" n @@ -764,6 +781,7 @@ let pr ", %s_printable ? %s_printable : \"\", %s" n n count | Closure { cbname } -> () | Enum (n, _) -> pr ", %s" n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> pr ", %s" n | Fd n | Int n | Int64 n | SizeT n -> pr ", %s" n | SockAddrAndLen (_, len) -> pr ", (int) %s" len @@ -787,7 +805,7 @@ let | StringList n -> pr " free (%s_printable);\n" n | BytesOut _ | BytesPersistOut _ - | Bool _ | Closure _ | Enum _ | Flags _ | Fd _ | Int _ + | Bool _ | Closure _ | Enum _ | Extent64 _ | Flags _ | Fd _ | Int _ | Int64 _ | SizeT _ | SockAddrAndLen _ | UInt _ | UInt32 _ | UInt64 _ | UIntPtr _ -> () ) args; diff --git a/generator/GoLang.ml b/generator/GoLang.ml index eb7279a4..39b6cc64 100644 --- a/generator/GoLang.ml +++ b/generator/GoLang.ml @@ -49,6 +49,7 @@ let | BytesPersistOut (n, len) -> n | Closure { cbname } -> cbname | Enum (n, _) -> n + | Extent64 n -> n | Fd n -> n | Flags (n, _) -> n | Int n -> n @@ -71,6 +72,7 @@ let | BytesPersistOut _ -> "AioBuffer" | Closure { cbname } -> sprintf "%sCallback" (camel_case cbname) | Enum (_, { enum_prefix }) -> camel_case enum_prefix + | Extent64 _ -> assert false (* only used in extent64_closure *) | Fd _ -> "int" | Flags (_, { flag_prefix }) -> camel_case flag_prefix | Int _ -> "int" @@ -262,6 +264,7 @@ let pr " c_%s.user_data = C.alloc_cbid(C.long(%s_cbid))\n" cbname cbname | Enum (n, _) -> pr " c_%s := C.int (%s)\n" n n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Fd n -> pr " c_%s := C.int (%s)\n" n n | Flags (n, _) -> @@ -334,6 +337,7 @@ let | BytesPersistOut (n, len) -> pr ", c_%s, c_%s" n len | Closure { cbname } -> pr ", c_%s" cbname | Enum (n, _) -> pr ", c_%s" n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Fd n -> pr ", c_%s" n | Flags (n, _) -> pr ", c_%s" n | Int n -> pr ", c_%s" n @@ -524,6 +528,18 @@ let copy(ret, s) return ret } + +func copy_extent_array (entries *C.nbd_extent, count C.size_t) []LibnbdExtent { + unsafePtr := unsafe.Pointer(entries) + arrayPtr := (*[1 << 20]C.nbd_extent)(unsafePtr) + slice := arrayPtr[:count:count] + ret := make([]LibnbdExtent, count) + for i := 0; i < int (count); i++ { + ret[i].Length = uint64 (slice[i].length) + ret[i].Flags = uint64 (slice[i].flags) + } + return ret +} "; List.iter ( @@ -537,6 +553,8 @@ let match cbarg with | CBArrayAndLen (UInt32 n, _) -> pr "%s []uint32" n; + | CBArrayAndLen (Extent64 n, _) -> + pr "%s []LibnbdExtent" n; | CBBytesIn (n, len) -> pr "%s []byte" n; | CBInt n -> @@ -563,6 +581,8 @@ let match cbarg with | CBArrayAndLen (UInt32 n, count) -> pr "%s *C.uint32_t, %s C.size_t" n count + | CBArrayAndLen (Extent64 n, count) -> + pr "%s *C.nbd_extent, %s C.size_t" n count | CBBytesIn (n, len) -> pr "%s unsafe.Pointer, %s C.size_t" n len | CBInt n -> @@ -605,6 +625,8 @@ let match cbarg with | CBArrayAndLen (UInt32 n, count) -> pr "copy_uint32_array (%s, %s)" n count + | CBArrayAndLen (Extent64 n, count) -> + pr "copy_extent_array (%s, %s)" n count | CBBytesIn (n, len) -> pr "C.GoBytes (%s, C.int (%s))" n len | CBInt n -> @@ -756,6 +778,8 @@ let match cbarg with | CBArrayAndLen (UInt32 n, count) -> pr "uint32_t *%s, size_t %s" n count + | CBArrayAndLen (Extent64 n, count) -> + pr "nbd_extent *%s, size_t %s" n count | CBBytesIn (n, len) -> pr "void *%s, size_t %s" n len | CBInt n -> diff --git a/generator/OCaml.ml b/generator/OCaml.ml index 6a280b67..66dc6cad 100644 --- a/generator/OCaml.ml +++ b/generator/OCaml.ml @@ -44,6 +44,7 @@ and | Closure { cbargs } -> sprintf "(%s)" (ocaml_closuredecl_to_string cbargs) | Enum (_, { enum_prefix }) -> sprintf "%s.t" enum_prefix + | Extent64 _ -> "extent" | Fd _ -> "Unix.file_descr" | Flags (_, { flag_prefix }) -> sprintf "%s.t list" flag_prefix | Int _ -> "int" @@ -102,6 +103,7 @@ let | BytesPersistOut (n, len) -> n | Closure { cbname } -> cbname | Enum (n, _) -> n + | Extent64 n -> n | Fd n -> n | Flags (n, _) -> n | Int n -> n @@ -147,6 +149,9 @@ let type cookie = int64 +type extent = int64 * int64 +(** Length and flags of an extent in block_status_64 callback. *) + "; List.iter ( @@ -268,6 +273,7 @@ let exception Error of string * Unix.error option exception Closed of string type cookie = int64 +type extent = int64 * int64 (* Give the exceptions names so that they can be raised from the C code. *) let () = @@ -498,7 +504,8 @@ let let argnames = List.map ( function - | CBArrayAndLen (UInt32 n, _) | CBBytesIn (n, _) + | CBArrayAndLen (UInt32 n, _) | CBArrayAndLen (Extent64 n, _) + | CBBytesIn (n, _) | CBInt n | CBInt64 n | CBMutable (Int n) | CBString n | CBUInt n | CBUInt64 n -> n ^ "v" @@ -526,6 +533,9 @@ let | CBArrayAndLen (UInt32 n, count) -> pr " %sv = nbd_internal_ocaml_alloc_int64_from_uint32_array (%s, %s);\n" n n count; + | CBArrayAndLen (Extent64 n, count) -> + pr " %sv = nbd_internal_ocaml_alloc_extent64_array (%s, %s);\n" + n n count; | CBBytesIn (n, len) -> pr " %sv = caml_alloc_initialized_string (%s, %s);\n" n len n | CBInt n | CBUInt n -> @@ -549,7 +559,7 @@ let List.iter ( function - | CBArrayAndLen (UInt32 _, _) + | CBArrayAndLen (_, _) | CBBytesIn _ | CBInt _ | CBInt64 _ @@ -558,7 +568,7 @@ let | CBUInt64 _ -> () | CBMutable (Int n) -> pr " *%s = Int_val (Field (%sv, 0));\n" n n - | CBArrayAndLen _ | CBMutable _ -> assert false + | CBMutable _ -> assert false ) cbargs; pr " if (Is_exception_result (rv)) {\n"; @@ -573,6 +583,7 @@ let pr "}\n"; pr "\n"; pr "static int\n"; + pr "__attribute__((unused)) /* XXX temporary hack */\n"; pr "%s_wrapper " cbname; C.print_cbarg_list ~wrap:true cbargs; pr "\n"; @@ -688,6 +699,7 @@ let pr " %s_callback.free = free_user_data;\n" cbname | Enum (n, { enum_prefix }) -> pr " int %s = %s_val (%sv);\n" n enum_prefix n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Fd n -> pr " /* OCaml Unix.file_descr is just an int, at least on Unix. */\n"; pr " int %s = Int_val (%sv);\n" n n @@ -773,6 +785,7 @@ let | BytesPersistOut _ | Closure _ | Enum _ + | Extent64 _ | Fd _ | Flags _ | Int _ diff --git a/generator/Python.ml b/generator/Python.ml index 3585955b..980fdcea 100644 --- a/generator/Python.ml +++ b/generator/Python.ml @@ -153,6 +153,7 @@ let let print_python_closure_wrapper { cbname; cbargs } = pr "/* Wrapper for %s callback. */\n" cbname; pr "static int\n"; + pr "__attribute__((unused)) /* XXX temporary hack */\n"; pr "%s_wrapper " cbname; C.print_cbarg_list ~wrap:true cbargs; pr "\n"; @@ -165,6 +166,7 @@ let List.iter ( function | CBArrayAndLen (UInt32 n, _) + | CBArrayAndLen (Extent64 n, _) | CBBytesIn (n, _) | CBMutable (Int n) -> pr " PyObject *py_%s = NULL;\n" n @@ -182,6 +184,16 @@ let pr " if (!py_e_%s) { PyErr_PrintEx (0); goto out; }\n" n; pr " PyList_SET_ITEM (py_%s, i_%s, py_e_%s);\n" n n n; pr " }\n" + | CBArrayAndLen (Extent64 n, len) -> + pr " py_%s = PyList_New (%s);\n" n len; + pr " size_t i_%s;\n" n; + pr " for (i_%s = 0; i_%s < %s; ++i_%s) {\n" n n len n; + pr " PyObject *py_e_%s = Py_BuildValue (\"OO\",\n" n; + pr " PyLong_FromUnsignedLong (%s[i_%s].length),\n" n n; + pr " PyLong_FromUnsignedLong (%s[i_%s].flags));\n" n n; + pr " if (!py_e_%s) { PyErr_PrintEx (0); goto out; }\n" n; + pr " PyList_SET_ITEM (py_%s, i_%s, py_e_%s);\n" n n n; + pr " }\n" | CBBytesIn (n, len) -> pr " py_%s = nbd_internal_py_get_subview (data->view, %s, %s);\n" n n len; pr " if (!py_%s) { PyErr_PrintEx (0); goto out; }\n" n @@ -201,6 +213,7 @@ let List.map ( function | CBArrayAndLen (UInt32 n, _) -> "O", sprintf "py_%s" n + | CBArrayAndLen (Extent64 n, _) -> "O", sprintf "py_%s" n | CBBytesIn (n, _) -> "O", sprintf "py_%s" n | CBInt n -> "i", n | CBInt64 n -> "L", n @@ -249,7 +262,8 @@ let pr " out:\n"; List.iter ( function - | CBArrayAndLen (UInt32 n, _) -> + | CBArrayAndLen (UInt32 n, _) + | CBArrayAndLen (Extent64 n, _) -> pr " Py_XDECREF (py_%s);\n" n | CBBytesIn (n, _) -> pr " Py_XDECREF (py_%s);\n" n @@ -302,6 +316,7 @@ let pr ".callback = %s_wrapper, .free = free_user_data" cbname); pr " };\n" | Enum (n, _) -> pr " int %s;\n" n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> pr " uint32_t %s_u32;\n" n; pr " unsigned int %s; /* really uint32_t */\n" n @@ -358,6 +373,7 @@ let "O", sprintf "&%s" n, sprintf "py_%s->buf, py_%s->len" n n | Closure { cbname } -> "O", sprintf "&py_%s_fn" cbname, cbname | Enum (n, _) -> "i", sprintf "&%s" n, n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> "I", sprintf "&%s" n, sprintf "%s_u32" n | Fd n | Int n -> "i", sprintf "&%s" n, n | Int64 n -> "L", sprintf "&%s" n, sprintf "%s_i64" n @@ -449,6 +465,7 @@ let pr " if (!chunk_user_data->view) goto out;\n" ) | Enum _ -> () + | Extent64 _ -> () | Flags (n, _) -> pr " %s_u32 = %s;\n" n n | Fd _ | Int _ -> () | Int64 n -> pr " %s_i64 = %s;\n" n n @@ -547,6 +564,7 @@ let | Closure { cbname } -> pr " free_user_data (%s_user_data);\n" cbname | Enum _ -> () + | Extent64 _ -> () | Flags _ -> () | Fd _ | Int _ -> () | Int64 _ -> () @@ -834,6 +852,7 @@ let | BytesPersistOut (n, _) -> n, None | Closure { cbname } -> cbname, None | Enum (n, _) -> n, None + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> n, None | Fd n | Int n -> n, None | Int64 n -> n, None diff --git a/ocaml/helpers.c b/ocaml/helpers.c index 6568755b..ac8bb0e2 100644 --- a/ocaml/helpers.c +++ b/ocaml/helpers.c @@ -133,6 +133,26 @@ nbd_internal_ocaml_alloc_int64_from_uint32_array (uint32_t *a, size_t len) CAMLreturn (rv); } +value +nbd_internal_ocaml_alloc_extent64_array (nbd_extent *a, size_t len) +{ + CAMLparam0 (); + CAMLlocal3 (s, v, rv); + size_t i; + + rv = caml_alloc (len, 0); + for (i = 0; i < len; ++i) { + s = caml_alloc (2, 0); + v = caml_copy_int64 (a[i].length); + Store_field (s, 0, v); + v = caml_copy_int64 (a[i].flags); + Store_field (s, 1, v); + Store_field (rv, i, s); + } + + CAMLreturn (rv); +} + /* Convert a Unix.sockaddr to a C struct sockaddr. */ void nbd_internal_unix_sockaddr_to_sa (value sockaddrv, diff --git a/ocaml/nbd-c.h b/ocaml/nbd-c.h index 8b0c088d..9a5a80a2 100644 --- a/ocaml/nbd-c.h +++ b/ocaml/nbd-c.h @@ -1,5 +1,5 @@ /* NBD client library in userspace - * Copyright (C) 2013-2019 Red Hat Inc. + * Copyright (C) 2013-2022 Red Hat Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -63,6 +63,7 @@ extern void nbd_internal_ocaml_raise_closed (const char *func) Noreturn; extern const char **nbd_internal_ocaml_string_list (value); extern value nbd_internal_ocaml_alloc_int64_from_uint32_array (uint32_t *, size_t); +extern value nbd_internal_ocaml_alloc_extent64_array (nbd_extent *, size_t); extern void nbd_internal_unix_sockaddr_to_sa (value, struct sockaddr_storage *, socklen_t *); extern void nbd_internal_ocaml_exception_in_wrapper (const char *, value); diff --git a/golang/handle.go b/golang/handle.go index c8d9485d..b8388b61 100644 --- a/golang/handle.go +++ b/golang/handle.go @@ -1,5 +1,5 @@ /* libnbd golang handle. - * Copyright (C) 2013-2021 Red Hat Inc. + * Copyright (C) 2013-2022 Red Hat Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -58,6 +58,12 @@ func (h *Libnbd) String() string { return "&Libnbd{}" } +/* Used for block status callback. */ +type LibnbdExtent struct { + Length uint64 // length of the extent + Flags uint64 // flags describing properties of the extent +} + /* All functions (except Close) return ([result,] LibnbdError). */ type LibnbdError struct { Op string // operation which failed From patchwork Mon Nov 14 22:51:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042992 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CEB1FC433FE for ; Tue, 15 Nov 2022 00:08:39 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidv-0001wv-JM; Mon, 14 Nov 2022 18:13:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouido-0001X6-79 for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:24 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJH-00035H-J4 for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466330; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1cemSvuHYPXOo5uQfvR6U2aeVGuLoDft6zMTDXWL10Y=; b=AI8Tc9OsDU+MzHVr0lT72n5dO8JLwqS1gLVmEp63MFI5zDpE7jgcxFVHJl4JJRPX5GgkTM YEKXJH0MZLNfj23CSRDmS+8acC/JqwdxDqm/+OQEFeuoLOd4olmlQ/I9UXWRZ5yl/qen6B gyJhe7Y/kRAODGhaNPNreN6BgLom6vA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-373-ua_psouvN3WsrSJyGMDkFw-1; Mon, 14 Nov 2022 17:52:06 -0500 X-MC-Unique: ua_psouvN3WsrSJyGMDkFw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 91B84196EF8B; Mon, 14 Nov 2022 22:52:06 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2D68440E9786; Mon, 14 Nov 2022 22:52:06 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 08/23] block_status: Track 64-bit extents internally Date: Mon, 14 Nov 2022 16:51:43 -0600 Message-Id: <20221114225158.2186742-9-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org When extended headers are in use, the server can send us 64-bit extents, even for a 32-bit query (if the server knows the entire image is data, for example, or if the metacontext has a status definition that uses more than 32 bits). Also, while most contexts only have 32-bit flags, a server is allowed to negotiate contexts with 64-bit flags when extended headers are in use. Thus, for maximum flexibility, we are better off storing 64-bit data internally, with a goal of letting the client's 32-bit interface work as much as possible, and for a future API addition of a 64-bit interface to work even when the server only gave 32-bit results. For backwards compatibility, a client that never negotiates a 64-bit status context can be handled without errors by truncating any 64-bit lengths down to just under 4G; so the old 32-bit interface will continue to work in most cases. But we can't truncate status down; if a user requests an extended status, the 32-bit interface can now report EOVERFLOW for that context (although that situation can't happen until a later patch actually turns on the use of extended headers). Note that the existing 32-bit nbd_block_status() API is now slightly slower, particularly when talking with a server that lacks extended headers: we are doing double size conversions. But this speed penalty is likely in the noise compared to the network delays, and ideally clients will switch over to the new 64-bit interfaces as more and more servers start supporting extended headers. One of the trickier aspects of this patch is auditing that both the user's extent and our malloc'd shim get cleaned up once on all possible paths, so that there is neither a leak nor a double free. --- lib/internal.h | 8 +++- generator/states-reply-structured.c | 30 ++++++++----- lib/handle.c | 2 +- lib/rw.c | 70 ++++++++++++++++++++++++++++- 4 files changed, 95 insertions(+), 15 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index 73fd24c0..0c23f882 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -80,7 +80,7 @@ struct export { struct command_cb { union { - nbd_extent_callback extent; + nbd_extent64_callback extent; nbd_chunk_callback chunk; nbd_list_callback list; nbd_context_callback context; @@ -302,7 +302,11 @@ struct nbd_handle { size_t querynum; /* When receiving block status, this is used. */ - uint32_t *bs_entries; + union { + char *storage; /* malloc's view */ + nbd_extent *normal; /* Our 64-bit preferred internal form; n slots */ + uint32_t *narrow; /* 32-bit NBD_REPLY_TYPE_BLOCK_STATUS form; n*2 slots */ + } bs_entries; /* Commands which are waiting to be issued [meaning the request * packet is sent to the server]. This is used as a simple linked diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index da9894c6..d23e56a9 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -436,6 +436,7 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: struct command *cmd = h->reply_cmd; uint32_t length; + uint32_t count; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -450,15 +451,19 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (length >= 12); length -= sizeof h->sbuf.reply.payload.bs_hdr; + count = length / (2 * sizeof (uint32_t)); - free (h->bs_entries); - h->bs_entries = malloc (length); - if (h->bs_entries == NULL) { + /* Read raw data into a subset of h->bs_entries, then expand it + * into place later during byte-swapping. + */ + free (h->bs_entries.storage); + h->bs_entries.storage = malloc (count * sizeof *h->bs_entries.normal); + if (h->bs_entries.storage == NULL) { SET_NEXT_STATE (%.DEAD); set_error (errno, "malloc"); return 0; } - h->rbuf = h->bs_entries; + h->rbuf = h->bs_entries.narrow; h->rlen = length; SET_NEXT_STATE (%RECV_BS_ENTRIES); } @@ -470,6 +475,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: uint32_t count; size_t i; uint32_t context_id; + uint32_t *raw; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -483,17 +489,21 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); - assert (h->bs_entries); + assert (h->bs_entries.normal); assert (length >= 12); assert (h->meta_valid); count = (length - sizeof h->sbuf.reply.payload.bs_hdr) / - sizeof *h->bs_entries; + (2 * sizeof (uint32_t)); /* Need to byte-swap the entries returned, but apart from that we - * don't validate them. + * don't validate them. Reverse order is essential, since we are + * expanding in-place from narrow to wider type. */ - for (i = 0; i < count; ++i) - h->bs_entries[i] = be32toh (h->bs_entries[i]); + raw = h->bs_entries.narrow; + for (i = count; i-- > 0; ) { + h->bs_entries.normal[i].flags = be32toh (raw[i * 2 + 1]); + h->bs_entries.normal[i].length = be32toh (raw[i * 2]); + } /* Look up the context ID. */ context_id = be32toh (h->sbuf.reply.payload.bs_hdr.context_id); @@ -507,7 +517,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: if (CALL_CALLBACK (cmd->cb.fn.extent, h->meta_contexts.ptr[i].name, cmd->offset, - h->bs_entries, count, + h->bs_entries.normal, count, &error) == -1) if (cmd->error == 0) cmd->error = error ? error : EPROTO; diff --git a/lib/handle.c b/lib/handle.c index 4a186f8f..41e442e7 100644 --- a/lib/handle.c +++ b/lib/handle.c @@ -130,7 +130,7 @@ nbd_close (struct nbd_handle *h) string_vector_iter (&h->querylist, (void *) free); free (h->querylist.ptr); - free (h->bs_entries); + free (h->bs_entries.storage); nbd_internal_reset_size_and_flags (h); for (i = 0; i < h->meta_contexts.len; ++i) free (h->meta_contexts.ptr[i].name); diff --git a/lib/rw.c b/lib/rw.c index 81dded3f..6212bf4c 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -42,6 +42,61 @@ wait_for_command (struct nbd_handle *h, int64_t cookie) return r == -1 ? -1 : 0; } +/* Convert from 64-bit to 32-bit extent callback. */ +static int +nbd_convert_extent (void *data, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t nr_entries, int *error) +{ + nbd_extent_callback *cb = data; + uint32_t *array = malloc (nr_entries * 2 * sizeof *array); + size_t i; + int ret; + bool fail = false; + + if (array == NULL) { + set_error (*error = errno, "malloc"); + return -1; + } + + for (i = 0; i < nr_entries; i++) { + array[i * 2] = entries[i].length; + array[i * 2 + 1] = entries[i].flags; + /* If an extent is larger than 32 bits, silently truncate the rest + * of the server's response; the client can then make progress + * instead of needing to see failure. Rather than track the + * connection's alignment, just clamp the large extent to 4G-64M. + * However, if flags doesn't fit in 32 bits, it's better to inform + * the caller of an EOVERFLOW failure. + * + * Technically, a server with 64-bit answers is non-compliant if + * the client did not negotiate extended headers - contexts that + * include 64-bit flags should not have been negotiated in that + * case. + */ + if (entries[i].length > UINT32_MAX) { + array[i++ * 2] = -MAX_REQUEST_SIZE; + break; + } + if (entries[i].flags > UINT32_MAX) { + *error = EOVERFLOW; + fail = true; + break; + } + } + + ret = CALL_CALLBACK (*cb, metacontext, offset, array, i * 2, error); + free (array); + return fail ? -1 : ret; +} + +static void +nbd_convert_extent_free (void *data) +{ + nbd_extent_callback *cb = data; + FREE_CALLBACK (*cb); + free (cb); +} + /* Issue a read command and wait for the reply. */ int nbd_unlocked_pread (struct nbd_handle *h, void *buf, @@ -487,12 +542,23 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, nbd_completion_callback *completion, uint32_t flags) { - struct command_cb cb = { .fn.extent = *extent, + nbd_extent_callback *shim = malloc (sizeof *shim); + struct command_cb cb = { .fn.extent.callback = nbd_convert_extent, + .fn.extent.user_data = shim, + .fn.extent.free = nbd_convert_extent_free, .completion = *completion }; + if (shim == NULL) { + set_error (errno, "malloc"); + return -1; + } + *shim = *extent; + SET_CALLBACK_TO_NULL (*extent); + if (h->strict & LIBNBD_STRICT_COMMANDS) { if (!h->structured_replies) { set_error (ENOTSUP, "server does not support structured replies"); + FREE_CALLBACK (cb.fn.extent); return -1; } @@ -500,11 +566,11 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, set_error (ENOTSUP, "did not negotiate any metadata contexts, " "either you did not call nbd_add_meta_context before " "connecting or the server does not support it"); + FREE_CALLBACK (cb.fn.extent); return -1; } } - SET_CALLBACK_TO_NULL (*extent); SET_CALLBACK_TO_NULL (*completion); return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset, count, EINVAL, NULL, &cb); From patchwork Mon Nov 14 22:51:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2E024C433FE for ; Tue, 15 Nov 2022 00:00:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidt-0001uM-Pb; Mon, 14 Nov 2022 18:13:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouido-0001eb-9m for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:24 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJH-00035O-Ll for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466331; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qoAWbegsBTfEv47rBxOGcuvfh/pdMnVmqyolAT0+VnU=; b=ZFURzNBBah6c+cEc+xL/2ZKwZZZKgQThNY7ZPrecwwUvnt7+UCZDw6275nwGbAVfs7XEEa VXFIBqpZK3liXfu7a4sb1pSO8R4ULYirf7Z+oA1BNaWGCuhxJ0/Nph/1PdeqHI5ytJqZ/p BqbXttSbRkpf9OByMWSyiQ8x0VE1zBE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-433-0cOpsB4XOQ2s_ZGD88tnzQ-1; Mon, 14 Nov 2022 17:52:07 -0500 X-MC-Unique: 0cOpsB4XOQ2s_ZGD88tnzQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2A8EE802556; Mon, 14 Nov 2022 22:52:07 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id BBB9840E9794; Mon, 14 Nov 2022 22:52:06 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 09/23] block_status: Accept 64-bit extents during block status Date: Mon, 14 Nov 2022 16:51:44 -0600 Message-Id: <20221114225158.2186742-10-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Support a server giving us a 64-bit extent. Note that the protocol says a server should not give a 64-bit answer when extended headers are not negotiated; we can handle that by reporting EPROTO but otherwise accepting the information. Meanwhile, when extended headers are in effect, even a 32-bit original query can produce a 64-bit answer; and likewise, a 64-bit query may have so much information that the server truncates it to a 32-bit answer, so we must be prepared for either type of response. Since we already store 64-bit extents internally, the user's 32-bit callback doesn't have to care which reply chunk the server uses (the shim takes care of that, and an upcoming patch adds new APIs to let the client use a 64-bit callback). Of course, until a later patch enables extended headers negotiation, no compliant server will trigger the new code here. Implementation-wise, 'normal' and 'wide' are two different types but have the same underlying size; keeping the two names makes it easier to reason about when values are still in network byte order from the server or native endian for local processing. --- lib/internal.h | 3 + generator/states-reply-structured.c | 88 ++++++++++++++++++++++------- 2 files changed, 72 insertions(+), 19 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index 0c23f882..b91fe6f6 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -248,6 +248,7 @@ struct nbd_handle { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; struct nbd_structured_reply_block_status_hdr bs_hdr; + struct nbd_structured_reply_block_status_ext_hdr bs_ext_hdr; struct { struct nbd_structured_reply_error error; char msg[NBD_MAX_STRING]; /* Common to all error types */ @@ -306,6 +307,8 @@ struct nbd_handle { char *storage; /* malloc's view */ nbd_extent *normal; /* Our 64-bit preferred internal form; n slots */ uint32_t *narrow; /* 32-bit NBD_REPLY_TYPE_BLOCK_STATUS form; n*2 slots */ + struct nbd_block_descriptor_ext *wide; + /* 64-bit NBD_REPLY_TYPE_BLOCK_STATUS_EXT; n slots */ } bs_entries; /* Commands which are waiting to be issued [meaning the request diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index d23e56a9..7e313b5a 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -22,6 +22,8 @@ #include #include +#include "minmax.h" + /* Structured reply must be completely inside the bounds of the * requesting command. */ @@ -147,6 +149,24 @@ REPLY.STRUCTURED_REPLY.CHECK: /* Start by reading the context ID. */ h->rbuf = &h->sbuf.reply.payload.bs_hdr; h->rlen = sizeof h->sbuf.reply.payload.bs_hdr; + h->sbuf.reply.payload.bs_ext_hdr.count = 0; + SET_NEXT_STATE (%RECV_BS_HEADER); + break; + + case NBD_REPLY_TYPE_BLOCK_STATUS_EXT: + if (cmd->type != NBD_CMD_BLOCK_STATUS || + length < 24 || + (length-8) % sizeof(struct nbd_block_descriptor_ext)) + goto resync; + if (!h->extended_headers) { + debug (h, "unexpected 64-bit block status without extended headers, " + "this is probably a server bug"); + if (cmd->error == 0) + cmd->error = EPROTO; + } + /* Start by reading the context ID. */ + h->rbuf = &h->sbuf.reply.payload.bs_ext_hdr; + h->rlen = sizeof h->sbuf.reply.payload.bs_ext_hdr; SET_NEXT_STATE (%RECV_BS_HEADER); break; @@ -437,6 +457,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: struct command *cmd = h->reply_cmd; uint32_t length; uint32_t count; + uint16_t type; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -446,24 +467,44 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: return 0; case 0: length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ + type = be16toh (h->sbuf.reply.hdr.structured.type); assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); - assert (length >= 12); - length -= sizeof h->sbuf.reply.payload.bs_hdr; - count = length / (2 * sizeof (uint32_t)); - /* Read raw data into a subset of h->bs_entries, then expand it + if (type == NBD_REPLY_TYPE_BLOCK_STATUS) { + assert (length >= 12); + length -= sizeof h->sbuf.reply.payload.bs_hdr; + count = length / (2 * sizeof (uint32_t)); + } + else { + assert (type == NBD_REPLY_TYPE_BLOCK_STATUS_EXT); + assert (length >= 24); + length -= sizeof h->sbuf.reply.payload.bs_ext_hdr; + count = length / sizeof (struct nbd_block_descriptor_ext); + if (h->sbuf.reply.payload.bs_ext_hdr.count && + count != be32toh (h->sbuf.reply.payload.bs_ext_hdr.count)) { + h->rbuf = NULL; + h->rlen = length; + SET_NEXT_STATE (%RESYNC); + return 0; + } + } + /* Normalize count for later use. */ + h->sbuf.reply.payload.bs_ext_hdr.count = count; + + /* Read raw data into an overlap of h->bs_entries, then move it * into place later during byte-swapping. */ free (h->bs_entries.storage); - h->bs_entries.storage = malloc (count * sizeof *h->bs_entries.normal); + h->bs_entries.storage = malloc (MAX (count * sizeof *h->bs_entries.normal, + length)); if (h->bs_entries.storage == NULL) { SET_NEXT_STATE (%.DEAD); set_error (errno, "malloc"); return 0; } - h->rbuf = h->bs_entries.narrow; + h->rbuf = h->bs_entries.storage; h->rlen = length; SET_NEXT_STATE (%RECV_BS_ENTRIES); } @@ -471,11 +512,10 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: struct command *cmd = h->reply_cmd; - uint32_t length; uint32_t count; size_t i; uint32_t context_id; - uint32_t *raw; + uint16_t type; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -484,25 +524,35 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: SET_NEXT_STATE (%.READY); return 0; case 0: - length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ + type = be16toh (h->sbuf.reply.hdr.structured.type); + count = h->sbuf.reply.payload.bs_ext_hdr.count; /* normalized in BS_HEADER */ assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); - assert (h->bs_entries.normal); - assert (length >= 12); + assert (h->bs_entries.normal && count); assert (h->meta_valid); - count = (length - sizeof h->sbuf.reply.payload.bs_hdr) / - (2 * sizeof (uint32_t)); /* Need to byte-swap the entries returned, but apart from that we - * don't validate them. Reverse order is essential, since we are - * expanding in-place from narrow to wider type. + * don't validate them. */ - raw = h->bs_entries.narrow; - for (i = count; i-- > 0; ) { - h->bs_entries.normal[i].flags = be32toh (raw[i * 2 + 1]); - h->bs_entries.normal[i].length = be32toh (raw[i * 2]); + if (type == NBD_REPLY_TYPE_BLOCK_STATUS) { + uint32_t *raw = h->bs_entries.narrow; + + /* Expanding in-place from narrow to wide, must use reverse order. */ + for (i = count; i-- > 0; ) { + h->bs_entries.normal[i].flags = be32toh (raw[i * 2 + 1]); + h->bs_entries.normal[i].length = be32toh (raw[i * 2]); + } + } + else { + struct nbd_block_descriptor_ext *wide = h->bs_entries.wide; + + assert (type == NBD_REPLY_TYPE_BLOCK_STATUS_EXT); + for (i = 0; i < count; i++) { + h->bs_entries.normal[i].length = be64toh (wide[i].length); + h->bs_entries.normal[i].flags = be64toh (wide[i].status_flags); + } } /* Look up the context ID. */ From patchwork Mon Nov 14 22:51:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DF888C4332F for ; Mon, 14 Nov 2022 23:59:07 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidt-0001t2-4o; Mon, 14 Nov 2022 18:13:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidn-0001eb-DU for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:23 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJJ-000361-2K for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466332; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=seDiZqQnIf3SgjsUnP5k3NnNxv90qCNsxvo5UmDgIhI=; b=EycBA16d0Ew6929JuhLg4EPZo0+AiYemPmj8tuXfN0i+2ADj8HEYF2MVRLgeaupouzIUbV I2s4rgh5K5aHgyVhNrm2NQGpoJZSrtkHymbTkEjMa8QBlQOgP7rbcS93HgA/LN27ceAN0E 5ujX3rLraZThcqFVuZCoFbOF2GWKQ9Y= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-25-HbXOJNWKNHa6t5LeGSZ_Mg-1; Mon, 14 Nov 2022 17:52:08 -0500 X-MC-Unique: HbXOJNWKNHa6t5LeGSZ_Mg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B5E5387B2A0; Mon, 14 Nov 2022 22:52:07 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5037C40E9787; Mon, 14 Nov 2022 22:52:07 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 10/23] api: Add [aio_]nbd_block_status_64 Date: Mon, 14 Nov 2022 16:51:45 -0600 Message-Id: <20221114225158.2186742-11-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Overcome the inherent 32-bit limitation of our existing nbd_block_status command by adding a 64-bit variant. The command sent to the server does not change, but the user's callback is now handed 64-bit information regardless of whether the server replies with 32- or 64-bit extents. Unit tests prove that the new API works in each of C, Python, OCaml, and Go bindings. We can also get rid of the temporary hack added to appease the compiler in an earlier patch. --- docs/libnbd.pod | 18 +-- sh/nbdsh.pod | 2 +- generator/API.ml | 144 +++++++++++++++++++--- generator/OCaml.ml | 1 - generator/Python.ml | 1 - lib/rw.c | 48 ++++++-- python/t/465-block-status-64.py | 56 +++++++++ ocaml/tests/Makefile.am | 1 + ocaml/tests/test_465_block_status_64.ml | 58 +++++++++ tests/meta-base-allocation.c | 111 +++++++++++++++-- fuzzing/libnbd-fuzz-wrapper.c | 22 +++- golang/Makefile.am | 1 + golang/libnbd_465_block_status_64_test.go | 119 ++++++++++++++++++ 13 files changed, 534 insertions(+), 48 deletions(-) create mode 100644 python/t/465-block-status-64.py create mode 100644 ocaml/tests/test_465_block_status_64.ml create mode 100644 golang/libnbd_465_block_status_64_test.go diff --git a/docs/libnbd.pod b/docs/libnbd.pod index 170fc1fa..b57cfa46 100644 --- a/docs/libnbd.pod +++ b/docs/libnbd.pod @@ -661,14 +661,14 @@ In order to utilize block status, the client must call L prior to connecting, for each meta context in which it is interested, then check L after connection to see which contexts the server actually supports. If a -context is supported, the client can then use L -with a callback function that will receive an array of 32-bit integer -pairs describing consecutive extents within a context. In each pair, -the first integer is the length of the extent, the second is a bitmask -description of that extent (for the "base:allocation" context, the -bitmask may include C for unallocated portions of -the file, and/or C for portions of the file known -to read as zero). +context is supported, the client can then use +L with a callback function that will receive +an array of structs describing consecutive extents within a context. +Each struct gives the length of the extent, then a bitmask description +of that extent (for the "base:allocation" context, the bitmask may +include C for unallocated portions of the file, +and/or C for portions of the file known to read as +zero). There is a full example of requesting meta context and using block status available at @@ -917,7 +917,7 @@ will tie up resources until L is eventually reached). =head2 Callbacks with C parameter Some of the high-level commands (L, -L) involve the use of a callback function invoked +L) involve the use of a callback function invoked by the state machine at appropriate points in the server's reply before the overall command is complete. These callback functions, along with all of the completion callbacks, include a parameter diff --git a/sh/nbdsh.pod b/sh/nbdsh.pod index 4d14118c..666a31b6 100644 --- a/sh/nbdsh.pod +++ b/sh/nbdsh.pod @@ -59,7 +59,7 @@ Display brief command line help and exit. =item B<--base-allocation> Request the use of the "base:allocation" meta context, which is the -most common context used with L. This is +most common context used with L. This is equivalent to calling S> in the shell prior to connecting, and works even when combined with C<--uri> (while diff --git a/generator/API.ml b/generator/API.ml index 85509867..aeee41fb 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -1565,7 +1565,7 @@ "add_meta_context", { During connection libnbd can negotiate zero or more metadata contexts with the server. Metadata contexts are features (such as C<\"base:allocation\">) which describe information returned -by the L command (for C<\"base:allocation\"> +by the L command (for C<\"base:allocation\"> this is whether blocks of data are allocated, zero or sparse). This call adds one metadata context to the list to be negotiated. @@ -1592,7 +1592,7 @@ "add_meta_context", { Other metadata contexts are server-specific, but include C<\"qemu:dirty-bitmap:...\"> and C<\"qemu:allocation-depth\"> for qemu-nbd (see qemu-nbd I<-B> and I<-A> options)."; - see_also = [Link "block_status"; Link "can_meta_context"; + see_also = [Link "block_status_64"; Link "can_meta_context"; Link "get_nr_meta_contexts"; Link "get_meta_context"; Link "clear_meta_contexts"]; }; @@ -1605,14 +1605,14 @@ "get_nr_meta_contexts", { During connection libnbd can negotiate zero or more metadata contexts with the server. Metadata contexts are features (such as C<\"base:allocation\">) which describe information returned -by the L command (for C<\"base:allocation\"> +by the L command (for C<\"base:allocation\"> this is whether blocks of data are allocated, zero or sparse). This command returns how many meta contexts have been added to the list to request from the server via L. The server is not obligated to honor all of the requests; to see what it actually supports, see L."; - see_also = [Link "block_status"; Link "can_meta_context"; + see_also = [Link "block_status_64"; Link "can_meta_context"; Link "add_meta_context"; Link "get_meta_context"; Link "clear_meta_contexts"]; }; @@ -1625,13 +1625,13 @@ "get_meta_context", { During connection libnbd can negotiate zero or more metadata contexts with the server. Metadata contexts are features (such as C<\"base:allocation\">) which describe information returned -by the L command (for C<\"base:allocation\"> +by the L command (for C<\"base:allocation\"> this is whether blocks of data are allocated, zero or sparse). This command returns the i'th meta context request, as added by L, and bounded by L."; - see_also = [Link "block_status"; Link "can_meta_context"; + see_also = [Link "block_status_64"; Link "can_meta_context"; Link "add_meta_context"; Link "get_nr_meta_contexts"; Link "clear_meta_contexts"]; }; @@ -1645,7 +1645,7 @@ "clear_meta_contexts", { During connection libnbd can negotiate zero or more metadata contexts with the server. Metadata contexts are features (such as C<\"base:allocation\">) which describe information returned -by the L command (for C<\"base:allocation\"> +by the L command (for C<\"base:allocation\"> this is whether blocks of data are allocated, zero or sparse). This command resets the list of meta contexts to request back to @@ -1654,7 +1654,7 @@ "clear_meta_contexts", { negotiation mode is selected (see L), for altering the list of attempted contexts between subsequent export queries."; - see_also = [Link "block_status"; Link "can_meta_context"; + see_also = [Link "block_status_64"; Link "can_meta_context"; Link "add_meta_context"; Link "get_nr_meta_contexts"; Link "get_meta_context"; Link "set_opt_mode"]; }; @@ -2233,7 +2233,7 @@ "can_meta_context", { ^ non_blocking_test_call_description; see_also = [SectionLink "Flag calls"; Link "opt_info"; Link "add_meta_context"; - Link "block_status"; Link "aio_block_status"; + Link "block_status_64"; Link "aio_block_status_64"; Link "set_request_meta_context"; Link "opt_set_meta_context"]; }; @@ -2664,7 +2664,7 @@ "block_status", { optargs = [ OFlags ("flags", cmd_flags, Some ["REQ_ONE"]) ]; ret = RErr; permitted_states = [ Connected ]; - shortdesc = "send block status command to the NBD server"; + shortdesc = "send block status command to the NBD server, with 32-bit callback"; longdesc = "\ Issue the block status command to the NBD server. If supported by the server, this causes metadata context @@ -2679,7 +2679,15 @@ "block_status", { The NBD protocol does not yet have a way for a client to learn if the server will enforce an even smaller maximum block status size, although a future extension may add a constraint visible in -L. +L. Furthermore, this function is inherently +limited to 32-bit values. If the server replies with a larger +extent, the length of that extent will be truncated to just +below 32 bits and any further extents from the server will be +ignored. If the server replies with a status value larger than +32 bits (only possible when extended headers are in use), the +callback function will be passed an C error. To get +the full extent information from a server that supports 64-bit +extents, you must use L. Depending on which metadata contexts were enabled before connecting (see L) and which are @@ -2699,7 +2707,7 @@ "block_status", { array is an array of pairs of integers with the first entry in each pair being the length (in bytes) of the block and the second entry being a status/flags field which is specific to the metadata context. -(The number of pairs passed to the function is C.) The +The number of pairs passed to the function is C. The NBD protocol document in the section about C describes the meaning of this array; for contexts known to libnbd, Blibnbd.hE> contains constants @@ -2723,10 +2731,79 @@ "block_status", { does not exceed C bytes; however, libnbd does not validate that the server obeyed the flag." ^ strict_call_description; - see_also = [Link "add_meta_context"; Link "can_meta_context"; + see_also = [Link "block_status_64"; + Link "add_meta_context"; Link "can_meta_context"; Link "aio_block_status"; Link "set_strict_mode"]; }; + "block_status_64", { + default_call with + args = [ UInt64 "count"; UInt64 "offset"; Closure extent64_closure ]; + optargs = [ OFlags ("flags", cmd_flags, Some ["REQ_ONE"]) ]; + ret = RErr; + permitted_states = [ Connected ]; + shortdesc = "send block status command to the NBD server, with 64-bit callback"; + longdesc = "\ +Issue the block status command to the NBD server. If +supported by the server, this causes metadata context +information about blocks beginning from the specified +offset to be returned. The C parameter is a hint: the +server may choose to return less status, or the final block +may extend beyond the requested range. If multiple contexts +are supported, the number of blocks and cumulative length +of those blocks need not be identical between contexts. + +Note that not all servers can support a C of 4GiB or larger. +The NBD protocol does not yet have a way for a client to learn if +the server will enforce an even smaller maximum block status size, +although a future extension may add a constraint visible in +L. + +Depending on which metadata contexts were enabled before +connecting (see L) and which are +supported by the server (see L) this call +returns information about extents by calling back to the +C function. The callback cannot call C APIs on the +same handle since it holds the handle lock and will +cause a deadlock. If the callback returns C<-1>, and no earlier +error has been detected, then the overall block status command +will fail with any non-zero value stored into the callback's +C parameter (with a default of C); but any further +contexts will still invoke the callback. + +The C function is called once per type of metadata available, +with the C passed to this function. The C +parameter is a string such as C<\"base:allocation\">. The C +array is an array of B structs, containing length (in bytes) +of the block and a status/flags field which is specific to the metadata +context. The number of array entries passed to the function is +C. The NBD protocol document in the section about +C describes the meaning of this array; +for contexts known to libnbd, Blibnbd.hE> contains constants +beginning with C that may help decipher the values. +On entry to the callback, the C parameter contains the errno +value of any previously detected error. + +It is possible for the extent function to be called +more times than you expect (if the server is buggy), +so always check the C field to ensure you +are receiving the data you expect. It is also possible +that the extent function is not called at all, even for +metadata contexts that you requested. This indicates +either that the server doesn't support the context +or for some other reason cannot return the data. + +The C parameter may be C<0> for no flags, or may contain +C meaning that the server should +return only one extent per metadata context where that extent +does not exceed C bytes; however, libnbd does not +validate that the server obeyed the flag." +^ strict_call_description; + see_also = [Link "block_status"; + Link "add_meta_context"; Link "can_meta_context"; + Link "aio_block_status_64"; Link "set_strict_mode"]; + }; + "poll", { default_call with args = [ Int "timeout" ]; ret = RInt; @@ -3321,7 +3398,7 @@ "aio_block_status", { OFlags ("flags", cmd_flags, Some ["REQ_ONE"]) ]; ret = RCookie; permitted_states = [ Connected ]; - shortdesc = "send block status command to the NBD server"; + shortdesc = "send block status command to the NBD server, with 32-bit callback"; longdesc = "\ Send the block status command to the NBD server. @@ -3329,13 +3406,48 @@ "aio_block_status", { Or supply the optional C which will be invoked as described in L. -Other parameters behave as documented in L." +Other parameters behave as documented in L. + +This function is inherently limited to 32-bit values. If the +server replies with a larger extent, the length of that extent +will be truncated to just below 32 bits and any further extents +from the server will be ignored. If the server replies with a +status value larger than 32 bits (only possible when extended +headers are in use), the callback function will be passed an +C error. To get the full extent information from a +server that supports 64-bit extents, you must use +L. +" ^ strict_call_description; see_also = [SectionLink "Issuing asynchronous commands"; + Link "aio_block_status_64"; Link "can_meta_context"; Link "block_status"; Link "set_strict_mode"]; }; + "aio_block_status_64", { + default_call with + args = [ UInt64 "count"; UInt64 "offset"; Closure extent64_closure ]; + optargs = [ OClosure completion_closure; + OFlags ("flags", cmd_flags, Some ["REQ_ONE"]) ]; + ret = RCookie; + permitted_states = [ Connected ]; + shortdesc = "send block status command to the NBD server, with 64-bit callback"; + longdesc = "\ +Send the block status command to the NBD server. + +To check if the command completed, call L. +Or supply the optional C which will be invoked +as described in L. + +Other parameters behave as documented in L." +^ strict_call_description; + see_also = [SectionLink "Issuing asynchronous commands"; + Link "aio_block_status"; + Link "can_meta_context"; Link "block_status_64"; + Link "set_strict_mode"]; + }; + "aio_get_fd", { default_call with args = []; ret = RFd; @@ -3858,6 +3970,8 @@ let first_version = "aio_opt_structured_reply", (1, 16); "opt_starttls", (1, 16); "aio_opt_starttls", (1, 16); + "block_status_64", (1, 16); + "aio_block_status_64", (1, 16); (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. diff --git a/generator/OCaml.ml b/generator/OCaml.ml index 66dc6cad..360a75a9 100644 --- a/generator/OCaml.ml +++ b/generator/OCaml.ml @@ -583,7 +583,6 @@ let pr "}\n"; pr "\n"; pr "static int\n"; - pr "__attribute__((unused)) /* XXX temporary hack */\n"; pr "%s_wrapper " cbname; C.print_cbarg_list ~wrap:true cbargs; pr "\n"; diff --git a/generator/Python.ml b/generator/Python.ml index 980fdcea..482cdbda 100644 --- a/generator/Python.ml +++ b/generator/Python.ml @@ -153,7 +153,6 @@ let let print_python_closure_wrapper { cbname; cbargs } = pr "/* Wrapper for %s callback. */\n" cbname; pr "static int\n"; - pr "__attribute__((unused)) /* XXX temporary hack */\n"; pr "%s_wrapper " cbname; C.print_cbarg_list ~wrap:true cbargs; pr "\n"; diff --git a/lib/rw.c b/lib/rw.c index 6212bf4c..26533836 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -205,7 +205,7 @@ nbd_unlocked_zero (struct nbd_handle *h, return wait_for_command (h, cookie); } -/* Issue a block status command and wait for the reply. */ +/* Issue a block status command and wait for the reply, 32-bit callback. */ int nbd_unlocked_block_status (struct nbd_handle *h, uint64_t count, uint64_t offset, @@ -223,6 +223,25 @@ nbd_unlocked_block_status (struct nbd_handle *h, return wait_for_command (h, cookie); } +/* Issue a block status command and wait for the reply, 64-bit callback. */ +int +nbd_unlocked_block_status_64 (struct nbd_handle *h, + uint64_t count, uint64_t offset, + nbd_extent64_callback *extent64, + uint32_t flags) +{ + int64_t cookie; + nbd_completion_callback c = NBD_NULL_COMPLETION; + + cookie = nbd_unlocked_aio_block_status_64 (h, count, offset, extent64, &c, + flags); + if (cookie == -1) + return -1; + + assert (CALLBACK_IS_NULL (*extent64)); + return wait_for_command (h, cookie); +} + /* count_err represents the errno to return if bounds check fail */ int64_t nbd_internal_command_common (struct nbd_handle *h, @@ -543,10 +562,10 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, uint32_t flags) { nbd_extent_callback *shim = malloc (sizeof *shim); - struct command_cb cb = { .fn.extent.callback = nbd_convert_extent, - .fn.extent.user_data = shim, - .fn.extent.free = nbd_convert_extent_free, - .completion = *completion }; + nbd_extent64_callback wrapper = { .callback = nbd_convert_extent, + .user_data = shim, + .free = nbd_convert_extent_free, }; + int ret; if (shim == NULL) { set_error (errno, "malloc"); @@ -555,10 +574,25 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, *shim = *extent; SET_CALLBACK_TO_NULL (*extent); + ret = nbd_unlocked_aio_block_status_64 (h, count, offset, &wrapper, + completion, flags); + FREE_CALLBACK (wrapper); + return ret; +} + +int64_t +nbd_unlocked_aio_block_status_64 (struct nbd_handle *h, + uint64_t count, uint64_t offset, + nbd_extent64_callback *extent64, + nbd_completion_callback *completion, + uint32_t flags) +{ + struct command_cb cb = { .fn.extent = *extent64, + .completion = *completion }; + if (h->strict & LIBNBD_STRICT_COMMANDS) { if (!h->structured_replies) { set_error (ENOTSUP, "server does not support structured replies"); - FREE_CALLBACK (cb.fn.extent); return -1; } @@ -566,11 +600,11 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, set_error (ENOTSUP, "did not negotiate any metadata contexts, " "either you did not call nbd_add_meta_context before " "connecting or the server does not support it"); - FREE_CALLBACK (cb.fn.extent); return -1; } } + SET_CALLBACK_TO_NULL (*extent64); SET_CALLBACK_TO_NULL (*completion); return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset, count, EINVAL, NULL, &cb); diff --git a/python/t/465-block-status-64.py b/python/t/465-block-status-64.py new file mode 100644 index 00000000..6fba9c3e --- /dev/null +++ b/python/t/465-block-status-64.py @@ -0,0 +1,56 @@ +# libnbd Python bindings +# Copyright (C) 2010-2022 Red Hat Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +import os + +import nbd + +script = "%s/../tests/meta-base-allocation.sh" % os.getenv("srcdir", ".") + +h = nbd.NBD() +h.add_meta_context("base:allocation") +h.connect_command(["nbdkit", "-s", "--exit-with-parent", "-v", "sh", script]) + +entries = [] + + +def f(user_data, metacontext, offset, e, err): + global entries + assert user_data == 42 + assert err.value == 0 + if metacontext != "base:allocation": + return + entries = e + + +h.block_status_64(65536, 0, lambda *args: f(42, *args)) +print("entries = %r" % entries) +assert entries == [(8192, 0), + (8192, 1), + (16384, 3), + (16384, 2), + (16384, 0)] + +h.block_status_64(1024, 32256, lambda *args: f(42, *args)) +print("entries = %r" % entries) +assert entries == [(512, 3), + (16384, 2)] + +h.block_status_64(1024, 32256, lambda *args: f(42, *args), + nbd.CMD_FLAG_REQ_ONE) +print("entries = %r" % entries) +assert entries == [(512, 3)] diff --git a/ocaml/tests/Makefile.am b/ocaml/tests/Makefile.am index 2cd36eb0..fbd53ac9 100644 --- a/ocaml/tests/Makefile.am +++ b/ocaml/tests/Makefile.am @@ -39,6 +39,7 @@ ML_TESTS = \ test_405_pread_structured.ml \ test_410_pwrite.ml \ test_460_block_status.ml \ + test_465_block_status_64.ml \ test_500_aio_pread.ml \ test_505_aio_pread_structured_callback.ml \ test_510_aio_pwrite.ml \ diff --git a/ocaml/tests/test_465_block_status_64.ml b/ocaml/tests/test_465_block_status_64.ml new file mode 100644 index 00000000..2f36a692 --- /dev/null +++ b/ocaml/tests/test_465_block_status_64.ml @@ -0,0 +1,58 @@ +(* hey emacs, this is OCaml code: -*- tuareg -*- *) +(* libnbd OCaml test case + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + *) + +open Printf + +let script = + try + let srcdir = Sys.getenv "srcdir" in + sprintf "%s/../../tests/meta-base-allocation.sh" srcdir + with + Not_found -> failwith "error: srcdir is not defined" + +let entries = ref [||] +let f user_data metacontext offset e err = + assert (user_data = 42); + assert (!err = 0); + if metacontext = "base:allocation" then + entries := e; + 0 + +let () = + let nbd = NBD.create () in + NBD.add_meta_context nbd "base:allocation"; + NBD.connect_command nbd ["nbdkit"; "-s"; "--exit-with-parent"; "-v"; + "sh"; script]; + + NBD.block_status_64 nbd 65536_L 0_L (f 42); + assert (!entries = [| 8192_L, 0_L; + 8192_L, 1_L; + 16384_L, 3_L; + 16384_L, 2_L; + 16384_L, 0_L |]); + + NBD.block_status_64 nbd 1024_L 32256_L (f 42); + assert (!entries = [| 512_L, 3_L; + 16384_L, 2_L |]); + + let flags = let open NBD.CMD_FLAG in [REQ_ONE] in + NBD.block_status_64 nbd 1024_L 32256_L (f 42) ~flags; + assert (!entries = [| 512_L, 3_L |]) + +let () = Gc.compact () diff --git a/tests/meta-base-allocation.c b/tests/meta-base-allocation.c index 401c0c88..c8f14929 100644 --- a/tests/meta-base-allocation.c +++ b/tests/meta-base-allocation.c @@ -1,5 +1,5 @@ /* NBD client library in userspace - * Copyright (C) 2013-2020 Red Hat Inc. + * Copyright (C) 2013-2022 Red Hat Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -32,10 +32,13 @@ #define BOGUS_CONTEXT "x-libnbd:nosuch" -static int check_extent (void *data, - const char *metacontext, - uint64_t offset, - uint32_t *entries, size_t nr_entries, int *error); +static int check_extent32 (void *data, const char *metacontext, + uint64_t offset, + uint32_t *entries, size_t nr_entries, int *error); + +static int check_extent64 (void *data, const char *metacontext, + uint64_t offset, + nbd_extent *entries, size_t nr_entries, int *error); int main (int argc, char *argv[]) @@ -149,27 +152,51 @@ main (int argc, char *argv[]) /* Read the block status. */ id = 1; if (nbd_block_status (nbd, 65536, 0, - (nbd_extent_callback) { .callback = check_extent, .user_data = &id }, + (nbd_extent_callback) { .callback = check_extent32, + .user_data = &id }, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); } + if (nbd_block_status_64 (nbd, 65536, 0, + (nbd_extent64_callback) { .callback = check_extent64, + .user_data = &id }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } id = 2; if (nbd_block_status (nbd, 1024, 32768-512, - (nbd_extent_callback) { .callback = check_extent, .user_data = &id }, + (nbd_extent_callback) { .callback = check_extent32, + .user_data = &id }, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); } + if (nbd_block_status_64 (nbd, 1024, 32768-512, + (nbd_extent64_callback) { .callback = check_extent64, + .user_data = &id }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } id = 3; if (nbd_block_status (nbd, 1024, 32768-512, - (nbd_extent_callback) { .callback = check_extent, .user_data = &id }, + (nbd_extent_callback) { .callback = check_extent32, + .user_data = &id }, LIBNBD_CMD_FLAG_REQ_ONE) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); } + if (nbd_block_status_64 (nbd, 1024, 32768-512, + (nbd_extent64_callback) { .callback = check_extent64, + .user_data = &id }, + LIBNBD_CMD_FLAG_REQ_ONE) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } if (nbd_shutdown (nbd, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); @@ -181,10 +208,8 @@ main (int argc, char *argv[]) } static int -check_extent (void *data, - const char *metacontext, - uint64_t offset, - uint32_t *entries, size_t nr_entries, int *error) +check_extent32 (void *data, const char *metacontext, uint64_t offset, + uint32_t *entries, size_t nr_entries, int *error) { size_t i; int id; @@ -238,3 +263,65 @@ check_extent (void *data, return 0; } + +static int +check_extent64 (void *data, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t nr_entries, int *error) +{ + size_t i; + int id; + + id = * (int *)data; + + printf ("extent: id=%d, metacontext=%s, offset=%" PRIu64 ", " + "nr_entries=%zu, error=%d\n", + id, metacontext, offset, nr_entries, *error); + + assert (*error == 0); + if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) == 0) { + for (i = 0; i < nr_entries; i++) { + printf ("\t%zu\tlength=%" PRIu64 ", status=%" PRIu64 "\n", + i, entries[i].length, entries[i].flags); + } + fflush (stdout); + + switch (id) { + case 1: + assert (nr_entries == 5); + assert (entries[0].length == 8192); + assert (entries[0].flags == 0); + assert (entries[1].length == 8192); + assert (entries[1].flags == LIBNBD_STATE_HOLE); + assert (entries[2].length == 16384); + assert (entries[2].flags == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + assert (entries[3].length == 16384); + assert (entries[3].flags == LIBNBD_STATE_ZERO); + assert (entries[4].length == 16384); + assert (entries[4].flags == 0); + break; + + case 2: + assert (nr_entries == 2); + assert (entries[0].length == 512); + assert (entries[0].flags == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + assert (entries[1].length == 16384); + assert (entries[1].flags == LIBNBD_STATE_ZERO); + break; + + case 3: + assert (nr_entries == 1); + assert (entries[0].length == 512); + assert (entries[0].flags == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + break; + + default: + abort (); + } + + } + else + fprintf (stderr, "warning: ignored unexpected meta context %s\n", + metacontext); + + return 0; +} diff --git a/fuzzing/libnbd-fuzz-wrapper.c b/fuzzing/libnbd-fuzz-wrapper.c index d8cb7cb0..e7cf7fe9 100644 --- a/fuzzing/libnbd-fuzz-wrapper.c +++ b/fuzzing/libnbd-fuzz-wrapper.c @@ -1,5 +1,5 @@ /* NBD client library in userspace - * Copyright (C) 2013-2019 Red Hat Inc. + * Copyright (C) 2013-2022 Red Hat Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -165,6 +165,17 @@ extent_callback (void *user_data, return 0; } +/* Block status (extents64) callback, does nothing. */ +static int +extent64_callback (void *user_data, + const char *metacontext, + uint64_t offset, nbd_extent *entries, + size_t nr_entries, int *error) +{ + //fprintf (stderr, "extent64 called, nr_entries = %zu\n", nr_entries); + return 0; +} + /* This is the client (parent process) running libnbd. */ static char buf[512]; static char prbuf[65536]; @@ -213,7 +224,7 @@ client (int sock) }, 0); - /* Test block status. */ + /* Test both sizes of block status. */ nbd_block_status (nbd, length, 0, (nbd_extent_callback) { .callback = extent_callback, @@ -221,6 +232,13 @@ client (int sock) .free = NULL }, 0); + nbd_block_status_64 (nbd, length, 0, + (nbd_extent64_callback) { + .callback = extent64_callback, + .user_data = NULL, + .free = NULL + }, + 0); nbd_shutdown (nbd, 0); } diff --git a/golang/Makefile.am b/golang/Makefile.am index 0808a639..d342b51c 100644 --- a/golang/Makefile.am +++ b/golang/Makefile.am @@ -43,6 +43,7 @@ source_files = \ libnbd_405_pread_structured_test.go \ libnbd_410_pwrite_test.go \ libnbd_460_block_status_test.go \ + libnbd_465_block_status_64_test.go \ libnbd_500_aio_pread_test.go \ libnbd_510_aio_pwrite_test.go \ libnbd_590_aio_copy_test.go \ diff --git a/golang/libnbd_465_block_status_64_test.go b/golang/libnbd_465_block_status_64_test.go new file mode 100644 index 00000000..066cccbf --- /dev/null +++ b/golang/libnbd_465_block_status_64_test.go @@ -0,0 +1,119 @@ +/* libnbd golang tests + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +package libnbd + +import ( + "fmt" + "os" + "strings" + "testing" +) + +var entries64 []LibnbdExtent + +func mcf64(metacontext string, offset uint64, e []LibnbdExtent, error *int) int { + if *error != 0 { + panic("expected *error == 0") + } + if metacontext == "base:allocation" { + entries64 = e + } + return 0 +} + +// Seriously WTF? +func mc64_compare(a1 []LibnbdExtent, a2 []LibnbdExtent) bool { + if len(a1) != len(a2) { + return false + } + for i := 0; i < len(a1); i++ { + if a1[i] != a2[i] { + return false + } + } + return true +} + +func mc64_to_string(a []LibnbdExtent) string { + ss := make([]string, len(a)) + for i := 0; i < len(a); i++ { + ss[i] = fmt.Sprintf("%#v", a[i]) + } + return strings.Join(ss, ", ") +} + +func Test465BlockStatus64(t *testing.T) { + srcdir := os.Getenv("abs_top_srcdir") + script := srcdir + "/tests/meta-base-allocation.sh" + + h, err := Create() + if err != nil { + t.Fatalf("could not create handle: %s", err) + } + defer h.Close() + + err = h.AddMetaContext("base:allocation") + if err != nil { + t.Fatalf("%s", err) + } + err = h.ConnectCommand([]string{ + "nbdkit", "-s", "--exit-with-parent", "-v", + "sh", script, + }) + if err != nil { + t.Fatalf("%s", err) + } + + err = h.BlockStatus64(65536, 0, mcf64, nil) + if err != nil { + t.Fatalf("%s", err) + } + if !mc64_compare(entries64, []LibnbdExtent{ + {8192, 0}, + {8192, 1}, + {16384, 3}, + {16384, 2}, + {16384, 0}, + }) { + t.Fatalf("unexpected entries (1): %s", mc64_to_string(entries64)) + } + + err = h.BlockStatus64(1024, 32256, mcf64, nil) + if err != nil { + t.Fatalf("%s", err) + } + if !mc64_compare(entries64, []LibnbdExtent{ + {512, 3}, + {16384, 2}, + }) { + t.Fatalf("unexpected entries (2): %s", mc64_to_string(entries64)) + } + + var optargs BlockStatus64Optargs + optargs.FlagsSet = true + optargs.Flags = CMD_FLAG_REQ_ONE + err = h.BlockStatus64(1024, 32256, mcf64, &optargs) + if err != nil { + t.Fatalf("%s", err) + } + if !mc64_compare(entries64, []LibnbdExtent{{512, 3}}) { + t.Fatalf("unexpected entries (3): %s", mc64_to_string(entries64)) + } + +} From patchwork Mon Nov 14 22:51:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3FF7BC4332F for ; Mon, 14 Nov 2022 23:32:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidn-0001ji-7w; Mon, 14 Nov 2022 18:13:23 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidl-0001eb-0S for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:21 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJR-00038O-JG for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466339; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2UZ3+3XiNyWGBAYAGfP+9zc44+qr/0xwK7n95Q82oNs=; b=ASCOpRNSwWI6+SRzzlQ/2YymNHbFQXELcKtBOG9Cu5M2LNEeBGstUJzYukWYaiYrbA3fZi s4gtQgOC12Q7n6+ziS/gDUN+X8g2p+EaFjjD5LnmRxVvEg3ZAXp7bwIlHHAi0XbN+UkvQH XtdZcMhMaJImwpUV8BqgdqAVWoPIm0Q= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-564-4q8RnqxQMPGUta7j2eXmuw-1; Mon, 14 Nov 2022 17:52:15 -0500 X-MC-Unique: 4q8RnqxQMPGUta7j2eXmuw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CED82101E14C; Mon, 14 Nov 2022 22:52:08 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6CA3440E9786; Mon, 14 Nov 2022 22:52:08 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 12/23] copy: Update nbdcopy to use 64-bit block status Date: Mon, 14 Nov 2022 16:51:47 -0600 Message-Id: <20221114225158.2186742-13-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Although our use of "base:allocation" doesn't require the use of the 64-bit API for flags, we might perform slightly faster for a server that does give us 64-bit extent lengths and honors larger nbd_zero lengths. --- copy/nbd-ops.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c index 34ab4857..dad78ea9 100644 --- a/copy/nbd-ops.c +++ b/copy/nbd-ops.c @@ -428,7 +428,7 @@ nbd_ops_asynch_notify_write (struct rw *rw, size_t index) * request for extents in a single round trip. */ static int add_extent (void *vp, const char *metacontext, - uint64_t offset, uint32_t *entries, size_t nr_entries, + uint64_t offset, nbd_extent *entries, size_t nr_entries, int *error); static void @@ -449,11 +449,11 @@ nbd_ops_get_extents (struct rw *rw, size_t index, size_t i; exts.len = 0; - if (nbd_block_status (nbd, count, offset, - (nbd_extent_callback) { - .user_data = &exts, - .callback = add_extent - }, 0) == -1) { + if (nbd_block_status_64 (nbd, count, offset, + (nbd_extent64_callback) { + .user_data = &exts, + .callback = add_extent + }, 0) == -1) { /* XXX We could call default_get_extents, but unclear if it's * the right thing to do if the server is returning errors. */ @@ -493,7 +493,7 @@ nbd_ops_get_extents (struct rw *rw, size_t index, static int add_extent (void *vp, const char *metacontext, - uint64_t offset, uint32_t *entries, size_t nr_entries, + uint64_t offset, nbd_extent *entries, size_t nr_entries, int *error) { extent_list *ret = vp; @@ -502,25 +502,25 @@ add_extent (void *vp, const char *metacontext, if (strcmp (metacontext, "base:allocation") != 0 || *error) return 0; - for (i = 0; i < nr_entries; i += 2) { + for (i = 0; i < nr_entries; i++) { struct extent e; e.offset = offset; - e.length = entries[i]; + e.length = entries[i].length; /* Note we deliberately don't care about the HOLE flag. There is * no need to read extent that reads as zeroes. We will convert * to it to a hole or allocated extents based on the command line * arguments. */ - e.zero = (entries[i+1] & LIBNBD_STATE_ZERO) != 0; + e.zero = (entries[i].flags & LIBNBD_STATE_ZERO) != 0; if (extent_list_append (ret, e) == -1) { perror ("realloc"); exit (EXIT_FAILURE); } - offset += entries[i]; + offset += entries[i].length; } return 0; From patchwork Mon Nov 14 22:51:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042982 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4A5B9C4332F for ; Tue, 15 Nov 2022 00:06:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidv-0001vj-GA; Mon, 14 Nov 2022 18:13:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidn-0001df-Ug for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:23 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJH-00035d-Uz for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466331; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1kFZ7Yv3HNPzLl2w5aNu6KklNLzOrQIqVYaO4PqAHyk=; b=aQELNAqPbdgiF9H6uTOvhcrrVxqjlsRDYl4r3/dBOOD+JepKxOm1DSISi4uUkwn/tv1jb4 qNIl4yCXbbvM8Kj0nlOd16jz556xu4sHspFzEIHamOQrSjo2zefeB5UeuMVITtPOsGolZF nnyeGmX2TGnmr43mbYu8lCUNBU6lu3w= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-142-SYJVdRt5NNWZYYQuCjmh_Q-1; Mon, 14 Nov 2022 17:52:09 -0500 X-MC-Unique: SYJVdRt5NNWZYYQuCjmh_Q-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6164A2A5955B; Mon, 14 Nov 2022 22:52:09 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id F29C340E9786; Mon, 14 Nov 2022 22:52:08 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 13/23] dump: Update nbddump to use 64-bit block status Date: Mon, 14 Nov 2022 16:51:48 -0600 Message-Id: <20221114225158.2186742-14-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Although our use of "base:allocation" doesn't require the use of the 64-bit API for flags, we might perform slightly faster for a server that does give us 64-bit extent lengths. --- dump/dump.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/dump/dump.c b/dump/dump.c index bdbc9040..0427ab86 100644 --- a/dump/dump.c +++ b/dump/dump.c @@ -38,7 +38,7 @@ #include "version.h" #include "vector.h" -DEFINE_VECTOR_TYPE (uint32_vector, uint32_t) +DEFINE_VECTOR_TYPE (uint64_vector, uint64_t) static const char *progname; static struct nbd_handle *nbd; @@ -262,10 +262,10 @@ catch_signal (int sig) static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, - uint32_t *entries, size_t nr_entries, + nbd_extent *entries, size_t nr_entries, int *error) { - uint32_vector *list = user_data; + uint64_vector *list = user_data; size_t i; if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) != 0) @@ -273,7 +273,8 @@ extent_callback (void *user_data, const char *metacontext, /* Just append the entries we got to the list. */ for (i = 0; i < nr_entries; ++i) { - if (uint32_vector_append (list, entries[i]) == -1) { + if (uint64_vector_append (list, entries[i].length) == -1 || + uint64_vector_append (list, entries[i].flags) == -1) { perror ("realloc"); exit (EXIT_FAILURE); } @@ -284,7 +285,7 @@ extent_callback (void *user_data, const char *metacontext, static bool test_all_zeroes (uint64_t offset, size_t count) { - uint32_vector entries = empty_vector; + uint64_vector entries = empty_vector; size_t i; uint64_t count_read; @@ -296,22 +297,22 @@ test_all_zeroes (uint64_t offset, size_t count) * false, causing the main code to do a full read. We could be * smarter and keep asking the server (XXX). */ - if (nbd_block_status (nbd, count, offset, - (nbd_extent_callback) { - .callback = extent_callback, - .user_data = &entries }, - 0) == -1) { + if (nbd_block_status_64 (nbd, count, offset, + (nbd_extent64_callback) { + .callback = extent_callback, + .user_data = &entries }, + 0) == -1) { fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } count_read = 0; for (i = 0; i < entries.len; i += 2) { - uint32_t len = entries.ptr[i]; - uint32_t type = entries.ptr[i+1]; + uint64_t len = entries.ptr[i]; + uint64_t type = entries.ptr[i+1]; count_read += len; - if (!(type & 2)) /* not zero */ + if (!(type & LIBNBD_STATE_ZERO)) /* not zero */ return false; } From patchwork Mon Nov 14 22:51:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042923 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 793BCC433FE for ; Mon, 14 Nov 2022 23:31:29 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouids-0001s8-LP; Mon, 14 Nov 2022 18:13:28 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidn-0001eb-U3 for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:23 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJJ-00035i-ES for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:21 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466331; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dn4u5c7k0NDmH+fmkZZ9nPfDWOXKvhaM7RFY3LonV6A=; b=bzi2uBHoR4qBFJuZZU0amtG1HsRB54Q3APCPPzu3XkNGwzhqics0YBGNVBaj+a6As0PGwb 5Jdww+hl49xkYBwuAcgpyFK3258IuW1iq6P3mgsYX2Fqm/oQw/cChvdXvKuNsJ8qt8AF6J 2Ij5oycwpuSMKPMwlwn6qWJR1J3QkKc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-383-eKppC75POrCF_C0b4JsMYg-1; Mon, 14 Nov 2022 17:52:10 -0500 X-MC-Unique: eKppC75POrCF_C0b4JsMYg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EA35F858F13; Mon, 14 Nov 2022 22:52:09 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 84FCF40E9786; Mon, 14 Nov 2022 22:52:09 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 14/23] info: Expose extended-headers support through nbdinfo Date: Mon, 14 Nov 2022 16:51:49 -0600 Message-Id: <20221114225158.2186742-15-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Add another bit of overall server information, as well as a '--can extended-headers' silent query. For now, the testsuite is written assuming that when nbdkit finally adds extended headers support, it will also add a --no-eh kill switch comparable to its existing --no-sr switch. --- info/nbdinfo.pod | 11 ++++++++++- info/can.c | 9 +++++++++ info/info-can.sh | 27 +++++++++++++++++++++++++++ info/info-packets.sh | 17 ++++++++++++++++- info/main.c | 7 ++++++- 5 files changed, 68 insertions(+), 3 deletions(-) diff --git a/info/nbdinfo.pod b/info/nbdinfo.pod index c47e5175..2455e1c0 100644 --- a/info/nbdinfo.pod +++ b/info/nbdinfo.pod @@ -86,6 +86,7 @@ the I<--json> parameter: "protocol": "newstyle-fixed", "TLS": false, "structured": true, + "extended": false, "exports": [ { "export-name": "", @@ -165,6 +166,11 @@ Test if the NBD URI connection is using TLS. Test if server can respond with structured replies (a prerequisite for supporting block status commands). +=item nbdinfo --can extended-headers URI + +Test if server supports extended headers (a prerequisite for +supporting 64-bit commands; implies structured replies as well). + =item nbdinfo --is rotational URI Test if the server export is backed by something which behaves like a @@ -312,6 +318,8 @@ Display brief command line help and exit. =item B<--can df> +=item B<--can extended-headers> + =item B<--can fast-zero> =item B<--can flush> @@ -341,7 +349,8 @@ and the following libnbd functions: L, L, L, L, L, L, L, L, L, -L. +L, +L. =item B<--color> diff --git a/info/can.c b/info/can.c index 08d6bcd5..f602ffce 100644 --- a/info/can.c +++ b/info/can.c @@ -50,6 +50,15 @@ do_can (void) strcasecmp (can, "structured_replies") == 0) feature = nbd_get_structured_replies_negotiated (nbd); + else if (strcasecmp (can, "eh") == 0 || + strcasecmp (can, "extended header") == 0 || + strcasecmp (can, "extended-header") == 0 || + strcasecmp (can, "extended_header") == 0 || + strcasecmp (can, "extended headers") == 0 || + strcasecmp (can, "extended-headers") == 0 || + strcasecmp (can, "extended_headers") == 0) + feature = nbd_get_extended_headers_negotiated (nbd); + else if (strcasecmp (can, "readonly") == 0 || strcasecmp (can, "read-only") == 0 || strcasecmp (can, "read_only") == 0) diff --git a/info/info-can.sh b/info/info-can.sh index 3edc3948..e5f6a44b 100755 --- a/info/info-can.sh +++ b/info/info-can.sh @@ -61,6 +61,33 @@ esac EOF test $st = 2 +# --can extended-headers cannot be positively tested until nbdkit gains +# --no-eh support. Otherwise, it is similar to --can structured-reply. + +no_eh= +if nbdkit --no-eh --help >/dev/null 2>/dev/null; then + no_eh=--no-eh + nbdkit -v -U - sh - \ + --run '$VG nbdinfo --can extended-headers "nbd+unix:///?socket=$unixsocket"' <<'EOF' +case "$1" in + get_size) echo 1024 ;; + pread) ;; + *) exit 2 ;; +esac +EOF +fi + +st=0 +nbdkit -v -U - $no_eh sh - \ + --run '$VG nbdinfo --can extended-headers "nbd+unix:///?socket=$unixsocket"' <<'EOF' || st=$? +case "$1" in + get_size) echo 1024 ;; + pread) ;; + *) exit 2 ;; +esac +EOF +test $st = 2 + # --can cache and --can fua require special handling because in # nbdkit-sh-plugin we must print "native" or "none". Also the can_fua # flag is only sent if the export is writable (hence can_write below). diff --git a/info/info-packets.sh b/info/info-packets.sh index 82bb526c..a6b307a0 100755 --- a/info/info-packets.sh +++ b/info/info-packets.sh @@ -27,12 +27,27 @@ requires nbdkit --no-sr memory --version out=info-packets.out cleanup_fn rm -f $out +# Older nbdkit does not support extended headers; --no-eh is a reliable +# witness of whether nbdkit is new enough. + +no_eh= +if nbdkit --no-eh --help >/dev/null 2>/dev/null; then + no_eh=--no-eh +fi + nbdkit --no-sr -U - memory size=1M \ --run '$VG nbdinfo "nbd+unix:///?socket=$unixsocket"' > $out cat $out grep "protocol: .*using simple packets" $out -nbdkit -U - memory size=1M \ +nbdkit $no_eh -U - memory size=1M \ --run '$VG nbdinfo "nbd+unix:///?socket=$unixsocket"' > $out cat $out grep "protocol: .*using structured packets" $out + +if test x != "x$no_eh"; then + nbdkit -U - memory size=1M \ + --run '$VG nbdinfo "nbd+unix:///?socket=$unixsocket"' > $out + cat $out + grep "protocol: .*using extended packets" $out +fi diff --git a/info/main.c b/info/main.c index 5cd91fe1..9794c109 100644 --- a/info/main.c +++ b/info/main.c @@ -302,11 +302,13 @@ main (int argc, char *argv[]) const char *protocol; int tls_negotiated; int sr_negotiated; + int eh_negotiated; /* Print per-connection fields. */ protocol = nbd_get_protocol (nbd); tls_negotiated = nbd_get_tls_negotiated (nbd); sr_negotiated = nbd_get_structured_replies_negotiated (nbd); + eh_negotiated = nbd_get_extended_headers_negotiated (nbd); if (!json_output) { if (protocol) { @@ -314,8 +316,9 @@ main (int argc, char *argv[]) fprintf (fp, "protocol: %s", protocol); if (tls_negotiated >= 0) fprintf (fp, " %s TLS", tls_negotiated ? "with" : "without"); - if (sr_negotiated >= 0) + if (eh_negotiated >= 0 && sr_negotiated >= 0) fprintf (fp, ", using %s packets", + eh_negotiated ? "extended" : sr_negotiated ? "structured" : "simple"); fprintf (fp, "\n"); ansi_restore (fp); @@ -333,6 +336,8 @@ main (int argc, char *argv[]) fprintf (fp, "\"TLS\": %s,\n", tls_negotiated ? "true" : "false"); if (sr_negotiated >= 0) fprintf (fp, "\"structured\": %s,\n", sr_negotiated ? "true" : "false"); + if (eh_negotiated >= 0) + fprintf (fp, "\"extended\": %s,\n", eh_negotiated ? "true" : "false"); } if (!list_all) From patchwork Mon Nov 14 22:51:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042930 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 063F8C433FE for ; Mon, 14 Nov 2022 23:34:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidn-0001jg-6D; Mon, 14 Nov 2022 18:13:23 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidk-0001eb-Ms for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:20 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJS-00038u-Nr for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466342; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w2cD56WaAJ/IvhRPrOPQ2JyhwwZRuIKdCJRpznOMdqs=; b=FZlsDZrGm9yJisr0X18TOHW4Fy261oaQqRvAnvd3wgZm0hBS97Mj47afeD4WojLff9//MK 0maeYVIkv4R96BrQN6za2Z2DPUuPn2dARAZ4p5NnVcVahrqDsZSz5ITPHwywqUx+M1DWVa YOMmbqp9ZVMjUq/xRleWesGdB1GozQc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-262-pPoxq3Q7M4KEOuKG6EuL0w-1; Mon, 14 Nov 2022 17:52:10 -0500 X-MC-Unique: pPoxq3Q7M4KEOuKG6EuL0w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7E0CD101E148; Mon, 14 Nov 2022 22:52:10 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1A2CA40E9786; Mon, 14 Nov 2022 22:52:10 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 15/23] info: Update nbdinfo --map to use 64-bit block status Date: Mon, 14 Nov 2022 16:51:50 -0600 Message-Id: <20221114225158.2186742-16-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Although we usually map "base:allocation" which doesn't require the use of the 64-bit API for flags, this application IS intended to map out other metacontexts that might have 64-bit flags. And when extended headers are in use, we might as well ask for the server to give us extents as large as it wants, rather than breaking things up at 4G boundaries. --- info/map.c | 67 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 35 insertions(+), 32 deletions(-) diff --git a/info/map.c b/info/map.c index a5aad955..ffa53b81 100644 --- a/info/map.c +++ b/info/map.c @@ -1,5 +1,5 @@ /* NBD client library in userspace - * Copyright (C) 2020-2021 Red Hat Inc. + * Copyright (C) 2020-2022 Red Hat Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -36,13 +36,13 @@ #include "nbdinfo.h" -DEFINE_VECTOR_TYPE (uint32_vector, uint32_t) +DEFINE_VECTOR_TYPE (uint64_vector, uint64_t) -static void print_extents (uint32_vector *entries); -static void print_totals (uint32_vector *entries, int64_t size); +static void print_extents (uint64_vector *entries); +static void print_totals (uint64_vector *entries, int64_t size); static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, - uint32_t *entries, size_t nr_entries, + nbd_extent *entries, size_t nr_entries, int *error); void @@ -50,7 +50,7 @@ do_map (void) { size_t i; int64_t size; - uint32_vector entries = empty_vector; + uint64_vector entries = empty_vector; uint64_t offset, align, max_len; size_t prev_entries_size; @@ -69,14 +69,16 @@ do_map (void) fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } + if (nbd_get_extended_headers_negotiated (nbd) == 1) + max_len = size; for (offset = 0; offset < size;) { prev_entries_size = entries.len; - if (nbd_block_status (nbd, MIN (size - offset, max_len), offset, - (nbd_extent_callback) { - .callback = extent_callback, - .user_data = &entries }, - 0) == -1) { + if (nbd_block_status_64 (nbd, MIN (size - offset, max_len), offset, + (nbd_extent64_callback) { + .callback = extent_callback, + .user_data = &entries }, + 0) == -1) { fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } @@ -99,18 +101,18 @@ do_map (void) } /* Callback handling --map. */ -static void print_one_extent (uint64_t offset, uint64_t len, uint32_t type); -static void extent_description (const char *metacontext, uint32_t type, +static void print_one_extent (uint64_t offset, uint64_t len, uint64_t type); +static void extent_description (const char *metacontext, uint64_t type, char **descr, bool *free_descr, const char **fg, const char **bg); static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, - uint32_t *entries, size_t nr_entries, + nbd_extent *entries, size_t nr_entries, int *error) { - uint32_vector *list = user_data; + uint64_vector *list = user_data; size_t i; if (strcmp (metacontext, map) != 0) @@ -120,7 +122,8 @@ extent_callback (void *user_data, const char *metacontext, * print_extents below. */ for (i = 0; i < nr_entries; ++i) { - if (uint32_vector_append (list, entries[i]) == -1) { + if (uint64_vector_append (list, entries[i].length) == -1 || + uint64_vector_append (list, entries[i].flags) == -1) { perror ("realloc"); exit (EXIT_FAILURE); } @@ -129,7 +132,7 @@ extent_callback (void *user_data, const char *metacontext, } static void -print_extents (uint32_vector *entries) +print_extents (uint64_vector *entries) { size_t i, j; uint64_t offset = 0; /* end of last extent printed + 1 */ @@ -138,7 +141,7 @@ print_extents (uint32_vector *entries) if (json_output) fprintf (fp, "[\n"); for (i = 0; i < entries->len; i += 2) { - uint32_t type = entries->ptr[last+1]; + uint64_t type = entries->ptr[last+1]; /* If we're coalescing and the current type is different from the * previous one then we should print everything up to this entry. @@ -157,7 +160,7 @@ print_extents (uint32_vector *entries) /* Print the last extent if there is one. */ if (last != i) { - uint32_t type = entries->ptr[last+1]; + uint64_t type = entries->ptr[last+1]; uint64_t len; for (j = last, len = 0; j < i; j += 2) @@ -169,7 +172,7 @@ print_extents (uint32_vector *entries) } static void -print_one_extent (uint64_t offset, uint64_t len, uint32_t type) +print_one_extent (uint64_t offset, uint64_t len, uint64_t type) { static bool comma = false; char *descr; @@ -185,7 +188,7 @@ print_one_extent (uint64_t offset, uint64_t len, uint32_t type) ansi_colour (bg, fp); fprintf (fp, "%10" PRIu64 " " "%10" PRIu64 " " - "%3" PRIu32, + "%3" PRIu64, offset, len, type); if (descr) fprintf (fp, " %s", descr); @@ -199,7 +202,7 @@ print_one_extent (uint64_t offset, uint64_t len, uint32_t type) fprintf (fp, "{ \"offset\": %" PRIu64 ", " "\"length\": %" PRIu64 ", " - "\"type\": %" PRIu32, + "\"type\": %" PRIu64, offset, len, type); if (descr) { fprintf (fp, ", \"description\": "); @@ -215,9 +218,9 @@ print_one_extent (uint64_t offset, uint64_t len, uint32_t type) /* --map --totals suboption */ static void -print_totals (uint32_vector *entries, int64_t size) +print_totals (uint64_vector *entries, int64_t size) { - uint32_t type; + uint64_t type; bool comma = false; /* This is necessary to avoid a divide by zero below, but if the @@ -237,16 +240,16 @@ print_totals (uint32_vector *entries, int64_t size) */ type = 0; for (;;) { - uint64_t next_type = (uint64_t)UINT32_MAX + 1; + uint64_t next_type = 0; uint64_t c = 0; size_t i; for (i = 0; i < entries->len; i += 2) { - uint32_t t = entries->ptr[i+1]; + uint64_t t = entries->ptr[i+1]; if (t == type) c += entries->ptr[i]; - else if (type < t && t < next_type) + else if (type < t && (next_type == 0 || t < next_type)) next_type = t; } @@ -263,7 +266,7 @@ print_totals (uint32_vector *entries, int64_t size) ansi_colour (fg, fp); if (bg) ansi_colour (bg, fp); - fprintf (fp, "%10" PRIu64 " %5.1f%% %3" PRIu32, + fprintf (fp, "%10" PRIu64 " %5.1f%% %3" PRIu64, c, percent, type); if (descr) fprintf (fp, " %s", descr); @@ -278,7 +281,7 @@ print_totals (uint32_vector *entries, int64_t size) fprintf (fp, "{ \"size\": %" PRIu64 ", " "\"percent\": %g, " - "\"type\": %" PRIu32, + "\"type\": %" PRIu64, c, percent, type); if (descr) { fprintf (fp, ", \"description\": "); @@ -292,7 +295,7 @@ print_totals (uint32_vector *entries, int64_t size) free (descr); } - if (next_type == (uint64_t)UINT32_MAX + 1) + if (next_type == 0) break; type = next_type; } @@ -301,7 +304,7 @@ print_totals (uint32_vector *entries, int64_t size) } static void -extent_description (const char *metacontext, uint32_t type, +extent_description (const char *metacontext, uint64_t type, char **descr, bool *free_descr, const char **fg, const char **bg) { @@ -348,7 +351,7 @@ extent_description (const char *metacontext, uint32_t type, *fg = ANSI_FG_BRIGHT_WHITE; *bg = ANSI_BG_BLACK; return; default: - if (asprintf (descr, "backing depth %u", type) == -1) { + if (asprintf (descr, "backing depth %" PRIu64, type) == -1) { perror ("asprintf"); exit (EXIT_FAILURE); } From patchwork Mon Nov 14 22:51:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13043068 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 683D0C43219 for ; Tue, 15 Nov 2022 01:28:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouido-0001lr-HG; Mon, 14 Nov 2022 18:13:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidl-0001X6-Oh for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:21 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJP-00036i-E0 for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466334; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5xwarWq7IS97eQgNFs7g0Lw+/taGHjpDs4Ete5qDg2U=; b=QEMnzOSrNnDXTyylgMiWUilA9Tzy1CMsFzogC9U9n2GhlXor0K/yX5/jnn63dv/Zau2HmR h1zTYvFmNHRqMxQ/on9ZTAHrAMvlQqua4DAaR885f/9PytYOxEfElRRVFvc/Bkyfd88o+K rli9FC3mVWglRgY1FfNVOmQ2Wx2p1L0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-300-RlTzK5bCNRemScTiMHoUDw-1; Mon, 14 Nov 2022 17:52:11 -0500 X-MC-Unique: RlTzK5bCNRemScTiMHoUDw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1216F857D0E; Mon, 14 Nov 2022 22:52:11 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id A259940E9786; Mon, 14 Nov 2022 22:52:10 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 16/23] examples: Update copy-libev to use 64-bit block status Date: Mon, 14 Nov 2022 16:51:51 -0600 Message-Id: <20221114225158.2186742-17-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Although our use of "base:allocation" doesn't require the use of the 64-bit API for flags, we might perform slightly faster for a server that does give us 64-bit extent lengths and honors larger nbd_zero lengths. --- examples/copy-libev.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/examples/copy-libev.c b/examples/copy-libev.c index 418d99f1..d8b45d87 100644 --- a/examples/copy-libev.c +++ b/examples/copy-libev.c @@ -94,7 +94,7 @@ struct request { }; struct extent { - uint32_t length; + uint64_t length; bool zero; }; @@ -182,7 +182,7 @@ get_events(struct connection *c) static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, - uint32_t *entries, size_t nr_entries, int *error) + nbd_extent *entries, size_t nr_entries, int *error) { struct request *r = user_data; @@ -197,22 +197,21 @@ extent_callback (void *user_data, const char *metacontext, uint64_t offset, return 1; } - /* Libnbd returns uint32_t pair (length, flags) for each extent. */ - extents_len = nr_entries / 2; + extents_len = nr_entries; extents = malloc (extents_len * sizeof *extents); if (extents == NULL) FAIL ("Cannot allocated extents: %s", strerror (errno)); /* Copy libnbd entries to extents array. */ - for (int i = 0, j = 0; i < extents_len; i++, j=i*2) { - extents[i].length = entries[j]; + for (int i = 0; i < extents_len; i++) { + extents[i].length = entries[i].length; /* Libnbd exposes both ZERO and HOLE flags. We care only about * ZERO status, meaning we can copy this extent using efficinet * zero method. */ - extents[i].zero = (entries[j + 1] & LIBNBD_STATE_ZERO) != 0; + extents[i].zero = (entries[i].flags & LIBNBD_STATE_ZERO) != 0; } DEBUG ("r%zu: received %zu extents for %s", @@ -284,10 +283,10 @@ start_extents (struct request *r) DEBUG ("r%zu: start extents offset=%" PRIi64 " count=%zu", r->index, offset, count); - cookie = nbd_aio_block_status ( + cookie = nbd_aio_block_status_64 ( src.nbd, count, offset, - (nbd_extent_callback) { .callback=extent_callback, - .user_data=r }, + (nbd_extent64_callback) { .callback=extent_callback, + .user_data=r }, (nbd_completion_callback) { .callback=extents_completed, .user_data=r }, 0); @@ -322,7 +321,7 @@ next_extent (struct request *r) limit = MIN (REQUEST_SIZE, size - offset); while (length < limit) { - DEBUG ("e%zu: offset=%" PRIi64 " len=%" PRIu32 " zero=%d", + DEBUG ("e%zu: offset=%" PRIi64 " len=%" PRIu64 " zero=%d", extents_pos, offset, extents[extents_pos].length, is_zero); /* If this extent is too large, steal some data from it to From patchwork Mon Nov 14 22:51:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8075AC433FE for ; Mon, 14 Nov 2022 23:53:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidt-0001tg-HM; Mon, 14 Nov 2022 18:13:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidn-0001eb-1Z for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:23 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJP-000379-B0 for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466336; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VoOH8h5zJxs1trCxHkur4lW8NowOm3VRrSmQjkq7wik=; b=KcyxL5mDLAo7E9ia4ur07zr+uT/r57TjSQzhlrVpdq7i0Y2D4IWG+7Kq4056Po2zW4EsCe q0S5YEpwZF6AKeiPD+dnHEiTCtdnbXVnPCHDyUGUGfx9uWZpUiyhsI6HFHZPJsW2VRj2sw oLIxmMQhpqHM4CDNQiPQiQXX7UI+eRM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-621-gafmUrvGONm7ECI5Pdy_jw-1; Mon, 14 Nov 2022 17:52:12 -0500 X-MC-Unique: gafmUrvGONm7ECI5Pdy_jw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9726B185A7A3; Mon, 14 Nov 2022 22:52:11 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 34F6740E9786; Mon, 14 Nov 2022 22:52:11 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 17/23] ocaml: Add example for 64-bit extents Date: Mon, 14 Nov 2022 16:51:52 -0600 Message-Id: <20221114225158.2186742-18-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Since our example program for 32-bit extents is inherently limited to 32-bit lengths, it is also worth demonstrating the 64-bit extent API, including the difference in the array indexing being saner. --- ocaml/examples/Makefile.am | 3 ++- ocaml/examples/extents64.ml | 42 +++++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 ocaml/examples/extents64.ml diff --git a/ocaml/examples/Makefile.am b/ocaml/examples/Makefile.am index 5ee6dd63..c6f4989d 100644 --- a/ocaml/examples/Makefile.am +++ b/ocaml/examples/Makefile.am @@ -1,5 +1,5 @@ # nbd client library in userspace -# Copyright (C) 2013-2019 Red Hat Inc. +# Copyright (C) 2013-2022 Red Hat Inc. # # This library is free software; you can redistribute it and/or # modify it under the terms of the GNU Lesser General Public @@ -20,6 +20,7 @@ include $(top_srcdir)/subdir-rules.mk ml_examples = \ asynch_copy.ml \ extents.ml \ + extents64.ml \ get_size.ml \ open_qcow2.ml \ server_flags.ml \ diff --git a/ocaml/examples/extents64.ml b/ocaml/examples/extents64.ml new file mode 100644 index 00000000..8ee7e218 --- /dev/null +++ b/ocaml/examples/extents64.ml @@ -0,0 +1,42 @@ +open Printf + +let () = + NBD.with_handle ( + fun nbd -> + NBD.add_meta_context nbd "base:allocation"; + NBD.connect_command nbd + ["nbdkit"; "-s"; "--exit-with-parent"; "-r"; + "sparse-random"; "8G"]; + + (* Read the extents and print them. *) + let size = NBD.get_size nbd in + let cap = + match NBD.get_extended_headers_negotiated nbd with + | true -> size + | false -> 0x8000_0000_L in + let fetch_offset = ref 0_L in + while !fetch_offset < size do + let remaining = Int64.sub size !fetch_offset in + let fetch_size = min remaining cap in + NBD.block_status_64 nbd fetch_size !fetch_offset ( + fun meta _ entries err -> + printf "nbd_block_status callback: meta=%s err=%d\n" meta !err; + if meta = "base:allocation" then ( + printf "index\t%16s %16s %s\n" "offset" "length" "flags"; + for i = 0 to Array.length entries - 1 do + let len = fst entries.(i) + and flags = + match snd entries.(i) with + | 0_L -> "data" + | 1_L -> "hole" + | 2_L -> "zero" + | 3_L -> "hole+zero" + | unknown -> sprintf "unknown (%Ld)" unknown in + printf "%d:\t%16Ld %16Ld %s\n" i !fetch_offset len flags; + fetch_offset := Int64.add !fetch_offset len + done; + ); + 0 + ) (* NBD.block_status *) + done + ) From patchwork Mon Nov 14 22:51:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CCC19C433FE for ; Mon, 14 Nov 2022 23:34:50 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouids-0001rN-4r; Mon, 14 Nov 2022 18:13:28 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidn-0001X6-1U for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:23 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJP-00036n-AA for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:24 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466334; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VvBB30A7zggPuWnJggKPmR+K5etynzvU9euhLFtV6Bk=; b=Q3pP0Sk3zNEOivq10gnPzpbcFY6s8E13I/MwWbTUm+31rSaZTfz7pZMRMu4Ucr+gNgi7r2 NEmLl7nLPhazqI+u6CTuG8ILRYwbP34CuDqLQS+W8VxCLAdNlW5I5sN+HwBDzFQA3TeuHN tBo9Zwo332wcjd70DBUPA2G/E1ISFxY= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-53-IIoAmcm2OP2PZM2bvsak8g-1; Mon, 14 Nov 2022 17:52:13 -0500 X-MC-Unique: IIoAmcm2OP2PZM2bvsak8g-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B27913C0E467; Mon, 14 Nov 2022 22:52:12 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4F52840E9786; Mon, 14 Nov 2022 22:52:12 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 19/23] api: Add nbd_[aio_]opt_extended_headers() Date: Mon, 14 Nov 2022 16:51:54 -0600 Message-Id: <20221114225158.2186742-20-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Very similar to the recent addition of nbd_opt_structured_reply, giving us fine-grained control over an extended headers request. Because nbdkit does not yet support extended headers, testsuite coverage is limited to interop testing with qemu-nbd. It shows that extended headers imply structured replies, regardless of which order the two are manually negotiated in. --- generator/API.ml | 79 +++++++-- .../states-newstyle-opt-extended-headers.c | 30 +++- generator/states-newstyle.c | 3 + lib/opt.c | 44 +++++ interop/Makefile.am | 6 + interop/opt-extended-headers.c | 153 ++++++++++++++++++ interop/opt-extended-headers.sh | 29 ++++ .gitignore | 1 + 8 files changed, 329 insertions(+), 16 deletions(-) create mode 100644 interop/opt-extended-headers.c create mode 100755 interop/opt-extended-headers.sh diff --git a/generator/API.ml b/generator/API.ml index 570f15f4..c4d15e3a 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -825,6 +825,7 @@ "set_request_extended_headers", { if L is also set to false, since the use of extended headers implies structured replies."; see_also = [Link "get_request_extended_headers"; + Link "opt_extended_headers"; Link "set_handshake_flags"; Link "set_strict_mode"; Link "get_extended_headers_negotiated"; Link "zero"; Link "trim"; Link "cache"; @@ -856,7 +857,9 @@ "get_extended_headers_negotiated", { shortdesc = "see if extended headers are in use"; longdesc = "\ After connecting you may call this to find out if the connection is -using extended headers. +using extended headers. Note that this setting is sticky; this +can return true even after a second L +returns false because the server detected a duplicate request. When extended headers are not in use, commands are limited to a 32-bit length, even when the libnbd API uses a 64-bit parameter @@ -938,7 +941,7 @@ "get_structured_replies_negotiated", { attempted."; see_also = [Link "set_request_structured_replies"; Link "get_request_structured_replies"; - Link "opt_structured_reply"; + Link "opt_structured_reply"; Link "opt_extended_headers"; Link "get_protocol"; Link "get_extended_headers_negotiated"]; }; @@ -1211,12 +1214,13 @@ "set_opt_mode", { newstyle server. This setting has no effect when connecting to an oldstyle server. -Note that libnbd defaults to attempting C and -C before letting you control remaining -negotiation steps; if you need control over these steps as well, -first set L to C and -L to false before starting -the connection attempt. +Note that libnbd defaults to attempting C, +C, and C +before letting you control remaining negotiation steps; if you +need control over these steps as well, first set L +to C, and L +or L to false, before +starting the connection attempt. When option mode is enabled, you have fine-grained control over which options are negotiated, compared to the default of the server @@ -1324,6 +1328,35 @@ "opt_starttls", { Link "supports_tls"] }; + "opt_extended_headers", { + default_call with + args = []; ret = RBool; + permitted_states = [ Negotiating ]; + shortdesc = "request the server to enable extended headers"; + longdesc = "\ +Request that the server use extended headers, by sending +C. This can only be used if +L enabled option mode; furthermore, libnbd +defaults to automatically requesting this unless you use +L or +L prior to connecting. +This function is mainly useful for integration testing of corner +cases in server handling. + +This function returns true if the server replies with success, +false if the server replies with an error, and fails only if +the server does not reply (such as for a loss of connection). +Note that some servers fail a second request as redundant; +libnbd assumes that once one request has succeeded, then +extended headers are supported (as visible by +L) regardless if +later calls to this function return false. If this function +returns true, the use of structured replies is implied."; + see_also = [Link "set_opt_mode"; Link "aio_opt_extended_headers"; + Link "opt_go"; Link "set_request_extended_headers"; + Link "set_request_structured_replies"] + }; + "opt_structured_reply", { default_call with args = []; ret = RBool; @@ -1345,7 +1378,9 @@ "opt_structured_reply", { libnbd assumes that once one request has succeeded, then structured replies are supported (as visible by L) regardless if -later calls to this function return false."; +later calls to this function return false. Similarly, a +server may fail this request if extended headers are already +negotiated, since extended headers take priority."; see_also = [Link "set_opt_mode"; Link "aio_opt_structured_reply"; Link "opt_go"; Link "set_request_structured_replies"] }; @@ -3098,6 +3133,30 @@ "aio_opt_starttls", { see_also = [Link "set_opt_mode"; Link "opt_starttls"]; }; + "aio_opt_extended_headers", { + default_call with + args = []; + optargs = [ OClosure completion_closure ]; + ret = RErr; + permitted_states = [ Negotiating ]; + shortdesc = "request the server to enable extended headers"; + longdesc = "\ +Request that the server use extended headers, by sending +C. This behaves like the synchronous +counterpart L, except that it does +not wait for the server's response. + +To determine when the request completes, wait for +L to return false. Or supply the optional +C which will be invoked as described in +L, except that it is automatically +retired regardless of return value. Note that detecting whether the +server returns an error (as is done by the return value of the +synchronous counterpart) is only possible with a completion +callback."; + see_also = [Link "set_opt_mode"; Link "opt_extended_headers"]; + }; + "aio_opt_structured_reply", { default_call with args = []; @@ -4070,6 +4129,8 @@ let first_version = "set_request_extended_headers", (1, 16); "get_request_extended_headers", (1, 16); "get_extended_headers_negotiated", (1, 16); + "opt_extended_headers", (1, 16); + "aio_opt_extended_headers", (1, 16); (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. diff --git a/generator/states-newstyle-opt-extended-headers.c b/generator/states-newstyle-opt-extended-headers.c index 151787bf..d73d1b5f 100644 --- a/generator/states-newstyle-opt-extended-headers.c +++ b/generator/states-newstyle-opt-extended-headers.c @@ -21,11 +21,14 @@ STATE_MACHINE { NEWSTYLE.OPT_EXTENDED_HEADERS.START: assert (h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE); - assert (h->opt_current != NBD_OPT_EXTENDED_HEADERS); - assert (CALLBACK_IS_NULL (h->opt_cb.completion)); - if (!h->request_eh || !h->request_sr) { - SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); - return 0; + if (h->opt_current == NBD_OPT_EXTENDED_HEADERS) + assert (h->opt_mode); + else { + assert (CALLBACK_IS_NULL (h->opt_cb.completion)); + if (!h->request_eh || !h->request_sr) { + SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + return 0; + } } h->sbuf.option.version = htobe64 (NBD_NEW_VERSION); @@ -68,6 +71,7 @@ NEWSTYLE.OPT_EXTENDED_HEADERS.RECV_REPLY_PAYLOAD: NEWSTYLE.OPT_EXTENDED_HEADERS.CHECK_REPLY: uint32_t reply; + int err = ENOTSUP; reply = be32toh (h->sbuf.or.option_reply.reply); switch (reply) { @@ -76,19 +80,31 @@ NEWSTYLE.OPT_EXTENDED_HEADERS.CHECK_REPLY: h->extended_headers = true; /* Extended headers trump structured replies, so skip ahead. */ h->structured_replies = true; + err = 0; break; + case NBD_REP_ERR_INVALID: + err = EINVAL; + /* fallthrough */ default: if (handle_reply_error (h) == -1) { SET_NEXT_STATE (%.DEAD); return 0; } - debug (h, "extended headers are not supported by this server"); + if (h->extended_headers) + debug (h, "extended headers already negotiated"); + else + debug (h, "extended headers are not supported by this server"); break; } /* Next option. */ - SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + if (h->opt_current == NBD_OPT_EXTENDED_HEADERS) + SET_NEXT_STATE (%.NEGOTIATING); + else + SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + CALL_CALLBACK (h->opt_cb.completion, &err); + nbd_internal_free_option (h); return 0; } /* END STATE MACHINE */ diff --git a/generator/states-newstyle.c b/generator/states-newstyle.c index c88430e2..66bbd86a 100644 --- a/generator/states-newstyle.c +++ b/generator/states-newstyle.c @@ -146,6 +146,9 @@ NEWSTYLE.START: case NBD_OPT_STRUCTURED_REPLY: SET_NEXT_STATE (%OPT_STRUCTURED_REPLY.START); return 0; + case NBD_OPT_EXTENDED_HEADERS: + SET_NEXT_STATE (%OPT_EXTENDED_HEADERS.START); + return 0; case NBD_OPT_STARTTLS: SET_NEXT_STATE (%OPT_STARTTLS.START); return 0; diff --git a/lib/opt.c b/lib/opt.c index 776a3612..bcdd5741 100644 --- a/lib/opt.c +++ b/lib/opt.c @@ -164,6 +164,31 @@ nbd_unlocked_opt_starttls (struct nbd_handle *h) return r; } +/* Issue NBD_OPT_EXTENDED_HEADERS and wait for the reply. */ +int +nbd_unlocked_opt_extended_headers (struct nbd_handle *h) +{ + int err; + nbd_completion_callback c = { .callback = go_complete, .user_data = &err }; + int r = nbd_unlocked_aio_opt_extended_headers (h, &c); + + if (r == -1) + return r; + + r = wait_for_option (h); + if (r == 0) { + if (nbd_internal_is_state_negotiating (get_next_state (h))) + r = err == 0; + else { + assert (nbd_internal_is_state_dead (get_next_state (h))); + set_error (err, + "failed to get response to opt_extended_headers request"); + r = -1; + } + } + return r; +} + /* Issue NBD_OPT_STRUCTURED_REPLY and wait for the reply. */ int nbd_unlocked_opt_structured_reply (struct nbd_handle *h) @@ -386,6 +411,25 @@ nbd_unlocked_aio_opt_starttls (struct nbd_handle *h, #endif } +/* Issue NBD_OPT_EXTENDED_HEADERS without waiting. */ +int +nbd_unlocked_aio_opt_extended_headers (struct nbd_handle *h, + nbd_completion_callback *complete) +{ + if ((h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE) == 0) { + set_error (ENOTSUP, "server is not using fixed newstyle protocol"); + return -1; + } + + h->opt_current = NBD_OPT_EXTENDED_HEADERS; + h->opt_cb.completion = *complete; + SET_CALLBACK_TO_NULL (*complete); + + if (nbd_internal_run (h, cmd_issue) == -1) + debug (h, "option queued, ignoring state machine failure"); + return 0; +} + /* Issue NBD_OPT_STRUCTURED_REPLY without waiting. */ int nbd_unlocked_aio_opt_structured_reply (struct nbd_handle *h, diff --git a/interop/Makefile.am b/interop/Makefile.am index 75716eb9..5e0ea1ed 100644 --- a/interop/Makefile.am +++ b/interop/Makefile.am @@ -25,6 +25,7 @@ EXTRA_DIST = \ list-exports-test-dir/disk1 \ list-exports-test-dir/disk2 \ structured-read.sh \ + opt-extended-headers.sh \ $(NULL) TESTS_ENVIRONMENT = \ @@ -134,6 +135,7 @@ check_PROGRAMS += \ socket-activation-qemu-nbd \ dirty-bitmap \ structured-read \ + opt-extended-headers \ $(NULL) TESTS += \ interop-qemu-nbd \ @@ -144,6 +146,7 @@ TESTS += \ dirty-bitmap.sh \ structured-read.sh \ interop-qemu-block-size.sh \ + opt-extended-headers.sh \ $(NULL) interop_qemu_nbd_SOURCES = \ @@ -235,6 +238,9 @@ dirty_bitmap_LDADD = $(top_builddir)/lib/libnbd.la structured_read_SOURCES = structured-read.c structured_read_LDADD = $(top_builddir)/lib/libnbd.la +opt_extended_headers_SOURCES = opt-extended-headers.c +opt_extended_headers_LDADD = $(top_builddir)/lib/libnbd.la + endif HAVE_QEMU_NBD #---------------------------------------------------------------------- diff --git a/interop/opt-extended-headers.c b/interop/opt-extended-headers.c new file mode 100644 index 00000000..d8ca102d --- /dev/null +++ b/interop/opt-extended-headers.c @@ -0,0 +1,153 @@ +/* NBD client library in userspace + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Demonstrate low-level use of nbd_opt_extended_headers(). */ + +#include + +#include +#include +#include +#include +#include +#include +#include + +#include + +#define check(got, exp) do_check (#got, got, exp) + +static void +do_check (const char *act, int64_t got, int64_t exp) +{ + fprintf (stderr, "trying %s\n", act); + if (got == -1) + fprintf (stderr, "%s\n", nbd_get_error ()); + else + fprintf (stderr, "succeeded, result %" PRId64 "\n", got); + if (got != exp) { + fprintf (stderr, "got %" PRId64 ", but expected %" PRId64 "\n", got, exp); + exit (EXIT_FAILURE); + } +} + +static int +cb (void *data, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t nr_entries, int *error) +{ + /* If we got here, extents worked, implying at least structured replies */ + bool *seen = data; + + *seen = true; + return 0; +} + +struct nbd_handle * +prep (bool sr, bool eh, char **cmd) +{ + struct nbd_handle *nbd; + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* Connect to the server in opt mode, disable client-side failsafes so + * that we are testing server response even when client breaks protocol. + */ + check (nbd_set_opt_mode (nbd, true), 0); + check (nbd_set_strict_mode (nbd, 0), 0); + check (nbd_add_meta_context (nbd, LIBNBD_CONTEXT_BASE_ALLOCATION), 0); + check (nbd_set_request_structured_replies (nbd, sr), 0); + check (nbd_set_request_extended_headers (nbd, eh), 0); + check (nbd_connect_systemd_socket_activation (nbd, cmd), 0); + + return nbd; +} + +void +cleanup (struct nbd_handle *nbd, bool extents_exp) +{ + bool extents = false; + + check (nbd_opt_go (nbd), 0); + check (nbd_can_meta_context (nbd, LIBNBD_CONTEXT_BASE_ALLOCATION), + extents_exp); + check (nbd_block_status_64 (nbd, 512, 0, + (nbd_extent64_callback) { .callback = cb, + .user_data = &extents }, + 0), extents_exp ? 0 : -1); + check (extents, extents_exp); + nbd_close (nbd); +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + int64_t bytes_sent; + + if (argc < 2) { + fprintf (stderr, "%s qemu-nbd [args ...]\n", argv[0]); + exit (EXIT_FAILURE); + } + + /* Default setup tries eh first, and skips sr request when eh works... */ + nbd = prep (true, true, &argv[1]); + bytes_sent = nbd_stats_bytes_sent (nbd); + check (nbd_get_extended_headers_negotiated (nbd), true); + check (nbd_get_structured_replies_negotiated (nbd), true); + /* Duplicate eh request is no-op as redundant, but does not change state */ + check (nbd_opt_extended_headers (nbd), false); + /* Trying sr after eh is no-op as redundant, but does not change state */ + check (nbd_opt_structured_reply (nbd), false); + check (nbd_get_extended_headers_negotiated (nbd), true); + check (nbd_get_structured_replies_negotiated (nbd), true); + cleanup (nbd, true); + + /* ...which should result in the same amount of initial negotiation + * traffic as explicitly requesting just structured replies, albeit + * with different results on what got negotiated. + */ + nbd = prep (true, false, &argv[1]); + check (nbd_stats_bytes_sent (nbd), bytes_sent); + check (nbd_get_extended_headers_negotiated (nbd), false); + check (nbd_get_structured_replies_negotiated (nbd), true); + cleanup (nbd, true); + + /* request_eh is ignored if request_sr is false. */ + nbd = prep (false, true, &argv[1]); + check (nbd_get_extended_headers_negotiated (nbd), false); + check (nbd_get_structured_replies_negotiated (nbd), false); + cleanup (nbd, false); + + /* Swap order, requesting structured replies before extended headers */ + nbd = prep (false, false, &argv[1]); + check (nbd_get_extended_headers_negotiated (nbd), false); + check (nbd_get_structured_replies_negotiated (nbd), false); + check (nbd_opt_structured_reply (nbd), true); + check (nbd_get_extended_headers_negotiated (nbd), false); + check (nbd_get_structured_replies_negotiated (nbd), true); + check (nbd_opt_extended_headers (nbd), true); + check (nbd_get_extended_headers_negotiated (nbd), true); + check (nbd_get_structured_replies_negotiated (nbd), true); + cleanup (nbd, true); + + exit (EXIT_SUCCESS); +} diff --git a/interop/opt-extended-headers.sh b/interop/opt-extended-headers.sh new file mode 100755 index 00000000..491770d9 --- /dev/null +++ b/interop/opt-extended-headers.sh @@ -0,0 +1,29 @@ +#!/usr/bin/env bash +# nbd client library in userspace +# Copyright (C) 2019-2022 Red Hat Inc. +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +# Test low-level nbd_opt_extended_headers() details with qemu-nbd + +source ../tests/functions.sh +set -e +set -x + +requires qemu-nbd --version +requires nbdinfo --can extended-headers -- [ qemu-nbd -r -f raw "$0" ] + +# Run the test. +$VG ./opt-extended-headers qemu-nbd -r -f raw "$0" diff --git a/.gitignore b/.gitignore index bd002522..35e65335 100644 --- a/.gitignore +++ b/.gitignore @@ -113,6 +113,7 @@ Makefile.in /interop/list-exports-nbdkit /interop/list-exports-qemu-nbd /interop/nbd-server-tls.conf +/interop/opt-extended-headers /interop/requires.c /interop/socket-activation-nbdkit /interop/socket-activation-qemu-nbd From patchwork Mon Nov 14 22:51:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042928 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2BE71C4332F for ; Mon, 14 Nov 2022 23:32:02 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidw-0001y3-1Z; Mon, 14 Nov 2022 18:13:32 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidm-0001df-Aj for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJP-00037V-Cy for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iCUt/O3CrxGt8AYzN+fJa4btlP/Hy09Ns1XAwz9bEQg=; b=Hg5bxphW3PtokJ3wV5UnbVXbV+0olVSuQ0r7CP3hVQy5/yz/wGo/jT/JrLR9dQmSXW1TXx cmFCWByhh5WbFHU1D+41Jt7qrYBMl9EjJVTmH51p7iwrd+xE4ND9/PKEWfqhrQNjZymKhH vmUsFABQ8RDyv6ffcYu9M/EUmvzJ+Mc= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-654-2WADOspJPHOgnCajqkSLzw-1; Mon, 14 Nov 2022 17:52:13 -0500 X-MC-Unique: 2WADOspJPHOgnCajqkSLzw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4C47F1C08982; Mon, 14 Nov 2022 22:52:13 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id DCE4540E9789; Mon, 14 Nov 2022 22:52:12 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 20/23] interop: Add test of 64-bit block status Date: Mon, 14 Nov 2022 16:51:55 -0600 Message-Id: <20221114225158.2186742-21-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Prove that we can round-trip a block status request larger than 4G through a new-enough qemu-nbd. Also serves as a unit test of our shim for converting internal 64-bit representation back to the older 32-bit nbd_block_status callback interface. --- interop/Makefile.am | 6 ++ interop/large-status.c | 186 ++++++++++++++++++++++++++++++++++++++++ interop/large-status.sh | 49 +++++++++++ .gitignore | 1 + 4 files changed, 242 insertions(+) create mode 100644 interop/large-status.c create mode 100755 interop/large-status.sh diff --git a/interop/Makefile.am b/interop/Makefile.am index 5e0ea1ed..84c8fbbb 100644 --- a/interop/Makefile.am +++ b/interop/Makefile.am @@ -21,6 +21,7 @@ EXTRA_DIST = \ dirty-bitmap.sh \ interop-qemu-storage-daemon.sh \ interop-qemu-block-size.sh \ + large-status.sh \ list-exports-nbd-config \ list-exports-test-dir/disk1 \ list-exports-test-dir/disk2 \ @@ -134,6 +135,7 @@ check_PROGRAMS += \ list-exports-qemu-nbd \ socket-activation-qemu-nbd \ dirty-bitmap \ + large-status \ structured-read \ opt-extended-headers \ $(NULL) @@ -144,6 +146,7 @@ TESTS += \ list-exports-qemu-nbd \ socket-activation-qemu-nbd \ dirty-bitmap.sh \ + large-status.sh \ structured-read.sh \ interop-qemu-block-size.sh \ opt-extended-headers.sh \ @@ -235,6 +238,9 @@ socket_activation_qemu_nbd_LDADD = $(top_builddir)/lib/libnbd.la dirty_bitmap_SOURCES = dirty-bitmap.c dirty_bitmap_LDADD = $(top_builddir)/lib/libnbd.la +large_status_SOURCES = large-status.c +large_status_LDADD = $(top_builddir)/lib/libnbd.la + structured_read_SOURCES = structured-read.c structured_read_LDADD = $(top_builddir)/lib/libnbd.la diff --git a/interop/large-status.c b/interop/large-status.c new file mode 100644 index 00000000..44a911d7 --- /dev/null +++ b/interop/large-status.c @@ -0,0 +1,186 @@ +/* NBD client library in userspace + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Test 64-bit block status with qemu. */ + +#include + +#include +#include +#include +#include +#include +#include +#include + +#include + +static const char *bitmap; + +struct data { + bool req_one; /* input: true if req_one was passed to request */ + int count; /* input: count of expected remaining calls */ + bool seen_base; /* output: true if base:allocation encountered */ + bool seen_dirty; /* output: true if qemu:dirty-bitmap encountered */ +}; + +static int +cb32 (void *opaque, const char *metacontext, uint64_t offset, + uint32_t *entries, size_t len, int *error) +{ + struct data *data = opaque; + + assert (offset == 0); + assert (data->count-- > 0); + + if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) == 0) { + assert (!data->seen_base); + data->seen_base = true; + + /* Data block offset 0 size 64k, remainder is hole */ + assert (len == 4); + assert (entries[0] == 65536); + assert (entries[1] == 0); + /* libnbd had to truncate qemu's >4G answer */ + assert (entries[2] == 4227858432); + assert (entries[3] == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + } + else if (strcmp (metacontext, bitmap) == 0) { + assert (!data->seen_dirty); + data->seen_dirty = true; + + /* Dirty block at offset 5G-64k, remainder is clean */ + /* libnbd had to truncate qemu's >4G answer */ + assert (len == 2); + assert (entries[0] == 4227858432); + assert (entries[1] == 0); + } + else { + fprintf (stderr, "unexpected context %s\n", metacontext); + exit (EXIT_FAILURE); + } + return 0; +} + +static int +cb64 (void *opaque, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t len, int *error) +{ + struct data *data = opaque; + + assert (offset == 0); + assert (data->count-- > 0); + + if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) == 0) { + assert (!data->seen_base); + data->seen_base = true; + + /* Data block offset 0 size 64k, remainder is hole */ + assert (len == 2); + assert (entries[0].length == 65536); + assert (entries[0].flags == 0); + assert (entries[1].length == 5368643584ULL); + assert (entries[1].flags == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + } + else if (strcmp (metacontext, bitmap) == 0) { + assert (!data->seen_dirty); + data->seen_dirty = true; + + /* Dirty block at offset 5G-64k, remainder is clean */ + assert (len == 2); + assert (entries[0].length == 5368643584ULL); + assert (entries[0].flags == 0); + assert (entries[1].length == 65536); + assert (entries[1].flags == 1); + } + else { + fprintf (stderr, "unexpected context %s\n", metacontext); + exit (EXIT_FAILURE); + } + return 0; +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + int64_t exportsize; + struct data data; + + if (argc < 3) { + fprintf (stderr, "%s bitmap qemu-nbd [args ...]\n", argv[0]); + exit (EXIT_FAILURE); + } + bitmap = argv[1]; + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + nbd_add_meta_context (nbd, LIBNBD_CONTEXT_BASE_ALLOCATION); + nbd_add_meta_context (nbd, bitmap); + + if (nbd_connect_systemd_socket_activation (nbd, &argv[2]) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + exportsize = nbd_get_size (nbd); + if (exportsize == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + if (nbd_get_extended_headers_negotiated (nbd) != 1) { + fprintf (stderr, "skipping: qemu-nbd lacks extended headers\n"); + exit (77); + } + + /* Prove that we can round-trip a >4G block status request */ + data = (struct data) { .count = 2, }; + if (nbd_block_status_64 (nbd, exportsize, 0, + (nbd_extent64_callback) { .callback = cb64, + .user_data = &data }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (data.seen_base && data.seen_dirty); + + /* Check libnbd's handling of a >4G response through older interface */ + data = (struct data) { .count = 2, }; + if (nbd_block_status (nbd, exportsize, 0, + (nbd_extent_callback) { .callback = cb32, + .user_data = &data }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (data.seen_base && data.seen_dirty); + + if (nbd_shutdown (nbd, 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + nbd_close (nbd); + + exit (EXIT_SUCCESS); +} diff --git a/interop/large-status.sh b/interop/large-status.sh new file mode 100755 index 00000000..86271a41 --- /dev/null +++ b/interop/large-status.sh @@ -0,0 +1,49 @@ +#!/usr/bin/env bash +# nbd client library in userspace +# Copyright (C) 2019-2022 Red Hat Inc. +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +# Test reading qemu dirty-bitmap. + +source ../tests/functions.sh +set -e +set -x + +requires qemu-img bitmap --help +requires qemu-nbd --version + +# This test uses the qemu-nbd -B option. +if ! qemu-nbd --help | grep -sq -- -B; then + echo "$0: skipping because qemu-nbd does not support the -B option" + exit 77 +fi + +files="large-status.qcow2" +rm -f $files +cleanup_fn rm -f $files + +# Create mostly-sparse file with intentionally different data vs. dirty areas +# (64k data, 5G-64k hole,zero; 5G-64k clean, 64k dirty) +qemu-img create -f qcow2 large-status.qcow2 5G +qemu-img bitmap --add --enable -f qcow2 large-status.qcow2 bitmap0 +qemu-io -f qcow2 -c "w -z $((5*1024*1024*1024 - 64*1024)) 64k" \ + large-status.qcow2 +qemu-img bitmap --disable -f qcow2 large-status.qcow2 bitmap0 +qemu-io -f qcow2 -c 'w 0 64k' large-status.qcow2 + +# Run the test. +$VG ./large-status qemu:dirty-bitmap:bitmap0 \ + qemu-nbd -f qcow2 -B bitmap0 large-status.qcow2 diff --git a/.gitignore b/.gitignore index 35e65335..67bcee58 100644 --- a/.gitignore +++ b/.gitignore @@ -109,6 +109,7 @@ Makefile.in /interop/interop-qemu-nbd /interop/interop-qemu-nbd-tls-certs /interop/interop-qemu-nbd-tls-psk +/interop/large-status /interop/list-exports-nbd-server /interop/list-exports-nbdkit /interop/list-exports-qemu-nbd From patchwork Mon Nov 14 22:51:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13043042 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 46010C4332F for ; Tue, 15 Nov 2022 01:09:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouidq-0001ow-TD; Mon, 14 Nov 2022 18:13:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidm-0001hU-8R for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJP-000376-CW for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466336; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oqtLdpC7jr3RPefbhddIZHtvto4i7zggFcSyTawLmP8=; b=fd37LhhUXJh3UbIuFjlAH8lczURf+3PuB1tUS8hzqir2PhqIz4nGZNeIyWI75WXokvjtT5 f3So9uYmq5Wb1Cu4tChSdI3W/u/CSHM6+BX/lofN5lLl0K1LZzGoTOx1AisyLA27jA1hYM sr5ZwniD8T4nM3pQe5ki8WuqzIUaWbQ= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-231-zaxqkZMiOoyig7r9wsVf2w-1; Mon, 14 Nov 2022 17:52:14 -0500 X-MC-Unique: zaxqkZMiOoyig7r9wsVf2w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D6DBB1C08985; Mon, 14 Nov 2022 22:52:13 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 711B240E9786; Mon, 14 Nov 2022 22:52:13 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 21/23] api: Add nbd_can_block_status_payload() Date: Mon, 14 Nov 2022 16:51:56 -0600 Message-Id: <20221114225158.2186742-22-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org In the recent NBD protocol extensions to add 64-bit commands, an additional option was added to allow NBD_CMD_BLOCK_STATUS pass a client payload instructing the server to filter its answers (mainly useful when the client requests more than one meta context with NBD_OPT_SET_META_CONTEXT). This patch lays the groundwork by exposing servers that advertise this capability, although libnbd does not yet actually utilize it until the next patch. At the time this patch was written, qemu-nbd was also patched to provide such support; hence, an interop/ test shows the API in action. --- info/nbdinfo.pod | 10 ++- lib/nbd-protocol.h | 29 +++++--- generator/API.ml | 18 +++++ lib/flags.c | 11 +++ examples/server-flags.c | 7 +- interop/Makefile.am | 6 ++ interop/block-status-payload.c | 126 ++++++++++++++++++++++++++++++++ interop/block-status-payload.sh | 68 +++++++++++++++++ .gitignore | 1 + info/can.c | 5 ++ info/show.c | 9 ++- 11 files changed, 273 insertions(+), 17 deletions(-) create mode 100644 interop/block-status-payload.c create mode 100755 interop/block-status-payload.sh diff --git a/info/nbdinfo.pod b/info/nbdinfo.pod index 2455e1c0..c3a97c3b 100644 --- a/info/nbdinfo.pod +++ b/info/nbdinfo.pod @@ -178,6 +178,8 @@ rotating disk: accessing nearby blocks may be faster than random access and requests should be sorted to improve performance. Many servers do not or cannot report this accurately. +=item nbdinfo --can block-status-payload URI + =item nbdinfo --can cache URI =item nbdinfo --can df URI @@ -345,10 +347,10 @@ The command does not print anything. Instead it exits with success For further information see the L -and the following libnbd functions: L, -L, L, L, -L, L, L, -L, L, +and the following libnbd functions: L, +L, L, L, +L, L, L, +L, L, L, L, L. diff --git a/lib/nbd-protocol.h b/lib/nbd-protocol.h index ac569a11..2d1fabd0 100644 --- a/lib/nbd-protocol.h +++ b/lib/nbd-protocol.h @@ -102,17 +102,18 @@ struct nbd_fixed_new_option_reply { #define NBD_FLAG_NO_ZEROES (1 << 1) /* Per-export flags. */ -#define NBD_FLAG_HAS_FLAGS (1 << 0) -#define NBD_FLAG_READ_ONLY (1 << 1) -#define NBD_FLAG_SEND_FLUSH (1 << 2) -#define NBD_FLAG_SEND_FUA (1 << 3) -#define NBD_FLAG_ROTATIONAL (1 << 4) -#define NBD_FLAG_SEND_TRIM (1 << 5) -#define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) -#define NBD_FLAG_SEND_DF (1 << 7) -#define NBD_FLAG_CAN_MULTI_CONN (1 << 8) -#define NBD_FLAG_SEND_CACHE (1 << 10) -#define NBD_FLAG_SEND_FAST_ZERO (1 << 11) +#define NBD_FLAG_HAS_FLAGS (1 << 0) +#define NBD_FLAG_READ_ONLY (1 << 1) +#define NBD_FLAG_SEND_FLUSH (1 << 2) +#define NBD_FLAG_SEND_FUA (1 << 3) +#define NBD_FLAG_ROTATIONAL (1 << 4) +#define NBD_FLAG_SEND_TRIM (1 << 5) +#define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) +#define NBD_FLAG_SEND_DF (1 << 7) +#define NBD_FLAG_CAN_MULTI_CONN (1 << 8) +#define NBD_FLAG_SEND_CACHE (1 << 10) +#define NBD_FLAG_SEND_FAST_ZERO (1 << 11) +#define NBD_FLAG_BLOCK_STATUS_PAYLOAD (1 << 12) /* NBD options (new style handshake only). */ #define NBD_OPT_EXPORT_NAME 1 @@ -204,6 +205,12 @@ struct nbd_request_ext { uint64_t count; /* Request effect or payload length. */ } NBD_ATTRIBUTE_PACKED; +/* Extended request payload for NBD_CMD_BLOCK_STATUS, when supported. */ +struct nbd_block_status_payload { + uint64_t length; /* Effective length of client request */ + /* followed by array of uint32_t ids */ +} NBD_ATTRIBUTE_PACKED; + /* Simple reply (server -> client). */ struct nbd_simple_reply { uint32_t magic; /* NBD_SIMPLE_REPLY_MAGIC. */ diff --git a/generator/API.ml b/generator/API.ml index c4d15e3a..bbf7c0bb 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -2279,6 +2279,23 @@ "can_fast_zero", { example = Some "examples/server-flags.c"; }; + "can_block_status_payload", { + default_call with + args = []; ret = RBool; + permitted_states = [ Negotiating; Connected; Closed ]; + shortdesc = "does the server support the block status payload flag?"; + longdesc = "\ +Returns true if the server supports the use of the +C flag to allow filtering of the +block status command. Returns +false if the server does not. Note that this will never return +true if L is false." +^ non_blocking_test_call_description; + see_also = [SectionLink "Flag calls"; Link "opt_info"; + Link "get_extended_headers_negotiated"]; + example = Some "examples/server-flags.c"; + }; + "can_df", { default_call with args = []; ret = RBool; @@ -4131,6 +4148,7 @@ let first_version = "get_extended_headers_negotiated", (1, 16); "opt_extended_headers", (1, 16); "aio_opt_extended_headers", (1, 16); + "can_block_status_payload", (1, 16); (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. diff --git a/lib/flags.c b/lib/flags.c index eaf4ff85..7d3ce4c7 100644 --- a/lib/flags.c +++ b/lib/flags.c @@ -66,6 +66,11 @@ nbd_internal_set_size_and_flags (struct nbd_handle *h, eflags &= ~NBD_FLAG_SEND_DF; } + if (eflags & NBD_FLAG_BLOCK_STATUS_PAYLOAD && !h->extended_headers) { + debug (h, "server lacks extended headers, ignoring claim of block status payload"); + eflags &= ~NBD_FLAG_BLOCK_STATUS_PAYLOAD; + } + if (eflags & NBD_FLAG_SEND_FAST_ZERO && !(eflags & NBD_FLAG_SEND_WRITE_ZEROES)) { debug (h, "server lacks write zeroes, ignoring claim of fast zero"); @@ -213,6 +218,12 @@ nbd_unlocked_can_cache (struct nbd_handle *h) return get_flag (h, NBD_FLAG_SEND_CACHE); } +int +nbd_unlocked_can_block_status_payload (struct nbd_handle *h) +{ + return get_flag (h, NBD_FLAG_BLOCK_STATUS_PAYLOAD); +} + int nbd_unlocked_can_meta_context (struct nbd_handle *h, const char *name) { diff --git a/examples/server-flags.c b/examples/server-flags.c index d156aced..c826f146 100644 --- a/examples/server-flags.c +++ b/examples/server-flags.c @@ -78,8 +78,13 @@ main (int argc, char *argv[]) PRINT_FLAG (nbd_can_multi_conn); PRINT_FLAG (nbd_can_trim); PRINT_FLAG (nbd_can_zero); -#if LIBNBD_HAVE_NBD_CAN_FAST_ZERO /* Added in 1.2 */ +#if LIBNBD_HAVE_NBD_CAN_FAST_ZERO + /* Added in 1.2 */ PRINT_FLAG (nbd_can_fast_zero); +#endif +#if LIBNBD_HAVE_NBD_CAN_BLOCK_STATUS_PAYLOAD + /* Added in 1.16 */ + PRINT_FLAG (nbd_can_block_status_payload); #endif PRINT_FLAG (nbd_is_read_only); PRINT_FLAG (nbd_is_rotational); diff --git a/interop/Makefile.am b/interop/Makefile.am index 84c8fbbb..8a7bb17e 100644 --- a/interop/Makefile.am +++ b/interop/Makefile.am @@ -27,6 +27,7 @@ EXTRA_DIST = \ list-exports-test-dir/disk2 \ structured-read.sh \ opt-extended-headers.sh \ + block-status-payload.sh \ $(NULL) TESTS_ENVIRONMENT = \ @@ -138,6 +139,7 @@ check_PROGRAMS += \ large-status \ structured-read \ opt-extended-headers \ + block-status-payload \ $(NULL) TESTS += \ interop-qemu-nbd \ @@ -150,6 +152,7 @@ TESTS += \ structured-read.sh \ interop-qemu-block-size.sh \ opt-extended-headers.sh \ + block-status-payload.sh \ $(NULL) interop_qemu_nbd_SOURCES = \ @@ -247,6 +250,9 @@ structured_read_LDADD = $(top_builddir)/lib/libnbd.la opt_extended_headers_SOURCES = opt-extended-headers.c opt_extended_headers_LDADD = $(top_builddir)/lib/libnbd.la +block_status_payload_SOURCES = block-status-payload.c +block_status_payload_LDADD = $(top_builddir)/lib/libnbd.la + endif HAVE_QEMU_NBD #---------------------------------------------------------------------- diff --git a/interop/block-status-payload.c b/interop/block-status-payload.c new file mode 100644 index 00000000..cdb0de7c --- /dev/null +++ b/interop/block-status-payload.c @@ -0,0 +1,126 @@ +/* NBD client library in userspace + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Test interaction with qemu using block status payload filtering. */ + +#include + +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "array-size.h" + +static const char *contexts[] = { + "base:allocation", + "qemu:allocation-depth", + "qemu:dirty-bitmap:bitmap0", + "qemu:dirty-bitmap:bitmap1", +}; + +static int +cb (void *opaque, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t len, int *error) +{ + /* Adjust seen according to which context was visited */ + unsigned int *seen = opaque; + size_t i; + + for (i = 0; i < ARRAY_SIZE (contexts); i++) + if (strcmp (contexts[i], metacontext) == 0) + break; + *seen |= 1 << i; + return 0; +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + int64_t exportsize; + unsigned int seen; + size_t i; + int r; + + if (argc < 2) { + fprintf (stderr, "%s qemu-nbd [args ...]\n", argv[0]); + exit (EXIT_FAILURE); + } + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + assert (ARRAY_SIZE (contexts) == 4); + for (i = 0; i < ARRAY_SIZE (contexts); i++) { + if (nbd_add_meta_context (nbd, contexts[i]) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + } + + if (nbd_connect_systemd_socket_activation (nbd, &argv[1]) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + r = nbd_can_block_status_payload (nbd); + if (r == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + if (r != 1) { + fprintf (stderr, "expecting block status payload support from qemu\n"); + exit (EXIT_FAILURE); + } + + exportsize = nbd_get_size (nbd); + if (exportsize == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* An unfiltered call should see all four contexts */ + seen = 0; + if (nbd_block_status_64 (nbd, exportsize, 0, + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (seen == 0xf); + + /* FIXME: Test filtered calls once the API is added */ + if (nbd_shutdown (nbd, 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + nbd_close (nbd); + + exit (EXIT_SUCCESS); +} diff --git a/interop/block-status-payload.sh b/interop/block-status-payload.sh new file mode 100755 index 00000000..a178583b --- /dev/null +++ b/interop/block-status-payload.sh @@ -0,0 +1,68 @@ +#!/usr/bin/env bash +# nbd client library in userspace +# Copyright (C) 2019-2022 Red Hat Inc. +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +# Test use of block status payload for server filtering + +source ../tests/functions.sh +set -e +set -x + +requires qemu-img bitmap --help +# This test uses the qemu-nbd -A and -B options. +requires qemu-nbd -A -BA --version + +file="block-status-payload.qcow2" +rm -f $file +cleanup_fn rm -f $file + +# Create sparse file with two bitmaps. +qemu-img create -f qcow2 $file 1M +qemu-img bitmap --add --enable -f qcow2 $file bitmap0 +qemu-img bitmap --add --enable -f qcow2 $file bitmap1 + +# Unconditional part of test: qemu should not advertise block status payload +# support if extended headers are not in use +nbdsh -c ' +h.set_request_extended_headers(False) +h.add_meta_context("base:allocation") +h.add_meta_context("qemu:allocation-depth") +h.add_meta_context("qemu:dirty-bitmap:bitmap0") +h.add_meta_context("qemu:dirty-bitmap:bitmap1") +h.set_opt_mode(True) +args = ["qemu-nbd", "-f", "qcow2", "-A", "-B", "bitmap0", "-B", "bitmap1", + "'"$file"'"] +h.connect_systemd_socket_activation(args) +assert h.aio_is_negotiating() is True +assert h.get_extended_headers_negotiated() is False +# Flag not available until info or go +try: + h.can_block_status_payload() + assert False +except nbd.Error: + pass +h.opt_info() +assert h.can_block_status_payload() is False +assert h.can_meta_context("base:allocation") is True +h.opt_abort() +' + +# Conditional part of test: if qemu is new enough to support extended +# headers, we assume it can also support block status payload. +requires nbdinfo --can extended-headers -- [ qemu-nbd -r -f qcow2 "$file" ] +$VG ./block-status-payload \ + qemu-nbd -f qcow2 -A -B bitmap0 -B bitmap1 $file diff --git a/.gitignore b/.gitignore index 67bcee58..a61aa00e 100644 --- a/.gitignore +++ b/.gitignore @@ -96,6 +96,7 @@ Makefile.in /info/nbdinfo /info/nbdinfo.1 /install-sh +/interop/block-status-payload /interop/dirty-bitmap /interop/interop-nbd-server /interop/interop-nbd-server-tls diff --git a/info/can.c b/info/can.c index f602ffce..daa7d2af 100644 --- a/info/can.c +++ b/info/can.c @@ -72,6 +72,11 @@ do_can (void) else if (strcasecmp (can, "rotational") == 0) feature = nbd_is_rotational (nbd); + else if (strcasecmp (can, "block status payload") == 0 || + strcasecmp (can, "block-status-payload") == 0 || + strcasecmp (can, "block_status_payload") == 0) + feature = nbd_can_block_status_payload (nbd); + else if (strcasecmp (can, "cache") == 0) feature = nbd_can_cache (nbd); diff --git a/info/show.c b/info/show.c index 4bf59671..113b1b50 100644 --- a/info/show.c +++ b/info/show.c @@ -54,7 +54,7 @@ show_one_export (struct nbd_handle *nbd, const char *desc, char *uri = NULL; int is_rotational, is_read_only; int can_cache, can_df, can_fast_zero, can_flush, can_fua, - can_multi_conn, can_trim, can_zero; + can_multi_conn, can_trim, can_zero, can_block_status_payload; int64_t block_minimum, block_preferred, block_maximum; string_vector contexts = empty_vector; bool show_context = false; @@ -120,6 +120,7 @@ show_one_export (struct nbd_handle *nbd, const char *desc, can_multi_conn = nbd_can_multi_conn (nbd); can_trim = nbd_can_trim (nbd); can_zero = nbd_can_zero (nbd); + can_block_status_payload = nbd_can_block_status_payload (nbd); block_minimum = nbd_get_block_size (nbd, LIBNBD_SIZE_MINIMUM); block_preferred = nbd_get_block_size (nbd, LIBNBD_SIZE_PREFERRED); block_maximum = nbd_get_block_size (nbd, LIBNBD_SIZE_MAXIMUM); @@ -161,6 +162,8 @@ show_one_export (struct nbd_handle *nbd, const char *desc, if (is_read_only >= 0) fprintf (fp, "\t%s: %s\n", "is_read_only", is_read_only ? "true" : "false"); + if (can_block_status_payload >= 0) + show_boolean ("can_block_status_payload", can_block_status_payload); if (can_cache >= 0) show_boolean ("can_cache", can_cache); if (can_df >= 0) @@ -230,6 +233,10 @@ show_one_export (struct nbd_handle *nbd, const char *desc, if (is_read_only >= 0) fprintf (fp, "\t\"%s\": %s,\n", "is_read_only", is_read_only ? "true" : "false"); + if (can_block_status_payload >= 0) + fprintf (fp, "\t\"%s\": %s,\n", + "can_block_status_payload", + can_block_status_payload ? "true" : "false"); if (can_cache >= 0) fprintf (fp, "\t\"%s\": %s,\n", "can_cache", can_cache ? "true" : "false"); From patchwork Mon Nov 14 22:51:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13042935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B59BAC433FE for ; Mon, 14 Nov 2022 23:36:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouids-0001s6-DS; Mon, 14 Nov 2022 18:13:28 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidm-0001eb-8o for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJP-00037R-CR for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=StorDyubBALrANG8hGrEVg7Xtd/kvhy4GHUimD5k1FU=; b=I4V9RfYgLcwsEP5RC9r620cRr5RtK/ig9rGrvOkqVxbMyH22NjI9xMxxdNnARYqRFfCH8X +HobOSmU80BmCKWm2OB60HANgaqOgULoBr5mJ7JfnhCy2tVXqPJIHwIg2HnXw21vocJ5Dk z3yzHDQdXioqYmx4SaCkTMj8MDSfJjo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-605-zW9DR_UNP32LNKnhAWP3jA-1; Mon, 14 Nov 2022 17:52:14 -0500 X-MC-Unique: zW9DR_UNP32LNKnhAWP3jA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6A8CF101E9B5; Mon, 14 Nov 2022 22:52:14 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 076C840E9786; Mon, 14 Nov 2022 22:52:13 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 22/23] api: Add nbd_[aio_]block_status_filter() Date: Mon, 14 Nov 2022 16:51:57 -0600 Message-Id: <20221114225158.2186742-23-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org As part of extending NBD to support 64-bit lengths, the protocol also added an option for servers to allow clients to request filtered responses to NBD_CMD_BLOCK_STATUS when more than one meta-context is negotiated (see NBD commit XXX[*]). At the same time as this patch, qemu-nbd was taught to support and advertise this feature as a server, but does not utilize it as a client (qemu doesn't yet need to connect to multiple contexts at once). Thus, addding generic client support and enhancing the interop/ test in libnbd is needed to prove that the feature is viable and worth standardizing. --- [*] FIXME with actual commit id --- lib/internal.h | 5 +- generator/API.ml | 71 +++++++++++++++-- generator/states-issue-command.c | 4 +- lib/aio.c | 7 +- lib/rw.c | 127 ++++++++++++++++++++++++++++++- interop/block-status-payload.c | 117 +++++++++++++++++++++++++++- interop/block-status-payload.sh | 14 +++- info/info-can.sh | 3 + 8 files changed, 336 insertions(+), 12 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index 1abb21cb..ac8d99c4 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -73,6 +73,8 @@ struct meta_context { }; DEFINE_VECTOR_TYPE(meta_vector, struct meta_context); +DEFINE_VECTOR_TYPE(uint32_vector, uint32_t); + struct export { char *name; char *description; @@ -379,7 +381,8 @@ struct command { uint64_t cookie; uint64_t offset; uint64_t count; - void *data; /* Buffer for read/write */ + void *data; /* Buffer for read/write, uint32_vector* for status payload */ + uint32_vector *ids; /* For block status with payload */ struct command_cb cb; bool initialized; /* For read, true if getting a hole may skip memset */ uint32_t data_seen; /* For read, cumulative size of data chunks seen */ diff --git a/generator/API.ml b/generator/API.ml index bbf7c0bb..6bf67de0 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -2287,12 +2287,13 @@ "can_block_status_payload", { longdesc = "\ Returns true if the server supports the use of the C flag to allow filtering of the -block status command. Returns +block status command (see L). Returns false if the server does not. Note that this will never return true if L is false." ^ non_blocking_test_call_description; see_also = [SectionLink "Flag calls"; Link "opt_info"; - Link "get_extended_headers_negotiated"]; + Link "get_extended_headers_negotiated"; + Link "block_status_filter"]; example = Some "examples/server-flags.c"; }; @@ -2361,6 +2362,10 @@ "can_meta_context", { meta contexts were requested but there is a missing or failed attempt at NBD_OPT_SET_META_CONTEXT during option negotiation. +If the server supports block status filtering (see +L, this function must return +true for any filter name passed to L. + The single parameter is the name of the metadata context, for example C. Blibnbd.hE> includes defined constants for well-known @@ -2893,9 +2898,12 @@ "block_status_64", { information about blocks beginning from the specified offset to be returned. The C parameter is a hint: the server may choose to return less status, or the final block -may extend beyond the requested range. If multiple contexts +may extend beyond the requested range. When multiple contexts are supported, the number of blocks and cumulative length -of those blocks need not be identical between contexts. +of those blocks need not be identical between contexts; this +command generally returns the status of all negotiated contexts, +while some servers also support a filtered request (see +L, L). Note that not all servers can support a C of 4GiB or larger; L indicates which servers @@ -2945,11 +2953,38 @@ "block_status_64", { does not exceed C bytes; however, libnbd does not validate that the server obeyed the flag." ^ strict_call_description; - see_also = [Link "block_status"; + see_also = [Link "block_status"; Link "block_status_filter"; Link "add_meta_context"; Link "can_meta_context"; Link "aio_block_status_64"; Link "set_strict_mode"]; }; + "block_status_filter", { + default_call with + args = [ UInt64 "count"; UInt64 "offset"; StringList "contexts"; + Closure extent64_closure ]; + optargs = [ OFlags ("flags", cmd_flags, Some ["REQ_ONE"; "PAYLOAD_LEN"]) ]; + ret = RErr; + permitted_states = [ Connected ]; + shortdesc = "send filtered block status command, with 64-bit callback"; + longdesc = "\ +Issue a filtered block status command to the NBD server. If +supported by the server (see L), +this causes metadata context information about blocks beginning +from the specified offset to be returned, and with the result +limited to just the contexts specified in C. Note that +all strings in C must be supported by +L. + +All other parameters to this function have the same semantics +as in L; except that for convenience, +the C parameter may additionally contain or omit +C." +^ strict_call_description; + see_also = [Link "block_status_64"; + Link "can_block_status_payload"; Link "can_meta_context"; + Link "aio_block_status_filter"; Link "set_strict_mode"]; + }; + "poll", { default_call with args = [ Int "timeout" ]; ret = RInt; @@ -3619,6 +3654,30 @@ "aio_block_status_64", { Link "set_strict_mode"]; }; + "aio_block_status_filter", { + default_call with + args = [ UInt64 "count"; UInt64 "offset"; StringList "contexts"; + Closure extent64_closure ]; + optargs = [ OClosure completion_closure; + OFlags ("flags", cmd_flags, Some ["REQ_ONE"; "PAYLOAD_LEN"]) ]; + ret = RCookie; + permitted_states = [ Connected ]; + shortdesc = "send filtered block status command to the NBD server"; + longdesc = "\ +Send a filtered block status command to the NBD server. + +To check if the command completed, call L. +Or supply the optional C which will be invoked +as described in L. + +Other parameters behave as documented in L." +^ strict_call_description; + see_also = [SectionLink "Issuing asynchronous commands"; + Link "aio_block_status_64"; Link "block_status_filter"; + Link "can_meta_context"; Link "can_block_status_payload"; + Link "set_strict_mode"]; + }; + "aio_get_fd", { default_call with args = []; ret = RFd; @@ -4149,6 +4208,8 @@ let first_version = "opt_extended_headers", (1, 16); "aio_opt_extended_headers", (1, 16); "can_block_status_payload", (1, 16); + "block_status_filter", (1, 16); + "aio_block_status_filter", (1, 16); (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. diff --git a/generator/states-issue-command.c b/generator/states-issue-command.c index feea2672..da898aef 100644 --- a/generator/states-issue-command.c +++ b/generator/states-issue-command.c @@ -84,7 +84,9 @@ ISSUE_COMMAND.PREPARE_WRITE_PAYLOAD: assert (h->cmds_to_issue != NULL); cmd = h->cmds_to_issue; assert (cmd->cookie == be64toh (h->req.compact.handle)); - if (cmd->type == NBD_CMD_WRITE) { + if (cmd->type == NBD_CMD_WRITE || + (h->extended_headers && cmd->type == NBD_CMD_BLOCK_STATUS && + cmd->flags & NBD_CMD_FLAG_PAYLOAD_LEN)) { h->wbuf = cmd->data; h->wlen = cmd->count; if (cmd->next && cmd->count < 64 * 1024) diff --git a/lib/aio.c b/lib/aio.c index 1188a328..fa142881 100644 --- a/lib/aio.c +++ b/lib/aio.c @@ -32,8 +32,13 @@ void nbd_internal_retire_and_free_command (struct command *cmd) { /* Free the callbacks. */ - if (cmd->type == NBD_CMD_BLOCK_STATUS) + if (cmd->type == NBD_CMD_BLOCK_STATUS) { + if (cmd->ids) { + uint32_vector_reset (cmd->ids); + free (cmd->ids); + } FREE_CALLBACK (cmd->cb.fn.extent); + } if (cmd->type == NBD_CMD_READ) FREE_CALLBACK (cmd->cb.fn.chunk); FREE_CALLBACK (cmd->cb.completion); diff --git a/lib/rw.c b/lib/rw.c index 26533836..310c1285 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -242,6 +242,26 @@ nbd_unlocked_block_status_64 (struct nbd_handle *h, return wait_for_command (h, cookie); } +/* Issue a filtered block status command and wait for the reply. */ +int +nbd_unlocked_block_status_filter (struct nbd_handle *h, + uint64_t count, uint64_t offset, + char **filter, + nbd_extent64_callback *extent64, + uint32_t flags) +{ + int64_t cookie; + nbd_completion_callback c = NBD_NULL_COMPLETION; + + cookie = nbd_unlocked_aio_block_status_filter (h, count, offset, filter, + extent64, &c, flags); + if (cookie == -1) + return -1; + + assert (CALLBACK_IS_NULL (*extent64)); + return wait_for_command (h, cookie); +} + /* count_err represents the errno to return if bounds check fail */ int64_t nbd_internal_command_common (struct nbd_handle *h, @@ -250,6 +270,7 @@ nbd_internal_command_common (struct nbd_handle *h, void *data, struct command_cb *cb) { struct command *cmd; + uint32_vector *ids = NULL; if (h->disconnect_request) { set_error (EINVAL, "cannot request more commands after NBD_CMD_DISC"); @@ -297,10 +318,23 @@ nbd_internal_command_common (struct nbd_handle *h, } break; + case NBD_CMD_BLOCK_STATUS: + if (data) { + ids = data; + count = ids->len * sizeof (uint32_t); + data = ids->ptr; + if (count > MAX_REQUEST_SIZE || + (h->strict & LIBNBD_STRICT_PAYLOAD && count > h->payload_maximum)) { + set_error (ERANGE, "filter set too large"); + goto err; + } + break; + } + /* fallthrough */ + default: /* Other commands are limited by the 32 bit field in the command * structure on the wire, unless extended headers were negotiated. */ - default: if (!h->extended_headers && count > UINT32_MAX) { set_error (ERANGE, "request too large: maximum request size is %" PRIu32, UINT32_MAX); @@ -320,6 +354,7 @@ nbd_internal_command_common (struct nbd_handle *h, cmd->offset = offset; cmd->count = count; cmd->data = data; + cmd->ids = ids; if (cb) cmd->cb = *cb; @@ -364,8 +399,13 @@ nbd_internal_command_common (struct nbd_handle *h, err: /* Since we did not queue the command, we must free the callbacks. */ if (cb) { - if (type == NBD_CMD_BLOCK_STATUS) + if (type == NBD_CMD_BLOCK_STATUS) { + if (ids) { + uint32_vector_reset (ids); + free (ids); + } FREE_CALLBACK (cb->fn.extent); + } if (type == NBD_CMD_READ) FREE_CALLBACK (cb->fn.chunk); FREE_CALLBACK (cb->completion); @@ -609,3 +649,86 @@ nbd_unlocked_aio_block_status_64 (struct nbd_handle *h, return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset, count, EINVAL, NULL, &cb); } + +int64_t +nbd_unlocked_aio_block_status_filter (struct nbd_handle *h, + uint64_t count, uint64_t offset, + char **filter, + nbd_extent64_callback *extent64, + nbd_completion_callback *completion, + uint32_t flags) +{ + struct command_cb cb = { .fn.extent = *extent64, + .completion = *completion }; + uint32_vector *ids; + char *name; + size_t i; + + /* Because this affects wire format, it is more convenient to manage + * PAYLOAD_LEN by what was negotiated than to require the user to + * have to set it correctly. + */ + if (!h->extended_headers) { + set_error (ENOTSUP, "server does not support extended headers"); + return -1; + } + flags |= LIBNBD_CMD_FLAG_PAYLOAD_LEN; + + if (h->strict & LIBNBD_STRICT_COMMANDS) { + if (nbd_unlocked_can_block_status_payload (h) != 1) { + set_error (EINVAL, + "server does not support the block status payload flag"); + return -1; + } + + if (!h->meta_valid || h->meta_contexts.len == 0) { + set_error (ENOTSUP, "did not negotiate any metadata contexts, " + "either you did not call nbd_add_meta_context before " + "connecting or the server does not support it"); + return -1; + } + } + + ids = calloc (1, sizeof *ids); + if (ids == NULL) { + set_error (errno, "calloc"); + return -1; + } + if (uint32_vector_append (ids, htobe32 (count >> 32)) == -1 || + uint32_vector_append (ids, htobe32 (count)) == -1) { + set_error (errno, "realloc"); + goto fail; + } + + /* O(n^2) search - hopefully filter and negotiated contexts are both small */ + for ( ; (name = *filter) != NULL; filter++) { + if (!h->meta_valid) { + set_error (EINVAL, "context %s not negotiated", name); + goto fail; + } + for (i = 0; i < h->meta_contexts.len; i++) { + struct meta_context *meta = &h->meta_contexts.ptr[i]; + if (strcmp (name, meta->name) == 0) { + if (uint32_vector_append (ids, htobe32 (meta->context_id)) == -1) { + set_error (errno, "realloc"); + goto fail; + } + break; + } + } + if (i == h->meta_contexts.len) { + set_error (EINVAL, "context %s not negotiated", name); + goto fail; + } + } + + SET_CALLBACK_TO_NULL (*extent64); + SET_CALLBACK_TO_NULL (*completion); + return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset, + count, EINVAL, ids, &cb); + + fail: + uint32_vector_reset (ids); + free (ids); + return -1; +} diff --git a/interop/block-status-payload.c b/interop/block-status-payload.c index cdb0de7c..0c0348ba 100644 --- a/interop/block-status-payload.c +++ b/interop/block-status-payload.c @@ -54,11 +54,26 @@ cb (void *opaque, const char *metacontext, uint64_t offset, return 0; } +static char ** +list (unsigned int use) +{ + static const char *array[ARRAY_SIZE (contexts) + 1]; + size_t i, j; + + assert (use < 1 << ARRAY_SIZE (contexts)); + for (i = j = 0; i < ARRAY_SIZE (contexts); i++) + if (use & (1 << i)) + array[j++] = contexts[i]; + array[j] = NULL; + return (char **) array; +} + int main (int argc, char *argv[]) { struct nbd_handle *nbd; int64_t exportsize; + uint64_t bytes_sent; unsigned int seen; size_t i; int r; @@ -114,7 +129,107 @@ main (int argc, char *argv[]) } assert (seen == 0xf); - /* FIXME: Test filtered calls once the API is added */ + /* Filtering with all contexts listed, same effect as unfilitered call */ + seen = 0; + if (nbd_block_status_filter (nbd, exportsize, 0, list (0xf), + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (seen == 0xf); + + /* Filtering with just two out of four contexts; test optional flag */ + seen = 0; + if (nbd_block_status_filter (nbd, exportsize, 0, list (0x5), + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (seen == 0x5); + + /* Filtering with one context, near end of file (to make sure the + * payload length isn't confused with the effect length) + */ + seen = 0; + if (nbd_block_status_filter (nbd, 1, exportsize - 1, list (0x2), + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (seen == 0x2); + + /* Filtering with no contexts - pointless, so qemu rejects it */ + bytes_sent = nbd_stats_bytes_sent (nbd); + seen = 0; + if (nbd_block_status_filter (nbd, exportsize, 0, list (0x0), + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) != -1) { + fprintf (stderr, "expecting block status failure\n"); + exit (EXIT_FAILURE); + } + assert (seen == 0x0); + if (nbd_get_errno () != EINVAL) { + fprintf (stderr, "expecting EINVAL after block status failure\n"); + exit (EXIT_FAILURE); + } + if (nbd_stats_bytes_sent (nbd) <= bytes_sent) { + fprintf (stderr, "expecting server-side rejection of bad request\n"); + exit (EXIT_FAILURE); + } + + /* Giving unknown string triggers EINVAL from libnbd */ + bytes_sent = nbd_stats_bytes_sent (nbd); + seen = 0; + { + const char *bogus[] = { "qemu:dirty-bitmap:bitmap2", NULL }; + if (nbd_block_status_filter (nbd, exportsize, 0, (char **) bogus, + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) != -1) { + fprintf (stderr, "expecting block status failure\n"); + exit (EXIT_FAILURE); + } + } + if (nbd_get_errno () != EINVAL) { + fprintf (stderr, "expecting EINVAL after block status failure\n"); + exit (EXIT_FAILURE); + } + assert (seen == 0x0); + if (nbd_stats_bytes_sent (nbd) != bytes_sent) { + fprintf (stderr, "expecting client-side rejection of bad request\n"); + exit (EXIT_FAILURE); + } + + /* Giving same string twice triggers EINVAL from qemu */ + seen = 0; + { + const char *dupes[] = { "base:allocation", "base:allocation", NULL }; + if (nbd_block_status_filter (nbd, exportsize, 0, (char **) dupes, + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) != -1) { + fprintf (stderr, "expecting block status failure\n"); + exit (EXIT_FAILURE); + } + } + if (nbd_get_errno () != EINVAL) { + fprintf (stderr, "expecting EINVAL after block status failure\n"); + exit (EXIT_FAILURE); + } + assert (seen == 0x0); + if (nbd_stats_bytes_sent (nbd) <= bytes_sent) { + fprintf (stderr, "expecting server-side rejection of bad request\n"); + exit (EXIT_FAILURE); + } + + /* Done */ if (nbd_shutdown (nbd, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); diff --git a/interop/block-status-payload.sh b/interop/block-status-payload.sh index a178583b..e42e30e9 100755 --- a/interop/block-status-payload.sh +++ b/interop/block-status-payload.sh @@ -49,6 +49,7 @@ args = ["qemu-nbd", "-f", "qcow2", "-A", "-B", "bitmap0", "-B", "bitmap1", h.connect_systemd_socket_activation(args) assert h.aio_is_negotiating() is True assert h.get_extended_headers_negotiated() is False + # Flag not available until info or go try: h.can_block_status_payload() @@ -58,7 +59,18 @@ except nbd.Error: h.opt_info() assert h.can_block_status_payload() is False assert h.can_meta_context("base:allocation") is True -h.opt_abort() + +# Filter request not allowed if not advertised +def f(): + assert False +h.opt_go() +assert h.can_block_status_payload() is False +try: + h.block_status_filter(0, 512, ["base:allocation"], f) + assert False +except nbd.Error: + pass +h.shutdown() ' # Conditional part of test: if qemu is new enough to support extended diff --git a/info/info-can.sh b/info/info-can.sh index e5f6a44b..111b0be2 100755 --- a/info/info-can.sh +++ b/info/info-can.sh @@ -38,6 +38,9 @@ requires bash -c "nbdkit sh --dump-plugin | grep has_can_cache=1" # and oldstyle never, but that feels like depending a bit too much on # the implementation. +# --can block-status-payload is not supported by nbdkit yet. Testing +# is done during interop with new-enough qemu. + # --can structured-reply is not a per-export setting, but rather # something set on the server as a whole. From patchwork Mon Nov 14 22:51:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Blake X-Patchwork-Id: 13043039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C1F48C433FE for ; Tue, 15 Nov 2022 01:06:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ouido-0001lq-Gm; Mon, 14 Nov 2022 18:13:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouidl-0001X6-AU for qemu-devel@nongnu.org; Mon, 14 Nov 2022 18:13:21 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ouiJP-00037n-LI for qemu-devel@nongnu.org; Mon, 14 Nov 2022 17:52:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668466339; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rcnm5aMG/LUSV6x9rNqQu6bpIiyneV26ZG6lhU9ZMh4=; b=CRDAnxw0S/pi0R0B+/5L91ot8F2rRy30m5lZM+d19Ld1bFDbshJkv6CkKn/FL/v/5gMDpB SW6n6wxA2otf0jF9Yp2Xm2IwldsVCp+2riLYuVO5av+zWJAejOTr/R8BkcnUmQ3X0J4VD3 CKYWIVhlYwOMnAryK6hEHfa2EFfc7IE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-148-YsK2a6_mOtqnkpn016mR8Q-1; Mon, 14 Nov 2022 17:52:15 -0500 X-MC-Unique: YsK2a6_mOtqnkpn016mR8Q-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F29D2811E81; Mon, 14 Nov 2022 22:52:14 +0000 (UTC) Received: from green.redhat.com (unknown [10.2.16.240]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8E02540E9786; Mon, 14 Nov 2022 22:52:14 +0000 (UTC) From: Eric Blake To: libguestfs@redhat.com Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, nbd@other.debian.org Subject: [libnbd PATCH v2 23/23] RFC: pread: Accept 64-bit holes Date: Mon, 14 Nov 2022 16:51:58 -0600 Message-Id: <20221114225158.2186742-24-eblake@redhat.com> In-Reply-To: <20221114225158.2186742-1-eblake@redhat.com> References: <20221114224141.cm5jgyxfmvie5xb5@redhat.com> <20221114225158.2186742-1-eblake@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.133.124; envelope-from=eblake@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Even though we don't currently allow the user to request NBD_CMD_READ with more than 64M (and even if we did, our API signature caps us at SIZE_MAX, which is 32 bits on a 32-bit machine), upstream NBD commit XXX[*] states that for symmetry with 64-bit requests, extended header clients must be prepared for a server response with a 64-bit hole, even if the client never makes a read request that large. Note that we don't have to change the signature of the callback for nbd_pread_structured; nor is it worth adding a 64-bit counterpart to LIBNBD_READ_HOLE, because it is unlikely that a user callback will ever need to distinguish between which size was sent over the wire, when the value is always less than 32 bits. Also note that the recent NBD spec changes to add 64-bits did state that servers may allow clients to request a read of larger than the max block size, but if such read is not rejected with EOVERFLOW or EINVAL, then the reply will be divided into chunks so that no chunk sends more than a max block size payload. While we cannot guarantee which size structured reply the server will use, it is easy enough to handle both sizes, but tag the read with EPROTO for a non-compliant server that sends wide replies when extended headers were not negotiated. A more likely reason that a server may choose to send 64-bit hole chunks even for a 32-bit hole is because the extended hole payload has nicer power-of-2 sizing. --- [*] FIXME: Update this with the actual commit id, if upstream NBD even goes with this option. --- lib/internal.h | 1 + lib/nbd-protocol.h | 6 +++++ generator/states-reply-structured.c | 36 ++++++++++++++++++++++++----- 3 files changed, 37 insertions(+), 6 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index ac8d99c4..64d1941c 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -250,6 +250,7 @@ struct nbd_handle { union { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; + struct nbd_structured_reply_offset_hole_ext offset_hole_ext; struct nbd_structured_reply_block_status_hdr bs_hdr; struct nbd_structured_reply_block_status_ext_hdr bs_ext_hdr; struct { diff --git a/lib/nbd-protocol.h b/lib/nbd-protocol.h index 2d1fabd0..2d1a3202 100644 --- a/lib/nbd-protocol.h +++ b/lib/nbd-protocol.h @@ -247,6 +247,11 @@ struct nbd_structured_reply_offset_hole { uint32_t length; /* Length of hole. */ } NBD_ATTRIBUTE_PACKED; +struct nbd_structured_reply_offset_hole_ext { + uint64_t offset; + uint64_t length; /* Length of hole. */ +} NBD_ATTRIBUTE_PACKED; + /* NBD_REPLY_TYPE_BLOCK_STATUS block descriptor. */ struct nbd_block_descriptor { uint32_t length; /* length of block */ @@ -292,6 +297,7 @@ struct nbd_structured_reply_error { #define NBD_REPLY_TYPE_NONE 0 #define NBD_REPLY_TYPE_OFFSET_DATA 1 #define NBD_REPLY_TYPE_OFFSET_HOLE 2 +#define NBD_REPLY_TYPE_OFFSET_HOLE_EXT 3 #define NBD_REPLY_TYPE_BLOCK_STATUS 5 #define NBD_REPLY_TYPE_BLOCK_STATUS_EXT 6 #define NBD_REPLY_TYPE_ERROR NBD_REPLY_TYPE_ERR (1) diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 7e313b5a..e338bf74 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -28,15 +28,16 @@ * requesting command. */ static bool -structured_reply_in_bounds (uint64_t offset, uint32_t length, +structured_reply_in_bounds (uint64_t offset, uint64_t length, const struct command *cmd) { if (offset < cmd->offset || offset >= cmd->offset + cmd->count || - offset + length > cmd->offset + cmd->count) { + length > cmd->offset + cmd->count || + offset > cmd->offset + cmd->count - length) { set_error (0, "range of structured reply is out of bounds, " "offset=%" PRIu64 ", cmd->offset=%" PRIu64 ", " - "length=%" PRIu32 ", cmd->count=%" PRIu64 ": " + "length=%" PRIu64 ", cmd->count=%" PRIu64 ": " "this is likely to be a bug in the NBD server", offset, cmd->offset, length, cmd->count); return false; @@ -141,6 +142,21 @@ REPLY.STRUCTURED_REPLY.CHECK: SET_NEXT_STATE (%RECV_OFFSET_HOLE); break; + case NBD_REPLY_TYPE_OFFSET_HOLE_EXT: + if (cmd->type != NBD_CMD_READ || + length != sizeof h->sbuf.reply.payload.offset_hole_ext) + goto resync; + if (!h->extended_headers) { + debug (h, "unexpected 64-bit hole without extended headers, " + "this is probably a server bug"); + if (cmd->error == 0) + cmd->error = EPROTO; + } + h->rbuf = &h->sbuf.reply.payload.offset_hole_ext; + h->rlen = sizeof h->sbuf.reply.payload.offset_hole_ext; + SET_NEXT_STATE (%RECV_OFFSET_HOLE); + break; + case NBD_REPLY_TYPE_BLOCK_STATUS: if (cmd->type != NBD_CMD_BLOCK_STATUS || length < 12 || ((length-4) & 7) != 0) @@ -406,7 +422,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA_DATA: REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: struct command *cmd = h->reply_cmd; uint64_t offset; - uint32_t length; + uint64_t length; + uint16_t type; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -416,10 +433,14 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: return 0; case 0: offset = be64toh (h->sbuf.reply.payload.offset_hole.offset); - length = be32toh (h->sbuf.reply.payload.offset_hole.length); + type = be16toh (h->sbuf.reply.hdr.structured.type); + + if (type == NBD_REPLY_TYPE_OFFSET_HOLE) + length = be32toh (h->sbuf.reply.payload.offset_hole.length); + else + length = be64toh (h->sbuf.reply.payload.offset_hole_ext.length); assert (cmd); /* guaranteed by CHECK */ - assert (cmd->data && cmd->type == NBD_CMD_READ); /* Is the data within bounds? */ @@ -435,7 +456,10 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: /* The spec states that 0-length requests are unspecified, but * 0-length replies are broken. Still, it's easy enough to support * them as an extension, and this works even when length == 0. + * Although length is 64 bits, the bounds check above ensures that + * it is no larger than the 64M cap we put on NBD_CMD_READ. */ + assert (length <= SIZE_MAX); if (!cmd->initialized) memset ((char *) cmd->data + offset, 0, length); if (CALLBACK_IS_NOT_NULL (cmd->cb.fn.chunk)) {