From patchwork Thu Mar 31 13:02:05 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev" X-Patchwork-Id: 8712801 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 6B5769F38C for ; Thu, 31 Mar 2016 13:02:55 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 94495200E9 for ; Thu, 31 Mar 2016 13:02:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 361C920279 for ; Thu, 31 Mar 2016 13:02:52 +0000 (UTC) Received: from localhost ([::1]:60294 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1alcFP-0007d0-Md for patchwork-qemu-devel@patchwork.kernel.org; Thu, 31 Mar 2016 09:02:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56100) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1alcFF-0007ct-Uj for qemu-devel@nongnu.org; Thu, 31 Mar 2016 09:02:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1alcF9-0006kj-QJ for qemu-devel@nongnu.org; Thu, 31 Mar 2016 09:02:41 -0400 Received: from mailhub.sw.ru ([195.214.232.25]:25388 helo=relay.sw.ru) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1alcF9-0006iw-D2 for qemu-devel@nongnu.org; Thu, 31 Mar 2016 09:02:35 -0400 Received: from irbis.sw.ru ([10.30.2.139]) by relay.sw.ru (8.13.4/8.13.4) with ESMTP id u2VD25ii027853; Thu, 31 Mar 2016 16:02:06 +0300 (MSK) From: "Denis V. Lunev" To: nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org Date: Thu, 31 Mar 2016 16:02:05 +0300 Message-Id: <1459429325-16350-1-git-send-email-den@openvz.org> X-Mailer: git-send-email 2.1.4 X-detected-operating-system: by eggs.gnu.org: OpenBSD 3.x X-Received-From: 195.214.232.25 Cc: Kevin Wolf , Alex Bligh , Pavel Borzenkov , Stefan Hajnoczi , Paolo Bonzini , Wouter Verhelst , den@openvz.org Subject: [Qemu-devel] [PATCH v2 1/1] NBD proto: add WRITE_ZEROES extension X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Pavel Borzenkov There exist some cases when a client knows that the data it is going to write is all zeroes. Such cases include mirroring or backing up a device implemented by a sparse file. With current NBD command set, the client has to issue NBD_CMD_WRITE command with zeroed payload and transfer these zero bytes through the wire. The server has to write the data onto disk, effectively denying the sparseness. To remedy this, the patch adds WRITE_ZEROES extension with one new NBD_CMD_WRITE_ZEROES command. Signed-off-by: Pavel Borzenkov Signed-off-by: Denis V. Lunev CC: Wouter Verhelst CC: Paolo Bonzini CC: Kevin Wolf CC: Stefan Hajnoczi CC: Wouter Verhelst CC: Alex Bligh CC: Eric Blake --- v2: - rebased on master - explicitly state that the client must not set NBD_CMD_WRITE_ZEROES if support for it wasn't negotiated with the server; - add new command flag's description in format suitable for moving to "Command flags" section. doc/proto.md | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 60 insertions(+), 4 deletions(-) diff --git a/doc/proto.md b/doc/proto.md index c1e05c5..a574563 100644 --- a/doc/proto.md +++ b/doc/proto.md @@ -261,6 +261,8 @@ immediately after the handshake flags field in oldstyle negotiation: schedule I/O accesses as for a rotational medium - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports `NBD_CMD_TRIM` commands +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server + supports `NBD_CMD_WRITE_ZEROES` commands Clients SHOULD ignore unknown flags. @@ -444,10 +446,13 @@ affects a particular command. Clients MUST NOT set a command flag bit that is not documented for the particular command; and whether a flag is valid may depend on negotiation during the handshake phase. -- bit 0, `NBD_CMD_FLAG_FUA`; valid during `NBD_CMD_WRITE`. SHOULD be - set to 1 if the client requires "Force Unit Access" mode of - operation. MUST NOT be set unless transmission flags included - `NBD_FLAG_SEND_FUA`. +- bit 0, `NBD_CMD_FLAG_FUA`; valid during `NBD_CMD_WRITE` and + `NBD_CMD_WRITE_ZEROES` commands. SHOULD be set to 1 if the client requires + "Force Unit Access" mode of operation. MUST NOT be set unless transmission + flags included `NBD_FLAG_SEND_FUA`. + +- bit 1, `NBD_CMD_MAY_TRIM`; defined by the experimental `WRITE_ZEROES` + extension; see below. #### Request types @@ -523,6 +528,10 @@ The following request types exist: A client MUST NOT send a trim request unless `NBD_FLAG_SEND_TRIM` was set in the transmission flags field. +* `NBD_CMD_WRITE_ZEROES` (6) + + Defined by the experimental `WRITE_ZEROES` extension; see below. + * Other requests Some third-party implementations may require additional protocol @@ -654,6 +663,53 @@ option reply type. message if they do not also send it as a reply to the `NBD_OPT_SELECT` message. +### `WRITE_ZEROES` extension + +There exist some cases when a client knows that the data it is going to write +is all zeroes. Such cases include mirroring or backing up a device implemented +by a sparse file. With current NBD command set, the client has to issue +`NBD_CMD_WRITE` command with zeroed payload and transfer these zero bytes +through the wire. The server has to write the data onto disk, effectively +losing the sparseness. + +To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds +one new command and one new command flag. + +* `NBD_CMD_WRITE_ZEROES` (6) + + A write request with no payload. Length and offset define the location + and amount of data to be zeroed. + + The server MUST zero out the data on disk, and then send the reply + message. The server MAY send the reply message before the data has + reached permanent storage. + + A client MUST NOT send a write zeroes request unless + `NBD_FLAG_SEND_WRITE_ZEROES` was set in the transmission flags field. + + If the `NBD_FLAG_SEND_FUA` flag was set in the transmission flags field, + the client MAY set the flag `NBD_CMD_FLAG_FUA` in the command flags field. + If this flag was set, the server MUST NOT send the reply until it has + ensured that the newly-zeroed data has reached permanent storage. + + If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the command + flags field, the server MAY use trimming to zero out the area, but it + MUST ensure that the data reads back as zero. + + If an error occurs, the server SHOULD set the appropriate error code + in the error field. The server MAY then close the connection. + +The server SHOULD return `ENOSPC` if it receives a write zeroes request +including one or more sectors beyond the size of the device. It SHOULD +return `EPERM` if it receives a write zeroes request on a read-only export. + +The extension adds the following new command flag: + +- bit 1, `NBD_CMD_FLAG_MAY_TRIM`; valid during `NBD_CMD_WRITE_ZEROES`. + SHOULD be set to 1 if the client allows the server to use trim to perform + the requested operation. The client MAY send `NBD_CMD_FLAG_MAY_TRIM` even + if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field. + ## About this file This file tries to document the NBD protocol as it is currently