diff mbox series

[v2,03/18] cutils: Add qemu_strdup_hexlify() and qemu_strdup_unhexlify()

Message ID 20190308013222.12524-4-philmd@redhat.com (mailing list archive)
State New, archived
Headers show
Series fw_cfg: reduce memleaks, add QMP/HMP info + edk2_add_host_crypto_policy | expand

Commit Message

Philippe Mathieu-Daudé March 8, 2019, 1:32 a.m. UTC
Add two helpers: one to represent a binary data as a string of
hexadecimal values, and one to restore a such string into its
original binary data.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
 include/qemu/cutils.h | 33 ++++++++++++++++++++++++++
 util/cutils.c         | 55 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+)

Comments

Laszlo Ersek March 8, 2019, 9:48 a.m. UTC | #1
Hi Phil,

most important comment at the bottom.

On 03/08/19 02:32, Philippe Mathieu-Daudé wrote:
> Add two helpers: one to represent a binary data as a string of
> hexadecimal values, and one to restore a such string into its
> original binary data.
> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
>  include/qemu/cutils.h | 33 ++++++++++++++++++++++++++
>  util/cutils.c         | 55 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 88 insertions(+)
> 
> diff --git a/include/qemu/cutils.h b/include/qemu/cutils.h
> index d2dad3057c..375a5508b0 100644
> --- a/include/qemu/cutils.h
> +++ b/include/qemu/cutils.h
> @@ -171,6 +171,39 @@ bool test_buffer_is_zero_next_accel(void);
>  int uleb128_encode_small(uint8_t *out, uint32_t n);
>  int uleb128_decode_small(const uint8_t *in, uint32_t *n);
>  
> +/**
> + * qemu_strdup_hexlify:

(1) I think the name "hexlify" is unusual. I think we should use
encode/decode terminology, or hex/unhex, or, if we want to stick with
the "stringify" pattern, hexify/unhexify. (No "l".)

> + *
> + * Encode a sequence of binary data into its hexadecimal stringified
> + * representation.
> + *
> + * @ptr: Buffer to hexlify
> + * @size: Length of the buffer
> + *
> + * Use qemu_strdup_unhexlify() to convert the hex string to original data.
> + *
> + * Returns: A newly allocated, zero-terminated hex encoded string representing
> + * the data. The returned string must be freed with g_free().
> + */
> +gchar *qemu_strdup_hexlify(gconstpointer ptr, gsize size);
> +
> +/**
> + * qemu_strdup_unhexlify:
> + *
> + * Decode a sequence of hexadecimal encoded text into binary data.
> + *
> + * @hex_string: String to unhexlify
> + * @out_size: if not NULL: gsize to be written with the data length
> + *
> + * This function is the opposite of qemu_strdup_hexlify().
> + *
> + * Returns: A newly allocated buffer containing the binary data that text
> + * represents. The returned buffer must be freed with g_free().
> + * Note that the returned binary data is not necessarily zero-terminated,
> + * so it should not be used as a character string.
> + */
> +gpointer qemu_strdup_unhexlify(const gchar *hex_string, gsize *out_size);
> +
>  /**
>   * qemu_pstrcmp0:
>   * @str1: a non-NULL pointer to a C string (*str1 can be NULL)
> diff --git a/util/cutils.c b/util/cutils.c
> index e098debdc0..bf324c0d8b 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -779,6 +779,61 @@ int uleb128_decode_small(const uint8_t *in, uint32_t *n)
>      }
>  }
>  
> +static guchar hexval(const gchar v)
> +{
> +    switch (v) {
> +    case '0' ... '9':
> +        return v - '0';
> +    case 'A' ... 'F':
> +        return v - 'A' + 10;
> +    case 'a' ... 'f':
> +        return v - 'a' + 10;
> +    default:
> +        return 0;
> +    }
> +}

(2) I don't think that we should silently translate invalid characters
to zero, in any hexadecimal decoder.

> +
> +gchar *qemu_strdup_hexlify(gconstpointer ptr, gsize len)
> +{
> +    guchar *data = (guchar *)ptr;
> +    gchar *hex_string;
> +
> +    if (!ptr || !len) {
> +        return g_strdup("");
> +    }
> +
> +    hex_string = g_malloc(2 * len + 1);

(3) Should check against integer overflow in the g_malloc() argument
(multiplication and addition).

> +    for (gsize i = 0; i < len; i++) {
> +        g_snprintf(&hex_string[2 * i], 3, "%02x", data[i]);
> +    }
> +
> +    return hex_string;
> +}
> +
> +gpointer qemu_strdup_unhexlify(const gchar *hex_string, gsize *out_size)
> +{
> +    size_t size = 0;
> +    guchar *data = NULL;
> +
> +    if (hex_string) {
> +        size = strlen(hex_string) / 2;

(4) Should likely check that the length of the string is an even integer.

> +        if (size) {
> +            size_t i;
> +
> +            data = g_new(guchar, size + 1);
> +            for (i = 0; i < size; i++) {
> +                data[i]  = hexval(*hex_string++) << 4;
> +                data[i] |= hexval(*hex_string++);
> +            }
> +            data[i] = '\0';
> +        }
> +    }
> +    if (out_size) {
> +        *out_size = size;
> +    }
> +    return data;
> +}
> +
>  /*
>   * helper to parse debug environment variables
>   */
> 

(5) Most importantly: I don't think we need this patch.

First, AFAICS, the unhex function is never used in the series, and no
unit test is being added for it. That makes it a bad candidate for
"include/qemu/cutils.h".

Second, while the hex function is used in PATCH v2 13/18
("hw/nvram/fw_cfg: Add QMP 'info fw_cfg' command"), the documentation in
that patch and the logic in the patch are inconsistent. The
documentation -- i.e. both the commit message and the "misc.json" change
-- say that "FirmwareConfigurationItem.data" is unused (not populated).
However, that's exactly what create_qmp_fw_cfg_item() uses the hex
function for.

Third, if we do decide that the QMP command should output the fw_cfg
binary data, then the QMP tradition (to my knowledge) has been to use
base64 encoding. GLib provides helpers for base64:

https://developer.gnome.org/glib/stable/glib-Base64-Encoding.html

and you can see examples of it being used in e.g.

(a) qmp_ringbuf_read() [chardev/char-ringbuf.c] -- the  @ringbuf-read
command is defined in "qapi/char.json"

(b) qmp_guest_exec_status() [qga/commands.c] -- the @guest-exec-status
command is defined in "qga/qapi-schema.json".

Thanks
Laszlo
Markus Armbruster March 9, 2019, 2:32 p.m. UTC | #2
Laszlo Ersek <lersek@redhat.com> writes:

> Hi Phil,
>
> most important comment at the bottom.
>
> On 03/08/19 02:32, Philippe Mathieu-Daudé wrote:
>> Add two helpers: one to represent a binary data as a string of
>> hexadecimal values, and one to restore a such string into its
>> original binary data.
>> 
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>>  include/qemu/cutils.h | 33 ++++++++++++++++++++++++++
>>  util/cutils.c         | 55 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 88 insertions(+)
>> 
>> diff --git a/include/qemu/cutils.h b/include/qemu/cutils.h
>> index d2dad3057c..375a5508b0 100644
>> --- a/include/qemu/cutils.h
>> +++ b/include/qemu/cutils.h
>> @@ -171,6 +171,39 @@ bool test_buffer_is_zero_next_accel(void);
>>  int uleb128_encode_small(uint8_t *out, uint32_t n);
>>  int uleb128_decode_small(const uint8_t *in, uint32_t *n);
>>  
>> +/**
>> + * qemu_strdup_hexlify:
>
> (1) I think the name "hexlify" is unusual.

hexlify-buffer is an interactive autoloaded Lisp function in
‘hexl.el’.

(hexlify-buffer)

Convert a binary buffer to hexl format.
This discards the buffer’s undo information.

;-P

>                                            I think we should use
> encode/decode terminology, or hex/unhex, or, if we want to stick with
> the "stringify" pattern, hexify/unhexify. (No "l".)
>
>> + *
>> + * Encode a sequence of binary data into its hexadecimal stringified
>> + * representation.
>> + *
>> + * @ptr: Buffer to hexlify

Similar parameters elsewhere in this header are called @buf.

>> + * @size: Length of the buffer
>> + *
>> + * Use qemu_strdup_unhexlify() to convert the hex string to original data.
>> + *
>> + * Returns: A newly allocated, zero-terminated hex encoded string representing
>> + * the data. The returned string must be freed with g_free().
>> + */
>> +gchar *qemu_strdup_hexlify(gconstpointer ptr, gsize size);

Avoid the silly GLib types, please.

>> +
>> +/**
>> + * qemu_strdup_unhexlify:
>> + *
>> + * Decode a sequence of hexadecimal encoded text into binary data.
>> + *
>> + * @hex_string: String to unhexlify
>> + * @out_size: if not NULL: gsize to be written with the data length
>> + *
>> + * This function is the opposite of qemu_strdup_hexlify().
>> + *
>> + * Returns: A newly allocated buffer containing the binary data that text
>> + * represents. The returned buffer must be freed with g_free().
>> + * Note that the returned binary data is not necessarily zero-terminated,
>> + * so it should not be used as a character string.
>> + */
>> +gpointer qemu_strdup_unhexlify(const gchar *hex_string, gsize *out_size);
>> +
>>  /**
>>   * qemu_pstrcmp0:
>>   * @str1: a non-NULL pointer to a C string (*str1 can be NULL)
>> diff --git a/util/cutils.c b/util/cutils.c
>> index e098debdc0..bf324c0d8b 100644
>> --- a/util/cutils.c
>> +++ b/util/cutils.c
>> @@ -779,6 +779,61 @@ int uleb128_decode_small(const uint8_t *in, uint32_t *n)
>>      }
>>  }
>>  
>> +static guchar hexval(const gchar v)

Naming the parameter @ch would be more idiomatic, I think.

>> +{
>> +    switch (v) {
>> +    case '0' ... '9':
>> +        return v - '0';
>> +    case 'A' ... 'F':
>> +        return v - 'A' + 10;
>> +    case 'a' ... 'f':
>> +        return v - 'a' + 10;
>> +    default:
>> +        return 0;
>> +    }
>> +}
>
> (2) I don't think that we should silently translate invalid characters
> to zero, in any hexadecimal decoder.

Yup.  Let's abort().

>> +
>> +gchar *qemu_strdup_hexlify(gconstpointer ptr, gsize len)
>> +{
>> +    guchar *data = (guchar *)ptr;
>> +    gchar *hex_string;
>> +
>> +    if (!ptr || !len) {
>> +        return g_strdup("");
>> +    }

A null pointer is not the same as the empty string.  Replace this by

        assert(ptr);

and ...

>> +
>> +    hex_string = g_malloc(2 * len + 1);
>
> (3) Should check against integer overflow in the g_malloc() argument
> (multiplication and addition).

E.g.
        assert(len <= (SIZE_MAX - 1) / 2);

>> +    for (gsize i = 0; i < len; i++) {
>> +        g_snprintf(&hex_string[2 * i], 3, "%02x", data[i]);
>> +    }
>> +

... esnure termination here

        hex_string[2 * i] = 0;

What does g_snprintf() buy us over plain snprintf()?  I count 400+ uses
of the latter, and none of the former.

>> +    return hex_string;
>> +}
>> +
>> +gpointer qemu_strdup_unhexlify(const gchar *hex_string, gsize *out_size)
>> +{
>> +    size_t size = 0;
>> +    guchar *data = NULL;
>> +
>> +    if (hex_string) {

A null pointer is not the same as the empty string.  assert(hex_string)
and make the conversion unconditional.


>> +        size = strlen(hex_string) / 2;
>
> (4) Should likely check that the length of the string is an even integer.
>
>> +        if (size) {
>> +            size_t i;
>> +
>> +            data = g_new(guchar, size + 1);
>> +            for (i = 0; i < size; i++) {
>> +                data[i]  = hexval(*hex_string++) << 4;
>> +                data[i] |= hexval(*hex_string++);
>> +            }
>> +            data[i] = '\0';
>> +        }
>> +    }
>> +    if (out_size) {
>> +        *out_size = size;
>> +    }
>> +    return data;

This maps "" to null.  I think it shold return "".  It naturally does if
you make the if (size) code unconditional.

>> +}
>> +
>>  /*
>>   * helper to parse debug environment variables
>>   */
>> 
>
> (5) Most importantly: I don't think we need this patch.
>
> First, AFAICS, the unhex function is never used in the series, and no
> unit test is being added for it. That makes it a bad candidate for
> "include/qemu/cutils.h".
>
> Second, while the hex function is used in PATCH v2 13/18
> ("hw/nvram/fw_cfg: Add QMP 'info fw_cfg' command"), the documentation in
> that patch and the logic in the patch are inconsistent. The
> documentation -- i.e. both the commit message and the "misc.json" change
> -- say that "FirmwareConfigurationItem.data" is unused (not populated).
> However, that's exactly what create_qmp_fw_cfg_item() uses the hex
> function for.
>
> Third, if we do decide that the QMP command should output the fw_cfg
> binary data, then the QMP tradition (to my knowledge) has been to use
> base64 encoding. GLib provides helpers for base64:
>
> https://developer.gnome.org/glib/stable/glib-Base64-Encoding.html
>
> and you can see examples of it being used in e.g.
>
> (a) qmp_ringbuf_read() [chardev/char-ringbuf.c] -- the  @ringbuf-read
> command is defined in "qapi/char.json"
>
> (b) qmp_guest_exec_status() [qga/commands.c] -- the @guest-exec-status
> command is defined in "qga/qapi-schema.json".

Yes.  I wish you had wrote that first, saving me the trouble of looking
at the patch.
Laszlo Ersek March 12, 2019, 2:04 p.m. UTC | #3
On 03/09/19 15:32, Markus Armbruster wrote:
> Laszlo Ersek <lersek@redhat.com> writes:

>> (5) Most importantly: I don't think we need this patch.
>>
>> First, AFAICS, the unhex function is never used in the series, and no
>> unit test is being added for it. That makes it a bad candidate for
>> "include/qemu/cutils.h".
>>
>> Second, while the hex function is used in PATCH v2 13/18
>> ("hw/nvram/fw_cfg: Add QMP 'info fw_cfg' command"), the documentation in
>> that patch and the logic in the patch are inconsistent. The
>> documentation -- i.e. both the commit message and the "misc.json" change
>> -- say that "FirmwareConfigurationItem.data" is unused (not populated).
>> However, that's exactly what create_qmp_fw_cfg_item() uses the hex
>> function for.
>>
>> Third, if we do decide that the QMP command should output the fw_cfg
>> binary data, then the QMP tradition (to my knowledge) has been to use
>> base64 encoding. GLib provides helpers for base64:
>>
>> https://developer.gnome.org/glib/stable/glib-Base64-Encoding.html
>>
>> and you can see examples of it being used in e.g.
>>
>> (a) qmp_ringbuf_read() [chardev/char-ringbuf.c] -- the  @ringbuf-read
>> command is defined in "qapi/char.json"
>>
>> (b) qmp_guest_exec_status() [qga/commands.c] -- the @guest-exec-status
>> command is defined in "qga/qapi-schema.json".
> 
> Yes.  I wish you had wrote that first, saving me the trouble of looking
> at the patch.
> 

You are right, I'm sorry! :(
Laszlo
diff mbox series

Patch

diff --git a/include/qemu/cutils.h b/include/qemu/cutils.h
index d2dad3057c..375a5508b0 100644
--- a/include/qemu/cutils.h
+++ b/include/qemu/cutils.h
@@ -171,6 +171,39 @@  bool test_buffer_is_zero_next_accel(void);
 int uleb128_encode_small(uint8_t *out, uint32_t n);
 int uleb128_decode_small(const uint8_t *in, uint32_t *n);
 
+/**
+ * qemu_strdup_hexlify:
+ *
+ * Encode a sequence of binary data into its hexadecimal stringified
+ * representation.
+ *
+ * @ptr: Buffer to hexlify
+ * @size: Length of the buffer
+ *
+ * Use qemu_strdup_unhexlify() to convert the hex string to original data.
+ *
+ * Returns: A newly allocated, zero-terminated hex encoded string representing
+ * the data. The returned string must be freed with g_free().
+ */
+gchar *qemu_strdup_hexlify(gconstpointer ptr, gsize size);
+
+/**
+ * qemu_strdup_unhexlify:
+ *
+ * Decode a sequence of hexadecimal encoded text into binary data.
+ *
+ * @hex_string: String to unhexlify
+ * @out_size: if not NULL: gsize to be written with the data length
+ *
+ * This function is the opposite of qemu_strdup_hexlify().
+ *
+ * Returns: A newly allocated buffer containing the binary data that text
+ * represents. The returned buffer must be freed with g_free().
+ * Note that the returned binary data is not necessarily zero-terminated,
+ * so it should not be used as a character string.
+ */
+gpointer qemu_strdup_unhexlify(const gchar *hex_string, gsize *out_size);
+
 /**
  * qemu_pstrcmp0:
  * @str1: a non-NULL pointer to a C string (*str1 can be NULL)
diff --git a/util/cutils.c b/util/cutils.c
index e098debdc0..bf324c0d8b 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -779,6 +779,61 @@  int uleb128_decode_small(const uint8_t *in, uint32_t *n)
     }
 }
 
+static guchar hexval(const gchar v)
+{
+    switch (v) {
+    case '0' ... '9':
+        return v - '0';
+    case 'A' ... 'F':
+        return v - 'A' + 10;
+    case 'a' ... 'f':
+        return v - 'a' + 10;
+    default:
+        return 0;
+    }
+}
+
+gchar *qemu_strdup_hexlify(gconstpointer ptr, gsize len)
+{
+    guchar *data = (guchar *)ptr;
+    gchar *hex_string;
+
+    if (!ptr || !len) {
+        return g_strdup("");
+    }
+
+    hex_string = g_malloc(2 * len + 1);
+    for (gsize i = 0; i < len; i++) {
+        g_snprintf(&hex_string[2 * i], 3, "%02x", data[i]);
+    }
+
+    return hex_string;
+}
+
+gpointer qemu_strdup_unhexlify(const gchar *hex_string, gsize *out_size)
+{
+    size_t size = 0;
+    guchar *data = NULL;
+
+    if (hex_string) {
+        size = strlen(hex_string) / 2;
+        if (size) {
+            size_t i;
+
+            data = g_new(guchar, size + 1);
+            for (i = 0; i < size; i++) {
+                data[i]  = hexval(*hex_string++) << 4;
+                data[i] |= hexval(*hex_string++);
+            }
+            data[i] = '\0';
+        }
+    }
+    if (out_size) {
+        *out_size = size;
+    }
+    return data;
+}
+
 /*
  * helper to parse debug environment variables
  */