diff mbox series

[v3,16/25] tools/xenstore: move copying of node data out of db_fetch()

Message ID 20230724110247.10520-17-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series tools/xenstore: drop TDB | expand

Commit Message

Jürgen Groß July 24, 2023, 11:02 a.m. UTC
Today the node data is copied in db_fetch() on each data base read in
order to avoid accidental data base modifications when working on a
node.

read_node() is the only caller of db_fetch() which isn't freeing the
returned data area immediately after using it. The other callers don't
modify the returned data, so they don't need the data to be copied.

Move copying of the data into read_node(), resulting in a speedup of
the other callers due to no memory allocation and no copying being
needed anymore.

This allows to let db_fetch() return a pointer to const data.

As db_fetch() can't return any error other than ENOENT now, error
handling for the callers can be simplified.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- new patch
V3:
- modify return type of db_fetch() to return a pointer to const
  (Julien Grall)
- drop stale comment (Julien Grall)
- fix transaction handling
---
 tools/xenstore/xenstored_core.c        | 45 +++++++++++---------------
 tools/xenstore/xenstored_core.h        |  2 +-
 tools/xenstore/xenstored_transaction.c | 23 +++++++++----
 3 files changed, 35 insertions(+), 35 deletions(-)

Comments

Julien Grall July 27, 2023, 9:33 p.m. UTC | #1
Hi Juergen,

On 24/07/2023 12:02, Juergen Gross wrote:
> Today the node data is copied in db_fetch() on each data base read in
> order to avoid accidental data base modifications when working on a
> node.
> 
> read_node() is the only caller of db_fetch() which isn't freeing the
> returned data area immediately after using it. The other callers don't
> modify the returned data, so they don't need the data to be copied.
> 
> Move copying of the data into read_node(), resulting in a speedup of
> the other callers due to no memory allocation and no copying being
> needed anymore.
> 
> This allows to let db_fetch() return a pointer to const data.
> 
> As db_fetch() can't return any error other than ENOENT now, error
> handling for the callers can be simplified.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> V2:
> - new patch
> V3:
> - modify return type of db_fetch() to return a pointer to const
>    (Julien Grall)
> - drop stale comment (Julien Grall)
> - fix transaction handling
> ---
>   tools/xenstore/xenstored_core.c        | 45 +++++++++++---------------
>   tools/xenstore/xenstored_core.h        |  2 +-
>   tools/xenstore/xenstored_transaction.c | 23 +++++++++----
>   3 files changed, 35 insertions(+), 35 deletions(-)
> 
> diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
> index 9f88914149..1f5f118f1c 100644
> --- a/tools/xenstore/xenstored_core.c
> +++ b/tools/xenstore/xenstored_core.c
> @@ -555,10 +555,9 @@ static void initialize_fds(int *p_sock_pollfd_idx, int *ptimeout)
>   	}
>   }
>   
> -struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size)
> +const struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size)
>   {
> -	const struct xs_tdb_record_hdr *hdr;
> -	struct xs_tdb_record_hdr *p;
> +	struct xs_tdb_record_hdr *hdr;

Can't 'hdr' stay const?

>   
>   	hdr = hashtable_search(nodes, db_name);
>   	if (!hdr) {

[...]

> @@ -388,14 +385,26 @@ static int finalize_transaction(struct connection *conn,
>   		if (i->ta_node) {
>   			hdr = db_fetch(i->trans_name, &size);
>   			if (hdr) {
> +				/*
> +				 * Delete transaction entry and write it as
> +				 * no-TA entry. As we only hold a reference
> +				 * to the data, increment its ref count, then
> +				 * delete it from the DB. Now we own it and can
> +				 * drop the const attribute for changing the

It is not great, but I understand this is another necessary evil. So I 
am ok with this cast-away const. It is also well documented.

> +				 * generation count.
> +				 */
>   				enum write_node_mode mode;
> +				struct xs_tdb_record_hdr *own;
>   
> -				hdr->generation = ++generation;
> +				talloc_increase_ref_count(hdr);
> +				db_delete(conn, i->trans_name, NULL);
> +
> +				own = (struct xs_tdb_record_hdr *)hdr;
> +				own->generation = ++generation;
>   				mode = (i->generation == NO_GENERATION)
>   				       ? NODE_CREATE : NODE_MODIFY;
> -				*is_corrupt |= db_write(conn, i->node, hdr,
> +				*is_corrupt |= db_write(conn, i->node, own,
>   							size, NULL, mode, true);
> -				db_delete(conn, i->trans_name, NULL);
>   			} else {
>   				*is_corrupt = true;
>   			}

Cheers,
Jürgen Groß July 28, 2023, 6:18 a.m. UTC | #2
On 27.07.23 23:33, Julien Grall wrote:
> Hi Juergen,
> 
> On 24/07/2023 12:02, Juergen Gross wrote:
>> Today the node data is copied in db_fetch() on each data base read in
>> order to avoid accidental data base modifications when working on a
>> node.
>>
>> read_node() is the only caller of db_fetch() which isn't freeing the
>> returned data area immediately after using it. The other callers don't
>> modify the returned data, so they don't need the data to be copied.
>>
>> Move copying of the data into read_node(), resulting in a speedup of
>> the other callers due to no memory allocation and no copying being
>> needed anymore.
>>
>> This allows to let db_fetch() return a pointer to const data.
>>
>> As db_fetch() can't return any error other than ENOENT now, error
>> handling for the callers can be simplified.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>> V2:
>> - new patch
>> V3:
>> - modify return type of db_fetch() to return a pointer to const
>>    (Julien Grall)
>> - drop stale comment (Julien Grall)
>> - fix transaction handling
>> ---
>>   tools/xenstore/xenstored_core.c        | 45 +++++++++++---------------
>>   tools/xenstore/xenstored_core.h        |  2 +-
>>   tools/xenstore/xenstored_transaction.c | 23 +++++++++----
>>   3 files changed, 35 insertions(+), 35 deletions(-)
>>
>> diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
>> index 9f88914149..1f5f118f1c 100644
>> --- a/tools/xenstore/xenstored_core.c
>> +++ b/tools/xenstore/xenstored_core.c
>> @@ -555,10 +555,9 @@ static void initialize_fds(int *p_sock_pollfd_idx, int 
>> *ptimeout)
>>       }
>>   }
>> -struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size)
>> +const struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size)
>>   {
>> -    const struct xs_tdb_record_hdr *hdr;
>> -    struct xs_tdb_record_hdr *p;
>> +    struct xs_tdb_record_hdr *hdr;
> 
> Can't 'hdr' stay const?

Oh yes, I think so.

> 
>>       hdr = hashtable_search(nodes, db_name);
>>       if (!hdr) {
> 
> [...]
> 
>> @@ -388,14 +385,26 @@ static int finalize_transaction(struct connection *conn,
>>           if (i->ta_node) {
>>               hdr = db_fetch(i->trans_name, &size);
>>               if (hdr) {
>> +                /*
>> +                 * Delete transaction entry and write it as
>> +                 * no-TA entry. As we only hold a reference
>> +                 * to the data, increment its ref count, then
>> +                 * delete it from the DB. Now we own it and can
>> +                 * drop the const attribute for changing the
> 
> It is not great, but I understand this is another necessary evil. So I am ok 
> with this cast-away const. It is also well documented.

Thanks. In fact this is a fix resulting from letting db_fetch() return a const
pointer. So the const attribute already paid off.


Juergen
diff mbox series

Patch

diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 9f88914149..1f5f118f1c 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -555,10 +555,9 @@  static void initialize_fds(int *p_sock_pollfd_idx, int *ptimeout)
 	}
 }
 
-struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size)
+const struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size)
 {
-	const struct xs_tdb_record_hdr *hdr;
-	struct xs_tdb_record_hdr *p;
+	struct xs_tdb_record_hdr *hdr;
 
 	hdr = hashtable_search(nodes, db_name);
 	if (!hdr) {
@@ -569,22 +568,15 @@  struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size)
 	*size = sizeof(*hdr) + hdr->num_perms * sizeof(hdr->perms[0]) +
 		hdr->datalen + hdr->childlen;
 
-	/* Return a copy, avoiding a potential modification in the DB. */
-	p = talloc_memdup(NULL, hdr, *size);
-	if (!p) {
-		errno = ENOMEM;
-		return NULL;
-	}
-
 	trace_tdb("read %s size %zu\n", db_name, *size + strlen(db_name));
 
-	return p;
+	return hdr;
 }
 
 static void get_acc_data(const char *name, struct node_account_data *acc)
 {
 	size_t size;
-	struct xs_tdb_record_hdr *hdr;
+	const struct xs_tdb_record_hdr *hdr;
 
 	if (acc->memory < 0) {
 		hdr = db_fetch(name, &size);
@@ -595,7 +587,6 @@  static void get_acc_data(const char *name, struct node_account_data *acc)
 			acc->memory = size;
 			acc->domid = hdr->perms[0].id;
 		}
-		talloc_free(hdr);
 	}
 }
 
@@ -708,7 +699,7 @@  struct node *read_node(struct connection *conn, const void *ctx,
 		       const char *name)
 {
 	size_t size;
-	struct xs_tdb_record_hdr *hdr;
+	const struct xs_tdb_record_hdr *hdr;
 	struct node *node;
 	const char *db_name;
 	int err;
@@ -729,30 +720,30 @@  struct node *read_node(struct connection *conn, const void *ctx,
 	hdr = db_fetch(db_name, &size);
 
 	if (hdr == NULL) {
-		if (errno == ENOENT) {
-			node->generation = NO_GENERATION;
-			err = access_node(conn, node, NODE_ACCESS_READ, NULL);
-			errno = err ? : ENOENT;
-		} else {
-			log("DB error on read: %s", strerror(errno));
-			errno = EIO;
-		}
+		node->generation = NO_GENERATION;
+		err = access_node(conn, node, NODE_ACCESS_READ, NULL);
+		errno = err ? : ENOENT;
 		goto error;
 	}
 
 	node->parent = NULL;
-	talloc_steal(node, hdr);
 
 	/* Datalen, childlen, number of permissions */
 	node->generation = hdr->generation;
 	node->perms.num = hdr->num_perms;
 	node->datalen = hdr->datalen;
 	node->childlen = hdr->childlen;
-
-	/* Permissions are struct xs_permissions. */
-	node->perms.p = hdr->perms;
-	node->acc.domid = get_node_owner(node);
+	node->acc.domid = hdr->perms[0].id;
 	node->acc.memory = size;
+
+	/* Copy node data to new memory area, starting with permissions. */
+	size -= sizeof(*hdr);
+	node->perms.p = talloc_memdup(node, hdr->perms, size);
+	if (node->perms.p == NULL) {
+		errno = ENOMEM;
+		goto error;
+	}
+
 	if (domain_adjust_node_perms(node))
 		goto error;
 
diff --git a/tools/xenstore/xenstored_core.h b/tools/xenstore/xenstored_core.h
index 1aa3cc0936..6d1578ce97 100644
--- a/tools/xenstore/xenstored_core.h
+++ b/tools/xenstore/xenstored_core.h
@@ -362,7 +362,7 @@  extern xengnttab_handle **xgt_handle;
 int remember_string(struct hashtable *hash, const char *str);
 
 /* Data base access functions. */
-struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size);
+const struct xs_tdb_record_hdr *db_fetch(const char *db_name, size_t *size);
 int db_write(struct connection *conn, const char *db_name, const void *data,
 	     size_t size, struct node_account_data *acc,
 	     enum write_node_mode mode, bool no_quota_check);
diff --git a/tools/xenstore/xenstored_transaction.c b/tools/xenstore/xenstored_transaction.c
index fbcea3663e..a90283dcc5 100644
--- a/tools/xenstore/xenstored_transaction.c
+++ b/tools/xenstore/xenstored_transaction.c
@@ -357,20 +357,17 @@  static int finalize_transaction(struct connection *conn,
 {
 	struct accessed_node *i, *n;
 	size_t size;
-	struct xs_tdb_record_hdr *hdr;
+	const struct xs_tdb_record_hdr *hdr;
 	uint64_t gen;
 
 	list_for_each_entry_safe(i, n, &trans->accessed, list) {
 		if (i->check_gen) {
 			hdr = db_fetch(i->node, &size);
 			if (!hdr) {
-				if (errno != ENOENT)
-					return errno;
 				gen = NO_GENERATION;
 			} else {
 				gen = hdr->generation;
 			}
-			talloc_free(hdr);
 			if (i->generation != gen)
 				return EAGAIN;
 		}
@@ -388,14 +385,26 @@  static int finalize_transaction(struct connection *conn,
 		if (i->ta_node) {
 			hdr = db_fetch(i->trans_name, &size);
 			if (hdr) {
+				/*
+				 * Delete transaction entry and write it as
+				 * no-TA entry. As we only hold a reference
+				 * to the data, increment its ref count, then
+				 * delete it from the DB. Now we own it and can
+				 * drop the const attribute for changing the
+				 * generation count.
+				 */
 				enum write_node_mode mode;
+				struct xs_tdb_record_hdr *own;
 
-				hdr->generation = ++generation;
+				talloc_increase_ref_count(hdr);
+				db_delete(conn, i->trans_name, NULL);
+
+				own = (struct xs_tdb_record_hdr *)hdr;
+				own->generation = ++generation;
 				mode = (i->generation == NO_GENERATION)
 				       ? NODE_CREATE : NODE_MODIFY;
-				*is_corrupt |= db_write(conn, i->node, hdr,
+				*is_corrupt |= db_write(conn, i->node, own,
 							size, NULL, mode, true);
-				db_delete(conn, i->trans_name, NULL);
 			} else {
 				*is_corrupt = true;
 			}