diff mbox

[v3,05/12] xenstore: add support for reading directory with many children

Message ID 1478851210-6251-6-git-send-email-jgross@suse.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jürgen Groß Nov. 11, 2016, 8 a.m. UTC
As the payload size for one xenstore wire command is limited to 4096
bytes it is impossible to read the children names of a node with a
large number of children (e.g. /local/domain in case of a host with
more than about 2000 domains). This effectively limits the maximum
number of domains a host can support.

In order to support such long directory outputs add a new wire command
XS_DIRECTORY_PART which will return only some entries in each call and
can be called in a loop to get all entries.

Input data are the path of the node and the byte offset into the child
list where returned data should start.

Output is the generation count of the node (which will change each time
the node is being modified) and a list of child names starting with
the specified index. The end of the list is indicated by an empty
child name. It is the responsibility of the caller to check for data
consistency by comparing the generation counts of all returned data
sets to be the same for one node.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V3: use genlen, memcpy instead of strcpy as requested by Jan Beulich
    add XS_NEXT_ENTRY to xs_wire.h
    add XS_DIRECTORY_PART to sockmsg_string()

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstore/xenstored_core.c | 67 +++++++++++++++++++++++++++++++++++++++++
 xen/include/public/io/xs_wire.h |  3 ++
 2 files changed, 70 insertions(+)

Comments

Jan Beulich Nov. 11, 2016, 10:09 a.m. UTC | #1
>>> On 11.11.16 at 09:00, <JGross@suse.com> wrote:
> --- a/xen/include/public/io/xs_wire.h
> +++ b/xen/include/public/io/xs_wire.h
> @@ -50,6 +50,9 @@ enum xsd_sockmsg_type
>      XS_SET_TARGET,
>      XS_RESTRICT,
>      XS_RESET_WATCHES,
> +    XS_DIRECTORY_PART,
> +
> +    XS_NEXT_ENTRY,      /* First unused type. */

What is this needed for?

Jan
Jürgen Groß Nov. 11, 2016, 10:43 a.m. UTC | #2
On 11/11/16 11:09, Jan Beulich wrote:
>>>> On 11.11.16 at 09:00, <JGross@suse.com> wrote:
>> --- a/xen/include/public/io/xs_wire.h
>> +++ b/xen/include/public/io/xs_wire.h
>> @@ -50,6 +50,9 @@ enum xsd_sockmsg_type
>>      XS_SET_TARGET,
>>      XS_RESTRICT,
>>      XS_RESET_WATCHES,
>> +    XS_DIRECTORY_PART,
>> +
>> +    XS_NEXT_ENTRY,      /* First unused type. */
> 
> What is this needed for?

Patch 7. I didn't want to modify the same enum twice in the same series.
In case you'd prefer to add XS_NEXT_ENTRY in patch 7 I'd be happy to
move it.


Juergen
Jan Beulich Nov. 11, 2016, 11:12 a.m. UTC | #3
>>> On 11.11.16 at 11:43, <JGross@suse.com> wrote:
> On 11/11/16 11:09, Jan Beulich wrote:
>>>>> On 11.11.16 at 09:00, <JGross@suse.com> wrote:
>>> --- a/xen/include/public/io/xs_wire.h
>>> +++ b/xen/include/public/io/xs_wire.h
>>> @@ -50,6 +50,9 @@ enum xsd_sockmsg_type
>>>      XS_SET_TARGET,
>>>      XS_RESTRICT,
>>>      XS_RESET_WATCHES,
>>> +    XS_DIRECTORY_PART,
>>> +
>>> +    XS_NEXT_ENTRY,      /* First unused type. */
>> 
>> What is this needed for?
> 
> Patch 7. I didn't want to modify the same enum twice in the same series.
> In case you'd prefer to add XS_NEXT_ENTRY in patch 7 I'd be happy to
> move it.

Yes please. Additions should be made where they are needed;
please don't forget that series' may be committed in pieces. Also
judging about the chosen name (which I consider somewhat odd -
I'd have expected something like XS_TYPE_COUNT) is easier when
one sees the intended use.

Jan
Jürgen Groß Nov. 11, 2016, 11:15 a.m. UTC | #4
On 11/11/16 12:12, Jan Beulich wrote:
>>>> On 11.11.16 at 11:43, <JGross@suse.com> wrote:
>> On 11/11/16 11:09, Jan Beulich wrote:
>>>>>> On 11.11.16 at 09:00, <JGross@suse.com> wrote:
>>>> --- a/xen/include/public/io/xs_wire.h
>>>> +++ b/xen/include/public/io/xs_wire.h
>>>> @@ -50,6 +50,9 @@ enum xsd_sockmsg_type
>>>>      XS_SET_TARGET,
>>>>      XS_RESTRICT,
>>>>      XS_RESET_WATCHES,
>>>> +    XS_DIRECTORY_PART,
>>>> +
>>>> +    XS_NEXT_ENTRY,      /* First unused type. */
>>>
>>> What is this needed for?
>>
>> Patch 7. I didn't want to modify the same enum twice in the same series.
>> In case you'd prefer to add XS_NEXT_ENTRY in patch 7 I'd be happy to
>> move it.
> 
> Yes please. Additions should be made where they are needed;
> please don't forget that series' may be committed in pieces. Also
> judging about the chosen name (which I consider somewhat odd -
> I'd have expected something like XS_TYPE_COUNT) is easier when
> one sees the intended use.

Okay to both (moving and renaming to your suggested name).


Juergen
Wei Liu Nov. 12, 2016, 3:11 p.m. UTC | #5
On Fri, Nov 11, 2016 at 09:00:03AM +0100, Juergen Gross wrote:
> As the payload size for one xenstore wire command is limited to 4096
> bytes it is impossible to read the children names of a node with a
> large number of children (e.g. /local/domain in case of a host with
> more than about 2000 domains). This effectively limits the maximum
> number of domains a host can support.
> 
> In order to support such long directory outputs add a new wire command
> XS_DIRECTORY_PART which will return only some entries in each call and
> can be called in a loop to get all entries.
> 
> Input data are the path of the node and the byte offset into the child
> list where returned data should start.
> 
> Output is the generation count of the node (which will change each time
> the node is being modified) and a list of child names starting with
> the specified index. The end of the list is indicated by an empty
> child name. It is the responsibility of the caller to check for data
> consistency by comparing the generation counts of all returned data
> sets to be the same for one node.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

(with Jan's comment addressed)
diff mbox

Patch

diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 95d6d7d..e4e09fa 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -16,6 +16,7 @@ 
     along with this program; If not, see <http://www.gnu.org/licenses/>.
 */
 
+#include <inttypes.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <poll.h>
@@ -147,6 +148,7 @@  static char *sockmsg_string(enum xsd_sockmsg_type type)
 	case XS_RESUME: return "RESUME";
 	case XS_SET_TARGET: return "SET_TARGET";
 	case XS_RESET_WATCHES: return "RESET_WATCHES";
+	case XS_DIRECTORY_PART: return "DIRECTORY_PART";
 	default:
 		return "**UNKNOWN**";
 	}
@@ -812,6 +814,67 @@  static void send_directory(struct connection *conn, struct buffered_data *in)
 	send_reply(conn, XS_DIRECTORY, node->children, node->childlen);
 }
 
+static void send_directory_part(struct connection *conn,
+				struct buffered_data *in)
+{
+	unsigned int off, len, maxlen, genlen;
+	char *name, *child, *data;
+	struct node *node;
+	char gen[24];
+
+	if (xs_count_strings(in->buffer, in->used) != 2) {
+		send_error(conn, EINVAL);
+		return;
+	}
+
+	/* First arg is node name. */
+	name = canonicalize(conn, in->buffer);
+
+	/* Second arg is childlist offset. */
+	off = atoi(in->buffer + strlen(in->buffer) + 1);
+
+	node = get_node(conn, in, name, XS_PERM_READ);
+	if (!node) {
+		send_error(conn, errno);
+		return;
+	}
+
+	genlen = snprintf(gen, sizeof(gen), "%"PRIu64, node->generation) + 1;
+
+	/* Offset behind list: just return a list with an empty string. */
+	if (off >= node->childlen) {
+		gen[genlen] = 0;
+		send_reply(conn, XS_DIRECTORY_PART, gen, genlen + 1);
+		return;
+	}
+
+	len = 0;
+	maxlen = XENSTORE_PAYLOAD_MAX - genlen - 1;
+	child = node->children + off;
+
+	while (len + strlen(child) < maxlen) {
+		len += strlen(child) + 1;
+		child += strlen(child) + 1;
+		if (off + len == node->childlen)
+			break;
+	}
+
+	data = talloc_array(in, char, genlen + len + 1);
+	if (!data) {
+		send_error(conn, ENOMEM);
+		return;
+	}
+
+	memcpy(data, gen, genlen);
+	memcpy(data + genlen, node->children + off, len);
+	if (off + len == node->childlen) {
+		data[genlen + len] = 0;
+		len++;
+	}
+
+	send_reply(conn, XS_DIRECTORY_PART, data, genlen + len);
+}
+
 static void do_read(struct connection *conn, struct buffered_data *in)
 {
 	struct node *node;
@@ -1334,6 +1397,10 @@  static void process_message(struct connection *conn, struct buffered_data *in)
 		do_reset_watches(conn, in);
 		break;
 
+	case XS_DIRECTORY_PART:
+		send_directory_part(conn, in);
+		break;
+
 	default:
 		eprintf("Client unknown operation %i", in->hdr.msg.type);
 		send_error(conn, ENOSYS);
diff --git a/xen/include/public/io/xs_wire.h b/xen/include/public/io/xs_wire.h
index 0a0cdbc..9a6f8eb 100644
--- a/xen/include/public/io/xs_wire.h
+++ b/xen/include/public/io/xs_wire.h
@@ -50,6 +50,9 @@  enum xsd_sockmsg_type
     XS_SET_TARGET,
     XS_RESTRICT,
     XS_RESET_WATCHES,
+    XS_DIRECTORY_PART,
+
+    XS_NEXT_ENTRY,      /* First unused type. */
 
     XS_INVALID = 0xffff /* Guaranteed to remain an invalid type */
 };