diff mbox series

[3/9] strbuf: provide function to append whole lines

Message ID 4b0a963ea5c47a951b95aa0a03966315b3e8299d.1585129842.git.ps@pks.im (mailing list archive)
State New, archived
Headers show
Series Support for transactions in `git-update-ref --stdin` | expand

Commit Message

Patrick Steinhardt March 25, 2020, 9:53 a.m. UTC
While the strbuf interface already provides functions to read a line
into it that completely replaces its current contents, we do not have an
interface that allows for appending lines without discarding current
contents.

Add a new function `strbuf_appendwholeline` that reads a line including
its terminating character into a strbuf non-destructively. This is a
preparatory step for git-update-ref(1) reading standard input line-wise
instead of as a block.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 strbuf.c | 10 ++++++++++
 strbuf.h |  6 ++++++
 2 files changed, 16 insertions(+)

Comments

Junio C Hamano March 27, 2020, 9:04 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> While the strbuf interface already provides functions to read a line
> into it that completely replaces its current contents, we do not have an
> interface that allows for appending lines without discarding current
> contents.
>
> Add a new function `strbuf_appendwholeline` that reads a line including
> its terminating character into a strbuf non-destructively. This is a
> preparatory step for git-update-ref(1) reading standard input line-wise
> instead of as a block.

Hmph.  If we were to gain more uses of this function over time, the
necessity for an extra strbuf_addbuf() becomes annoying.  I wonder
if we should be doing this patch the other way around, i.e. move the
whole implementation, except for the early

	if (feof(fp))
		return EOF;
	strbuf_reset(sb);

part, and call it strbuf_addwholeline(), and strbuf_getwholeline()
becomes a thin wrapper around it that resets the incoming buffer
before calling strbuf_addwholeline().  The logic to determine EOF
would need to be updated if we go that route (i.e. instead of
checking sb->len is zero in the !HAVE_GETDELIM side of the
implementation, we need to see if the number is the same as before)
but it should be minor.

There is a stale comment in the HAVE_GETDELIM side of the curent
implementation, by the way.  I think our xrealloc no longer tries
to "recover" from ENOMEM.

Having said all that, I am fine with this version, at least for now.

> +int strbuf_appendwholeline(struct strbuf *sb, FILE *fp, int term)
> +{
> +	struct strbuf line = STRBUF_INIT;
> +	if (strbuf_getwholeline(&line, fp, term))
> +		return EOF;

So, if caller wants to read the stream until it runs out, it can do

	strbuf_init(&sb);
	while (strbuf_appendwholeline(&sb, fp, '\n') != EOF)
		; /* keep going */

If the caller knows how many lines to read, EOF-return can be used
only for error checking, e.g.

	strbuf_init(&sb);
	for (count = 0; count < 5; count++)
		if (strbuf_appendwholeline(&sb, fp, '\n') == EOF)
			die("cannot grab 5 lines");

It becomes cumbersome if the input lines are terminated, though.
The caller wouldn't be using strbuf_appendwholeline() interface,
e.g.

	strbuf_init(&accum);
	while ((strbuf_getwholeline(&sb, fp, '\n') != EOF) &&
               strcmp(sb.buf, "done\n"))
		strbuf_addbuf(&accum, &sb);

Anyway, I was primarily wondering if returning EOF, which perfectly
makes sense for getwholeline(), still makes sense for addwholeline(),
and it seems that it is still useful.

> +	strbuf_addbuf(sb, &line);
> +	strbuf_release(&line);
> +	return 0;
> +}
> +
>  static int strbuf_getdelim(struct strbuf *sb, FILE *fp, int term)
>  {
>  	if (strbuf_getwholeline(sb, fp, term))
> diff --git a/strbuf.h b/strbuf.h
> index ce8e49c0b2..411063ca76 100644
> --- a/strbuf.h
> +++ b/strbuf.h
> @@ -502,6 +502,12 @@ int strbuf_getline(struct strbuf *sb, FILE *file);
>   */
>  int strbuf_getwholeline(struct strbuf *sb, FILE *file, int term);
>  
> +/**
> + * Like `strbuf_getwholeline`, but appends the line instead of
> + * resetting the buffer first.
> + */
> +int strbuf_appendwholeline(struct strbuf *sb, FILE *file, int term);
> +
>  /**
>   * Like `strbuf_getwholeline`, but operates on a file descriptor.
>   * It reads one character at a time, so it is very slow.  Do not
Patrick Steinhardt March 30, 2020, 1:25 p.m. UTC | #2
On Fri, Mar 27, 2020 at 02:04:18PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > While the strbuf interface already provides functions to read a line
> > into it that completely replaces its current contents, we do not have an
> > interface that allows for appending lines without discarding current
> > contents.
> >
> > Add a new function `strbuf_appendwholeline` that reads a line including
> > its terminating character into a strbuf non-destructively. This is a
> > preparatory step for git-update-ref(1) reading standard input line-wise
> > instead of as a block.
> 
> Hmph.  If we were to gain more uses of this function over time, the
> necessity for an extra strbuf_addbuf() becomes annoying.  I wonder
> if we should be doing this patch the other way around, i.e. move the
> whole implementation, except for the early
> 
> 	if (feof(fp))
> 		return EOF;
> 	strbuf_reset(sb);
> 
> part, and call it strbuf_addwholeline(), and strbuf_getwholeline()
> becomes a thin wrapper around it that resets the incoming buffer
> before calling strbuf_addwholeline().  The logic to determine EOF
> would need to be updated if we go that route (i.e. instead of
> checking sb->len is zero in the !HAVE_GETDELIM side of the
> implementation, we need to see if the number is the same as before)
> but it should be minor.

Unfortunately it's not that easy for the HAVE_GETDELIM case, though, as
we cannot call getdelim(3P) to append to an already populated buffer. So
we'd have to call getdelim(3P) either on a NULL line and append manually
or on an empty line in case the strbuf doesn't have any contents. While
doable, I wanted to keep out this additional complexity for now.

Patrick
Junio C Hamano March 30, 2020, 5:12 p.m. UTC | #3
Patrick Steinhardt <ps@pks.im> writes:

> Unfortunately it's not that easy for the HAVE_GETDELIM case, though, as
> we cannot call getdelim(3P) to append to an already populated buffer. So
> we'd have to call getdelim(3P) either on a NULL line and append manually
> or on an empty line in case the strbuf doesn't have any contents. While
> doable, I wanted to keep out this additional complexity for now.

Ahh, I see.  Thanks for a clarification.
diff mbox series

Patch

diff --git a/strbuf.c b/strbuf.c
index bb0065ccaf..6e74901bfa 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -690,6 +690,16 @@  int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 }
 #endif
 
+int strbuf_appendwholeline(struct strbuf *sb, FILE *fp, int term)
+{
+	struct strbuf line = STRBUF_INIT;
+	if (strbuf_getwholeline(&line, fp, term))
+		return EOF;
+	strbuf_addbuf(sb, &line);
+	strbuf_release(&line);
+	return 0;
+}
+
 static int strbuf_getdelim(struct strbuf *sb, FILE *fp, int term)
 {
 	if (strbuf_getwholeline(sb, fp, term))
diff --git a/strbuf.h b/strbuf.h
index ce8e49c0b2..411063ca76 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -502,6 +502,12 @@  int strbuf_getline(struct strbuf *sb, FILE *file);
  */
 int strbuf_getwholeline(struct strbuf *sb, FILE *file, int term);
 
+/**
+ * Like `strbuf_getwholeline`, but appends the line instead of
+ * resetting the buffer first.
+ */
+int strbuf_appendwholeline(struct strbuf *sb, FILE *file, int term);
+
 /**
  * Like `strbuf_getwholeline`, but operates on a file descriptor.
  * It reads one character at a time, so it is very slow.  Do not