[1/5] eoie: default to not writing EOIE section
diff mbox series

Message ID 20181120061147.GB144753@google.com
State New
Headers show
Series
  • Avoid confusing messages from new index extensions
Related show

Commit Message

Jonathan Nieder Nov. 20, 2018, 6:11 a.m. UTC
Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension,
2018-10-10) Git defaults to writing the new EOIE section when writing
out an index file.  Usually that is a good thing because it improves
threaded performance, but when a Git repository is shared with older
versions of Git, it produces a confusing warning:

  $ git status
  ignoring EOIE extension
  HEAD detached at 371ed0defa
  nothing to commit, working tree clean

Let's introduce the new index extension more gently.  First we'll roll
out the new version of Git that understands it, and then once
sufficiently many users are using such a version, we can flip the
default to writing it by default.

Introduce a '[index] recordEndOfIndexEntries' configuration variable
to allow interested users to benefit from this index extension early.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Rebased.  No other change from v1.

As Jonathan pointed out, it would be nice to have tests here.  Ben,
any advice for how I could write some in a followup change?  E.g. does
Derrick Stolee's tracing based testing trick apply here?

 Documentation/config/index.txt |  7 +++++++
 read-cache.c                   | 11 ++++++++++-
 t/t1700-split-index.sh         | 11 +++++++----
 3 files changed, 24 insertions(+), 5 deletions(-)

Comments

Ben Peart Nov. 20, 2018, 1:06 p.m. UTC | #1
On 11/20/2018 1:11 AM, Jonathan Nieder wrote:
> Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension,
> 2018-10-10) Git defaults to writing the new EOIE section when writing
> out an index file.  Usually that is a good thing because it improves
> threaded performance, but when a Git repository is shared with older
> versions of Git, it produces a confusing warning:
> 
>    $ git status
>    ignoring EOIE extension
>    HEAD detached at 371ed0defa
>    nothing to commit, working tree clean
> 
> Let's introduce the new index extension more gently.  First we'll roll
> out the new version of Git that understands it, and then once
> sufficiently many users are using such a version, we can flip the
> default to writing it by default.
> 
> Introduce a '[index] recordEndOfIndexEntries' configuration variable
> to allow interested users to benefit from this index extension early.
> 
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---
> Rebased.  No other change from v1.
> 
> As Jonathan pointed out, it would be nice to have tests here.  Ben,
> any advice for how I could write some in a followup change?  E.g. does
> Derrick Stolee's tracing based testing trick apply here?
> 
>   Documentation/config/index.txt |  7 +++++++
>   read-cache.c                   | 11 ++++++++++-
>   t/t1700-split-index.sh         | 11 +++++++----
>   3 files changed, 24 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt
> index 4b94b6bedc..8e138aba7a 100644
> --- a/Documentation/config/index.txt
> +++ b/Documentation/config/index.txt
> @@ -1,3 +1,10 @@
> +index.recordEndOfIndexEntries::
> +	Specifies whether the index file should include an "End Of Index
> +	Entry" section. This reduces index load time on multiprocessor
> +	machines but produces a message "ignoring EOIE extension" when
> +	reading the index using Git versions before 2.20. Defaults to
> +	'false'.
> +
>   index.threads::
>   	Specifies the number of threads to spawn when loading the index.
>   	This is meant to reduce index load time on multiprocessor machines.
> diff --git a/read-cache.c b/read-cache.c
> index 4ca81286c0..1e9c772603 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2689,6 +2689,15 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
>   		rollback_lock_file(lockfile);
>   }
>   
> +static int record_eoie(void)
> +{
> +	int val;

I believe you are going to want to initialize val to 0 here as it is on 
the stack so is not guaranteed to be zero.

> +
> +	if (!git_config_get_bool("index.recordendofindexentries", &val))
> +		return val;
> +	return 0;
> +}
> +
>   /*
>    * On success, `tempfile` is closed. If it is the temporary file
>    * of a `struct lock_file`, we will therefore effectively perform
> @@ -2936,7 +2945,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
>   	 * read.  Write it out regardless of the strip_extensions parameter as we need it
>   	 * when loading the shared index.
>   	 */
> -	if (offset) {
> +	if (offset && record_eoie()) {
>   		struct strbuf sb = STRBUF_INIT;
>   
>   		write_eoie_extension(&sb, &eoie_c, offset);
> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
> index 2ac47aa0e4..0cbac64e28 100755
> --- a/t/t1700-split-index.sh
> +++ b/t/t1700-split-index.sh
> @@ -25,14 +25,17 @@ test_expect_success 'enable split index' '
>   	git update-index --split-index &&
>   	test-tool dump-split-index .git/index >actual &&
>   	indexversion=$(test-tool index-version <.git/index) &&
> +
> +	# NEEDSWORK: Stop hard-coding checksums.
>   	if test "$indexversion" = "4"
>   	then
> -		own=3527df833c6c100d3d1d921a9a782d62a8be4b58
> -		base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb
> +		own=432ef4b63f32193984f339431fd50ca796493569
> +		base=508851a7f0dfa8691e9f69c7f055865389012491
>   	else
> -		own=5e9b60117ece18da410ddecc8b8d43766a0e4204
> -		base=4370042739b31cd17a5c5cd6043a77c9a00df113
> +		own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339
> +		base=39d890139ee5356c7ef572216cebcd27aa41f9df
>   	fi &&
> +
>   	cat >expect <<-EOF &&
>   	own $own
>   	base $base
>
SZEDER Gábor Nov. 20, 2018, 1:21 p.m. UTC | #2
On Tue, Nov 20, 2018 at 08:06:16AM -0500, Ben Peart wrote:
> >diff --git a/read-cache.c b/read-cache.c
> >index 4ca81286c0..1e9c772603 100644
> >--- a/read-cache.c
> >+++ b/read-cache.c
> >@@ -2689,6 +2689,15 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
> >  		rollback_lock_file(lockfile);
> >  }
> >+static int record_eoie(void)
> >+{
> >+	int val;
> 
> I believe you are going to want to initialize val to 0 here as it is on the
> stack so is not guaranteed to be zero.

The git_config_get_bool() call below will initialize it anyway.

> >+
> >+	if (!git_config_get_bool("index.recordendofindexentries", &val))
> >+		return val;
> >+	return 0;
> >+}
Ben Peart Nov. 20, 2018, 3:01 p.m. UTC | #3
On 11/20/2018 1:11 AM, Jonathan Nieder wrote:
> Since 3b1d9e04 (eoie: add End of Index Entry (EOIE) extension,
> 2018-10-10) Git defaults to writing the new EOIE section when writing
> out an index file.  Usually that is a good thing because it improves
> threaded performance, but when a Git repository is shared with older
> versions of Git, it produces a confusing warning:
> 
>    $ git status
>    ignoring EOIE extension
>    HEAD detached at 371ed0defa
>    nothing to commit, working tree clean
> 
> Let's introduce the new index extension more gently.  First we'll roll
> out the new version of Git that understands it, and then once
> sufficiently many users are using such a version, we can flip the
> default to writing it by default.
> 
> Introduce a '[index] recordEndOfIndexEntries' configuration variable
> to allow interested users to benefit from this index extension early.
> 
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---
> Rebased.  No other change from v1.
> 
> As Jonathan pointed out, it would be nice to have tests here.  Ben,
> any advice for how I could write some in a followup change?  E.g. does
> Derrick Stolee's tracing based testing trick apply here?
> 

I suppose a 'test-dump-eoie' could be written along the lines of 
test-dump-fsmonitor or test-dump-untracked-cache.  Unlike those, there 
isn't much state to dump other than the existence of the extension and 
the offset.  That could be used to test that the new settings are 
working properly.
Jeff King Nov. 21, 2018, 4:46 p.m. UTC | #4
On Tue, Nov 20, 2018 at 02:21:51PM +0100, SZEDER Gábor wrote:

> On Tue, Nov 20, 2018 at 08:06:16AM -0500, Ben Peart wrote:
> > >diff --git a/read-cache.c b/read-cache.c
> > >index 4ca81286c0..1e9c772603 100644
> > >--- a/read-cache.c
> > >+++ b/read-cache.c
> > >@@ -2689,6 +2689,15 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
> > >  		rollback_lock_file(lockfile);
> > >  }
> > >+static int record_eoie(void)
> > >+{
> > >+	int val;
> > 
> > I believe you are going to want to initialize val to 0 here as it is on the
> > stack so is not guaranteed to be zero.
> 
> The git_config_get_bool() call below will initialize it anyway.

Yes, there are two ways to write this. With a conditional to initialize
and return or to return the default, as we have here:

> > >+	if (!git_config_get_bool("index.recordendofindexentries", &val))
> > >+		return val;
> > >+	return 0;

Or initialize the default ahead of time, and rely on the function not to
modify it when the entry is missing:

  int val = 0;
  git_config_get_bool("index.whatever", &val);
  return val;

I think either is perfectly fine, but since I also had to look at it
twice to make sure it was doing the right thing, I figured it is worth
mentioning as a possible style/convention thing we may want to decide
on.

-Peff
Junio C Hamano Nov. 22, 2018, 12:47 a.m. UTC | #5
Jeff King <peff@peff.net> writes:

> Yes, there are two ways to write this. With a conditional to initialize
> and return or to return the default, as we have here:
>
>> > >+	if (!git_config_get_bool("index.recordendofindexentries", &val))
>> > >+		return val;
>> > >+	return 0;
>
> Or initialize the default ahead of time, and rely on the function not to
> modify it when the entry is missing:
>
>   int val = 0;
>   git_config_get_bool("index.whatever", &val);
>   return val;
>
> I think either is perfectly fine, but since I also had to look at it
> twice to make sure it was doing the right thing, I figured it is worth
> mentioning as a possible style/convention thing we may want to decide
> on.

I too think either is fine, and both rely on the git_config_get_*()
to modify the value return only when it sees that it is set.

I'd choose the latter when the default value is simple, as the
reader does not have to even know what the return value from the
git_config_get_*() function means to follow what is going on.

On the other hand, the former (i.e. the original by Jonathan) is
more flexible, and it makes it possible to write a piece of code,
which computes a default that is expensive to build only when
necessary, in the most natural way.  The readers do need to be aware
of how the functin signals "I didn't get anything" with its return
value, though.

I do not mind standardizing on the latter, though.  A caller with an
expensive default can initialize val to an impossible "sentinel"
value that signals the fact that git_config_get_*() did not get
anything, as long as the type has a natural sentinel (like -1 for a
bool to signal "unset"), and code that comes either immediately
after git_config_get_*() or much much later in the control flow can
check for the sentinel to see if it needs to compute the expensive
default.

Patch
diff mbox series

diff --git a/Documentation/config/index.txt b/Documentation/config/index.txt
index 4b94b6bedc..8e138aba7a 100644
--- a/Documentation/config/index.txt
+++ b/Documentation/config/index.txt
@@ -1,3 +1,10 @@ 
+index.recordEndOfIndexEntries::
+	Specifies whether the index file should include an "End Of Index
+	Entry" section. This reduces index load time on multiprocessor
+	machines but produces a message "ignoring EOIE extension" when
+	reading the index using Git versions before 2.20. Defaults to
+	'false'.
+
 index.threads::
 	Specifies the number of threads to spawn when loading the index.
 	This is meant to reduce index load time on multiprocessor machines.
diff --git a/read-cache.c b/read-cache.c
index 4ca81286c0..1e9c772603 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2689,6 +2689,15 @@  void update_index_if_able(struct index_state *istate, struct lock_file *lockfile
 		rollback_lock_file(lockfile);
 }
 
+static int record_eoie(void)
+{
+	int val;
+
+	if (!git_config_get_bool("index.recordendofindexentries", &val))
+		return val;
+	return 0;
+}
+
 /*
  * On success, `tempfile` is closed. If it is the temporary file
  * of a `struct lock_file`, we will therefore effectively perform
@@ -2936,7 +2945,7 @@  static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
 	 * read.  Write it out regardless of the strip_extensions parameter as we need it
 	 * when loading the shared index.
 	 */
-	if (offset) {
+	if (offset && record_eoie()) {
 		struct strbuf sb = STRBUF_INIT;
 
 		write_eoie_extension(&sb, &eoie_c, offset);
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index 2ac47aa0e4..0cbac64e28 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -25,14 +25,17 @@  test_expect_success 'enable split index' '
 	git update-index --split-index &&
 	test-tool dump-split-index .git/index >actual &&
 	indexversion=$(test-tool index-version <.git/index) &&
+
+	# NEEDSWORK: Stop hard-coding checksums.
 	if test "$indexversion" = "4"
 	then
-		own=3527df833c6c100d3d1d921a9a782d62a8be4b58
-		base=746f7ab2ed44fb839efdfbffcf399d0b113fb4cb
+		own=432ef4b63f32193984f339431fd50ca796493569
+		base=508851a7f0dfa8691e9f69c7f055865389012491
 	else
-		own=5e9b60117ece18da410ddecc8b8d43766a0e4204
-		base=4370042739b31cd17a5c5cd6043a77c9a00df113
+		own=8299b0bcd1ac364e5f1d7768efb62fa2da79a339
+		base=39d890139ee5356c7ef572216cebcd27aa41f9df
 	fi &&
+
 	cat >expect <<-EOF &&
 	own $own
 	base $base