diff mbox series

[v1,1/3] read-cache: add post-indexchanged hook

Message ID 20190208195115.12156-2-peartben@gmail.com (mailing list archive)
State New, archived
Headers show
Series Add post-indexchanged hook | expand

Commit Message

Ben Peart Feb. 8, 2019, 7:51 p.m. UTC
From: Ben Peart <benpeart@microsoft.com>

Add a post-indexchanged hook that is invoked after the index is written in
do_write_locked_index().

This hook is meant primarily for notification, and cannot affect
the outcome of git commands that trigger the index write.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
---
 builtin/reset.c        |  1 +
 builtin/update-index.c |  2 ++
 cache.h                |  4 +++-
 read-cache.c           | 14 ++++++++++++--
 unpack-trees.c         |  2 ++
 5 files changed, 20 insertions(+), 3 deletions(-)

Comments

brian m. carlson Feb. 8, 2019, 11:53 p.m. UTC | #1
On Fri, Feb 08, 2019 at 02:51:13PM -0500, Ben Peart wrote:
> From: Ben Peart <benpeart@microsoft.com>
> 
> Add a post-indexchanged hook that is invoked after the index is written in
> do_write_locked_index().
> 
> This hook is meant primarily for notification, and cannot affect
> the outcome of git commands that trigger the index write.
> 
> Signed-off-by: Ben Peart <benpeart@microsoft.com>

First, I think the tests should be merged into this commit. That's what
we typically do.

I'm also going to bikeshed slightly and suggest "post-index-changed",
since we normally use dashes between words in our hook names.

> diff --git a/cache.h b/cache.h
> index 27fe635f62..46eb862d3e 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -338,7 +338,9 @@ struct index_state {
>  	struct cache_time timestamp;
>  	unsigned name_hash_initialized : 1,
>  		 initialized : 1,
> -		 drop_cache_tree : 1;
> +		 drop_cache_tree : 1,
> +		 updated_workdir : 1,
> +		 updated_skipworktree : 1;

How important is it that we expose whether the skip-worktree bit is
changed? I can understand if we expose the workdir is updated, since
that's a thing a general user of this hook is likely to be interested
in. However, I'm not sure that for a general-purpose hook, the
skip-worktree bit is interesting.

> diff --git a/read-cache.c b/read-cache.c
> index 0e0c93edc9..0fcfa8a075 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -17,6 +17,7 @@
>  #include "commit.h"
>  #include "blob.h"
>  #include "resolve-undo.h"
> +#include "run-command.h"
>  #include "strbuf.h"
>  #include "varint.h"
>  #include "split-index.h"
> @@ -2999,8 +3000,17 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l
>  	if (ret)
>  		return ret;
>  	if (flags & COMMIT_LOCK)
> -		return commit_locked_index(lock);
> -	return close_lock_file_gently(lock);
> +		ret = commit_locked_index(lock);
> +	else
> +		ret = close_lock_file_gently(lock);
> +
> +	run_hook_le(NULL, "post-indexchanged",
> +			istate->updated_workdir ? "1" : "0",
> +			istate->updated_skipworktree ? "1" : "0", NULL);

I have, in general, some concerns about this API. First, I think we need
to consider that if we're going to expose various bits of information,
we might in the future want to expose more such bits. If so, adding
integer parameters is not likely to be a good way to do this. It's hard
to remember and if a binary is used as the hook, it may not always
handle additional arguments gracefully like shell scripts tend to.

If we're not going to expose the skip-worktree bit, then I suppose one
argument is fine. Otherwise, it might be better to expose key-value
pairs on stdin instead, or something like that.

Finally, I have questions about performance. What's the overhead of
determining whether the hook exists in this code path when there isn't
one? Since the index is frequently used, and can be written out as an
optimization by some commands, it would be nice to keep overhead low if
the hook isn't present.
Ben Peart Feb. 12, 2019, 5:39 p.m. UTC | #2
On 2/8/2019 6:53 PM, brian m. carlson wrote:
> On Fri, Feb 08, 2019 at 02:51:13PM -0500, Ben Peart wrote:
>> From: Ben Peart <benpeart@microsoft.com>
>>
>> Add a post-indexchanged hook that is invoked after the index is written in
>> do_write_locked_index().
>>
>> This hook is meant primarily for notification, and cannot affect
>> the outcome of git commands that trigger the index write.
>>
>> Signed-off-by: Ben Peart <benpeart@microsoft.com>
> 
> First, I think the tests should be merged into this commit. That's what
> we typically do.

Happy to.  In fact, I'd be happy to add the documentation as well and 
have a single commit. That's what _I'd_ typically do for something small 
like this. :)

> 
> I'm also going to bikeshed slightly and suggest "post-index-changed",
> since we normally use dashes between words in our hook names.
> 

I can do that as well to help make it more consistent.

>> diff --git a/cache.h b/cache.h
>> index 27fe635f62..46eb862d3e 100644
>> --- a/cache.h
>> +++ b/cache.h
>> @@ -338,7 +338,9 @@ struct index_state {
>>   	struct cache_time timestamp;
>>   	unsigned name_hash_initialized : 1,
>>   		 initialized : 1,
>> -		 drop_cache_tree : 1;
>> +		 drop_cache_tree : 1,
>> +		 updated_workdir : 1,
>> +		 updated_skipworktree : 1;
> 
> How important is it that we expose whether the skip-worktree bit is
> changed? I can understand if we expose the workdir is updated, since
> that's a thing a general user of this hook is likely to be interested
> in. However, I'm not sure that for a general-purpose hook, the
> skip-worktree bit is interesting.
> 

In our use case, we needed the skip-worktree flag because if something 
clears the skip-worktree bit on a file, we need to start paying 
attention to it in the work directory.  This flag tells us that may have 
happened and enables us to not have to do the extra work for other index 
changed events that don't change the index without updating the working 
directory.

Initially this was just to deal with 'reset --mixed' as it behaves 
differently with regards to updating the index and working directory 
than most other commands.  However, the update-index command can also 
arbitrarily clear the skip-worktree bit so we renamed the flag to be 
more generic.

>> diff --git a/read-cache.c b/read-cache.c
>> index 0e0c93edc9..0fcfa8a075 100644
>> --- a/read-cache.c
>> +++ b/read-cache.c
>> @@ -17,6 +17,7 @@
>>   #include "commit.h"
>>   #include "blob.h"
>>   #include "resolve-undo.h"
>> +#include "run-command.h"
>>   #include "strbuf.h"
>>   #include "varint.h"
>>   #include "split-index.h"
>> @@ -2999,8 +3000,17 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l
>>   	if (ret)
>>   		return ret;
>>   	if (flags & COMMIT_LOCK)
>> -		return commit_locked_index(lock);
>> -	return close_lock_file_gently(lock);
>> +		ret = commit_locked_index(lock);
>> +	else
>> +		ret = close_lock_file_gently(lock);
>> +
>> +	run_hook_le(NULL, "post-indexchanged",
>> +			istate->updated_workdir ? "1" : "0",
>> +			istate->updated_skipworktree ? "1" : "0", NULL);
> 
> I have, in general, some concerns about this API. First, I think we need
> to consider that if we're going to expose various bits of information,
> we might in the future want to expose more such bits. If so, adding
> integer parameters is not likely to be a good way to do this. It's hard
> to remember and if a binary is used as the hook, it may not always
> handle additional arguments gracefully like shell scripts tend to.
> 

Binaries deal with a variable number of arguments all the time via `int 
argc, const char **argv` so this isn't a problem (we actually use a 
binary for this hook already).

> If we're not going to expose the skip-worktree bit, then I suppose one
> argument is fine. Otherwise, it might be better to expose key-value
> pairs on stdin instead, or something like that.
> 

I'm not sure what else we may want to add in the future; this is all 
we've needed for our uses.  For now, I'd suggest we keep it simple and 
just pass them as command line parameters like we do with the other 
hooks.  It's easy to add additional arguments in the future and if we 
ever get to where that is unwieldy, we can address it then (YAGNI).

> Finally, I have questions about performance. What's the overhead of
> determining whether the hook exists in this code path when there isn't
> one? Since the index is frequently used, and can be written out as an
> optimization by some commands, it would be nice to keep overhead low if
> the hook isn't present.
> 

If you ever hit this code path, we've just updated the index which means 
we read the index file, did an lstat() on every file in the repo plus 
various refs, config files, etc, and then wrote out a new index file. 
Adding one more test for a hooks existence doesn't have any measurable 
impact.

Thank you for the feedback!
diff mbox series

Patch

diff --git a/builtin/reset.c b/builtin/reset.c
index 4d18a461fa..e173afcaac 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -380,6 +380,7 @@  int cmd_reset(int argc, const char **argv, const char *prefix)
 			int flags = quiet ? REFRESH_QUIET : REFRESH_IN_PORCELAIN;
 			if (read_from_tree(&pathspec, &oid, intent_to_add))
 				return 1;
+			the_index.updated_skipworktree = 1;
 			if (!quiet && get_git_work_tree()) {
 				uint64_t t_begin, t_delta_in_ms;
 
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 02ace602b9..cf731640fa 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1071,6 +1071,8 @@  int cmd_update_index(int argc, const char **argv, const char *prefix)
 	if (entries < 0)
 		die("cache corrupted");
 
+	the_index.updated_skipworktree = 1;
+
 	/*
 	 * Custom copy of parse_options() because we want to handle
 	 * filename arguments as they come.
diff --git a/cache.h b/cache.h
index 27fe635f62..46eb862d3e 100644
--- a/cache.h
+++ b/cache.h
@@ -338,7 +338,9 @@  struct index_state {
 	struct cache_time timestamp;
 	unsigned name_hash_initialized : 1,
 		 initialized : 1,
-		 drop_cache_tree : 1;
+		 drop_cache_tree : 1,
+		 updated_workdir : 1,
+		 updated_skipworktree : 1;
 	struct hashmap name_hash;
 	struct hashmap dir_hash;
 	struct object_id oid;
diff --git a/read-cache.c b/read-cache.c
index 0e0c93edc9..0fcfa8a075 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -17,6 +17,7 @@ 
 #include "commit.h"
 #include "blob.h"
 #include "resolve-undo.h"
+#include "run-command.h"
 #include "strbuf.h"
 #include "varint.h"
 #include "split-index.h"
@@ -2999,8 +3000,17 @@  static int do_write_locked_index(struct index_state *istate, struct lock_file *l
 	if (ret)
 		return ret;
 	if (flags & COMMIT_LOCK)
-		return commit_locked_index(lock);
-	return close_lock_file_gently(lock);
+		ret = commit_locked_index(lock);
+	else
+		ret = close_lock_file_gently(lock);
+
+	run_hook_le(NULL, "post-indexchanged",
+			istate->updated_workdir ? "1" : "0",
+			istate->updated_skipworktree ? "1" : "0", NULL);
+	istate->updated_workdir = 0;
+	istate->updated_skipworktree = 0;
+
+	return ret;
 }
 
 static int write_split_index(struct index_state *istate,
diff --git a/unpack-trees.c b/unpack-trees.c
index 3563daae1a..8665a4a7c0 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1637,6 +1637,8 @@  int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 						  WRITE_TREE_SILENT |
 						  WRITE_TREE_REPAIR);
 		}
+
+		o->result.updated_workdir = 1;
 		discard_index(o->dst_index);
 		*o->dst_index = o->result;
 	} else {