diff mbox series

[v2] builtin/clone: avoid failure with GIT_DEFAULT_HASH

Message ID 20200915015845.4149976-1-sandals@crustytoothpaste.net (mailing list archive)
State Superseded
Headers show
Series [v2] builtin/clone: avoid failure with GIT_DEFAULT_HASH | expand

Commit Message

brian m. carlson Sept. 15, 2020, 1:58 a.m. UTC
If a user is cloning a SHA-1 repository with GIT_DEFAULT_HASH set to
"sha256", then we can end up with a repository where the repository
format version is 0 but the extensions.objectformat key is set to
"sha256".  This is both wrong (the user has a SHA-1 repository) and
nonfunctional (because the extension cannot be used in a v0 repository).

This happens because in a clone, we initially set up the repository, and
then change its algorithm based on what the remote side tells us it's
using.  We've initially set up the repository as SHA-256 in this case,
and then later on reset the repository version without clearing the
extension.

We could just always set the extension in this case, but that would mean
that our SHA-1 repositories weren't compatible with older Git versions,
even though there's no reason why they shouldn't be.  And we also don't
want to initialize the repository as SHA-1 initially, since that means
if we're cloning an empty repository, we'll have failed to honor the
GIT_DEFAULT_HASH variable and will end up with a SHA-1 repository, not a
SHA-256 repository.

Neither of those are appealing, so let's tell the repository
initialization code if we're doing a reinit like this, and if so, to
clear the extension if we're using SHA-1.  This makes sure we produce a
valid and functional repository and doesn't break any of our other use
cases.

Reported-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
Changes since v1:
* Use git_config_set_gently to make things work with SHA-1 repos as well
  as SHA-256 repos.

Diff-intervalle contre v1 :
1:  32d3357460 ! 1:  1becbbbb50 builtin/clone: avoid failure with GIT_DEFAULT_HASH
    @@ builtin/init-db.c: void initialize_repository_version(int hash_algo)
      		git_config_set("extensions.objectformat",
      			       hash_algos[hash_algo].name);
     +	else if (reinit)
    -+		git_config_set("extensions.objectformat", NULL);
    ++		git_config_set_gently("extensions.objectformat", NULL);
      }
      
      static int create_default_files(const char *template_path,

 builtin/clone.c   |  2 +-
 builtin/init-db.c |  6 ++++--
 cache.h           |  2 +-
 t/t5601-clone.sh  | 10 ++++++++++
 4 files changed, 16 insertions(+), 4 deletions(-)

Comments

Junio C Hamano Sept. 15, 2020, 4:31 a.m. UTC | #1
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> If a user is cloning a SHA-1 repository with GIT_DEFAULT_HASH set to
> "sha256", then we can end up with a repository where the repository
> format version is 0 but the extensions.objectformat key is set to
> "sha256".  This is both wrong (the user has a SHA-1 repository) and
> nonfunctional (because the extension cannot be used in a v0 repository).
>
> This happens because in a clone, we initially set up the repository, and
> then change its algorithm based on what the remote side tells us it's
> using.  We've initially set up the repository as SHA-256 in this case,
> and then later on reset the repository version without clearing the
> extension.
>
> We could just always set the extension in this case, but that would mean
> that our SHA-1 repositories weren't compatible with older Git versions,
> even though there's no reason why they shouldn't be.  And we also don't
> want to initialize the repository as SHA-1 initially, since that means
> if we're cloning an empty repository, we'll have failed to honor the
> GIT_DEFAULT_HASH variable and will end up with a SHA-1 repository, not a
> SHA-256 repository.
>
> Neither of those are appealing, so let's tell the repository
> initialization code if we're doing a reinit like this, and if so, to
> clear the extension if we're using SHA-1.  This makes sure we produce a
> valid and functional repository and doesn't break any of our other use
> cases.
>
> Reported-by: Matheus Tavares <matheus.bernardino@usp.br>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
> Changes since v1:
> * Use git_config_set_gently to make things work with SHA-1 repos as well
>   as SHA-256 repos.

Hmph, the reason why v1's bug weren't caught was because it was only
tested with GIT_TEST_DEFAULT_HASH=sha256, right?  I am wondering if
adding two new tests that run the same end-user scenario except
for the choice of hash algorithms would be a good way to ensure this
will stay fixed.  Am I mistaken?

Thanks anyway for a quick turnaround.

> Diff-intervalle contre v1 :
> 1:  32d3357460 ! 1:  1becbbbb50 builtin/clone: avoid failure with GIT_DEFAULT_HASH
>     @@ builtin/init-db.c: void initialize_repository_version(int hash_algo)
>       		git_config_set("extensions.objectformat",
>       			       hash_algos[hash_algo].name);
>      +	else if (reinit)
>     -+		git_config_set("extensions.objectformat", NULL);
>     ++		git_config_set_gently("extensions.objectformat", NULL);
>       }
>       
>       static int create_default_files(const char *template_path,
>
>  builtin/clone.c   |  2 +-
>  builtin/init-db.c |  6 ++++--
>  cache.h           |  2 +-
>  t/t5601-clone.sh  | 10 ++++++++++
>  4 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index b087ee40c2..925a2e3dd6 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1235,7 +1235,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		 * Now that we know what algorithm the remote side is using,
>  		 * let's set ours to the same thing.
>  		 */
> -		initialize_repository_version(hash_algo);
> +		initialize_repository_version(hash_algo, 1);
>  		repo_set_hash_algo(the_repository, hash_algo);
>  
>  		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
> diff --git a/builtin/init-db.c b/builtin/init-db.c
> index cd3e760541..01bc648d41 100644
> --- a/builtin/init-db.c
> +++ b/builtin/init-db.c
> @@ -179,7 +179,7 @@ static int needs_work_tree_config(const char *git_dir, const char *work_tree)
>  	return 1;
>  }
>  
> -void initialize_repository_version(int hash_algo)
> +void initialize_repository_version(int hash_algo, int reinit)
>  {
>  	char repo_version_string[10];
>  	int repo_version = GIT_REPO_VERSION;
> @@ -195,6 +195,8 @@ void initialize_repository_version(int hash_algo)
>  	if (hash_algo != GIT_HASH_SHA1)
>  		git_config_set("extensions.objectformat",
>  			       hash_algos[hash_algo].name);
> +	else if (reinit)
> +		git_config_set_gently("extensions.objectformat", NULL);
>  }
>  
>  static int create_default_files(const char *template_path,
> @@ -277,7 +279,7 @@ static int create_default_files(const char *template_path,
>  		free(ref);
>  	}
>  
> -	initialize_repository_version(fmt->hash_algo);
> +	initialize_repository_version(fmt->hash_algo, 0);
>  
>  	/* Check filemode trustability */
>  	path = git_path_buf(&buf, "config");
> diff --git a/cache.h b/cache.h
> index cee8aa5dc3..c0072d43b1 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -629,7 +629,7 @@ int path_inside_repo(const char *prefix, const char *path);
>  int init_db(const char *git_dir, const char *real_git_dir,
>  	    const char *template_dir, int hash_algo,
>  	    const char *initial_branch, unsigned int flags);
> -void initialize_repository_version(int hash_algo);
> +void initialize_repository_version(int hash_algo, int reinit);
>  
>  void sanitize_stdfds(void);
>  int daemonize(void);
> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index 15fb64c18d..570d989795 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh
> @@ -631,6 +631,16 @@ test_expect_success CASE_INSENSITIVE_FS 'colliding file detection' '
>  	test_i18ngrep "the following paths have collided" icasefs/warning
>  '
>  
> +test_expect_success 'clone with GIT_DEFAULT_HASH' '
> +	(
> +		sane_unset GIT_DEFAULT_HASH &&
> +		git init test
> +	) &&
> +	test_commit -C test foo &&
> +	git clone test test-clone &&
> +	git -C test-clone status
> +'
> +
>  partial_clone_server () {
>  	       SERVER="$1" &&
>
brian m. carlson Sept. 15, 2020, 10:51 p.m. UTC | #2
On 2020-09-15 at 04:31:14, Junio C Hamano wrote:
> > Changes since v1:
> > * Use git_config_set_gently to make things work with SHA-1 repos as well
> >   as SHA-256 repos.
> 
> Hmph, the reason why v1's bug weren't caught was because it was only
> tested with GIT_TEST_DEFAULT_HASH=sha256, right?  I am wondering if
> adding two new tests that run the same end-user scenario except
> for the choice of hash algorithms would be a good way to ensure this
> will stay fixed.  Am I mistaken?

Sure, I can do that.  I'll try to get a v3 out soon.
diff mbox series

Patch

diff --git a/builtin/clone.c b/builtin/clone.c
index b087ee40c2..925a2e3dd6 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1235,7 +1235,7 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 		 * Now that we know what algorithm the remote side is using,
 		 * let's set ours to the same thing.
 		 */
-		initialize_repository_version(hash_algo);
+		initialize_repository_version(hash_algo, 1);
 		repo_set_hash_algo(the_repository, hash_algo);
 
 		mapped_refs = wanted_peer_refs(refs, &remote->fetch);
diff --git a/builtin/init-db.c b/builtin/init-db.c
index cd3e760541..01bc648d41 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -179,7 +179,7 @@  static int needs_work_tree_config(const char *git_dir, const char *work_tree)
 	return 1;
 }
 
-void initialize_repository_version(int hash_algo)
+void initialize_repository_version(int hash_algo, int reinit)
 {
 	char repo_version_string[10];
 	int repo_version = GIT_REPO_VERSION;
@@ -195,6 +195,8 @@  void initialize_repository_version(int hash_algo)
 	if (hash_algo != GIT_HASH_SHA1)
 		git_config_set("extensions.objectformat",
 			       hash_algos[hash_algo].name);
+	else if (reinit)
+		git_config_set_gently("extensions.objectformat", NULL);
 }
 
 static int create_default_files(const char *template_path,
@@ -277,7 +279,7 @@  static int create_default_files(const char *template_path,
 		free(ref);
 	}
 
-	initialize_repository_version(fmt->hash_algo);
+	initialize_repository_version(fmt->hash_algo, 0);
 
 	/* Check filemode trustability */
 	path = git_path_buf(&buf, "config");
diff --git a/cache.h b/cache.h
index cee8aa5dc3..c0072d43b1 100644
--- a/cache.h
+++ b/cache.h
@@ -629,7 +629,7 @@  int path_inside_repo(const char *prefix, const char *path);
 int init_db(const char *git_dir, const char *real_git_dir,
 	    const char *template_dir, int hash_algo,
 	    const char *initial_branch, unsigned int flags);
-void initialize_repository_version(int hash_algo);
+void initialize_repository_version(int hash_algo, int reinit);
 
 void sanitize_stdfds(void);
 int daemonize(void);
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 15fb64c18d..570d989795 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -631,6 +631,16 @@  test_expect_success CASE_INSENSITIVE_FS 'colliding file detection' '
 	test_i18ngrep "the following paths have collided" icasefs/warning
 '
 
+test_expect_success 'clone with GIT_DEFAULT_HASH' '
+	(
+		sane_unset GIT_DEFAULT_HASH &&
+		git init test
+	) &&
+	test_commit -C test foo &&
+	git clone test test-clone &&
+	git -C test-clone status
+'
+
 partial_clone_server () {
 	       SERVER="$1" &&