diff mbox series

[on,master,v2] revision: use commit graph in get_reference()

Message ID 20181207215034.213211-1-jonathantanmy@google.com (mailing list archive)
State New, archived
Headers show
Series [on,master,v2] revision: use commit graph in get_reference() | expand

Commit Message

Jonathan Tan Dec. 7, 2018, 9:50 p.m. UTC
When fetching into a repository, a connectivity check is first made by
check_exist_and_connected() in builtin/fetch.c that runs:

  git rev-list --objects --stdin --not --all --quiet <(list of objects)

If the client repository has many refs, this command can be slow,
regardless of the nature of the server repository or what is being
fetched. A profiler reveals that most of the time is spent in
setup_revisions() (approx. 60/63), and of the time spent in
setup_revisions(), most of it is spent in parse_object() (approx.
49/60). This is because setup_revisions() parses the target of every ref
(from "--all"), and parse_object() reads the buffer of the object.

Reading the buffer is unnecessary if the repository has a commit graph
and if the ref points to a commit (which is typically the case). This
patch uses the commit graph wherever possible; on my computer, when I
run the above command with a list of 1 object on a many-ref repository,
I get a speedup from 1.8s to 1.0s.

Another way to accomplish this effect would be to modify parse_object()
to use the commit graph if possible; however, I did not want to change
parse_object()'s current behavior of always checking the object
signature of the returned object.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
This patch is now on master.

v2 makes use of the optimization Stolee describes in [1], except that I
have arranged the functions slightly differently. In particular, I
didn't want to add even more ways to obtain objects, so I let
parse_commit_in_graph() be able to take in either a commit shell or an
OID, and did not create the parse_probably_commit() function he
suggested. But I'm not really attached to this design choice, and can
change it if requested.

[1] https://public-inbox.org/git/aa0cd481-c135-47aa-2a69-e3dc71661caa@gmail.com/
---
 commit-graph.c             | 38 ++++++++++++++++++++++++++++----------
 commit-graph.h             | 12 ++++++++----
 commit.c                   |  2 +-
 revision.c                 |  5 ++++-
 t/helper/test-repository.c |  4 ++--
 5 files changed, 43 insertions(+), 18 deletions(-)

Comments

Junio C Hamano Dec. 9, 2018, 12:51 a.m. UTC | #1
Jonathan Tan <jonathantanmy@google.com> writes:

> When fetching into a repository, a connectivity check is first made by
> check_exist_and_connected() in builtin/fetch.c that runs:
> ...
> Another way to accomplish this effect would be to modify parse_object()
> to use the commit graph if possible; however, I did not want to change
> parse_object()'s current behavior of always checking the object
> signature of the returned object.

Sounds good.

> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> This patch is now on master.

OK.  

Obviously that won't apply to the base for v1 without conflicts, and
it of course applies cleanly on 'master', but the result of doing so
will cause the same conflicts when sb/more-repo-in-api is merged on
top, which means that the same conflicts need to be resolved if this
wants to be merged to 'next' (or 'pu', FWIW).

> diff --git a/commit-graph.c b/commit-graph.c
> index 40c855f185..a571b523b7 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -374,24 +375,41 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
>  	}
>  }
>  
> -static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
> +static struct commit *parse_commit_in_graph_one(struct repository *r,
> +						struct commit_graph *g,
> +						struct commit *shell,
> +						const struct object_id *oid)

Now the complexity of the behaviour of this function deserves to be
documented in a comment in front.  Let me see if I can get it
correctly without such a comment by explaining the function aloud.

The caller may or may not have already obtained an in-core commit
object for a given object name, so shell could be NULL but otherwise
it could be used for optimization.  When shell==NULL, the function
looks up the commit object using the oid parameter instead.  The
returned in-core commit has the parents etc. filled as if we ran
parse_commit() on it.  If the commit is not yet in the graph, the
caller may get a NULL even if the commit exists.

>  {
>  	uint32_t pos;
>  
> -	if (item->object.parsed)
> -		return 1;
> +	if (shell && shell->object.parsed)
> +		return shell;
>  
> -	if (find_commit_in_graph(item, g, &pos))
> -		return fill_commit_in_graph(item, g, pos);
> +	if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
> +		pos = shell->graph_pos;
> +	} else if (bsearch_graph(g, shell ? &shell->object.oid : oid, &pos)) {
> +		/* bsearch_graph sets pos */

Please spell an empty statement like so:

		; /* comment */

> +	} else {
> +		return NULL;

We come here when the commit (either came from shell or from oid) is
not found by bsearch_graph().  "Is the caller prepared for it, and
how?" is a natural question a reader would have.  Let's read on.

> +	}
>  
> -	return 0;
> +	if (!shell) {
> +		shell = lookup_commit(r, oid);
> +		if (!shell)
> +			return NULL;
> +	}
> +
> +	fill_commit_in_graph(shell, g, pos);
> +	return shell;
>  }
>  
> -int parse_commit_in_graph(struct repository *r, struct commit *item)
> +struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
> +				     const struct object_id *oid)
>  {
>  	if (!prepare_commit_graph(r))
>  		return 0;
> -	return parse_commit_in_graph_one(r->objects->commit_graph, item);
> +	return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
> +					 oid);
>  }
>  
>  void load_commit_graph_info(struct repository *r, struct commit *item)
> @@ -1025,7 +1043,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
>  		}
>  
>  		graph_commit = lookup_commit(r, &cur_oid);
> -		if (!parse_commit_in_graph_one(g, graph_commit))
> +		if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
>  			graph_report("failed to parse %s from commit-graph",
>  				     oid_to_hex(&cur_oid));
>  	}
> diff --git a/commit-graph.h b/commit-graph.h
> index 9db40b4d3a..8b7b5985dc 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -13,16 +13,20 @@ struct commit;
>  char *get_commit_graph_filename(const char *obj_dir);
>  
>  /*
> - * Given a commit struct, try to fill the commit struct info, including:
> + * If the given commit (identified by shell->object.oid or oid) is in the
> + * commit graph, returns a commit struct (reusing shell if it is not NULL)
> + * including the following info:
>   *  1. tree object
>   *  2. date
>   *  3. parents.
>   *
> - * Returns 1 if and only if the commit was found in the packed graph.
> + * If not, returns NULL. See parse_commit_buffer() for the fallback after this
> + * call.
>   *
> - * See parse_commit_buffer() for the fallback after this call.
> + * Either shell or oid must be non-NULL. If both are non-NULL, oid is ignored.
>   */

OK, the eventual caller is the caller of this thing, which should
have been prepared to see NULL for a commit that is too new.  So
that should be OK.

> -int parse_commit_in_graph(struct repository *r, struct commit *item);
> +struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
> +				     const struct object_id *oid);
>  
>  /*
>   * It is possible that we loaded commit contents from the commit buffer,
> diff --git a/commit.c b/commit.c
> index d13a7bc374..88eb580c5a 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -456,7 +456,7 @@ int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
>  		return -1;
>  	if (item->object.parsed)
>  		return 0;
> -	if (use_commit_graph && parse_commit_in_graph(the_repository, item))
> +	if (use_commit_graph && parse_commit_in_graph(the_repository, item, NULL))
>  		return 0;
>  	buffer = read_object_file(&item->object.oid, &type, &size);
>  	if (!buffer)
> diff --git a/revision.c b/revision.c
> index 13e0519c02..05fddb5880 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -213,7 +213,10 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
>  {
>  	struct object *object;
>  
> -	object = parse_object(revs->repo, oid);
> +	object = (struct object *) parse_commit_in_graph(revs->repo, NULL, oid);
> +	if (!object)
> +		object = parse_object(revs->repo, oid);

OK and this is such a caller.  I think a general rule of thumb is
that we need to access recent history a lot more often than the
older part of the history, and having to fall back for more recent
commits feels a bit disturbing, but I do not see an easy way to
reverse the performance characteristics offhand.
Junio C Hamano Dec. 9, 2018, 1:49 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> When fetching into a repository, a connectivity check is first made by
>> check_exist_and_connected() in builtin/fetch.c that runs:
>> ...
>> Another way to accomplish this effect would be to modify parse_object()
>> to use the commit graph if possible; however, I did not want to change
>> parse_object()'s current behavior of always checking the object
>> signature of the returned object.
>
> Sounds good.
>
>> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
>> ---
>> This patch is now on master.
>
> OK.  
>
> Obviously that won't apply to the base for v1 without conflicts, and
> it of course applies cleanly on 'master', but the result of doing so
> will cause the same conflicts when sb/more-repo-in-api is merged on
> top, which means that the same conflicts need to be resolved if this
> wants to be merged to 'next' (or 'pu', FWIW).

So,... as I had to do the reverse rebase anyway, here is the
difference since the previous round, which I came up with by
comparing these two:

 (A) merge 'sb/more-repo-in-api' to 'master' and then merge v1 of
     this topic to the result.

 (B) apply your patch to 'master', and then merge
     'sb/more-repo-in-api' to the result.

diff --git a/commit-graph.c b/commit-graph.c
index f78a8e96b5..74a17789f8 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -286,7 +286,8 @@ void close_commit_graph(struct repository *r)
 	r->objects->commit_graph = NULL;
 }
 
-static int bsearch_graph(struct commit_graph *g, struct object_id *oid, uint32_t *pos)
+static int bsearch_graph(struct commit_graph *g, const struct object_id *oid,
+			 uint32_t *pos)
 {
 	return bsearch_hash(oid->hash, g->chunk_oid_fanout,
 			    g->chunk_oid_lookup, g->hash_len, pos);
@@ -377,26 +378,41 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
 	}
 }
 
-static int parse_commit_in_graph_one(struct repository *r,
-				     struct commit_graph *g,
-				     struct commit *item)
+static struct commit *parse_commit_in_graph_one(struct repository *r,
+						struct commit_graph *g,
+						struct commit *shell,
+						const struct object_id *oid)
 {
 	uint32_t pos;
 
-	if (item->object.parsed)
-		return 1;
+	if (shell && shell->object.parsed)
+		return shell;
 
-	if (find_commit_in_graph(item, g, &pos))
-		return fill_commit_in_graph(r, item, g, pos);
+	if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
+		pos = shell->graph_pos;
+	} else if (bsearch_graph(g, shell ? &shell->object.oid : oid, &pos)) {
+		/* bsearch_graph sets pos */
+	} else {
+		return NULL;
+	}
 
-	return 0;
+	if (!shell) {
+		shell = lookup_commit(r, oid);
+		if (!shell)
+			return NULL;
+	}
+
+	fill_commit_in_graph(r, shell, g, pos);
+	return shell;
 }
 
-int parse_commit_in_graph(struct repository *r, struct commit *item)
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+				     const struct object_id *oid)
 {
 	if (!prepare_commit_graph(r))
 		return 0;
-	return parse_commit_in_graph_one(r, r->objects->commit_graph, item);
+	return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
+					 oid);
 }
 
 void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1033,7 +1049,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g)
 		}
 
 		graph_commit = lookup_commit(r, &cur_oid);
-		if (!parse_commit_in_graph_one(r, g, graph_commit))
+		if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
 			graph_report("failed to parse %s from commit-graph",
 				     oid_to_hex(&cur_oid));
 	}
diff --git a/commit-graph.h b/commit-graph.h
index 9db40b4d3a..8b7b5985dc 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -13,16 +13,20 @@ struct commit;
 char *get_commit_graph_filename(const char *obj_dir);
 
 /*
- * Given a commit struct, try to fill the commit struct info, including:
+ * If the given commit (identified by shell->object.oid or oid) is in the
+ * commit graph, returns a commit struct (reusing shell if it is not NULL)
+ * including the following info:
  *  1. tree object
  *  2. date
  *  3. parents.
  *
- * Returns 1 if and only if the commit was found in the packed graph.
+ * If not, returns NULL. See parse_commit_buffer() for the fallback after this
+ * call.
  *
- * See parse_commit_buffer() for the fallback after this call.
+ * Either shell or oid must be non-NULL. If both are non-NULL, oid is ignored.
  */
-int parse_commit_in_graph(struct repository *r, struct commit *item);
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+				     const struct object_id *oid);
 
 /*
  * It is possible that we loaded commit contents from the commit buffer,
diff --git a/commit.c b/commit.c
index a5333c7ac6..da7a1d3262 100644
--- a/commit.c
+++ b/commit.c
@@ -462,7 +462,7 @@ int repo_parse_commit_internal(struct repository *r,
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	if (use_commit_graph && parse_commit_in_graph(r, item))
+	if (use_commit_graph && parse_commit_in_graph(r, item, NULL))
 		return 0;
 	buffer = repo_read_object_file(r, &item->object.oid, &type, &size);
 	if (!buffer)
diff --git a/revision.c b/revision.c
index 22aa109c14..05fddb5880 100644
--- a/revision.c
+++ b/revision.c
@@ -213,19 +213,9 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
 {
 	struct object *object;
 
-	/*
-	 * If the repository has commit graphs, repo_parse_commit() avoids
-	 * reading the object buffer, so use it whenever possible.
-	 */
-	if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
-		struct commit *c = lookup_commit(revs->repo, oid);
-		if (!repo_parse_commit(revs->repo, c))
-			object = (struct object *) c;
-		else
-			object = NULL;
-	} else {
+	object = (struct object *) parse_commit_in_graph(revs->repo, NULL, oid);
+	if (!object)
 		object = parse_object(revs->repo, oid);
-	}
 
 	if (!object) {
 		if (revs->ignore_missing)
diff --git a/t/helper/test-repository.c b/t/helper/test-repository.c
index f7f8618445..689a0b652e 100644
--- a/t/helper/test-repository.c
+++ b/t/helper/test-repository.c
@@ -27,7 +27,7 @@ static void test_parse_commit_in_graph(const char *gitdir, const char *worktree,
 
 	c = lookup_commit(&r, commit_oid);
 
-	if (!parse_commit_in_graph(&r, c))
+	if (!parse_commit_in_graph(&r, c, NULL))
 		die("Couldn't parse commit");
 
 	printf("%"PRItime, c->date);
@@ -62,7 +62,7 @@ static void test_get_commit_tree_in_graph(const char *gitdir,
 	 * get_commit_tree_in_graph does not automatically parse the commit, so
 	 * parse it first.
 	 */
-	if (!parse_commit_in_graph(&r, c))
+	if (!parse_commit_in_graph(&r, c, NULL))
 		die("Couldn't parse commit");
 	tree = get_commit_tree_in_graph(&r, c);
 	if (!tree)
Jeff King Dec. 11, 2018, 10:54 a.m. UTC | #3
On Sun, Dec 09, 2018 at 09:51:28AM +0900, Junio C Hamano wrote:

> > -static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
> > +static struct commit *parse_commit_in_graph_one(struct repository *r,
> > +						struct commit_graph *g,
> > +						struct commit *shell,
> > +						const struct object_id *oid)
> 
> Now the complexity of the behaviour of this function deserves to be
> documented in a comment in front.  Let me see if I can get it
> correctly without such a comment by explaining the function aloud.
> 
> The caller may or may not have already obtained an in-core commit
> object for a given object name, so shell could be NULL but otherwise
> it could be used for optimization.  When shell==NULL, the function
> looks up the commit object using the oid parameter instead.  The
> returned in-core commit has the parents etc. filled as if we ran
> parse_commit() on it.  If the commit is not yet in the graph, the
> caller may get a NULL even if the commit exists.

Yeah, this was the part that took me a bit to figure out, as well. The
optimization here is really just avoiding a call to lookup_commit(),
which will do a single hash-table lookup. I wonder if that's actually
worth this more complex interface (as opposed to just always taking an
oid and then always returning a "struct commit", which could be old or
new).

-Peff
Jonathan Tan Dec. 12, 2018, 7:58 p.m. UTC | #4
> On Sun, Dec 09, 2018 at 09:51:28AM +0900, Junio C Hamano wrote:
> 
> > > -static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
> > > +static struct commit *parse_commit_in_graph_one(struct repository *r,
> > > +						struct commit_graph *g,
> > > +						struct commit *shell,
> > > +						const struct object_id *oid)
> > 
> > Now the complexity of the behaviour of this function deserves to be
> > documented in a comment in front.  Let me see if I can get it
> > correctly without such a comment by explaining the function aloud.
> > 
> > The caller may or may not have already obtained an in-core commit
> > object for a given object name, so shell could be NULL but otherwise
> > it could be used for optimization.  When shell==NULL, the function
> > looks up the commit object using the oid parameter instead.  The
> > returned in-core commit has the parents etc. filled as if we ran
> > parse_commit() on it.  If the commit is not yet in the graph, the
> > caller may get a NULL even if the commit exists.

In the next revision, I'll unify parse_commit_in_graph_one() (quoted
above) with parse_commit_in_graph(), so that the comment I wrote for the
latter can cover the entire functionality. I think the comment covers
the details that you outline here.

> Yeah, this was the part that took me a bit to figure out, as well. The
> optimization here is really just avoiding a call to lookup_commit(),
> which will do a single hash-table lookup. I wonder if that's actually
> worth this more complex interface (as opposed to just always taking an
> oid and then always returning a "struct commit", which could be old or
> new).

Avoidance of lookup_commit() is more important than an optimization, I
think. Here, we call lookup_commit() only when we know that that object
is a commit (by its presence in a commit graph). If we just called it
blindly, we might mistakenly create a commit for that hash when it is
actually an object of another type. (We could inline lookup_commit() in
parse_commit_in_graph_one(), removing the object creation part, but that
adds complexity as well.)
Jeff King Dec. 13, 2018, 1:27 a.m. UTC | #5
On Wed, Dec 12, 2018 at 11:58:12AM -0800, Jonathan Tan wrote:

> > Yeah, this was the part that took me a bit to figure out, as well. The
> > optimization here is really just avoiding a call to lookup_commit(),
> > which will do a single hash-table lookup. I wonder if that's actually
> > worth this more complex interface (as opposed to just always taking an
> > oid and then always returning a "struct commit", which could be old or
> > new).
> 
> Avoidance of lookup_commit() is more important than an optimization, I
> think. Here, we call lookup_commit() only when we know that that object
> is a commit (by its presence in a commit graph). If we just called it
> blindly, we might mistakenly create a commit for that hash when it is
> actually an object of another type. (We could inline lookup_commit() in
> parse_commit_in_graph_one(), removing the object creation part, but that
> adds complexity as well.)

I was thinking we would only do so in the happy path when we find a
commit. I.e., something like:

  obj = lookup_object(oid); /* does not auto-vivify */
  if (obj && obj->parsed)
	return obj;

  if (we_have_it_in_commit_graph) {
	commit = obj || lookup_commit(oid);
	fill_in_details_from_commit_graph(commit);
	return &commit->obj;
  } else {
	return parse_object(oid);
  }

which is more along the lines of that parse_probably_commit() that
Stolee mentioned.

-Peff
Derrick Stolee Dec. 13, 2018, 4:20 p.m. UTC | #6
On 12/12/2018 8:27 PM, Jeff King wrote:
> On Wed, Dec 12, 2018 at 11:58:12AM -0800, Jonathan Tan wrote:
>
>>> Yeah, this was the part that took me a bit to figure out, as well. The
>>> optimization here is really just avoiding a call to lookup_commit(),
>>> which will do a single hash-table lookup. I wonder if that's actually
>>> worth this more complex interface (as opposed to just always taking an
>>> oid and then always returning a "struct commit", which could be old or
>>> new).
>> Avoidance of lookup_commit() is more important than an optimization, I
>> think. Here, we call lookup_commit() only when we know that that object
>> is a commit (by its presence in a commit graph). If we just called it
>> blindly, we might mistakenly create a commit for that hash when it is
>> actually an object of another type. (We could inline lookup_commit() in
>> parse_commit_in_graph_one(), removing the object creation part, but that
>> adds complexity as well.)
> I was thinking we would only do so in the happy path when we find a
> commit. I.e., something like:
>
>    obj = lookup_object(oid); /* does not auto-vivify */
>    if (obj && obj->parsed)
> 	return obj;
>
>    if (we_have_it_in_commit_graph) {
> 	commit = obj || lookup_commit(oid);
> 	fill_in_details_from_commit_graph(commit);
> 	return &commit->obj;
>    } else {
> 	return parse_object(oid);
>    }
>
> which is more along the lines of that parse_probably_commit() that
> Stolee mentioned.

This approach is what I had in mind. Thanks for making it more concrete!

-Stolee
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 40c855f185..a571b523b7 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -286,7 +286,8 @@  void close_commit_graph(struct repository *r)
 	r->objects->commit_graph = NULL;
 }
 
-static int bsearch_graph(struct commit_graph *g, struct object_id *oid, uint32_t *pos)
+static int bsearch_graph(struct commit_graph *g, const struct object_id *oid,
+			 uint32_t *pos)
 {
 	return bsearch_hash(oid->hash, g->chunk_oid_fanout,
 			    g->chunk_oid_lookup, g->hash_len, pos);
@@ -374,24 +375,41 @@  static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
 	}
 }
 
-static int parse_commit_in_graph_one(struct commit_graph *g, struct commit *item)
+static struct commit *parse_commit_in_graph_one(struct repository *r,
+						struct commit_graph *g,
+						struct commit *shell,
+						const struct object_id *oid)
 {
 	uint32_t pos;
 
-	if (item->object.parsed)
-		return 1;
+	if (shell && shell->object.parsed)
+		return shell;
 
-	if (find_commit_in_graph(item, g, &pos))
-		return fill_commit_in_graph(item, g, pos);
+	if (shell && shell->graph_pos != COMMIT_NOT_FROM_GRAPH) {
+		pos = shell->graph_pos;
+	} else if (bsearch_graph(g, shell ? &shell->object.oid : oid, &pos)) {
+		/* bsearch_graph sets pos */
+	} else {
+		return NULL;
+	}
 
-	return 0;
+	if (!shell) {
+		shell = lookup_commit(r, oid);
+		if (!shell)
+			return NULL;
+	}
+
+	fill_commit_in_graph(shell, g, pos);
+	return shell;
 }
 
-int parse_commit_in_graph(struct repository *r, struct commit *item)
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+				     const struct object_id *oid)
 {
 	if (!prepare_commit_graph(r))
 		return 0;
-	return parse_commit_in_graph_one(r->objects->commit_graph, item);
+	return parse_commit_in_graph_one(r, r->objects->commit_graph, shell,
+					 oid);
 }
 
 void load_commit_graph_info(struct repository *r, struct commit *item)
@@ -1025,7 +1043,7 @@  int verify_commit_graph(struct repository *r, struct commit_graph *g)
 		}
 
 		graph_commit = lookup_commit(r, &cur_oid);
-		if (!parse_commit_in_graph_one(g, graph_commit))
+		if (!parse_commit_in_graph_one(r, g, graph_commit, NULL))
 			graph_report("failed to parse %s from commit-graph",
 				     oid_to_hex(&cur_oid));
 	}
diff --git a/commit-graph.h b/commit-graph.h
index 9db40b4d3a..8b7b5985dc 100644
--- a/commit-graph.h
+++ b/commit-graph.h
@@ -13,16 +13,20 @@  struct commit;
 char *get_commit_graph_filename(const char *obj_dir);
 
 /*
- * Given a commit struct, try to fill the commit struct info, including:
+ * If the given commit (identified by shell->object.oid or oid) is in the
+ * commit graph, returns a commit struct (reusing shell if it is not NULL)
+ * including the following info:
  *  1. tree object
  *  2. date
  *  3. parents.
  *
- * Returns 1 if and only if the commit was found in the packed graph.
+ * If not, returns NULL. See parse_commit_buffer() for the fallback after this
+ * call.
  *
- * See parse_commit_buffer() for the fallback after this call.
+ * Either shell or oid must be non-NULL. If both are non-NULL, oid is ignored.
  */
-int parse_commit_in_graph(struct repository *r, struct commit *item);
+struct commit *parse_commit_in_graph(struct repository *r, struct commit *shell,
+				     const struct object_id *oid);
 
 /*
  * It is possible that we loaded commit contents from the commit buffer,
diff --git a/commit.c b/commit.c
index d13a7bc374..88eb580c5a 100644
--- a/commit.c
+++ b/commit.c
@@ -456,7 +456,7 @@  int parse_commit_internal(struct commit *item, int quiet_on_missing, int use_com
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	if (use_commit_graph && parse_commit_in_graph(the_repository, item))
+	if (use_commit_graph && parse_commit_in_graph(the_repository, item, NULL))
 		return 0;
 	buffer = read_object_file(&item->object.oid, &type, &size);
 	if (!buffer)
diff --git a/revision.c b/revision.c
index 13e0519c02..05fddb5880 100644
--- a/revision.c
+++ b/revision.c
@@ -213,7 +213,10 @@  static struct object *get_reference(struct rev_info *revs, const char *name,
 {
 	struct object *object;
 
-	object = parse_object(revs->repo, oid);
+	object = (struct object *) parse_commit_in_graph(revs->repo, NULL, oid);
+	if (!object)
+		object = parse_object(revs->repo, oid);
+
 	if (!object) {
 		if (revs->ignore_missing)
 			return object;
diff --git a/t/helper/test-repository.c b/t/helper/test-repository.c
index 6a84a53efb..63b928a883 100644
--- a/t/helper/test-repository.c
+++ b/t/helper/test-repository.c
@@ -22,7 +22,7 @@  static void test_parse_commit_in_graph(const char *gitdir, const char *worktree,
 
 	c = lookup_commit(&r, commit_oid);
 
-	if (!parse_commit_in_graph(&r, c))
+	if (!parse_commit_in_graph(&r, c, NULL))
 		die("Couldn't parse commit");
 
 	printf("%"PRItime, c->date);
@@ -52,7 +52,7 @@  static void test_get_commit_tree_in_graph(const char *gitdir,
 	 * get_commit_tree_in_graph does not automatically parse the commit, so
 	 * parse it first.
 	 */
-	if (!parse_commit_in_graph(&r, c))
+	if (!parse_commit_in_graph(&r, c, NULL))
 		die("Couldn't parse commit");
 	tree = get_commit_tree_in_graph(&r, c);
 	if (!tree)