diff mbox series

[RFC,1/2] refs: make _advance() check struct repo, not flag

Message ID 9187eab52552630863285ef5743a107ccc555495.1629933380.git.jonathantanmy@google.com (mailing list archive)
State New, archived
Headers show
Series First steps towards iterating over submodule refs | expand

Commit Message

Jonathan Tan Aug. 25, 2021, 11:23 p.m. UTC
Currently, ref iterators access the object store each time they advance
if and only if the boolean flag DO_FOR_EACH_INCLUDE_BROKEN is unset.
(The iterators access the object store because, if
DO_FOR_EACH_INCLUDE_BROKEN is unset, they need to attempt to resolve
each ref to determine that it is not broken.)

Also, the object store accessed is always that of the_repository, making
it impossible to iterate over a submodule's refs without
DO_FOR_EACH_INCLUDE_BROKEN (unless add_submodule_odb() is used).

As a first step in resolving both these problems, replace the
DO_FOR_EACH_INCLUDE_BROKEN flag with a struct repository pointer. This
commit is a mechanical conversion - whenever DO_FOR_EACH_INCLUDE_BROKEN
is set, a NULL repository (representing access to no object store) is
used instead, and whenever DO_FOR_EACH_INCLUDE_BROKEN is unset, a
non-NULL repository (representing access to that repository's object
store) is used instead. Right now, the locations in which
non-the_repository support needs to be added are marked with BUG()
statements - in a future patch, these will be replaced. (NEEDSWORK: in
this RFC patch set, this has not been done)

I have considered and rejected the following design alternatives:

- Making all ref stores not access the object store during their
  _advance() callbacks, and making ref_iterator_advance() be responsible
  for checking the object store - thus, simplifying the code in that the
  logic of checking for the flag (current) or the pointer (after the
  equivalent of this commit) is only in one place instead of in every
  ref store's callback. However, the ref stores already make use of this
  flag for another reason - for determining if refs are resolvable when
  writing (search for "REF_STORE_ODB"). Thus, I decided to retain each
  ref store's knowledge of this flag.

- Teaching the ref iterator mechanism to never skip any ref. This has
  the same problem as above, and furthermore, all callers now need to
  handle unresolvable refs.

- Change the _advance() callback to also have a repository object
  parameter, and either skip or not skip depending on whether that
  parameter is NULL. This burdens callers to have to carry this
  information along with the iterator, and such calling code may be
  unclear as to why that parameter can be NULL in some cases and cannot
  in others.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 refs.c                | 48 +++++++++++++++++++++++--------------------
 refs/debug.c          |  4 ++--
 refs/files-backend.c  | 19 +++++++++++++----
 refs/packed-backend.c | 16 +++++++++++----
 refs/refs-internal.h  | 24 ++++++++++++----------
 5 files changed, 68 insertions(+), 43 deletions(-)

Comments

Han-Wen Nienhuys Aug. 26, 2021, 4:39 p.m. UTC | #1
On Thu, Aug 26, 2021 at 1:23 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> Currently, ref iterators access the object store each time they advance
> if and only if the boolean flag DO_FOR_EACH_INCLUDE_BROKEN is unset.
> (The iterators access the object store because, if
> DO_FOR_EACH_INCLUDE_BROKEN is unset, they need to attempt to resolve
> each ref to determine that it is not broken.)
>
> Also, the object store accessed is always that of the_repository, making
> it impossible to iterate over a submodule's refs without
> DO_FOR_EACH_INCLUDE_BROKEN (unless add_submodule_odb() is used).
>
> As a first step in resolving both these problems, replace the
> DO_FOR_EACH_INCLUDE_BROKEN flag with a struct repository pointer. This
> commit is a mechanical conversion - whenever DO_FOR_EACH_INCLUDE_BROKEN
> is set, a NULL repository (representing access to no object store) is
> used instead, and whenever DO_FOR_EACH_INCLUDE_BROKEN is unset, a
> non-NULL repository (representing access to that repository's object
> store) is used instead. Right now, the locations in which
> non-the_repository support needs to be added are marked with BUG()
> statements - in a future patch, these will be replaced. (NEEDSWORK: in
> this RFC patch set, this has not been done)

from a design perspective, it would be nice if the ref backend
wouldn't need to know about the object store. Can't this be hidden in
the layer in refs.c that calls into the backends?

If they have to know about the object store, have you considered
passing the repository pointer
in xxx_ref_store_create() ? Then there is no possibliity to mismatch
the repository pointers and with the ref store.

> - Making all ref stores not access the object store during their
>   _advance() callbacks, and making ref_iterator_advance() be responsible
>   for checking the object store - thus, simplifying the code in that the
>   logic of checking for the flag (current) or the pointer (after the
>   equivalent of this commit) is only in one place instead of in every
>   ref store's callback. However, the ref stores already make use of this
>   flag for another reason - for determining if refs are resolvable when
>   writing (search for "REF_STORE_ODB"). Thus, I decided to retain each

I looked, but I couldn't figure out how this flag is used.
Jonathan Tan Aug. 26, 2021, 10:24 p.m. UTC | #2
> from a design perspective, it would be nice if the ref backend
> wouldn't need to know about the object store. Can't this be hidden in
> the layer in refs.c that calls into the backends?

Thanks for taking a look.

The answer requires additional context, so I'll answer this at the end
of this email.

> If they have to know about the object store, have you considered
> passing the repository pointer
> in xxx_ref_store_create() ? Then there is no possibliity to mismatch
> the repository pointers and with the ref store.

I thought about that, but didn't want to make things worse - the effort
in this patch set is, after all, to attempt to increase the dissociation
between the ref stores and a certain object store (that is,
the_repository's object store), and I thought that reintroducing an
association (albeit to arbitrary object stores instead of a hardcoded
object store) would be a step back.

But this may be the way to go - the ref stores already have a gitdir
field that we could replace with a struct repository field.

> > - Making all ref stores not access the object store during their
> >   _advance() callbacks, and making ref_iterator_advance() be responsible
> >   for checking the object store - thus, simplifying the code in that the
> >   logic of checking for the flag (current) or the pointer (after the
> >   equivalent of this commit) is only in one place instead of in every
> >   ref store's callback. However, the ref stores already make use of this
> >   flag for another reason - for determining if refs are resolvable when
> >   writing (search for "REF_STORE_ODB"). Thus, I decided to retain each
> 
> I looked, but I couldn't figure out how this flag is used.

I was thinking of files_ref_iterator_begin() setting a local variable
required_flags. Somehow I thought that files_pack_refs() relied on
files_ref_iterator_begin() setting that variable, but now I see that
that's not true - both functions are independently checking that the
underlying ref store supports ODB access, so I can remove ODB from
files_ref_iterator_begin() if I want to.

To go back to the question at the top, now I agree that hiding the ODB
access in _advance() in the layer in refs.c is possible. The last part
still accessing the ODB is files_pack_refs(), I think. Refactoring that
is possible, but I'll leave that to another patch set.

If you or anyone else has more questions or comments, please reply - and
in the meantime, I'll update this patch set to move the ODB access in
_advance() to the layer in refs.c.
Glen Choo Sept. 14, 2021, 10:41 p.m. UTC | #3
On Thu, Aug 26, 2021 at 03:24:39PM -0700, Jonathan Tan wrote:
> > If they have to know about the object store, have you considered
> > passing the repository pointer
> > in xxx_ref_store_create() ? Then there is no possibliity to mismatch
> > the repository pointers and with the ref store.
> 
> I thought about that, but didn't want to make things worse - the effort
> in this patch set is, after all, to attempt to increase the dissociation
> between the ref stores and a certain object store (that is,
> the_repository's object store), and I thought that reintroducing an
> association (albeit to arbitrary object stores instead of a hardcoded
> object store) would be a step back.
> 
> But this may be the way to go - the ref stores already have a gitdir
> field that we could replace with a struct repository field.

I'm curious about how we'd want to resolve the general problem of ref
stores referencing odbs. 

A discussion I had with Jonathan Nieder suggests that ref stores are
doing two slightly related, but not equivalent things:

- a logical ref database that preserves its own consistency
- a layer of ref storage in such a ref database

In the current state of affairs, the files ref store and the packed ref
store seem to behave as a single logical ref database. An example of
this (that I care about in particular) is in refs/files-backend.c where
the files backend validates oids using the_repository's odb.
refs/packed-backend.c doesn't do any such validation, and presumably
just relies on the correctness of refs/files-backend.c. I assume that
this also explains why some functions in refs_be_packed are stubs.

The answer to whether or not a ref store should refer to a certain
object store seems unresolved because a ref store is trying to do two
separate things. Perhaps it is reasonable to associate a ref database
with an object store (so that it can validate its refs), but we would
prefer to dissociate the physical ref storage layer from the object
store. (I'm paraphrasing Johnathan Nieder here, this isn't an original
thought).

Perhaps this is a question we want to resolve when considering reftable
and other ref databases.
Han-Wen Nienhuys Sept. 15, 2021, 7:35 a.m. UTC | #4
On Wed, Sep 15, 2021 at 12:41 AM Glen Choo <chooglen@google.com> wrote:
> In the current state of affairs, the files ref store and the packed ref
> store seem to behave as a single logical ref database. An example of
> this (that I care about in particular) is in refs/files-backend.c where
> the files backend validates oids using the_repository's odb.
> refs/packed-backend.c doesn't do any such validation, and presumably
> just relies on the correctness of refs/files-backend.c. I assume that
> this also explains why some functions in refs_be_packed are stubs.

The loose/packed storage is implemented in terms of files backend (the
public entry point) that defers to a packed backend in some cases. The
latter is implemented as a ref backend, but for no good reason.

> The answer to whether or not a ref store should refer to a certain
> object store seems unresolved because a ref store is trying to do two
> separate things. Perhaps it is reasonable to associate a ref database
> with an object store (so that it can validate its refs), but we would
> prefer to dissociate the physical ref storage layer from the object
> store. (I'm paraphrasing Johnathan Nieder here, this isn't an original
> thought).
>
> Perhaps this is a question we want to resolve when considering reftable
> and other ref databases.

Work on reftable shows that there are more egregious breaks of
abstraction boundaries. For example, there are still parts of the code
that equate

  (file under .git/ == ref)

you can find a good part of them if you run GIT_TEST_REFTABLE=1 with
the reftable support switched on. Another place where API contracts
are unclear is resolving symrefs: on first sight, you'd think that a
ref backend should just provide storage for a refname => {symref,
commit SHA-1, tag + commit SHA-1} mapping. However, in some places it
is currently necessary for the ref backend to resolve symrefs. You can
find these places by grepping for refs_resolve_ref_unsafe() in the
files backend.

I think Jonathan is right, but I also think that teasing apart the ref
backend and the ODB is premature until the ref backend itself is a
strongly enforced abstraction boundary.
Jonathan Tan Sept. 16, 2021, 5:24 p.m. UTC | #5
> The answer to whether or not a ref store should refer to a certain
> object store seems unresolved because a ref store is trying to do two
> separate things. Perhaps it is reasonable to associate a ref database
> with an object store (so that it can validate its refs), but we would
> prefer to dissociate the physical ref storage layer from the object
> store. (I'm paraphrasing Johnathan Nieder here, this isn't an original
> thought).
> 
> Perhaps this is a question we want to resolve when considering reftable
> and other ref databases.

Either adding an explicit dependency on an object store to a ref store
or dissociating it would be an improvement over what we have now, which
is an implicit dependency on the_repository's object store. Of the two,
I also prefer dissociating it. In practice, if I remember correctly, the
part that checks object existence during ref writing is the last
dependency, so if we can eliminate that without a convoluted design, I
think it's worth dissociating.
Jonathan Tan Sept. 16, 2021, 5:26 p.m. UTC | #6
> On Wed, Sep 15, 2021 at 12:41 AM Glen Choo <chooglen@google.com> wrote:
> > In the current state of affairs, the files ref store and the packed ref
> > store seem to behave as a single logical ref database. An example of
> > this (that I care about in particular) is in refs/files-backend.c where
> > the files backend validates oids using the_repository's odb.
> > refs/packed-backend.c doesn't do any such validation, and presumably
> > just relies on the correctness of refs/files-backend.c. I assume that
> > this also explains why some functions in refs_be_packed are stubs.
> 
> The loose/packed storage is implemented in terms of files backend (the
> public entry point) that defers to a packed backend in some cases. The
> latter is implemented as a ref backend, but for no good reason.

Yes, the packed backend doesn't need to be a ref backend.

> I think Jonathan is right, but I also think that teasing apart the ref
> backend and the ODB is premature until the ref backend itself is a
> strongly enforced abstraction boundary.

I think both efforts can proceed independently.
Junio C Hamano Sept. 16, 2021, 9:56 p.m. UTC | #7
Jonathan Tan <jonathantanmy@google.com> writes:

>> On Wed, Sep 15, 2021 at 12:41 AM Glen Choo <chooglen@google.com> wrote:
>> > In the current state of affairs, the files ref store and the packed ref
>> > store seem to behave as a single logical ref database. An example of
>> > this (that I care about in particular) is in refs/files-backend.c where
>> > the files backend validates oids using the_repository's odb.
>> > refs/packed-backend.c doesn't do any such validation, and presumably
>> > just relies on the correctness of refs/files-backend.c. I assume that
>> > this also explains why some functions in refs_be_packed are stubs.
>> 
>> The loose/packed storage is implemented in terms of files backend (the
>> public entry point) that defers to a packed backend in some cases. The
>> latter is implemented as a ref backend, but for no good reason.
>
> Yes, the packed backend doesn't need to be a ref backend.

Sorry, I do not follow.  Do you mean we cannot have a version of Git
that offers say a read-only access to the repository without any
loose refs, with the default ref backend being the packed one?

Or do you mean that we can ignore such a hypothetical use case and
could reimplement the files backend that can also understand the
$GIT_DIR/packed-refs file directly without "deferring to another ref
backend which is 'packed'"?
Jonathan Tan Sept. 16, 2021, 10:05 p.m. UTC | #8
> > Yes, the packed backend doesn't need to be a ref backend.
> 
> Sorry, I do not follow.  Do you mean we cannot have a version of Git
> that offers say a read-only access to the repository without any
> loose refs, with the default ref backend being the packed one?
> 
> Or do you mean that we can ignore such a hypothetical use case and
> could reimplement the files backend that can also understand the
> $GIT_DIR/packed-refs file directly without "deferring to another ref
> backend which is 'packed'"?

I meant the latter - more specifically, the files backend could defer to
functions that are not necessarily inside a struct ref_storage_be.
diff mbox series

Patch

diff --git a/refs.c b/refs.c
index 8b9f7c3a80..35b85f3e79 100644
--- a/refs.c
+++ b/refs.c
@@ -1413,16 +1413,16 @@  int head_ref(each_ref_fn fn, void *cb_data)
 
 struct ref_iterator *refs_ref_iterator_begin(
 		struct ref_store *refs,
-		const char *prefix, int trim, int flags)
+		const char *prefix, int trim, struct repository *repo,
+		int flags)
 {
 	struct ref_iterator *iter;
 
 	if (ref_paranoia < 0)
 		ref_paranoia = git_env_bool("GIT_REF_PARANOIA", 0);
-	if (ref_paranoia)
-		flags |= DO_FOR_EACH_INCLUDE_BROKEN;
 
-	iter = refs->be->iterator_begin(refs, prefix, flags);
+	iter = refs->be->iterator_begin(refs, prefix,
+					ref_paranoia ? NULL : repo, flags);
 
 	/*
 	 * `iterator_begin()` already takes care of prefix, but we
@@ -1442,13 +1442,16 @@  struct ref_iterator *refs_ref_iterator_begin(
  * Call fn for each reference in the specified submodule for which the
  * refname begins with prefix. If trim is non-zero, then trim that
  * many characters off the beginning of each refname before passing
- * the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
- * include broken references in the iteration. If fn ever returns a
+ * the refname to fn. If fn ever returns a
  * non-zero value, stop the iteration and return that value;
  * otherwise, return 0.
+ *
+ * See the documentation of refs_ref_iterator_begin() for more information on
+ * the repo parameter.
  */
 static int do_for_each_repo_ref(struct repository *r, const char *prefix,
-				each_repo_ref_fn fn, int trim, int flags,
+				each_repo_ref_fn fn, int trim,
+				struct repository *repo, int flags,
 				void *cb_data)
 {
 	struct ref_iterator *iter;
@@ -1457,7 +1460,7 @@  static int do_for_each_repo_ref(struct repository *r, const char *prefix,
 	if (!refs)
 		return 0;
 
-	iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
+	iter = refs_ref_iterator_begin(refs, prefix, trim, repo, flags);
 
 	return do_for_each_repo_ref_iterator(r, iter, fn, cb_data);
 }
@@ -1479,7 +1482,8 @@  static int do_for_each_ref_helper(struct repository *r,
 }
 
 static int do_for_each_ref(struct ref_store *refs, const char *prefix,
-			   each_ref_fn fn, int trim, int flags, void *cb_data)
+			   each_ref_fn fn, int trim, struct repository *repo,
+			   int flags, void *cb_data)
 {
 	struct ref_iterator *iter;
 	struct do_for_each_ref_help hp = { fn, cb_data };
@@ -1487,7 +1491,7 @@  static int do_for_each_ref(struct ref_store *refs, const char *prefix,
 	if (!refs)
 		return 0;
 
-	iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
+	iter = refs_ref_iterator_begin(refs, prefix, trim, repo, flags);
 
 	return do_for_each_repo_ref_iterator(the_repository, iter,
 					do_for_each_ref_helper, &hp);
@@ -1495,7 +1499,7 @@  static int do_for_each_ref(struct ref_store *refs, const char *prefix,
 
 int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, "", fn, 0, 0, cb_data);
+	return do_for_each_ref(refs, "", fn, 0, the_repository, 0, cb_data);
 }
 
 int for_each_ref(each_ref_fn fn, void *cb_data)
@@ -1506,7 +1510,7 @@  int for_each_ref(each_ref_fn fn, void *cb_data)
 int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
 			 each_ref_fn fn, void *cb_data)
 {
-	return do_for_each_ref(refs, prefix, fn, strlen(prefix), 0, cb_data);
+	return do_for_each_ref(refs, prefix, fn, strlen(prefix), the_repository, 0, cb_data);
 }
 
 int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
@@ -1518,10 +1522,10 @@  int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data, unsig
 {
 	unsigned int flag = 0;
 
-	if (broken)
-		flag = DO_FOR_EACH_INCLUDE_BROKEN;
 	return do_for_each_ref(get_main_ref_store(the_repository),
-			       prefix, fn, 0, flag, cb_data);
+			       prefix, fn, 0,
+			       broken ? NULL : the_repository,
+			       flag, cb_data);
 }
 
 int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
@@ -1530,16 +1534,16 @@  int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
 {
 	unsigned int flag = 0;
 
-	if (broken)
-		flag = DO_FOR_EACH_INCLUDE_BROKEN;
-	return do_for_each_ref(refs, prefix, fn, 0, flag, cb_data);
+	return do_for_each_ref(refs, prefix, fn, 0,
+			       broken ? NULL : the_repository,
+			       flag, cb_data);
 }
 
 int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
 {
 	return do_for_each_repo_ref(r, git_replace_ref_base, fn,
 				    strlen(git_replace_ref_base),
-				    DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
+				    NULL, 0, cb_data);
 }
 
 int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
@@ -1548,7 +1552,7 @@  int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
 	int ret;
 	strbuf_addf(&buf, "%srefs/", get_git_namespace());
 	ret = do_for_each_ref(get_main_ref_store(the_repository),
-			      buf.buf, fn, 0, 0, cb_data);
+			      buf.buf, fn, 0, the_repository, 0, cb_data);
 	strbuf_release(&buf);
 	return ret;
 }
@@ -1556,7 +1560,7 @@  int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
 int refs_for_each_rawref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
 {
 	return do_for_each_ref(refs, "", fn, 0,
-			       DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
+			       NULL, 0, cb_data);
 }
 
 int for_each_rawref(each_ref_fn fn, void *cb_data)
@@ -2263,7 +2267,7 @@  int refs_verify_refname_available(struct ref_store *refs,
 	strbuf_addch(&dirname, '/');
 
 	iter = refs_ref_iterator_begin(refs, dirname.buf, 0,
-				       DO_FOR_EACH_INCLUDE_BROKEN);
+				       NULL, 0);
 	while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
 		if (skip &&
 		    string_list_has_string(skip, iter->refname))
diff --git a/refs/debug.c b/refs/debug.c
index 1a7a9e11cf..753d5da893 100644
--- a/refs/debug.c
+++ b/refs/debug.c
@@ -224,11 +224,11 @@  static struct ref_iterator_vtable debug_ref_iterator_vtable = {
 
 static struct ref_iterator *
 debug_ref_iterator_begin(struct ref_store *ref_store, const char *prefix,
-			 unsigned int flags)
+			 struct repository *repo, unsigned int flags)
 {
 	struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store;
 	struct ref_iterator *res =
-		drefs->refs->be->iterator_begin(drefs->refs, prefix, flags);
+		drefs->refs->be->iterator_begin(drefs->refs, prefix, repo, flags);
 	struct debug_ref_iterator *diter = xcalloc(1, sizeof(*diter));
 	base_ref_iterator_init(&diter->base, &debug_ref_iterator_vtable, 1);
 	diter->iter = res;
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 677b7e4cdd..4c42db1092 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -730,6 +730,7 @@  struct files_ref_iterator {
 	struct ref_iterator base;
 
 	struct ref_iterator *iter0;
+	struct repository *repo;
 	unsigned int flags;
 };
 
@@ -744,7 +745,13 @@  static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		    ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE)
 			continue;
 
-		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+		if (iter->repo && iter->repo != the_repository)
+			/*
+			 * NEEDSWORK: make ref_resolves_to_object() support
+			 * arbitrary repositories
+			 */
+			BUG("iter->repo must be NULL or the_repository");
+		if (iter->repo &&
 		    !ref_resolves_to_object(iter->iter0->refname,
 					    iter->iter0->oid,
 					    iter->iter0->flags))
@@ -793,7 +800,7 @@  static struct ref_iterator_vtable files_ref_iterator_vtable = {
 
 static struct ref_iterator *files_ref_iterator_begin(
 		struct ref_store *ref_store,
-		const char *prefix, unsigned int flags)
+		const char *prefix, struct repository *repo, unsigned int flags)
 {
 	struct files_ref_store *refs;
 	struct ref_iterator *loose_iter, *packed_iter, *overlay_iter;
@@ -801,7 +808,7 @@  static struct ref_iterator *files_ref_iterator_begin(
 	struct ref_iterator *ref_iterator;
 	unsigned int required_flags = REF_STORE_READ;
 
-	if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN))
+	if (repo)
 		required_flags |= REF_STORE_ODB;
 
 	refs = files_downcast(ref_store, required_flags, "ref_iterator_begin");
@@ -836,10 +843,13 @@  static struct ref_iterator *files_ref_iterator_begin(
 	 * references, and (if needed) do our own check for broken
 	 * ones in files_ref_iterator_advance(), after we have merged
 	 * the packed and loose references.
+	 *
+	 * Do this by not supplying any repo, regardless of whether a repo was
+	 * supplied to files_ref_iterator_begin().
 	 */
 	packed_iter = refs_ref_iterator_begin(
 			refs->packed_ref_store, prefix, 0,
-			DO_FOR_EACH_INCLUDE_BROKEN);
+			NULL, 0);
 
 	overlay_iter = overlay_ref_iterator_begin(loose_iter, packed_iter);
 
@@ -848,6 +858,7 @@  static struct ref_iterator *files_ref_iterator_begin(
 	base_ref_iterator_init(ref_iterator, &files_ref_iterator_vtable,
 			       overlay_iter->ordered);
 	iter->iter0 = overlay_iter;
+	iter->repo = repo;
 	iter->flags = flags;
 
 	return ref_iterator;
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index f8aa97d799..bc2302a6e0 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -776,6 +776,7 @@  struct packed_ref_iterator {
 	struct object_id oid, peeled;
 	struct strbuf refname_buf;
 
+	struct repository *repo;
 	unsigned int flags;
 };
 
@@ -863,7 +864,13 @@  static int packed_ref_iterator_advance(struct ref_iterator *ref_iterator)
 		    ref_type(iter->base.refname) != REF_TYPE_PER_WORKTREE)
 			continue;
 
-		if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
+		if (iter->repo && iter->repo != the_repository)
+			/*
+			 * NEEDSWORK: make ref_resolves_to_object() support
+			 * arbitrary repositories
+			 */
+			BUG("iter->repo must be NULL or the_repository");
+		if (iter->repo &&
 		    !ref_resolves_to_object(iter->base.refname, &iter->oid,
 					    iter->flags))
 			continue;
@@ -913,7 +920,7 @@  static struct ref_iterator_vtable packed_ref_iterator_vtable = {
 
 static struct ref_iterator *packed_ref_iterator_begin(
 		struct ref_store *ref_store,
-		const char *prefix, unsigned int flags)
+		const char *prefix, struct repository *repo, unsigned int flags)
 {
 	struct packed_ref_store *refs;
 	struct snapshot *snapshot;
@@ -922,7 +929,7 @@  static struct ref_iterator *packed_ref_iterator_begin(
 	struct ref_iterator *ref_iterator;
 	unsigned int required_flags = REF_STORE_READ;
 
-	if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN))
+	if (repo)
 		required_flags |= REF_STORE_ODB;
 	refs = packed_downcast(ref_store, required_flags, "ref_iterator_begin");
 
@@ -954,6 +961,7 @@  static struct ref_iterator *packed_ref_iterator_begin(
 
 	iter->base.oid = &iter->oid;
 
+	iter->repo = repo;
 	iter->flags = flags;
 
 	if (prefix && *prefix)
@@ -1137,7 +1145,7 @@  static int write_with_updates(struct packed_ref_store *refs,
 	 * of updates is exhausted, leave i set to updates->nr.
 	 */
 	iter = packed_ref_iterator_begin(&refs->base, "",
-					 DO_FOR_EACH_INCLUDE_BROKEN);
+					 NULL, 0);
 	if ((ok = ref_iterator_advance(iter)) != ITER_OK)
 		iter = NULL;
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 3155708345..b20fa1f5cd 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -245,9 +245,6 @@  int refs_rename_ref_available(struct ref_store *refs,
 /* We allow "recursive" symbolic refs. Only within reason, though */
 #define SYMREF_MAXDEPTH 5
 
-/* Include broken references in a do_for_each_ref*() iteration: */
-#define DO_FOR_EACH_INCLUDE_BROKEN 0x01
-
 /*
  * Reference iterators
  *
@@ -349,16 +346,19 @@  int is_empty_ref_iterator(struct ref_iterator *ref_iterator);
  * Return an iterator that goes over each reference in `refs` for
  * which the refname begins with prefix. If trim is non-zero, then
  * trim that many characters off the beginning of each refname.
- * The output is ordered by refname. The following flags are supported:
+ * The output is ordered by refname.
+ *
+ * Pass NULL as repo to include broken references in the iteration, or non-NULL
+ * to skip references that do not resolve to an object in the given repo.
  *
- * DO_FOR_EACH_INCLUDE_BROKEN: include broken references in
- *         the iteration.
+ * The following flags are supported:
  *
  * DO_FOR_EACH_PER_WORKTREE_ONLY: only produce REF_TYPE_PER_WORKTREE refs.
  */
 struct ref_iterator *refs_ref_iterator_begin(
 		struct ref_store *refs,
-		const char *prefix, int trim, int flags);
+		const char *prefix, int trim, struct repository *repo,
+		int flags);
 
 /*
  * A callback function used to instruct merge_ref_iterator how to
@@ -446,8 +446,9 @@  void base_ref_iterator_free(struct ref_iterator *iter);
 /*
  * backend-specific implementation of ref_iterator_advance. For symrefs, the
  * function should set REF_ISSYMREF, and it should also dereference the symref
- * to provide the OID referent. If DO_FOR_EACH_INCLUDE_BROKEN is set, symrefs
- * with non-existent referents and refs pointing to non-existent object names
+ * to provide the OID referent. If a NULL repo was passed to the _begin()
+ * function that created this iterator, symrefs with non-existent referents and
+ * refs pointing to non-existent object names
  * should also be returned. If DO_FOR_EACH_PER_WORKTREE_ONLY, only
  * REF_TYPE_PER_WORKTREE refs should be returned.
  */
@@ -504,7 +505,7 @@  int do_for_each_repo_ref_iterator(struct repository *r,
  * where all reference backends will presumably store their
  * per-worktree refs.
  */
-#define DO_FOR_EACH_PER_WORKTREE_ONLY 0x02
+#define DO_FOR_EACH_PER_WORKTREE_ONLY 0x01
 
 struct ref_store;
 
@@ -569,7 +570,8 @@  typedef int copy_ref_fn(struct ref_store *ref_store,
  */
 typedef struct ref_iterator *ref_iterator_begin_fn(
 		struct ref_store *ref_store,
-		const char *prefix, unsigned int flags);
+		const char *prefix, struct repository *repo,
+		unsigned int flags);
 
 /* reflog functions */