[v2,1/3] factor out refresh_and_write_cache function

Message ID	20190829182748.43802-2-t.gummerer@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=vPpO=WZ=vger.kernel.org=git-owner@kernel.org> From: Thomas Gummerer <t.gummerer@gmail.com> To: git@vger.kernel.org Cc: Junio C Hamano <gitster@pobox.com>, Joel Teichroeb <joel@teichroeb.net>, Johannes Schindelin <Johannes.Schindelin@gmx.de>, Jeff King <peff@peff.net>, =?utf-8?q?Martin_=C3=85gren?= <martin.agren@gmail.com>, Thomas Gummerer <t.gummerer@gmail.com> Subject: [PATCH v2 1/3] factor out refresh_and_write_cache function Date: Thu, 29 Aug 2019 19:27:46 +0100 Message-Id: <20190829182748.43802-2-t.gummerer@gmail.com> In-Reply-To: <20190829182748.43802-1-t.gummerer@gmail.com> References: <20190827101408.76757-1-t.gummerer@gmail.com> <20190829182748.43802-1-t.gummerer@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: git-owner@vger.kernel.org Precedence: bulk
Series	make sure stash refreshes the index properly \| expand [v2,0/3] make sure stash refreshes the index properly [v2,1/3] factor out refresh_and_write_cache function [v2,2/3] merge: use refresh_and_write_cache [v2,3/3] stash: make sure to write refreshed cache

Message ID

20190829182748.43802-2-t.gummerer@gmail.com (mailing list archive)

State

New, archived

Headers

From: Thomas Gummerer <t.gummerer@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Joel Teichroeb <joel@teichroeb.net>,
 Johannes Schindelin <Johannes.Schindelin@gmx.de>, Jeff King <peff@peff.net>,
	=?utf-8?q?Martin_=C3=85gren?= <martin.agren@gmail.com>,
 Thomas Gummerer <t.gummerer@gmail.com>
Subject: [PATCH v2 1/3] factor out refresh_and_write_cache function
Date: Thu, 29 Aug 2019 19:27:46 +0100
Message-Id: <20190829182748.43802-2-t.gummerer@gmail.com>
In-Reply-To: <20190829182748.43802-1-t.gummerer@gmail.com>
References: <20190827101408.76757-1-t.gummerer@gmail.com>
 <20190829182748.43802-1-t.gummerer@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: git-owner@vger.kernel.org
Precedence: bulk

Series

make sure stash refreshes the index properly | expand

Commit Message

Thomas Gummerer Aug. 29, 2019, 6:27 p.m. UTC

Getting the lock for the index, refreshing it and then writing it is a
pattern that happens more than once throughout the codebase.  Factor
out the refresh_and_write_cache function from builtin/am.c to
read-cache.c, so it can be re-used in other places in a subsequent
commit.

Note that we return different error codes for failing to refresh the
cache, and failing to write the index.  The current caller only cares
about failing to write the index.  However for other callers we're
going to convert in subsequent patches we will need this distinction.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
---
 builtin/am.c | 16 ++--------------
 cache.h      | 13 +++++++++++++
 read-cache.c | 17 +++++++++++++++++
 3 files changed, 32 insertions(+), 14 deletions(-)

Comments

Martin Ågren Aug. 30, 2019, 3:07 p.m. UTC | #1

On Thu, 29 Aug 2019 at 20:28, Thomas Gummerer <t.gummerer@gmail.com> wrote:
> +int repo_refresh_and_write_index(struct  repository *repo,
> +                                unsigned int refresh_flags,
> +                                unsigned int write_flags,
> +                                const struct pathspec *pathspec,
> +                                char *seen, const char *header_msg)
> +{
> +       struct lock_file lock_file = LOCK_INIT;
> +
> +       repo_hold_locked_index(repo, &lock_file, LOCK_DIE_ON_ERROR);
> +       if (refresh_index(repo->index, refresh_flags, pathspec, seen, header_msg))
> +               return 1;
> +       if (write_locked_index(repo->index, &lock_file, COMMIT_LOCK | write_flags))
> +               return -1;
> +       return 0;
> +}

AFAIU, the_repository->index == &the_index so this patch is a noop on
the converted user as far as that aspect is concerned.

There's a difference in behavior that I'm not sure about: We used
to ignore the return value of `refresh_cache()`, i.e. we didn't care
whether it had any errors. I have no idea whether that's safe to do --
especially as we go on to write the index. So I don't know whether this
patch fixes a bug by introducing the early return. Or if it *introduces*
a bug by bailing too aggressively. Do you know more?

(This conversion provides REFRESH_QUIET, which seems to suppress certain
errors, but not all.)

In any case, that early return introduces a bug with the lockfile, that
much I know. We need to roll back the lockfile before doing the early
return. I should have seen that already in your previous version.. :-(

The above makes me think that once this new function is in good shape,
the commit introducing it could sell it as "this is hard to get right --
let's implement it correctly once and for all". ;-)

Martin

Junio C Hamano Aug. 30, 2019, 5:06 p.m. UTC | #2

Martin Ågren <martin.agren@gmail.com> writes:

> There's a difference in behavior that I'm not sure about: We used
> to ignore the return value of `refresh_cache()`, i.e. we didn't care
> whether it had any errors. I have no idea whether that's safe to do --
> especially as we go on to write the index. So I don't know whether this
> patch fixes a bug by introducing the early return. Or if it *introduces*
> a bug by bailing too aggressively. Do you know more?

One common reason why refresh_cache() fails is because the index is
unmerged (i.e. has one or more higher-stage entries).  After an
attempt to refresh, this would not wrote out the index in such a
case, which might even be more correct thing to do than the original
in the original context of "git am" implementation.  The next thing
that happens after the caller calls this function is to ask
repo_index_has_changes(), and we'd say "the index is dirty" whether
the index is written back or not from such a state.

> The above makes me think that once this new function is in good shape,
> the commit introducing it could sell it as "this is hard to get right --
> let's implement it correctly once and for all". ;-)

Yes, that is a more severe issue.

Thomas Gummerer Sept. 2, 2019, 5:15 p.m. UTC | #3

On 08/30, Junio C Hamano wrote:
> Martin Ågren <martin.agren@gmail.com> writes:
> 
> > There's a difference in behavior that I'm not sure about: We used
> > to ignore the return value of `refresh_cache()`, i.e. we didn't care
> > whether it had any errors. I have no idea whether that's safe to do --
> > especially as we go on to write the index. So I don't know whether this
> > patch fixes a bug by introducing the early return. Or if it *introduces*
> > a bug by bailing too aggressively. Do you know more?
> 
> One common reason why refresh_cache() fails is because the index is
> unmerged (i.e. has one or more higher-stage entries).  After an
> attempt to refresh, this would not wrote out the index in such a
> case, which might even be more correct thing to do than the original
> in the original context of "git am" implementation.  The next thing
> that happens after the caller calls this function is to ask
> repo_index_has_changes(), and we'd say "the index is dirty" whether
> the index is written back or not from such a state.

Looking at the other callsites, we seem to do something similar
everywhere, and usually fail if the index has unmerged entries.  So
the refreshed index would only not be written out in the case where
there's unmerged entries, and we fail later, which I think is okay.

> > The above makes me think that once this new function is in good shape,
> > the commit introducing it could sell it as "this is hard to get right --
> > let's implement it correctly once and for all". ;-)
> 
> Yes, that is a more severe issue.

With this do you mean what you quoted above, or that the lockfile is
not rolled back?  I agree that the lockfile not being rolled back if
'refresh_cache()' fails is indeed the bigger issue, and I'll fix that
in v3.  I can also add something like the above to the commit message,
just wanted to make sure I'm not missing something subtle in what you
quoted above.

Junio C Hamano Sept. 3, 2019, 5:43 p.m. UTC | #4

Thomas Gummerer <t.gummerer@gmail.com> writes:

> On 08/30, Junio C Hamano wrote:
>> Martin Ågren <martin.agren@gmail.com> writes:
>> ...
>> > The above makes me think that once this new function is in good shape,
>> > the commit introducing it could sell it as "this is hard to get right --
>> > let's implement it correctly once and for all". ;-)
>> 
>> Yes, that is a more severe issue.
>
> With this do you mean what you quoted above, or that the lockfile is
> not rolled back?  I agree that the lockfile not being rolled back if
> 'refresh_cache()' fails is indeed the bigger issue, and I'll fix that
> in v3.  I can also add something like the above to the commit message,
> just wanted to make sure I'm not missing something subtle in what you
> quoted above.

You didn't miss anything, other than that I trimmed my quote too
much and ended up confusing you.

Thanks.

diff --git a/builtin/am.c b/builtin/am.c
index 1aea657a7f..ddedd2b9d4 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1071,19 +1071,6 @@  static const char *msgnum(const struct am_state *state)
 	return sb.buf;
 }
 
-/**
- * Refresh and write index.
- */
-static void refresh_and_write_cache(void)
-{
-	struct lock_file lock_file = LOCK_INIT;
-
-	hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
-	refresh_cache(REFRESH_QUIET);
-	if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
-		die(_("unable to write index file"));
-}
-
 /**
  * Dies with a user-friendly message on how to proceed after resolving the
  * problem. This message can be overridden with state->resolvemsg.
@@ -1703,7 +1690,8 @@  static void am_run(struct am_state *state, int resume)
 
 	unlink(am_path(state, "dirtyindex"));
 
-	refresh_and_write_cache();
+	if (refresh_and_write_cache(REFRESH_QUIET, 0) < 0)
+		die(_("unable to write index file"));
 
 	if (repo_index_has_changes(the_repository, NULL, &sb)) {
 		write_state_bool(state, "dirtyindex", 1);
diff --git a/cache.h b/cache.h
index b1da1ab08f..987d289e8f 100644
--- a/cache.h
+++ b/cache.h
@@ -414,6 +414,7 @@  extern struct index_state the_index;
 #define add_file_to_cache(path, flags) add_file_to_index(&the_index, (path), (flags))
 #define chmod_cache_entry(ce, flip) chmod_index_entry(&the_index, (ce), (flip))
 #define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL, NULL)
+#define refresh_and_write_cache(refresh_flags, write_flags) repo_refresh_and_write_index(the_repository, (refresh_flags), (write_flags), NULL, NULL, NULL)
 #define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), (options))
 #define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), (options))
 #define cache_dir_exists(name, namelen) index_dir_exists(&the_index, (name), (namelen))
@@ -812,6 +813,18 @@  void fill_stat_cache_info(struct index_state *istate, struct cache_entry *ce, st
 #define REFRESH_IN_PORCELAIN	0x0020	/* user friendly output, not "needs update" */
 #define REFRESH_PROGRESS	0x0040  /* show progress bar if stderr is tty */
 int refresh_index(struct index_state *, unsigned int flags, const struct pathspec *pathspec, char *seen, const char *header_msg);
+/*
+ * Refresh the index and write it to disk.
+ *
+ * 'refresh_flags' is passed directly to 'refresh_index()', while
+ * 'COMMIT_LOCK | write_flags' is passed to 'write_locked_index()', so
+ * the lockfile is always either committed or rolled back.
+ *
+ * Return 1 if refreshing the cache failed, -1 if writing the cache to
+ * disk failed, 0 on success.
+ */
+int repo_refresh_and_write_index(struct repository*, unsigned int refresh_flags, unsigned int write_flags, const struct pathspec *, char *seen, const char *header_msg);
+
 struct cache_entry *refresh_cache_entry(struct index_state *, struct cache_entry *, unsigned int);
 
 void set_alternate_index_output(const char *);
diff --git a/read-cache.c b/read-cache.c
index 52ffa8a313..72662df077 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1472,6 +1472,23 @@  static void show_file(const char * fmt, const char * name, int in_porcelain,
 	printf(fmt, name);
 }
 
+int repo_refresh_and_write_index(struct  repository *repo,
+				 unsigned int refresh_flags,
+				 unsigned int write_flags,
+				 const struct pathspec *pathspec,
+				 char *seen, const char *header_msg)
+{
+	struct lock_file lock_file = LOCK_INIT;
+
+	repo_hold_locked_index(repo, &lock_file, LOCK_DIE_ON_ERROR);
+	if (refresh_index(repo->index, refresh_flags, pathspec, seen, header_msg))
+		return 1;
+	if (write_locked_index(repo->index, &lock_file, COMMIT_LOCK | write_flags))
+		return -1;
+	return 0;
+}
+
+
 int refresh_index(struct index_state *istate, unsigned int flags,
 		  const struct pathspec *pathspec,
 		  char *seen, const char *header_msg)

[v2,1/3] factor out refresh_and_write_cache function

Commit Message

Comments

Patch