diff mbox series

[20/23] fsmonitor: force update index when fsmonitor token advances

Message ID d699ad597d2c40cd2c0fe8abbf75b5386d4cf0f2.1617291666.git.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series Builtin FSMonitor Feature | expand

Commit Message

Jeff Hostetler April 1, 2021, 3:41 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

Set the `FSMONITOR_CHANGED` bit on `istate->cache_changed` when the
fsmonitor response contains a different token to ensure that the index
is written to disk.

Normally, when the fsmonitor response includes a tracked file, the
index is always updated.  Similarly, the index might be updated when
the response alters the untracked-cache (when enabled).  However, in
cases where neither of those cause the index to be considered changed,
the fsmonitor response is wasted.  And subsequent commands will
continue to make requests with the same token and if there have not
been any changes in the working directory, they will receive the same
response.

This was observed on Windows after a large checkout.  On Windows, the
kernel emits events for the files that are changed as they are
changed.  However, it might delay events for the containing
directories until the system is more idle (or someone scans the
directory (so it seems)).  The first status following a checkout would
get the list of files.  The subsequent status commands would get the
list of directories as the events trickled out.  But they would never
catch up because the token was not advanced because the index wasn't
updated.

This list of directories caused `wt_status_collect_untracked()` to
unnecessarily spend time actually scanning them during each command.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 fsmonitor.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Derrick Stolee April 27, 2021, 2:52 p.m. UTC | #1
On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:...
> +	/*
> +	 * If the fsmonitor response and the subsequent scan of the disk
> +	 * did not cause the in-memory index to be marked dirty, then force
> +	 * it so that we advance the fsmonitor token in our extension, so
> +	 * that future requests don't keep re-requesting the same range.
> +	 */
> +	if (istate->fsmonitor_last_update &&
> +	    strcmp(istate->fsmonitor_last_update, last_update_token.buf))
> +		istate->cache_changed |= FSMONITOR_CHANGED;
> +

This could lead to extra index writes that don't normally happen in
the case without the FS Monitor feature. I'm particularly sensitive
to this because of my sparse-index work is trying to solve for the
I/O cost of large indexes, but perhaps this cost is worth the benefit.

I'll keep an eye out as I do performance testing.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/fsmonitor.c b/fsmonitor.c
index d7e18fc8cd47..8b544e31f29f 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -353,6 +353,16 @@  void refresh_fsmonitor(struct index_state *istate)
 	}
 	strbuf_release(&query_result);
 
+	/*
+	 * If the fsmonitor response and the subsequent scan of the disk
+	 * did not cause the in-memory index to be marked dirty, then force
+	 * it so that we advance the fsmonitor token in our extension, so
+	 * that future requests don't keep re-requesting the same range.
+	 */
+	if (istate->fsmonitor_last_update &&
+	    strcmp(istate->fsmonitor_last_update, last_update_token.buf))
+		istate->cache_changed |= FSMONITOR_CHANGED;
+
 	/* Now that we've updated istate, save the last_update_token */
 	FREE_AND_NULL(istate->fsmonitor_last_update);
 	istate->fsmonitor_last_update = strbuf_detach(&last_update_token, NULL);