diff mbox series

[18/23] fsmonitor--daemon:: introduce client delay for testing

Message ID c6d5f045fb5644306a3676e5fa4145ba4c6e9b93.1617291666.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Builtin FSMonitor Feature | expand

Commit Message

Jeff Hostetler April 1, 2021, 3:41 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

Define GIT_TEST_FSMONITOR_CLIENT_DELAY as a millisecond delay.

Introduce an artificial delay when processing client requests.
This make the CI/PR test suite a little more stable and avoids
the need to load up test scripts with sleep statements to avoid
racy failures.  This was mostly seen on 1 or 2 core CI build
machines where the test script would create a file and quickly
try to confirm that the daemon had seen it *before* the daemon
had received the kernel event and causing a test failure.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsmonitor--daemon.c | 38 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

Comments

Derrick Stolee April 27, 2021, 1:36 p.m. UTC | #1
On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Define GIT_TEST_FSMONITOR_CLIENT_DELAY as a millisecond delay.

This is a second delay introduced in this feature, but the units
are different. Could we put a unit in the name? Perhaps a "_MS"
suffix.

> Introduce an artificial delay when processing client requests.
> This make the CI/PR test suite a little more stable and avoids
> the need to load up test scripts with sleep statements to avoid
> racy failures.  This was mostly seen on 1 or 2 core CI build
> machines where the test script would create a file and quickly
> try to confirm that the daemon had seen it *before* the daemon
> had received the kernel event and causing a test failure.

Isn't the cookie file supposed to prevent this from happening?

Yes, our test suite interacts with the filesystem and Git commands
more quickly than a human user would, but Git is used all the time
by scripts or build machines to quickly process data. The FS
Monitor feature should be robust to such a situation.

I feel that as currently described, this patch is only hiding a
bug that shows up during heavy use.

Perhaps the test failures are limited to a small number of
specific tests that are checking the FS Monitor daemon in a
non-standard way, especially in a way that circumvents the
cookie file. In this case, I'd like to see _in this patch_ how
the environment variable is used in the test suite.

I understand that it is difficult to simultaneously build a new
feature like this in small increments, but the biggest issue I
have with the series' organization so far is that we are 18
patches deep and I still haven't seen a single test. This is
a case where I think this only serves the purpose of the test
suite, so it would be good to delay until only seeing its value
in a test script.

Looking ahead, I see that you insert it as a blanket statement
in the t7527 test script, which seems like it has potential to
hide bugs instead of being an isolated cover for a specific
interaction.

As for the code, it all looks correct. However, please update
t/README with a description of the new GIT_TEST_* variable.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index e9a9aea59ad6..0cb09ef0b984 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -150,6 +150,30 @@  static int do_as_client__send_flush(void)
 	return 0;
 }
 
+static int lookup_client_test_delay(void)
+{
+	static int delay_ms = -1;
+
+	const char *s;
+	int ms;
+
+	if (delay_ms >= 0)
+		return delay_ms;
+
+	delay_ms = 0;
+
+	s = getenv("GIT_TEST_FSMONITOR_CLIENT_DELAY");
+	if (!s)
+		return delay_ms;
+
+	ms = atoi(s);
+	if (ms < 0)
+		return delay_ms;
+
+	delay_ms = ms;
+	return delay_ms;
+}
+
 /*
  * Requests to and from a FSMonitor Protocol V2 provider use an opaque
  * "token" as a virtual timestamp.  Clients can request a summary of all
@@ -526,6 +550,18 @@  static int do_handle_client(struct fsmonitor_daemon_state *state,
 		return SIMPLE_IPC_QUIT;
 	}
 
+	/*
+	 * For testing purposes, introduce an artificial delay in this
+	 * worker to allow the filesystem listener thread to receive
+	 * any fs events that may have been generated by the client
+	 * process on the other end of the pipe/socket.  This helps
+	 * make the CI/PR test suite runs a little more predictable
+	 * and hopefully eliminates the need to introduce `sleep`
+	 * commands in the test scripts.
+	 */
+	if (state->test_client_delay_ms)
+		sleep_millisec(state->test_client_delay_ms);
+
 	if (!strcmp(command, "flush")) {
 		/*
 		 * Flush all of our cached data and generate a new token
@@ -1038,7 +1074,7 @@  static int fsmonitor_run_daemon(void)
 	pthread_mutex_init(&state.main_lock, NULL);
 	state.error_code = 0;
 	state.current_token_data = fsmonitor_new_token_data();
-	state.test_client_delay_ms = 0;
+	state.test_client_delay_ms = lookup_client_test_delay();
 
 	/* Prepare to (recursively) watch the <worktree-root> directory. */
 	strbuf_init(&state.path_worktree_watch, 0);