diff mbox series

[v2,1/8] maintenance: add get_random_minute()

Message ID edc08023ed51890f0390aacf783d7213e82704a7.1691699987.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 89024a0ab018bb6e8ad2e4a6500b98b889088c54
Headers show
Series maintenance: schedule maintenance on a random minute | expand

Commit Message

Derrick Stolee Aug. 10, 2023, 8:39 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

When we initially created background maintenance -- with its hourly,
daily, and weekly schedules -- we considered the effects of all clients
launching fetches to the server every hour on the hour. The worry of
DDoSing server hosts was noted, but left as something we would consider
for a future update.

As background maintenance has gained more adoption over the past three
years, our worries about DDoSing the big Git hosts has been unfounded.
Those systems, especially those serving public repositories, are already
resilient to thundering herds of much smaller scale.

However, sometimes organizations spin up specific custom server
infrastructure either in addition to or on top of their Git host. Some
of these technologies are built for a different range of scale, and can
hit concurrency limits sooner. Organizations with such custom
infrastructures are more likely to recommend tools like `scalar` which
furthers their adoption of background maintenance.

To help solve for this, create get_random_minute() as a method to help
Git select a random minute when creating schedules in the future. The
integrations with this method do not yet exist, but will follow in
future changes.

To avoid multiple sources of randomness in the Git codebase, create a
new helper function, git_rand(), that returns a random uint32_t. This is
similar to how rand() returns a random nonnegative value, except it is
based on csprng_bytes() which is cryptographic and will return values
larger than RAND_MAX.

One thing that is important for testability is that we notice when we
are under a test scenario and return a predictable result. The schedules
themselves are not checked for this value, but at least one launchctl
test checks that we do not unnecessarily reboot the schedule if it has
not changed from a previous version.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c | 10 ++++++++++
 wrapper.c    | 10 ++++++++++
 wrapper.h    |  6 ++++++
 3 files changed, 26 insertions(+)

Comments

Taylor Blau Aug. 10, 2023, 9:25 p.m. UTC | #1
On Thu, Aug 10, 2023 at 08:39:40PM +0000, Derrick Stolee via GitGitGadget wrote:
> ---
>  builtin/gc.c | 10 ++++++++++
>  wrapper.c    | 10 ++++++++++
>  wrapper.h    |  6 ++++++
>  3 files changed, 26 insertions(+)

Very nice, this patch LGTM. Thanks for working on it!

Thanks,
Taylor
diff mbox series

Patch

diff --git a/builtin/gc.c b/builtin/gc.c
index 19d73067aad..2ebae7bc17c 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -1708,6 +1708,16 @@  static int get_schedule_cmd(const char **cmd, int *is_available)
 	return 1;
 }
 
+MAYBE_UNUSED
+static int get_random_minute(void)
+{
+	/* Use a static value when under tests. */
+	if (getenv("GIT_TEST_MAINT_SCHEDULER"))
+		return 13;
+
+	return git_rand() % 60;
+}
+
 static int is_launchctl_available(void)
 {
 	const char *cmd = "launchctl";
diff --git a/wrapper.c b/wrapper.c
index 5160c9e28de..48065c4f533 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -819,3 +819,13 @@  int csprng_bytes(void *buf, size_t len)
 	return 0;
 #endif
 }
+
+uint32_t git_rand(void)
+{
+	uint32_t result;
+
+	if (csprng_bytes(&result, sizeof(result)) < 0)
+		die(_("unable to get random bytes"));
+
+	return result;
+}
diff --git a/wrapper.h b/wrapper.h
index 79a9c1b5077..79c7321bb37 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -139,4 +139,10 @@  void sleep_millisec(int millisec);
  */
 int csprng_bytes(void *buf, size_t len);
 
+/*
+ * Returns a random uint32_t, uniformly distributed across all possible
+ * values.
+ */
+uint32_t git_rand(void);
+
 #endif /* WRAPPER_H */