From patchwork Mon Nov 15 23:50:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF679C433F5 for ; Tue, 16 Nov 2021 03:24:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8B02861B66 for ; Tue, 16 Nov 2021 03:24:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238332AbhKPD1U (ORCPT ); Mon, 15 Nov 2021 22:27:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238440AbhKPDZq (ORCPT ); Mon, 15 Nov 2021 22:25:46 -0500 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C9CDC034034 for ; Mon, 15 Nov 2021 15:51:07 -0800 (PST) Received: by mail-wr1-x42c.google.com with SMTP id d27so33931255wrb.6 for ; Mon, 15 Nov 2021 15:51:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ZJSdRpAOPi8ANGhN75oZ+f3ReQ3Eqxx1jYSgDCP4EXs=; b=RPK3xHDrhMjqRV9iZFzHdc8L5a/lPAWMh42hEYdC5rkIZIuayb8QbG7pcK/0C/T8SC 6gqXLrC6tZbGN4J0HPwdgn6BOlBVx80gOtczgHEFD5eVKBxQJPrkkdNhvhEJypQs0Jus xzQUkly+7MTwQSC4TpY1jvRfC8MAEctJsflce/sIWfFhnTtxgSnroeWYDryZ82kDa07J 1eK9ET1CPGJJOvGZxYgDpC0vmizPSj1IsyzhxkMrgzWMBTtgCqcP4+vHlPKpxRJ4RxAl hJDefkB7zkwfxADVxQS3b7FVlX0ddvyCDk+JPJjXMoh81/Zn87yCs0z7cHtbMq0Uglqv ow3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ZJSdRpAOPi8ANGhN75oZ+f3ReQ3Eqxx1jYSgDCP4EXs=; b=16bcMVot8RxcC0VeEjI/Jc/+7I9WIqi6nTJc/jG2OTiC1meC+eDB4HwNVomA74K4ym PcW3wYrMGUSAxLF1w6t73aZnxVIvZFgDIH3O0mw3f/BNwn88BveIghpvWW9ChL1FbPj6 xJK5SE37KyycCKJr7qRInfpMaMXSX3I094tJZ4NzFqj4GR2KwJZEnSvM7ZkViC4rP7Yv /a5dCXPZ05UAXNi/bXp/xnv1HRUwZgLSZCXR7nVV9oCBOpUTx7TrBr2J6b4qOXJ24p+m BLLJDsUbjeItGVX0CVxtcN+jMk2rq8MDezaN+VW3dyLvmEq+XBwxh3c0pZbSon1yW1Yo Jr3w== X-Gm-Message-State: AOAM530/voaCd5GZwZMqpBDQt+sHMwp+ZLPAwwWRPFul7FkMtcZDO8BH K4uDQ6mOQtxBbCuL2ODbdfHHQNuuy/U= X-Google-Smtp-Source: ABdhPJw2E5MTHML4TW01jcGsANWwF6Pr83SWEm4lP9H4kRDkI37s310ZofGWRNn9a/uJ6FUsb3CZnQ== X-Received: by 2002:adf:fa0b:: with SMTP id m11mr3762639wrr.152.1637020265648; Mon, 15 Nov 2021 15:51:05 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h15sm744518wmq.32.2021.11.15.15.51.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:05 -0800 (PST) Message-Id: <6b27afa60e05c6c0b7752f1bcf6629c446ede520.1637020263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 15 Nov 2021 23:50:55 +0000 Subject: [PATCH v9 1/9] tmp-objdir: new API for creating temporary writable databases Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh The tmp_objdir API provides the ability to create temporary object directories, but was designed with the goal of having subprocesses access these object stores, followed by the main process migrating objects from it to the main object store or just deleting it. The subprocesses would view it as their primary datastore and write to it. Here we add the tmp_objdir_replace_primary_odb function that replaces the current process's writable "main" object directory with the specified one. The previous main object directory is restored in either tmp_objdir_migrate or tmp_objdir_destroy. For the --remerge-diff usecase, add a new `will_destroy` flag in `struct object_database` to mark ephemeral object databases that do not require fsync durability. Add 'git prune' support for removing temporary object databases, and make sure that they have a name starting with tmp_ and containing an operation-specific name. Based-on-patch-by: Elijah Newren Signed-off-by: Neeraj Singh Signed-off-by: Junio C Hamano --- builtin/prune.c | 23 +++++++++++++++--- builtin/receive-pack.c | 2 +- environment.c | 5 ++++ object-file.c | 44 +++++++++++++++++++++++++++++++-- object-store.h | 19 +++++++++++++++ object.c | 2 +- tmp-objdir.c | 55 +++++++++++++++++++++++++++++++++++++++--- tmp-objdir.h | 29 +++++++++++++++++++--- 8 files changed, 165 insertions(+), 14 deletions(-) diff --git a/builtin/prune.c b/builtin/prune.c index 485c9a3c56f..a76e6a5f0e8 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -18,6 +18,7 @@ static int show_only; static int verbose; static timestamp_t expire; static int show_progress = -1; +static struct strbuf remove_dir_buf = STRBUF_INIT; static int prune_tmp_file(const char *fullpath) { @@ -26,10 +27,20 @@ static int prune_tmp_file(const char *fullpath) return error("Could not stat '%s'", fullpath); if (st.st_mtime > expire) return 0; - if (show_only || verbose) - printf("Removing stale temporary file %s\n", fullpath); - if (!show_only) - unlink_or_warn(fullpath); + if (S_ISDIR(st.st_mode)) { + if (show_only || verbose) + printf("Removing stale temporary directory %s\n", fullpath); + if (!show_only) { + strbuf_reset(&remove_dir_buf); + strbuf_addstr(&remove_dir_buf, fullpath); + remove_dir_recursively(&remove_dir_buf, 0); + } + } else { + if (show_only || verbose) + printf("Removing stale temporary file %s\n", fullpath); + if (!show_only) + unlink_or_warn(fullpath); + } return 0; } @@ -97,6 +108,9 @@ static int prune_cruft(const char *basename, const char *path, void *data) static int prune_subdir(unsigned int nr, const char *path, void *data) { + if (verbose) + printf("Removing directory %s\n", path); + if (!show_only) rmdir(path); return 0; @@ -184,5 +198,6 @@ int cmd_prune(int argc, const char **argv, const char *prefix) prune_shallow(show_only ? PRUNE_SHOW_ONLY : 0); } + strbuf_release(&remove_dir_buf); return 0; } diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 49b846d9605..8815e24cde5 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -2213,7 +2213,7 @@ static const char *unpack(int err_fd, struct shallow_info *si) strvec_push(&child.args, alt_shallow_file); } - tmp_objdir = tmp_objdir_create(); + tmp_objdir = tmp_objdir_create("incoming"); if (!tmp_objdir) { if (err_fd > 0) close(err_fd); diff --git a/environment.c b/environment.c index 9da7f3c1a19..342400fcaad 100644 --- a/environment.c +++ b/environment.c @@ -17,6 +17,7 @@ #include "commit.h" #include "strvec.h" #include "object-store.h" +#include "tmp-objdir.h" #include "chdir-notify.h" #include "shallow.h" @@ -331,10 +332,14 @@ static void update_relative_gitdir(const char *name, void *data) { char *path = reparent_relative_path(old_cwd, new_cwd, get_git_dir()); + struct tmp_objdir *tmp_objdir = tmp_objdir_unapply_primary_odb(); trace_printf_key(&trace_setup_key, "setup: move $GIT_DIR to '%s'", path); + set_git_dir_1(path); + if (tmp_objdir) + tmp_objdir_reapply_primary_odb(tmp_objdir, old_cwd, new_cwd); free(path); } diff --git a/object-file.c b/object-file.c index c3d866a287e..0b6a61aeaff 100644 --- a/object-file.c +++ b/object-file.c @@ -683,6 +683,43 @@ void add_to_alternates_memory(const char *reference) '\n', NULL, 0); } +struct object_directory *set_temporary_primary_odb(const char *dir, int will_destroy) +{ + struct object_directory *new_odb; + + /* + * Make sure alternates are initialized, or else our entry may be + * overwritten when they are. + */ + prepare_alt_odb(the_repository); + + /* + * Make a new primary odb and link the old primary ODB in as an + * alternate + */ + new_odb = xcalloc(1, sizeof(*new_odb)); + new_odb->path = xstrdup(dir); + new_odb->will_destroy = will_destroy; + new_odb->next = the_repository->objects->odb; + the_repository->objects->odb = new_odb; + return new_odb->next; +} + +void restore_primary_odb(struct object_directory *restore_odb, const char *old_path) +{ + struct object_directory *cur_odb = the_repository->objects->odb; + + if (strcmp(old_path, cur_odb->path)) + BUG("expected %s as primary object store; found %s", + old_path, cur_odb->path); + + if (cur_odb->next != restore_odb) + BUG("we expect the old primary object store to be the first alternate"); + + the_repository->objects->odb = restore_odb; + free_object_directory(cur_odb); +} + /* * Compute the exact path an alternate is at and returns it. In case of * error NULL is returned and the human readable error is added to `err` @@ -1809,8 +1846,11 @@ int hash_object_file(const struct git_hash_algo *algo, const void *buf, /* Finalize a file on disk, and close it. */ static void close_loose_object(int fd) { - if (fsync_object_files) - fsync_or_die(fd, "loose object file"); + if (!the_repository->objects->odb->will_destroy) { + if (fsync_object_files) + fsync_or_die(fd, "loose object file"); + } + if (close(fd) != 0) die_errno(_("error when closing loose object file")); } diff --git a/object-store.h b/object-store.h index 952efb6a4be..cb173e69392 100644 --- a/object-store.h +++ b/object-store.h @@ -27,6 +27,11 @@ struct object_directory { uint32_t loose_objects_subdir_seen[8]; /* 256 bits */ struct oidtree *loose_objects_cache; + /* + * This object store is ephemeral, so there is no need to fsync. + */ + int will_destroy; + /* * Path to the alternative object store. If this is a relative path, * it is relative to the current working directory. @@ -58,6 +63,17 @@ void add_to_alternates_file(const char *dir); */ void add_to_alternates_memory(const char *dir); +/* + * Replace the current writable object directory with the specified temporary + * object directory; returns the former primary object directory. + */ +struct object_directory *set_temporary_primary_odb(const char *dir, int will_destroy); + +/* + * Restore a previous ODB replaced by set_temporary_main_odb. + */ +void restore_primary_odb(struct object_directory *restore_odb, const char *old_path); + /* * Populate and return the loose object cache array corresponding to the * given object ID. @@ -68,6 +84,9 @@ struct oidtree *odb_loose_cache(struct object_directory *odb, /* Empty the loose object cache for the specified object directory. */ void odb_clear_loose_cache(struct object_directory *odb); +/* Clear and free the specified object directory */ +void free_object_directory(struct object_directory *odb); + struct packed_git { struct hashmap_entry packmap_ent; struct packed_git *next; diff --git a/object.c b/object.c index 23a24e678a8..048f96a260e 100644 --- a/object.c +++ b/object.c @@ -513,7 +513,7 @@ struct raw_object_store *raw_object_store_new(void) return o; } -static void free_object_directory(struct object_directory *odb) +void free_object_directory(struct object_directory *odb) { free(odb->path); odb_clear_loose_cache(odb); diff --git a/tmp-objdir.c b/tmp-objdir.c index b8d880e3626..3d38eeab66b 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -1,5 +1,6 @@ #include "cache.h" #include "tmp-objdir.h" +#include "chdir-notify.h" #include "dir.h" #include "sigchain.h" #include "string-list.h" @@ -11,6 +12,8 @@ struct tmp_objdir { struct strbuf path; struct strvec env; + struct object_directory *prev_odb; + int will_destroy; }; /* @@ -38,6 +41,9 @@ static int tmp_objdir_destroy_1(struct tmp_objdir *t, int on_signal) if (t == the_tmp_objdir) the_tmp_objdir = NULL; + if (!on_signal && t->prev_odb) + restore_primary_odb(t->prev_odb, t->path.buf); + /* * This may use malloc via strbuf_grow(), but we should * have pre-grown t->path sufficiently so that this @@ -52,6 +58,7 @@ static int tmp_objdir_destroy_1(struct tmp_objdir *t, int on_signal) */ if (!on_signal) tmp_objdir_free(t); + return err; } @@ -121,7 +128,7 @@ static int setup_tmp_objdir(const char *root) return ret; } -struct tmp_objdir *tmp_objdir_create(void) +struct tmp_objdir *tmp_objdir_create(const char *prefix) { static int installed_handlers; struct tmp_objdir *t; @@ -129,11 +136,16 @@ struct tmp_objdir *tmp_objdir_create(void) if (the_tmp_objdir) BUG("only one tmp_objdir can be used at a time"); - t = xmalloc(sizeof(*t)); + t = xcalloc(1, sizeof(*t)); strbuf_init(&t->path, 0); strvec_init(&t->env); - strbuf_addf(&t->path, "%s/incoming-XXXXXX", get_object_directory()); + /* + * Use a string starting with tmp_ so that the builtin/prune.c code + * can recognize any stale objdirs left behind by a crash and delete + * them. + */ + strbuf_addf(&t->path, "%s/tmp_objdir-%s-XXXXXX", get_object_directory(), prefix); /* * Grow the strbuf beyond any filename we expect to be placed in it. @@ -269,6 +281,13 @@ int tmp_objdir_migrate(struct tmp_objdir *t) if (!t) return 0; + if (t->prev_odb) { + if (the_repository->objects->odb->will_destroy) + BUG("migrating an ODB that was marked for destruction"); + restore_primary_odb(t->prev_odb, t->path.buf); + t->prev_odb = NULL; + } + strbuf_addbuf(&src, &t->path); strbuf_addstr(&dst, get_object_directory()); @@ -292,3 +311,33 @@ void tmp_objdir_add_as_alternate(const struct tmp_objdir *t) { add_to_alternates_memory(t->path.buf); } + +void tmp_objdir_replace_primary_odb(struct tmp_objdir *t, int will_destroy) +{ + if (t->prev_odb) + BUG("the primary object database is already replaced"); + t->prev_odb = set_temporary_primary_odb(t->path.buf, will_destroy); + t->will_destroy = will_destroy; +} + +struct tmp_objdir *tmp_objdir_unapply_primary_odb(void) +{ + if (!the_tmp_objdir || !the_tmp_objdir->prev_odb) + return NULL; + + restore_primary_odb(the_tmp_objdir->prev_odb, the_tmp_objdir->path.buf); + the_tmp_objdir->prev_odb = NULL; + return the_tmp_objdir; +} + +void tmp_objdir_reapply_primary_odb(struct tmp_objdir *t, const char *old_cwd, + const char *new_cwd) +{ + char *path; + + path = reparent_relative_path(old_cwd, new_cwd, t->path.buf); + strbuf_reset(&t->path); + strbuf_addstr(&t->path, path); + free(path); + tmp_objdir_replace_primary_odb(t, t->will_destroy); +} diff --git a/tmp-objdir.h b/tmp-objdir.h index b1e45b4c75d..a3145051f25 100644 --- a/tmp-objdir.h +++ b/tmp-objdir.h @@ -10,7 +10,7 @@ * * Example: * - * struct tmp_objdir *t = tmp_objdir_create(); + * struct tmp_objdir *t = tmp_objdir_create("incoming"); * if (!run_command_v_opt_cd_env(cmd, 0, NULL, tmp_objdir_env(t)) && * !tmp_objdir_migrate(t)) * printf("success!\n"); @@ -22,9 +22,10 @@ struct tmp_objdir; /* - * Create a new temporary object directory; returns NULL on failure. + * Create a new temporary object directory with the specified prefix; + * returns NULL on failure. */ -struct tmp_objdir *tmp_objdir_create(void); +struct tmp_objdir *tmp_objdir_create(const char *prefix); /* * Return a list of environment strings, suitable for use with @@ -51,4 +52,26 @@ int tmp_objdir_destroy(struct tmp_objdir *); */ void tmp_objdir_add_as_alternate(const struct tmp_objdir *); +/* + * Replaces the main object store in the current process with the temporary + * object directory and makes the former main object store an alternate. + * If will_destroy is nonzero, the object directory may not be migrated. + */ +void tmp_objdir_replace_primary_odb(struct tmp_objdir *, int will_destroy); + +/* + * If the primary object database was replaced by a temporary object directory, + * restore it to its original value while keeping the directory contents around. + * Returns NULL if the primary object database was not replaced. + */ +struct tmp_objdir *tmp_objdir_unapply_primary_odb(void); + +/* + * Reapplies the former primary temporary object database, after protentially + * changing its relative path. + */ +void tmp_objdir_reapply_primary_odb(struct tmp_objdir *, const char *old_cwd, + const char *new_cwd); + + #endif /* TMP_OBJDIR_H */ From patchwork Mon Nov 15 23:50:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DDBFC433EF for ; Tue, 16 Nov 2021 03:24:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 798A761B66 for ; Tue, 16 Nov 2021 03:24:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243709AbhKPD1R (ORCPT ); Mon, 15 Nov 2021 22:27:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238427AbhKPDZq (ORCPT ); Mon, 15 Nov 2021 22:25:46 -0500 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1A15C034035 for ; Mon, 15 Nov 2021 15:51:07 -0800 (PST) Received: by mail-wr1-x42c.google.com with SMTP id n29so33843331wra.11 for ; Mon, 15 Nov 2021 15:51:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=nIwWOzyteUPgtg6V/gbptI+dO14mDOtHt15Ldl+xrEE=; b=ABfTrJFrd7ePzBAXSZwidfuUXa2cy8LaIS+qYTevawsq1pYRgi7TFnPZqz/NWdbecU IxD3RJJwFewpQr/RwWB8dkvSF0ywZR1jxfTuI4MDGyWyRxfeLlUlxGyaFna0ucK8so/O q3Zt/BVC91a91CW8uBFKcdFJXcl2gVOpY/3y2KHXRSXZXKL3jL3Pv3zZXcn75MQOXtDx cJGhYI335v5+useMkp9DcOQSWwds4YtSyuJd94loZyi6nxFKJqmUnNP09yMLr/JYOAhu 9uCrc/iXhApatjeaFdymY72VmdBOae8ur4ws7lgln0MW9j89zxIbP6e59Zp/glpVv/KU hPTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=nIwWOzyteUPgtg6V/gbptI+dO14mDOtHt15Ldl+xrEE=; b=ArmN911r6LPbLPLlmHeci9xIq9iH2lw+QP1nDte7bAT2k8vF8WU01Xts2JTVs+faCU IzyC+5NNmlC4Blrdfb8yibSoSOYpQ5nt79eDSSqoEv1ntNcTj6Q0KqpFWXhDqaQfABWD edK4+uJun3/jawgrQhtvKHrCkpN9f5oVyjKkKI3I99pHDKwUmwBkwDsk3U2y/a07dDjd aH3l2gO96g2ofBhAZPr1skl1xnObbSIK7fjJqhhscywK7vKUda7XMos7KJqfhMdlizbo gGFeTnyIdWVExbWY984YIJgCEwog5oY3fEqzfz++E2hzcYEfvV6R7QyZzyaZVrYRRyNI hhqA== X-Gm-Message-State: AOAM531JEN3f3G794XyTeoP+ZuQ84Nf5cWKN18BME+IO3533edC7ZReI xKkWyFywrmiqooJYQsZD+63KRGRsfYg= X-Google-Smtp-Source: ABdhPJyfZC7QjLjtvV2vRz+3v5hPZx6UUGOUYidj+sGr/nyEZcxGxir5sIIbgTy/1UvDmh1iYLxlpQ== X-Received: by 2002:a5d:6d07:: with SMTP id e7mr3558582wrq.311.1637020266404; Mon, 15 Nov 2021 15:51:06 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u23sm16677717wru.21.2021.11.15.15.51.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:05 -0800 (PST) Message-Id: <71817cccfb9573929723a34fe9fb0db72498d4b9.1637020263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 15 Nov 2021 23:50:56 +0000 Subject: [PATCH v9 2/9] tmp-objdir: disable ref updates when replacing the primary odb Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh When creating a subprocess with a temporary ODB, we set the GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not to update refs, since the tmp-objdir may go away. Introduce a similar mechanism for in-process temporary ODBs when we call tmp_objdir_replace_primary_odb. Now both mechanisms set the disable_ref_updates flag on the odb, which is queried by the ref_transaction_prepare function. Note: This change adds an assumption that the state of the_repository is relevant for any ref transaction that might be initiated. Unwinding this assumption should be straightforward by saving the relevant repository to query in the transaction or the ref_store. Peff's test case was invoking ref updates via the cachetextconv setting. That particular code silently does nothing when a ref update is forbidden. See the call to notes_cache_put in fill_textconv where errors are ignored. Reported-by: Jeff King Signed-off-by: Neeraj Singh Signed-off-by: Junio C Hamano --- environment.c | 4 ++++ object-file.c | 6 ++++++ object-store.h | 9 ++++++++- refs.c | 2 +- repository.c | 2 ++ repository.h | 1 + 6 files changed, 22 insertions(+), 2 deletions(-) diff --git a/environment.c b/environment.c index 342400fcaad..2701dfeeec8 100644 --- a/environment.c +++ b/environment.c @@ -169,6 +169,10 @@ void setup_git_env(const char *git_dir) args.graft_file = getenv_safe(&to_free, GRAFT_ENVIRONMENT); args.index_file = getenv_safe(&to_free, INDEX_ENVIRONMENT); args.alternate_db = getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT); + if (getenv(GIT_QUARANTINE_ENVIRONMENT)) { + args.disable_ref_updates = 1; + } + repo_set_gitdir(the_repository, git_dir, &args); strvec_clear(&to_free); diff --git a/object-file.c b/object-file.c index 0b6a61aeaff..659ef7623ff 100644 --- a/object-file.c +++ b/object-file.c @@ -699,6 +699,12 @@ struct object_directory *set_temporary_primary_odb(const char *dir, int will_des */ new_odb = xcalloc(1, sizeof(*new_odb)); new_odb->path = xstrdup(dir); + + /* + * Disable ref updates while a temporary odb is active, since + * the objects in the database may roll back. + */ + new_odb->disable_ref_updates = 1; new_odb->will_destroy = will_destroy; new_odb->next = the_repository->objects->odb; the_repository->objects->odb = new_odb; diff --git a/object-store.h b/object-store.h index cb173e69392..9ae9262c340 100644 --- a/object-store.h +++ b/object-store.h @@ -27,10 +27,17 @@ struct object_directory { uint32_t loose_objects_subdir_seen[8]; /* 256 bits */ struct oidtree *loose_objects_cache; + /* + * This is a temporary object store created by the tmp_objdir + * facility. Disable ref updates since the objects in the store + * might be discarded on rollback. + */ + unsigned int disable_ref_updates : 1; + /* * This object store is ephemeral, so there is no need to fsync. */ - int will_destroy; + unsigned int will_destroy : 1; /* * Path to the alternative object store. If this is a relative path, diff --git a/refs.c b/refs.c index d7cc0a23a3b..27ec7d1fc64 100644 --- a/refs.c +++ b/refs.c @@ -2137,7 +2137,7 @@ int ref_transaction_prepare(struct ref_transaction *transaction, break; } - if (getenv(GIT_QUARANTINE_ENVIRONMENT)) { + if (the_repository->objects->odb->disable_ref_updates) { strbuf_addstr(err, _("ref updates forbidden inside quarantine environment")); return -1; diff --git a/repository.c b/repository.c index c5b90ba93ea..dce8e35ac20 100644 --- a/repository.c +++ b/repository.c @@ -80,6 +80,8 @@ void repo_set_gitdir(struct repository *repo, expand_base_dir(&repo->objects->odb->path, o->object_dir, repo->commondir, "objects"); + repo->objects->odb->disable_ref_updates = o->disable_ref_updates; + free(repo->objects->alternate_db); repo->objects->alternate_db = xstrdup_or_null(o->alternate_db); expand_base_dir(&repo->graft_file, o->graft_file, diff --git a/repository.h b/repository.h index a057653981c..7c04e99ac5c 100644 --- a/repository.h +++ b/repository.h @@ -158,6 +158,7 @@ struct set_gitdir_args { const char *graft_file; const char *index_file; const char *alternate_db; + int disable_ref_updates; }; void repo_set_gitdir(struct repository *repo, const char *root, From patchwork Mon Nov 15 23:50:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1109CC433EF for ; Tue, 16 Nov 2021 03:24:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EF0F361B96 for ; Tue, 16 Nov 2021 03:24:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238442AbhKPD1W (ORCPT ); Mon, 15 Nov 2021 22:27:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238438AbhKPDZq (ORCPT ); Mon, 15 Nov 2021 22:25:46 -0500 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88AB4C126D00 for ; Mon, 15 Nov 2021 15:51:08 -0800 (PST) Received: by mail-wr1-x432.google.com with SMTP id i5so33939161wrb.2 for ; Mon, 15 Nov 2021 15:51:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=0YFH/7O9dujfKmmu6ZRk/KJCH7LOCPzTEaGTMr5K95k=; b=ECHcU8am5+RFG6dX4B+xmMUNphEDTu0KVXq/cF/QlMNtFrdu8DPwEH6jWdfr5Lnpc4 rsL8szD5DSnXc9yzRXeYZPG82Hrpd3vAmmbS3kLyaN+OXrgl2TJy4ACTXycxsw9Fej5Z BE2b2ezAqVjjbYNL3Swk46QeNjPVpu32ftepzian+J9HC2+F2JBsopTd2rw4581dvMdJ sYcYjKzrCoDnFI8mHCY2M4mS8OnKxKBatDf8WkC/RUS08EnBdIUgusZcQysGpeIL6Vu5 Sp3y6wvZDkHCEaDHw9ylpA0mc/LvlCZGD3KiXB85bsF3Z+gz343PErZqi8u3O4HlPRWG /Qsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=0YFH/7O9dujfKmmu6ZRk/KJCH7LOCPzTEaGTMr5K95k=; b=x1pnTKQqHArqhbFFnyJgm6Ax50tCpgfz0rblTvwdXNZ/WG/z6nYl5HkC5h16+ioUt5 UJ9K9rIyPLfxIyqS/8esYpo6YjP+mw4EwSkRjzuK8reGrC2AuvqExFhTpTHggLqq6fpo mocc+VKKBO8a0Gg8mbZgSZU1HRv83vn49P7fBHpZ48lqp0/aXI5WTlJ0ua8a+7ToZFPz dFNfieYAB3IDC4ZH4MKb69EW+e+AMFLb69JJyZxl5xInLzp44Hl2u0eUmSKgNdo4JWzM mXvB1NN60MSzcHFFA42vRmzAu2tZ6aJ21nS14dvppeBPUDgjsZRqs6rSZTh6gHqMpAY/ ClBg== X-Gm-Message-State: AOAM530ssmSx2BaZhOS7j0JWOaQMpQYXoVp2b/7KqAV1IcPsdsLK7fqO ovOmwDbb+99NjDLQBPy+lHQxIRAKIyw= X-Google-Smtp-Source: ABdhPJy+PDM0FNw1uQpHHxx90WyJ9HOn98kTUjSpiMpFUbWvLAO5dRZmFI7Fm8waIohmi2hSEaMyHA== X-Received: by 2002:a5d:4b0e:: with SMTP id v14mr3912505wrq.196.1637020267084; Mon, 15 Nov 2021 15:51:07 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h1sm687041wmb.7.2021.11.15.15.51.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:06 -0800 (PST) Message-Id: <8fd1ca4c00aa94dbb75a34359727bdbd3adccf60.1637020263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 15 Nov 2021 23:50:57 +0000 Subject: [PATCH v9 3/9] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure. * Rename 'state' variable to 'bulk_checkin_state', since we will later be adding 'bulk_fsync_state'. This also makes the variable easier to find in the debugger, since the name is more unique. * Move the 'plugged' data member of 'bulk_checkin_state' into a separate static variable. Doing this avoids resetting the variable in finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we seem to unintentionally disable the plugging functionality the first time a new packfile must be created due to packfile size limits. While disabling the plugging state only results in suboptimal behavior for the current code, it would be fatal for the bulk-fsync functionality later in this patch series. Signed-off-by: Neeraj Singh --- bulk-checkin.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/bulk-checkin.c b/bulk-checkin.c index 8785b2ac806..6ae18401e04 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -10,9 +10,9 @@ #include "packfile.h" #include "object-store.h" -static struct bulk_checkin_state { - unsigned plugged:1; +static int bulk_checkin_plugged; +static struct bulk_checkin_state { char *pack_tmp_name; struct hashfile *f; off_t offset; @@ -21,7 +21,7 @@ static struct bulk_checkin_state { struct pack_idx_entry **written; uint32_t alloc_written; uint32_t nr_written; -} state; +} bulk_checkin_state; static void finish_tmp_packfile(struct strbuf *basename, const char *pack_tmp_name, @@ -277,21 +277,23 @@ int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags) { - int status = deflate_to_pack(&state, oid, fd, size, type, + int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type, path, flags); - if (!state.plugged) - finish_bulk_checkin(&state); + if (!bulk_checkin_plugged) + finish_bulk_checkin(&bulk_checkin_state); return status; } void plug_bulk_checkin(void) { - state.plugged = 1; + assert(!bulk_checkin_plugged); + bulk_checkin_plugged = 1; } void unplug_bulk_checkin(void) { - state.plugged = 0; - if (state.f) - finish_bulk_checkin(&state); + assert(bulk_checkin_plugged); + bulk_checkin_plugged = 0; + if (bulk_checkin_state.f) + finish_bulk_checkin(&bulk_checkin_state); } From patchwork Mon Nov 15 23:50:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D121C433F5 for ; Tue, 16 Nov 2021 03:24:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5EFA261B66 for ; Tue, 16 Nov 2021 03:24:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244078AbhKPD1i (ORCPT ); Mon, 15 Nov 2021 22:27:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238045AbhKPD0W (ORCPT ); Mon, 15 Nov 2021 22:26:22 -0500 Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FB1AC126D01 for ; Mon, 15 Nov 2021 15:51:09 -0800 (PST) Received: by mail-wm1-x329.google.com with SMTP id o29so15383404wms.2 for ; Mon, 15 Nov 2021 15:51:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=r4HSghyssrnnAl9X8BKAG/kF9JqVpxhpcwT8Bkye6Cs=; b=biRmsKmxEyb2PRnCJ8liRfvbXTZyDn+Wo4tQV64hkOGa7eesM8n7ewhaeA4KYRXAf9 wc8kUXw0bkvCwIvAjHdBNruN/yUzRpx85pbunbcrm77/cWbzwsG/YszFFfF4xrQr1sAH xgdvuXibnNzuLWWpwqnWnD8Xm6kqlbPU3Tm2NcQJTxRZb/76zPv8832W9GxHafkH6ZbH prakK+kP2tDw9M12yc1Zke4CH5cPCiSQ10ym0RBgrlY+xHLsPtHtfP1FgEpgYIXqinTL SgtTEejL1z0HT2BwPtbwpcXWxrow7VtEl05Veor6w+G8AVz/HdthlD89LYHkfBbPB5GC aQew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=r4HSghyssrnnAl9X8BKAG/kF9JqVpxhpcwT8Bkye6Cs=; b=h9qLfnwga1Gw82WJVTLPS3yb0cbo44gYeo5pdV02u3Buy9DhDCKVByc6bNv0GgrkX6 UsxGTflHIl0ja28iKKfddTojHomM9Znpb0wkhaME5I3clTxzq3ILRQuWBD5vFg9ngoaF bzLkjpyTQsE9+rPyUead0yGWKh15CTNBA77Ti/y8n4/q2nCbx3pi7t9UI84v8AQtDbVq OFJ9/Yclvm4oME/dHTN+NuJZJ0/1h0SkV7BD1bbpffhuwTJQkQ31I94JaDn7a6hEjbLL Xh3SwvzyeSy92LWiHxVpo/edftCiN+30hdgX/BDYvA04SjikHeCYq211TEL2xnFifSB7 ka9w== X-Gm-Message-State: AOAM5316V4+UAkjZMzlMHy3/w9RO2EfP0t/55Dvk5tBLMXKVkYI0KVUR jVQkDqBBHwrE7Hi0M5MQKI11ACPrdg0= X-Google-Smtp-Source: ABdhPJz0oBM7DhJ3NedIdcKR7MGuuSL5nqK86Q23tjKVmKsNYr/ycpZ+ASkfvh19CsSTDn6pKdwfYg== X-Received: by 2002:a1c:770e:: with SMTP id t14mr62633534wmi.173.1637020267932; Mon, 15 Nov 2021 15:51:07 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o10sm19475620wri.15.2021.11.15.15.51.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:07 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 15 Nov 2021 23:50:58 +0000 Subject: [PATCH v9 4/9] core.fsyncobjectfiles: batched disk flushes Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh When adding many objects to a repo with core.fsyncObjectFiles set to true, the cost of fsync'ing each object file can become prohibitive. One major source of the cost of fsync is the implied flush of the hardware writeback cache within the disk drive. Fortunately, Windows, and macOS offer mechanisms to write data from the filesystem page cache without initiating a hardware flush. Linux has the sync_file_range API, which issues a pagecache writeback request reliably after version 5.2. This patch introduces a new 'core.fsyncObjectFiles = batch' option that batches up hardware flushes. It hooks into the bulk-checkin plugging and unplugging functionality and takes advantage of tmp-objdir. When the new mode is enabled do the following for each new object: 1. Create the object in a tmp-objdir. 2. Issue a pagecache writeback request and wait for it to complete. At the end of the entire transaction when unplugging bulk checkin: 1. Issue an fsync against a dummy file to flush the hardware writeback cache, which should by now have processed the tmp-objdir writes. 2. Rename all of the tmp-objdir files to their final names. 3. When updating the index and/or refs, we assume that Git will issue another fsync internal to that operation. This is not the case today, but may be a good extension to those components. On a filesystem with a singular journal that is updated during name operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS we would expect the fsync to trigger a journal writeout so that this sequence is enough to ensure that the user's data is durable by the time the git command returns. This change also updates the macOS code to trigger a real hardware flush via fnctl(fd, F_FULLFSYNC) when fsync_or_die is called. Previously, on macOS there was no guarantee of durability since a simple fsync(2) call does not flush any hardware caches. _Performance numbers_: Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD. Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD. Windows - Same host as Linux, a preview version of Windows 11. This number is from a patch later in the series. Adding 500 files to the repo with 'git add' Times reported in seconds. core.fsyncObjectFiles | Linux | Mac | Windows ----------------------|-------|-------|-------- false | 0.06 | 0.35 | 0.61 true | 1.88 | 11.18 | 2.47 batch | 0.15 | 0.41 | 1.53 Signed-off-by: Neeraj Singh --- Documentation/config/core.txt | 29 +++++++++++---- Makefile | 6 ++++ bulk-checkin.c | 68 +++++++++++++++++++++++++++++++++++ bulk-checkin.h | 2 ++ cache.h | 8 ++++- config.c | 7 +++- config.mak.uname | 1 + configure.ac | 8 +++++ environment.c | 2 +- git-compat-util.h | 7 ++++ object-file.c | 12 ++++++- wrapper.c | 44 +++++++++++++++++++++++ write-or-die.c | 2 +- 13 files changed, 185 insertions(+), 11 deletions(-) diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index c04f62a54a1..200b4d9f06e 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -548,12 +548,29 @@ core.whitespace:: errors. The default tab width is 8. Allowed values are 1 to 63. core.fsyncObjectFiles:: - This boolean will enable 'fsync()' when writing object files. -+ -This is a total waste of time and effort on a filesystem that orders -data writes properly, but can be useful for filesystems that do not use -journalling (traditional UNIX filesystems) or that only journal metadata -and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback"). + A value indicating the level of effort Git will expend in + trying to make objects added to the repo durable in the event + of an unclean system shutdown. This setting currently only + controls loose objects in the object store, so updates to any + refs or the index may not be equally durable. ++ +* `false` allows data to remain in file system caches according to + operating system policy, whence it may be lost if the system loses power + or crashes. +* `true` triggers a data integrity flush for each loose object added to the + object store. This is the safest setting that is likely to ensure durability + across all operating systems and file systems that honor the 'fsync' system + call. However, this setting comes with a significant performance cost on + common hardware. Git does not currently fsync parent directories for + newly-added files, so some filesystems may still allow data to be lost on + system crash. +* `batch` enables an experimental mode that uses interfaces available in some + operating systems to write loose object data with a minimal set of FLUSH + CACHE (or equivalent) commands sent to the storage controller. If the + operating system interfaces are not available, this mode behaves the same as + `true`. This mode is expected to be as safe as `true` on macOS for repos + stored on HFS+ or APFS filesystems and on Windows for repos stored on NTFS or + ReFS. core.preloadIndex:: Enable parallel index preload for operations like 'git diff' diff --git a/Makefile b/Makefile index 12be39ac497..241dc322c09 100644 --- a/Makefile +++ b/Makefile @@ -406,6 +406,8 @@ all:: # # Define HAVE_CLOCK_MONOTONIC if your platform has CLOCK_MONOTONIC. # +# Define HAVE_SYNC_FILE_RANGE if your platform has sync_file_range. +# # Define NEEDS_LIBRT if your platform requires linking with librt (glibc version # before 2.17) for clock_gettime and CLOCK_MONOTONIC. # @@ -1884,6 +1886,10 @@ ifdef HAVE_CLOCK_MONOTONIC BASIC_CFLAGS += -DHAVE_CLOCK_MONOTONIC endif +ifdef HAVE_SYNC_FILE_RANGE + BASIC_CFLAGS += -DHAVE_SYNC_FILE_RANGE +endif + ifdef NEEDS_LIBRT EXTLIBS += -lrt endif diff --git a/bulk-checkin.c b/bulk-checkin.c index 6ae18401e04..4deee1af46e 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -3,14 +3,20 @@ */ #include "cache.h" #include "bulk-checkin.h" +#include "lockfile.h" #include "repository.h" #include "csum-file.h" #include "pack.h" #include "strbuf.h" +#include "string-list.h" +#include "tmp-objdir.h" #include "packfile.h" #include "object-store.h" static int bulk_checkin_plugged; +static int needs_batch_fsync; + +static struct tmp_objdir *bulk_fsync_objdir; static struct bulk_checkin_state { char *pack_tmp_name; @@ -79,6 +85,34 @@ clear_exit: reprepare_packed_git(the_repository); } +/* + * Cleanup after batch-mode fsync_object_files. + */ +static void do_batch_fsync(void) +{ + /* + * Issue a full hardware flush against a temporary file to ensure + * that all objects are durable before any renames occur. The code in + * fsync_loose_object_bulk_checkin has already issued a writeout + * request, but it has not flushed any writeback cache in the storage + * hardware. + */ + + if (needs_batch_fsync) { + struct strbuf temp_path = STRBUF_INIT; + struct tempfile *temp; + + strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory()); + temp = xmks_tempfile(temp_path.buf); + fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp)); + delete_tempfile(&temp); + strbuf_release(&temp_path); + } + + if (bulk_fsync_objdir) + tmp_objdir_migrate(bulk_fsync_objdir); +} + static int already_written(struct bulk_checkin_state *state, struct object_id *oid) { int i; @@ -273,6 +307,25 @@ static int deflate_to_pack(struct bulk_checkin_state *state, return 0; } +void fsync_loose_object_bulk_checkin(int fd) +{ + assert(fsync_object_files == FSYNC_OBJECT_FILES_BATCH); + + /* + * If we have a plugged bulk checkin, we issue a call that + * cleans the filesystem page cache but avoids a hardware flush + * command. Later on we will issue a single hardware flush + * before as part of do_batch_fsync. + */ + if (bulk_checkin_plugged && + git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) { + if (!needs_batch_fsync) + needs_batch_fsync = 1; + } else { + fsync_or_die(fd, "loose object file"); + } +} + int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags) @@ -287,6 +340,19 @@ int index_bulk_checkin(struct object_id *oid, void plug_bulk_checkin(void) { assert(!bulk_checkin_plugged); + + /* + * A temporary object directory is used to hold the files + * while they are not fsynced. + */ + if (fsync_object_files == FSYNC_OBJECT_FILES_BATCH) { + bulk_fsync_objdir = tmp_objdir_create("bulk-fsync"); + if (!bulk_fsync_objdir) + die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch")); + + tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0); + } + bulk_checkin_plugged = 1; } @@ -296,4 +362,6 @@ void unplug_bulk_checkin(void) bulk_checkin_plugged = 0; if (bulk_checkin_state.f) finish_bulk_checkin(&bulk_checkin_state); + + do_batch_fsync(); } diff --git a/bulk-checkin.h b/bulk-checkin.h index b26f3dc3b74..08f292379b6 100644 --- a/bulk-checkin.h +++ b/bulk-checkin.h @@ -6,6 +6,8 @@ #include "cache.h" +void fsync_loose_object_bulk_checkin(int fd); + int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags); diff --git a/cache.h b/cache.h index eba12487b99..6d6e6770ecc 100644 --- a/cache.h +++ b/cache.h @@ -985,7 +985,13 @@ void reset_shared_repository(void); extern int read_replace_refs; extern char *git_replace_ref_base; -extern int fsync_object_files; +enum fsync_object_files_mode { + FSYNC_OBJECT_FILES_OFF, + FSYNC_OBJECT_FILES_ON, + FSYNC_OBJECT_FILES_BATCH +}; + +extern enum fsync_object_files_mode fsync_object_files; extern int core_preload_index; extern int precomposed_unicode; extern int protect_hfs; diff --git a/config.c b/config.c index c5873f3a706..5eb36ecd77a 100644 --- a/config.c +++ b/config.c @@ -1491,7 +1491,12 @@ static int git_default_core_config(const char *var, const char *value, void *cb) } if (!strcmp(var, "core.fsyncobjectfiles")) { - fsync_object_files = git_config_bool(var, value); + if (value && !strcmp(value, "batch")) + fsync_object_files = FSYNC_OBJECT_FILES_BATCH; + else if (git_config_bool(var, value)) + fsync_object_files = FSYNC_OBJECT_FILES_ON; + else + fsync_object_files = FSYNC_OBJECT_FILES_OFF; return 0; } diff --git a/config.mak.uname b/config.mak.uname index 3236a4918a3..5ead1377667 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -57,6 +57,7 @@ ifeq ($(uname_S),Linux) HAVE_CLOCK_MONOTONIC = YesPlease # -lrt is needed for clock_gettime on glibc <= 2.16 NEEDS_LIBRT = YesPlease + HAVE_SYNC_FILE_RANGE = YesPlease HAVE_GETDELIM = YesPlease SANE_TEXT_GREP=-a FREAD_READS_DIRECTORIES = UnfortunatelyYes diff --git a/configure.ac b/configure.ac index 031e8d3fee8..c711037d625 100644 --- a/configure.ac +++ b/configure.ac @@ -1090,6 +1090,14 @@ AC_COMPILE_IFELSE([CLOCK_MONOTONIC_SRC], [AC_MSG_RESULT([no]) HAVE_CLOCK_MONOTONIC=]) GIT_CONF_SUBST([HAVE_CLOCK_MONOTONIC]) + +# +# Define HAVE_SYNC_FILE_RANGE=YesPlease if sync_file_range is available. +GIT_CHECK_FUNC(sync_file_range, + [HAVE_SYNC_FILE_RANGE=YesPlease], + [HAVE_SYNC_FILE_RANGE]) +GIT_CONF_SUBST([HAVE_SYNC_FILE_RANGE]) + # # Define NO_SETITIMER if you don't have setitimer. GIT_CHECK_FUNC(setitimer, diff --git a/environment.c b/environment.c index 2701dfeeec8..aeafe80235e 100644 --- a/environment.c +++ b/environment.c @@ -42,7 +42,7 @@ const char *git_attributes_file; const char *git_hooks_path; int zlib_compression_level = Z_BEST_SPEED; int pack_compression_level = Z_DEFAULT_COMPRESSION; -int fsync_object_files; +enum fsync_object_files_mode fsync_object_files; size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE; size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT; size_t delta_base_cache_limit = 96 * 1024 * 1024; diff --git a/git-compat-util.h b/git-compat-util.h index d70ce142861..4defd4ab200 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1214,6 +1214,13 @@ __attribute__((format (printf, 1, 2))) NORETURN void BUG(const char *fmt, ...); #endif +enum fsync_action { + FSYNC_WRITEOUT_ONLY, + FSYNC_HARDWARE_FLUSH +}; + +int git_fsync(int fd, enum fsync_action action); + /* * Preserves errno, prints a message, but gives no warning for ENOENT. * Returns 0 on success, which includes trying to unlink an object that does diff --git a/object-file.c b/object-file.c index 659ef7623ff..9d0aac792ae 100644 --- a/object-file.c +++ b/object-file.c @@ -1853,8 +1853,18 @@ int hash_object_file(const struct git_hash_algo *algo, const void *buf, static void close_loose_object(int fd) { if (!the_repository->objects->odb->will_destroy) { - if (fsync_object_files) + switch (fsync_object_files) { + case FSYNC_OBJECT_FILES_OFF: + break; + case FSYNC_OBJECT_FILES_ON: fsync_or_die(fd, "loose object file"); + break; + case FSYNC_OBJECT_FILES_BATCH: + fsync_loose_object_bulk_checkin(fd); + break; + default: + BUG("Invalid fsync_object_files mode."); + } } if (close(fd) != 0) diff --git a/wrapper.c b/wrapper.c index 36e12119d76..689288d2e31 100644 --- a/wrapper.c +++ b/wrapper.c @@ -546,6 +546,50 @@ int xmkstemp_mode(char *filename_template, int mode) return fd; } +int git_fsync(int fd, enum fsync_action action) +{ + switch (action) { + case FSYNC_WRITEOUT_ONLY: + +#ifdef __APPLE__ + /* + * on macOS, fsync just causes filesystem cache writeback but does not + * flush hardware caches. + */ + return fsync(fd); +#endif + +#ifdef HAVE_SYNC_FILE_RANGE + /* + * On linux 2.6.17 and above, sync_file_range is the way to issue + * a writeback without a hardware flush. An offset of 0 and size of 0 + * indicates writeout of the entire file and the wait flags ensure that all + * dirty data is written to the disk (potentially in a disk-side cache) + * before we continue. + */ + + return sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE | + SYNC_FILE_RANGE_WRITE | + SYNC_FILE_RANGE_WAIT_AFTER); +#endif + + errno = ENOSYS; + return -1; + + case FSYNC_HARDWARE_FLUSH: + +#ifdef __APPLE__ + return fcntl(fd, F_FULLFSYNC); +#else + return fsync(fd); +#endif + + default: + BUG("unexpected git_fsync(%d) call", action); + } + +} + static int warn_if_unremovable(const char *op, const char *file, int rc) { int err; diff --git a/write-or-die.c b/write-or-die.c index 0b1ec8190b6..cc8291d9794 100644 --- a/write-or-die.c +++ b/write-or-die.c @@ -57,7 +57,7 @@ void fprintf_or_die(FILE *f, const char *fmt, ...) void fsync_or_die(int fd, const char *msg) { - while (fsync(fd) < 0) { + while (git_fsync(fd, FSYNC_HARDWARE_FLUSH) < 0) { if (errno != EINTR) die_errno("fsync error on '%s'", msg); } From patchwork Mon Nov 15 23:50:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93FA8C4332F for ; Tue, 16 Nov 2021 03:24:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7A38761B66 for ; Tue, 16 Nov 2021 03:24:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345143AbhKPD1j (ORCPT ); Mon, 15 Nov 2021 22:27:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238407AbhKPD0X (ORCPT ); Mon, 15 Nov 2021 22:26:23 -0500 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 399BAC126D02 for ; Mon, 15 Nov 2021 15:51:10 -0800 (PST) Received: by mail-wm1-x334.google.com with SMTP id i8-20020a7bc948000000b0030db7b70b6bso542047wml.1 for ; Mon, 15 Nov 2021 15:51:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=RRVKUU/CDTvLO/mF2IPKnCbu1AteNB3iWsrxuugjmSU=; b=luUCwaXOHOJqwoVH4z6Qj2FgVX6Vz4OV0Kvl19xZGKR8Tlm/cma5ecMEKyOG/FNW1g 9RwKeIxMW0S046pju33m744gz+evVvxs+YXz6hDVvbw+RGuZZSB1t7aVTJe5IYFC88C5 gAxdQJS8G2+e2+N6dhrrN6HchWj/hx/hGD30JO1bBor5rPUiVZiEzfCJirmS3C5teUPT PVsTDQ+RnsMyqjx1GgF3ykmSbSdM0BRNuSWZ72VZr0Xhy8110iNz1Fs+MAxpK0YCqox8 f0HTRgzKvSEgJ6TRl3W260KgbXTPdlObFtfIpMmagcfdByO5vokIFPsQtTw9vt6FfX0h J2cQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=RRVKUU/CDTvLO/mF2IPKnCbu1AteNB3iWsrxuugjmSU=; b=x0OFKnSCnuxABPNFD3Xo/pZ9DRKIJwBTM3oh5Tyc54JgpaMpAVJlP3mLablPOh1g3g OY77MYuc8vOPb+ubZfEyIRBiB6u3RrTzuNTMw34cnzkw4CCNH/9Q/iBm7FEj2VQ6EP/6 mh2m1+YFlH0nGhoVXMSofyn92mx1UC3BbBWo5BVHmePqLjcccPRZP+FPIExzp9rftqgm Ydgj6xxhp8Lr3ZbMGzPH+Yr1sXNWOlErLuHtymPrzliGlauURv2KpQoCeo7T+Z/hCf6w yOXslyb62PkY1RCWa5guZHrprrbmioKpxU2QOKPEWE6iVmNZePTaTWJE8jgIJqySSxXr qRug== X-Gm-Message-State: AOAM533LDJSpLxUb5bH9UWmZDou32ICgAnJq/9UysP4Nd8pRnRdFl8ea ja5eu9/ouaQnHsRUCS6fA+hqXHJJeM8= X-Google-Smtp-Source: ABdhPJwfkjmT1jD29SbsmNJHi562S/UhAmPMH7eHUGCZ+DU8LzdaQiK+wKHuNwiaK84Gv2DafDA1yg== X-Received: by 2002:a1c:96:: with SMTP id 144mr11113156wma.126.1637020268695; Mon, 15 Nov 2021 15:51:08 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f8sm1016435wmf.2.2021.11.15.15.51.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:08 -0800 (PST) Message-Id: <951a559874e3e76c85e98d24c457e81a65de76e4.1637020263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 15 Nov 2021 23:50:59 +0000 Subject: [PATCH v9 5/9] core.fsyncobjectfiles: add windows support for batch mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh This commit adds a win32 implementation for fsync_no_flush that is called git_fsync. The 'NtFlushBuffersFileEx' function being called is available since Windows 8. If the function is not available, we return -1 and Git falls back to doing a full fsync. The operating system is told to flush data only without a hardware flush primitive. A later full fsync will cause the metadata log to be flushed and then the disk cache to be flushed on NTFS and ReFS. Other filesystems will treat this as a full flush operation. I added a new file here for this system call so as not to conflict with downstream changes in the git-for-windows repository related to fscache. Signed-off-by: Neeraj Singh --- compat/mingw.h | 3 +++ compat/win32/flush.c | 28 ++++++++++++++++++++++++++++ config.mak.uname | 2 ++ contrib/buildsystems/CMakeLists.txt | 3 ++- wrapper.c | 4 ++++ 5 files changed, 39 insertions(+), 1 deletion(-) create mode 100644 compat/win32/flush.c diff --git a/compat/mingw.h b/compat/mingw.h index c9a52ad64a6..6074a3d3ced 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -329,6 +329,9 @@ int mingw_getpagesize(void); #define getpagesize mingw_getpagesize #endif +int win32_fsync_no_flush(int fd); +#define fsync_no_flush win32_fsync_no_flush + struct rlimit { unsigned int rlim_cur; }; diff --git a/compat/win32/flush.c b/compat/win32/flush.c new file mode 100644 index 00000000000..75324c24ee7 --- /dev/null +++ b/compat/win32/flush.c @@ -0,0 +1,28 @@ +#include "../../git-compat-util.h" +#include +#include "lazyload.h" + +int win32_fsync_no_flush(int fd) +{ + IO_STATUS_BLOCK io_status; + +#define FLUSH_FLAGS_FILE_DATA_ONLY 1 + + DECLARE_PROC_ADDR(ntdll.dll, NTSTATUS, NtFlushBuffersFileEx, + HANDLE FileHandle, ULONG Flags, PVOID Parameters, ULONG ParameterSize, + PIO_STATUS_BLOCK IoStatusBlock); + + if (!INIT_PROC_ADDR(NtFlushBuffersFileEx)) { + errno = ENOSYS; + return -1; + } + + memset(&io_status, 0, sizeof(io_status)); + if (NtFlushBuffersFileEx((HANDLE)_get_osfhandle(fd), FLUSH_FLAGS_FILE_DATA_ONLY, + NULL, 0, &io_status)) { + errno = EINVAL; + return -1; + } + + return 0; +} diff --git a/config.mak.uname b/config.mak.uname index 5ead1377667..5727fb093ca 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -455,6 +455,7 @@ endif CFLAGS = BASIC_CFLAGS = -nologo -I. -Icompat/vcbuild/include -DWIN32 -D_CONSOLE -DHAVE_STRING_H -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE COMPAT_OBJS = compat/msvc.o compat/winansi.o \ + compat/win32/flush.o \ compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/trace2_win32_process_info.o \ @@ -630,6 +631,7 @@ ifeq ($(uname_S),MINGW) COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\" COMPAT_OBJS += compat/mingw.o compat/winansi.o \ compat/win32/trace2_win32_process_info.o \ + compat/win32/flush.o \ compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/dirent.o diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index fd1399c440f..ef0c1e4976d 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -261,7 +261,8 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows") NOGDI OBJECT_CREATION_MODE=1 __USE_MINGW_ANSI_STDIO=0 USE_NED_ALLOCATOR OVERRIDE_STRDUP MMAP_PREVENTS_DELETE USE_WIN32_MMAP UNICODE _UNICODE HAVE_WPGMPTR ENSURE_MSYSTEM_IS_SET) - list(APPEND compat_SOURCES compat/mingw.c compat/winansi.c compat/win32/path-utils.c + list(APPEND compat_SOURCES compat/mingw.c compat/winansi.c + compat/win32/flush.c compat/win32/path-utils.c compat/win32/pthread.c compat/win32mmap.c compat/win32/syslog.c compat/win32/trace2_win32_process_info.c compat/win32/dirent.c compat/nedmalloc/nedmalloc.c compat/strdup.c) diff --git a/wrapper.c b/wrapper.c index 689288d2e31..ece3d2ca106 100644 --- a/wrapper.c +++ b/wrapper.c @@ -573,6 +573,10 @@ int git_fsync(int fd, enum fsync_action action) SYNC_FILE_RANGE_WAIT_AFTER); #endif +#ifdef fsync_no_flush + return fsync_no_flush(fd); +#endif + errno = ENOSYS; return -1; From patchwork Mon Nov 15 23:51:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17083C433FE for ; Tue, 16 Nov 2021 03:24:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F1EB461B66 for ; Tue, 16 Nov 2021 03:24:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344890AbhKPD1v (ORCPT ); Mon, 15 Nov 2021 22:27:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244139AbhKPD00 (ORCPT ); Mon, 15 Nov 2021 22:26:26 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE0D1C126D03 for ; Mon, 15 Nov 2021 15:51:10 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id w29so33827825wra.12 for ; Mon, 15 Nov 2021 15:51:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hJTprmd6l9eGdcrH8bWYEQJ64Pw67KrU1jFu1jkOkoM=; b=R8sguujjto+EFvSRgU5AGV4NcXGppsPTEWu2pqpXh1++rEs8IZaTZBFmUmiYtiyGIc h4olNABc4tIdmiPGMFHAU5g/khpOF15BoxWnIxXhfLALuHj1YH82b2FccZNDvWVHAUjK l8awAh6vwZwWNqeDcIFat1sg/PvIfPRNHxXKagn/N0SWQbu5hlsnv6QoMoxLpuRurmN3 blRNnSU+p/NMyzCIxBO8EkwsHjMUrpVWzUD7BRCW1jS44KlXIoTuXbDsq8sVFXkuHE3G cNwcxprfKhNGoU1m8urr0uAy2phbQS2YeDUBDBRwT87J8FOPahDFjG/btmafnw7rpnbL eUFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hJTprmd6l9eGdcrH8bWYEQJ64Pw67KrU1jFu1jkOkoM=; b=VDxmA6/mnJmjOldQWEYr9CHXYAz1ye+DYqoZABi8nZzX/t8JEkr8q2I5BsbKnTpiYe /BCH2kWZ2gfnfSvkoEaIY2PmzZi+QXJCRl0WlYyNPM/RKT/uUaHuTz70h7mcyR07esle vJeyM1cOB+SDLTyD2N+Kv3GdrYpir1kWBBnnTPwI4rijF68W52sLpTFn81cTHy8ZR+iG N8+K5/nTL7M2YNZFE0rB3/oTClculx2nN3bw6OpmLrRMkXywbrJOO44nYIryskZf3qg0 6r1ZC0wtj2ZCJnwJRj/4CigamQAPGxx43h3hqqhuUadYDCbJnqGzfn+afmcMzjmX/e1d JY2g== X-Gm-Message-State: AOAM530YAva1a6U4+yDmeiuixsV+fD6LXdfs1sbjAxtYHDyYB7X9htaq +Tu7RIFHZpC1+luYp53DNFGEFmp6TfA= X-Google-Smtp-Source: ABdhPJxtEs8y2xmQMy/j/izJ+5/4LCCnNvcUySdb8tT83PzSt+YC4WP+Io8iS06cdggn3OSORfHwKQ== X-Received: by 2002:a5d:58c5:: with SMTP id o5mr3810524wrf.15.1637020269361; Mon, 15 Nov 2021 15:51:09 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 38sm7180264wrc.1.2021.11.15.15.51.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:09 -0800 (PST) Message-Id: <4a40fd4a29a468b9ce320bc7b22f19e5a526fad6.1637020263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 15 Nov 2021 23:51:00 +0000 Subject: [PATCH v9 6/9] update-index: use the bulk-checkin infrastructure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh The update-index functionality is used internally by 'git stash push' to setup the internal stashed commit. This change enables bulk-checkin for update-index infrastructure to speed up adding new objects to the object database by leveraging the pack functionality and the new bulk-fsync functionality. There is some risk with this change, since under batch fsync, the object files will not be available until the update-index is entirely complete. This usage is unlikely, since any tool invoking update-index and expecting to see objects would have to synchronize with the update-index process after passing it a file path. Signed-off-by: Neeraj Singh --- builtin/update-index.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/builtin/update-index.c b/builtin/update-index.c index 187203e8bb5..dc7368bb1ee 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -5,6 +5,7 @@ */ #define USE_THE_INDEX_COMPATIBILITY_MACROS #include "cache.h" +#include "bulk-checkin.h" #include "config.h" #include "lockfile.h" #include "quote.h" @@ -1088,6 +1089,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) the_index.updated_skipworktree = 1; + /* we might be adding many objects to the object database */ + plug_bulk_checkin(); + /* * Custom copy of parse_options() because we want to handle * filename arguments as they come. @@ -1168,6 +1172,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) strbuf_release(&buf); } + /* by now we must have added all of the new objects */ + unplug_bulk_checkin(); if (split_index > 0) { if (git_config_get_split_index() == 0) warning(_("core.splitIndex is set to false; " From patchwork Mon Nov 15 23:51:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08789C43217 for ; Tue, 16 Nov 2021 03:24:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E132A61B66 for ; Tue, 16 Nov 2021 03:24:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344719AbhKPD1w (ORCPT ); Mon, 15 Nov 2021 22:27:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244496AbhKPD00 (ORCPT ); Mon, 15 Nov 2021 22:26:26 -0500 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64594C126D04 for ; Mon, 15 Nov 2021 15:51:11 -0800 (PST) Received: by mail-wm1-x335.google.com with SMTP id k37-20020a05600c1ca500b00330cb84834fso531249wms.2 for ; Mon, 15 Nov 2021 15:51:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Wx/NRBAYoIBhOKGldB61QZt0lf65wQv8k/Ky8embUfk=; b=Pj4O+6n8gLYHIlA93DRRbsozvYEN81vS9wst1So30k4LO29Hs7qHqHxfaVhTcCcIxH wToR+BeAD/qtC5AfF8dQg/frjNPGEGHtX8BnwJSttQoKNxpM5rXAA/Y0eYyPUytGbqXH OHow9qS0nsx/SlaDQQPIBEbWTOSfREqp9Y6B3iEmp0AYlPOXevrtkdorhSmyjqgAe6vP +s/96+9PQan95x6hrrxQAP9PMzj1ZRjoNLSE7a6PZx8+PBV8OYtl6s9VGHLUEayX02+t jWRI+Rr+bOWBg8FBeCzwut5ZgdUZVRdqRekzPPvaRo1ii1NIckgARgKFDLcln85wFrqu 3eAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Wx/NRBAYoIBhOKGldB61QZt0lf65wQv8k/Ky8embUfk=; b=rJmOVO+ddM8y8pMI9j/WB56NIH8RTUrUqlCI7jfsAE8/5O8NoBazKm2wxQjqahv03A xcPKmNGYeih9J/5HglK31SnEVkVhAiv+9WVZn9qpF0iO+ZgSurFBjph0rTp7hXY8mMh1 j0JbhNxYlcbPcGSgEeE7JipAleNcgPurek/XREApgmGiequSe+3ppSxwLqQEtbbMtrG6 q76M9ri74WVLwt0dHpBVkSnzwUyMMfGXuOlAoLV1GQrXT6pou3Q2GxwH2UKwmRYxmnzI KT38s+9V1nGPqvQdZU77eBw0X5TMuHIxb3p46dm9FOZ/JyM5b5yptN43XwAoyyK6HUGu 0zGg== X-Gm-Message-State: AOAM530uR6auIexbG3wG3TaXiRtqMKvTLXPx0q/IEofSq/l0vZGcEnVA 9VTs4cC/WQ+LUcWYLluLCRy7LHsSw9E= X-Google-Smtp-Source: ABdhPJyrLGwtb99NYnn55YnxBSnva5YMVe3LREkIUt94a1fMLX1v4T2tP+INNE32/OjOBIhvaB6NJQ== X-Received: by 2002:a05:600c:3b1b:: with SMTP id m27mr2704261wms.125.1637020269932; Mon, 15 Nov 2021 15:51:09 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b197sm658436wmb.24.2021.11.15.15.51.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:09 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 15 Nov 2021 23:51:01 +0000 Subject: [PATCH v9 7/9] unpack-objects: use the bulk-checkin infrastructure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh The unpack-objects functionality is used by fetch, push, and fast-import to turn the transfered data into object database entries when there are fewer objects than the 'unpacklimit' setting. By enabling bulk-checkin when unpacking objects, we can take advantage of batched fsyncs. Signed-off-by: Neeraj Singh --- builtin/unpack-objects.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 4a9466295ba..51eb4f7b531 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -1,5 +1,6 @@ #include "builtin.h" #include "cache.h" +#include "bulk-checkin.h" #include "config.h" #include "object-store.h" #include "object.h" @@ -503,10 +504,12 @@ static void unpack_all(void) if (!quiet) progress = start_progress(_("Unpacking objects"), nr_objects); CALLOC_ARRAY(obj_list, nr_objects); + plug_bulk_checkin(); for (i = 0; i < nr_objects; i++) { unpack_one(i); display_progress(progress, i + 1); } + unplug_bulk_checkin(); stop_progress(&progress); if (delta_list) From patchwork Mon Nov 15 23:51:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7824BC4321E for ; Tue, 16 Nov 2021 03:24:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 591CB61B96 for ; Tue, 16 Nov 2021 03:24:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344644AbhKPD1x (ORCPT ); Mon, 15 Nov 2021 22:27:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245306AbhKPD01 (ORCPT ); Mon, 15 Nov 2021 22:26:27 -0500 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CDB6C126D05 for ; Mon, 15 Nov 2021 15:51:12 -0800 (PST) Received: by mail-wm1-x336.google.com with SMTP id y196so15349223wmc.3 for ; Mon, 15 Nov 2021 15:51:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=nbUJeiKDuUGJbRc6leBVx5PGjyXkuy6mjccjLuAcYco=; b=bEVPyozRl0vNacTN4xEA2firNnjQDEEObL5SBJRuDg/b8H+l3UI9A7zyCmV3BiLh2U 2hbLHWeGNf8QqQ+B18/5B93kUMwSJdMqYVTFyBX6F/YvruyDu7Qm0Nq9tT8DU0acHHGu ptY9sLvstznHath8bpCWDnFLHXQ/iBUli6WY+D0+M1jJJ1ykteHdJjvD6wtwP/IoVap8 AmWqi74Edtzh1hVok9lRN+IsM+5MoCGPVqXjhMo2UrMukXIOkd3cRr02ZNSaFucBLf03 79PM5cvqWGSd6uAVrzUg/pW7Hqxe/Z4Nla1Us2BQlVrQqA0ujWGmTFT9RDTKPCcHqGlJ Gd7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=nbUJeiKDuUGJbRc6leBVx5PGjyXkuy6mjccjLuAcYco=; b=anrGe6kGPDUh9w8mnReQqkdrXyqjr7xv2B/2bL/kiml/+aGsS6K0ibolunJ0wZUt22 V4g10oKybEZlAXZi2GHbW6nfYpBiH5x/e5vLL8jNgAkudHws8+IEtAFbJwmuBOfLMO+b LJZL9pZuQAATTfnwTgkA5qGordqbyk5+pvcVyqcoXcqp7BeFOrEC+ZrCzoXuUBe4k2d+ dVHQoh47hTYUAt5hLwXfOYLewy96LTDozGLdeQc4I8ip1XrCQ7yWG4J3E6+X3/gTkbLL 2SjwVWkCK+1qcgfpubc9nS2wUQRysbYXoBWly89Yv6E+1123lS9/wsVtTXDtmYPSgQ3O EMbA== X-Gm-Message-State: AOAM530ku24bq61AC8dpvY+Kulopwsztwm9RF+aA2iiaxdDqYMM2Z+8T 7u+D/ZTIbWdTdMsN5aPo8kVxvauc8zQ= X-Google-Smtp-Source: ABdhPJxAkGQ7D8qncQtKpACo2ORATw1NVudkIhqZwGRAVNtzLY+eQAhRdhewDTuRDoEnSjlhsxIuTA== X-Received: by 2002:a1c:9842:: with SMTP id a63mr64298625wme.102.1637020270583; Mon, 15 Nov 2021 15:51:10 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f81sm725368wmf.22.2021.11.15.15.51.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:10 -0800 (PST) Message-Id: <270c24827d0a5b246ec0db31de30c1f8c25cd0c3.1637020263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 15 Nov 2021 23:51:02 +0000 Subject: [PATCH v9 8/9] core.fsyncobjectfiles: tests for batch mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Add test cases to exercise batch mode for: * 'git add' * 'git stash' * 'git update-index' * 'git unpack-objects' These tests ensure that the added data winds up in the object database. In this change we introduce a new test helper lib-unique-files.sh. The goal of this library is to create a tree of files that have different oids from any other files that may have been created in the current test repo. This helps us avoid missing validation of an object being added due to it already being in the repo. Signed-off-by: Neeraj Singh --- t/lib-unique-files.sh | 36 ++++++++++++++++++++++++++++++++++++ t/t3700-add.sh | 20 ++++++++++++++++++++ t/t3903-stash.sh | 14 ++++++++++++++ t/t5300-pack-object.sh | 30 +++++++++++++++++++----------- 4 files changed, 89 insertions(+), 11 deletions(-) create mode 100644 t/lib-unique-files.sh diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh new file mode 100644 index 00000000000..a7de4ca8512 --- /dev/null +++ b/t/lib-unique-files.sh @@ -0,0 +1,36 @@ +# Helper to create files with unique contents + + +# Create multiple files with unique contents. Takes the number of +# directories, the number of files in each directory, and the base +# directory. +# +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files +# each in my_dir, all with unique +# contents. + +test_create_unique_files() { + test "$#" -ne 3 && BUG "3 param" + + local dirs=$1 + local files=$2 + local basedir=$3 + local counter=0 + test_tick + local basedata=$test_tick + + + rm -rf $basedir + + for i in $(test_seq $dirs) + do + local dir=$basedir/dir$i + + mkdir -p "$dir" + for j in $(test_seq $files) + do + counter=$((counter + 1)) + echo "$basedata.$counter" >"$dir/file$j.txt" + done + done +} diff --git a/t/t3700-add.sh b/t/t3700-add.sh index 283a66955d6..aaecefda159 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -8,6 +8,8 @@ test_description='Test of git add, including the -- option.' TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh +. $TEST_DIRECTORY/lib-unique-files.sh + # Test the file mode "$1" of the file "$2" in the index. test_mode_in_index () { case "$(git ls-files -s "$2")" in @@ -34,6 +36,24 @@ test_expect_success \ 'Test that "git add -- -q" works' \ 'touch -- -q && git add -- -q' +test_expect_success 'git add: core.fsyncobjectfiles=batch' " + test_create_unique_files 2 4 fsync-files && + git -c core.fsyncobjectfiles=batch add -- ./fsync-files/ && + rm -f fsynced_files && + git ls-files --stage fsync-files/ > fsynced_files && + test_line_count = 8 fsynced_files && + awk -- '{print \$2}' fsynced_files | xargs -n1 git cat-file -e +" + +test_expect_success 'git update-index: core.fsyncobjectfiles=batch' " + test_create_unique_files 2 4 fsync-files2 && + find fsync-files2 ! -type d -print | xargs git -c core.fsyncobjectfiles=batch update-index --add -- && + rm -f fsynced_files2 && + git ls-files --stage fsync-files2/ > fsynced_files2 && + test_line_count = 8 fsynced_files2 && + awk -- '{print \$2}' fsynced_files2 | xargs -n1 git cat-file -e +" + test_expect_success \ 'git add: Test that executable bit is not used if core.filemode=0' \ 'git config core.filemode 0 && diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index f0a82be9de7..6324b52c874 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -9,6 +9,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh +. $TEST_DIRECTORY/lib-unique-files.sh diff_cmp () { for i in "$1" "$2" @@ -1293,6 +1294,19 @@ test_expect_success 'stash handles skip-worktree entries nicely' ' git rev-parse --verify refs/stash:A.t ' +test_expect_success 'stash with core.fsyncobjectfiles=batch' " + test_create_unique_files 2 4 fsync-files && + git -c core.fsyncobjectfiles=batch stash push -u -- ./fsync-files/ && + rm -f fsynced_files && + + # The files were untracked, so use the third parent, + # which contains the untracked files + git ls-tree -r stash^3 -- ./fsync-files/ > fsynced_files && + test_line_count = 8 fsynced_files && + awk -- '{print \$3}' fsynced_files | xargs -n1 git cat-file -e +" + + test_expect_success 'stash -c stash.useBuiltin=false warning ' ' expected="stash.useBuiltin support has been removed" && diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index e13a8842075..38663dc1393 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -162,23 +162,23 @@ test_expect_success 'pack-objects with bogus arguments' ' check_unpack () { test_when_finished "rm -rf git2" && - git init --bare git2 && - git -C git2 unpack-objects -n <"$1".pack && - git -C git2 unpack-objects <"$1".pack && - (cd .git && find objects -type f -print) | - while read path - do - cmp git2/$path .git/$path || { - echo $path differs. - return 1 - } - done + git $2 init --bare git2 && + ( + git $2 -C git2 unpack-objects -n <"$1".pack && + git $2 -C git2 unpack-objects <"$1".pack && + git $2 -C git2 cat-file --batch-check="%(objectname)" + ) current && + cmp obj-list current } test_expect_success 'unpack without delta' ' check_unpack test-1-${packname_1} ' +test_expect_success 'unpack without delta (core.fsyncobjectfiles=batch)' ' + check_unpack test-1-${packname_1} "-c core.fsyncobjectfiles=batch" +' + test_expect_success 'pack with REF_DELTA' ' packname_2=$(git pack-objects --progress test-2 stderr) && check_deltas stderr -gt 0 @@ -188,6 +188,10 @@ test_expect_success 'unpack with REF_DELTA' ' check_unpack test-2-${packname_2} ' +test_expect_success 'unpack with REF_DELTA (core.fsyncobjectfiles=batch)' ' + check_unpack test-2-${packname_2} "-c core.fsyncobjectfiles=batch" +' + test_expect_success 'pack with OFS_DELTA' ' packname_3=$(git pack-objects --progress --delta-base-offset test-3 \ stderr) && @@ -198,6 +202,10 @@ test_expect_success 'unpack with OFS_DELTA' ' check_unpack test-3-${packname_3} ' +test_expect_success 'unpack with OFS_DELTA (core.fsyncobjectfiles=batch)' ' + check_unpack test-3-${packname_3} "-c core.fsyncobjectfiles=batch" +' + test_expect_success 'compare delta flavors' ' perl -e '\'' defined($_ = -s $_) or die for @ARGV; From patchwork Mon Nov 15 23:51:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12621261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77465C433EF for ; Tue, 16 Nov 2021 03:25:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 47B6361B96 for ; Tue, 16 Nov 2021 03:25:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345632AbhKPD1y (ORCPT ); Mon, 15 Nov 2021 22:27:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244575AbhKPD0a (ORCPT ); Mon, 15 Nov 2021 22:26:30 -0500 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CEBF8C126D06 for ; Mon, 15 Nov 2021 15:51:12 -0800 (PST) Received: by mail-wm1-x333.google.com with SMTP id c71-20020a1c9a4a000000b0032cdcc8cbafso1014322wme.3 for ; Mon, 15 Nov 2021 15:51:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=THp3oyYMQZ0IyvYbWDVtqI55HStotyepThNgwaAMuqY=; b=MtLJsnXXKYy/qq10u0szPk0uGaytJ+Ode/YLawZLT1PBzigIJgPno4T6pWhhw2huUt OEzO6faMoU1feDadnLhysHr5a2onxFsDckkMuPtE1hHRJXqRQ4MBB3QRDtlsshx69MRj xorF5hQxUBdWXwvyhJO6KdMAhIo8tH2ie8C5FaRF/F7Sr57EinAr3/rWkxEHjc5ongSm cSTapGdlhcaCqZU5MSOMNVE2qg6BdBtZkzQ7KzLbF3kMI9VGFlJm20CiLCtNWQ6modc6 CDDkXWXOfRKTXZa/Tienq5EPRWBOt+nh4h7C2I5jPD/ixahurYGDJh2q5n73BMa2Bu1k T5Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=THp3oyYMQZ0IyvYbWDVtqI55HStotyepThNgwaAMuqY=; b=n6Wh4z9KyHpRqe4oAWguLvKTGY47qSZfTV0zevLtyDync5s6M+n5N7XeHV6bkBOT4t WB+AWDOO63K/rstscA/3ihdGdENtvJBm9i8xua9Yi4+h0vAXjTWVmgVC7S+A9XTKibJS SOtQNmEDRCGB73vW9sihrfgej7/Q+xBt8e+o9kgZ0sPiEtfCiICG6zeMJHGWAZ6V8W/P DC3J4Gea0jKjsAumNmfrhC/ylJuCGlTKtpxMNdln20UJ8fezd64H0nMxOD07p1IE/Lfb X2gG78pPTcYFrjW2Qe+7ZMmosBc3E7pv7FV5VxnY+31Z4kREN2kpmCpUQNXKERcxZkaL /WtQ== X-Gm-Message-State: AOAM532jzXTFtCDlXGJxpwmITDSAmVZFEVo8wah/3u0+ESXBkdDcZNqX DfIZDTIqR01VmpNIRRH1QGh0/CO5ftU= X-Google-Smtp-Source: ABdhPJxigg6kPI6WiSKanLnLLVhp1TPfdSg80Hm2ugNbSMYJdeZgHJIcxhqQvv2pZoAy69dXVARgfw== X-Received: by 2002:a1c:43c2:: with SMTP id q185mr63379317wma.30.1637020271247; Mon, 15 Nov 2021 15:51:11 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j8sm15268697wrh.16.2021.11.15.15.51.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Nov 2021 15:51:10 -0800 (PST) Message-Id: <12d99641f4c3738efbc5c70ef4f8d70eccb6fc8b.1637020263.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 15 Nov 2021 23:51:03 +0000 Subject: [PATCH v9 9/9] core.fsyncobjectfiles: performance tests for add and stash Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , Elijah Newren , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Add a basic performance test for "git add" and "git stash" of a lot of new objects with various fsync settings. Signed-off-by: Neeraj Singh --- t/perf/p3700-add.sh | 43 ++++++++++++++++++++++++++++++++++++++++ t/perf/p3900-stash.sh | 46 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) create mode 100755 t/perf/p3700-add.sh create mode 100755 t/perf/p3900-stash.sh diff --git a/t/perf/p3700-add.sh b/t/perf/p3700-add.sh new file mode 100755 index 00000000000..e93c08a2e70 --- /dev/null +++ b/t/perf/p3700-add.sh @@ -0,0 +1,43 @@ +#!/bin/sh +# +# This test measures the performance of adding new files to the object database +# and index. The test was originally added to measure the effect of the +# core.fsyncObjectFiles=batch mode, which is why we are testing different values +# of that setting explicitly and creating a lot of unique objects. + +test_description="Tests performance of add" + +. ./perf-lib.sh + +. $TEST_DIRECTORY/lib-unique-files.sh + +test_perf_default_repo +test_checkout_worktree + +dir_count=10 +files_per_dir=50 +total_files=$((dir_count * files_per_dir)) + +# We need to create the files each time we run the perf test, but +# we do not want to measure the cost of creating the files, so run +# the tet once. +if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1 +then + echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2 + GIT_PERF_REPEAT_COUNT=1 +fi + +for m in false true batch +do + test_expect_success "create the files for core.fsyncObjectFiles=$m" ' + git reset --hard && + # create files across directories + test_create_unique_files $dir_count $files_per_dir files + ' + + test_perf "add $total_files files (core.fsyncObjectFiles=$m)" " + git -c core.fsyncobjectfiles=$m add files + " +done + +test_done diff --git a/t/perf/p3900-stash.sh b/t/perf/p3900-stash.sh new file mode 100755 index 00000000000..c9fcd0c03eb --- /dev/null +++ b/t/perf/p3900-stash.sh @@ -0,0 +1,46 @@ +#!/bin/sh +# +# This test measures the performance of adding new files to the object database +# and index. The test was originally added to measure the effect of the +# core.fsyncObjectFiles=batch mode, which is why we are testing different values +# of that setting explicitly and creating a lot of unique objects. + +test_description="Tests performance of stash" + +. ./perf-lib.sh + +. $TEST_DIRECTORY/lib-unique-files.sh + +test_perf_default_repo +test_checkout_worktree + +dir_count=10 +files_per_dir=50 +total_files=$((dir_count * files_per_dir)) + +# We need to create the files each time we run the perf test, but +# we do not want to measure the cost of creating the files, so run +# the tet once. +if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1 +then + echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2 + GIT_PERF_REPEAT_COUNT=1 +fi + +for m in false true batch +do + test_expect_success "create the files for core.fsyncObjectFiles=$m" ' + git reset --hard && + # create files across directories + test_create_unique_files $dir_count $files_per_dir files + ' + + # We only stash files in the 'files' subdirectory since + # the perf test infrastructure creates files in the + # current working directory that need to be preserved + test_perf "stash 500 files (core.fsyncObjectFiles=$m)" " + git -c core.fsyncobjectfiles=$m stash push -u -- files + " +done + +test_done