From patchwork Tue Sep 28 23:32:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CD55C433FE for ; Tue, 28 Sep 2021 23:32:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5D12F6136F for ; Tue, 28 Sep 2021 23:32:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243269AbhI1Xeg (ORCPT ); Tue, 28 Sep 2021 19:34:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243252AbhI1Xef (ORCPT ); Tue, 28 Sep 2021 19:34:35 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6755C061745 for ; Tue, 28 Sep 2021 16:32:54 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id t8so1124507wrq.4 for ; Tue, 28 Sep 2021 16:32:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AYmz+hgZbrQEWyt17kXMMD7aehTe8Mt5waObK2M5wgY=; b=TmquAGcS4hYXxqUWm3LAwxXshcCimPBdvHLPHt6z9T5XNFIl5K2CvH/EcaTd7LBD5G 8fKFfWYuzEwY3WftI8VtvM+Qqe73cbl1sVkD/OEhwMzCg8D4POhHrnnlnn+6VPM2W2dl GbdX59CPwNyzV9CmkIB7RKl4QYqEn8aKONva8rwsdyGN8yg2DVyF3QjGTFA9g/c4TvU7 I4AImBnOzdlAgAOHWKnKXSXWF9W8NyYyJezo23Tf/kQf25em1XrryiMLTnaknZzJXiAh +zw96zanUFx2X32aiifECIv42wewAcgg1f0kYiJubIsoHyWwE9t1QfqH35zm9Y7OVOCb OQ0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AYmz+hgZbrQEWyt17kXMMD7aehTe8Mt5waObK2M5wgY=; b=8NQjKBueTIW7R3g2Tpx4uOS3X1179lbdocWjrD2H9E6K7ablZh474PkEqI1+ClMN4P NM3HnetNajknhCqmgmq3+YXG0Wgd7v/yA6sNLZFapq2gFOUNkofkAv89Ac/+xBJ/4yrK O61nyYiOej0hVjA1bOv4ZyNT9YqmrZha5W+dRiYQxYWu/TQjgqGJ0qtJ4vYOXlQEyEcb BWWC0xwVEaYLU4CgqMx4NKfQIJLkzs0qG2o4obEUbo3WkiP8kbLwFT+YX2ZfU9J6nY1G urbirmfUiWcA8rZCMoauO3WrYpnyHI/FqjBxtKujxPXnmdqyEcAU3yLqvW8x2jqApvuF 4TYw== X-Gm-Message-State: AOAM530hOayHXxWtFwU9a52e5J/6fE2jBSo1XlhBAwfT2yVcFPe1GPu1 AhCVkrKO/q3vORhgcEB3X1+OFAK8GF8= X-Google-Smtp-Source: ABdhPJyTb8XSSEPhQvYmUCUC6zfqSiuJ3MAtBaaO7G6WnWxwl+urwOflrsQLUl41FVjB9bmacELFwg== X-Received: by 2002:adf:e8cb:: with SMTP id k11mr3161832wrn.435.1632871973457; Tue, 28 Sep 2021 16:32:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f1sm515547wri.43.2021.09.28.16.32.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:53 -0700 (PDT) Message-Id: <6e65f68fd6d4d90b0a7bca2e2e57ace9ad749266.1632871971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:43 +0000 Subject: [PATCH v7 1/9] object-file.c: do not rename in a temp odb Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh If a temporary ODB is active, as determined by GIT_QUARANTINE_PATH being set, create object files with their final names. This avoids an extra rename beyond what is needed to merge the temporary ODB in tmp_objdir_migrate. Creating an object file with the expected final name should be okay since the git process writing to the temporary object store is the only writer, and it only invokes write_loose_object/create_object_file after checking that the object doesn't exist. Signed-off-by: Neeraj Singh --- environment.c | 4 ++++ object-file.c | 51 ++++++++++++++++++++++++++++++++++---------------- object-store.h | 6 ++++++ repository.c | 2 ++ repository.h | 1 + 5 files changed, 48 insertions(+), 16 deletions(-) diff --git a/environment.c b/environment.c index b4ba4fa22db..30fca67e6d6 100644 --- a/environment.c +++ b/environment.c @@ -176,6 +176,10 @@ void setup_git_env(const char *git_dir) args.graft_file = getenv_safe(&to_free, GRAFT_ENVIRONMENT); args.index_file = getenv_safe(&to_free, INDEX_ENVIRONMENT); args.alternate_db = getenv_safe(&to_free, ALTERNATE_DB_ENVIRONMENT); + if (getenv(GIT_QUARANTINE_ENVIRONMENT)) { + args.object_dir_is_temp = 1; + } + repo_set_gitdir(the_repository, git_dir, &args); strvec_clear(&to_free); diff --git a/object-file.c b/object-file.c index be4f94ecf3b..49c53f801f7 100644 --- a/object-file.c +++ b/object-file.c @@ -1826,12 +1826,17 @@ static void write_object_file_prepare(const struct git_hash_algo *algo, } /* - * Move the just written object into its final resting place. + * Move the just written object into its final resting place, + * unless it is already there, as indicated by an empty string for + * tmpfile. */ int finalize_object_file(const char *tmpfile, const char *filename) { int ret = 0; + if (!*tmpfile) + goto out; + if (object_creation_mode == OBJECT_CREATION_USES_RENAMES) goto try_rename; else if (link(tmpfile, filename)) @@ -1904,21 +1909,37 @@ static inline int directory_size(const char *filename) } /* - * This creates a temporary file in the same directory as the final - * 'filename' + * This creates a loose object file for the specified object id. + * If we're working in a temporary object directory, the file is + * created with its final filename, otherwise it is created with + * a temporary name and renamed by finalize_object_file. + * If no rename is required, an empty string is returned in tmp. * * We want to avoid cross-directory filename renames, because those * can have problems on various filesystems (FAT, NFS, Coda). */ -static int create_tmpfile(struct strbuf *tmp, const char *filename) +static int create_objfile(const struct object_id *oid, struct strbuf *tmp, + struct strbuf *filename) { - int fd, dirlen = directory_size(filename); + int fd, dirlen, is_retrying = 0; + const char *object_name; + static const int object_mode = 0444; + loose_object_path(the_repository, filename, oid); + dirlen = directory_size(filename->buf); + +retry_create: strbuf_reset(tmp); - strbuf_add(tmp, filename, dirlen); - strbuf_addstr(tmp, "tmp_obj_XXXXXX"); - fd = git_mkstemp_mode(tmp->buf, 0444); - if (fd < 0 && dirlen && errno == ENOENT) { + if (!the_repository->objects->odb->is_temp) { + strbuf_add(tmp, filename->buf, dirlen); + object_name = "tmp_obj_XXXXXX"; + strbuf_addstr(tmp, object_name); + fd = git_mkstemp_mode(tmp->buf, object_mode); + } else { + fd = open(filename->buf, O_CREAT | O_EXCL | O_RDWR, object_mode); + } + + if (fd < 0 && dirlen && errno == ENOENT && !is_retrying) { /* * Make sure the directory exists; note that the contents * of the buffer are undefined after mkstemp returns an @@ -1926,15 +1947,15 @@ static int create_tmpfile(struct strbuf *tmp, const char *filename) * scratch. */ strbuf_reset(tmp); - strbuf_add(tmp, filename, dirlen - 1); + strbuf_add(tmp, filename->buf, dirlen - 1); if (mkdir(tmp->buf, 0777) && errno != EEXIST) return -1; if (adjust_shared_perm(tmp->buf)) return -1; /* Try again */ - strbuf_addstr(tmp, "/tmp_obj_XXXXXX"); - fd = git_mkstemp_mode(tmp->buf, 0444); + is_retrying = 1; + goto retry_create; } return fd; } @@ -1951,14 +1972,12 @@ static int write_loose_object(const struct object_id *oid, char *hdr, static struct strbuf tmp_file = STRBUF_INIT; static struct strbuf filename = STRBUF_INIT; - loose_object_path(the_repository, &filename, oid); - - fd = create_tmpfile(&tmp_file, filename.buf); + fd = create_objfile(oid, &tmp_file, &filename); if (fd < 0) { if (errno == EACCES) return error(_("insufficient permission for adding an object to repository database %s"), get_object_directory()); else - return error_errno(_("unable to create temporary file")); + return error_errno(_("unable to create object file")); } /* Set it up */ diff --git a/object-store.h b/object-store.h index c5130d8baea..551639f173d 100644 --- a/object-store.h +++ b/object-store.h @@ -27,6 +27,12 @@ struct object_directory { uint32_t loose_objects_subdir_seen[8]; /* 256 bits */ struct oidtree *loose_objects_cache; + /* + * This is a temporary object store, so there is no need to + * create new objects via rename. + */ + int is_temp; + /* * Path to the alternative object store. If this is a relative path, * it is relative to the current working directory. diff --git a/repository.c b/repository.c index 710a3b4bf87..75966153b75 100644 --- a/repository.c +++ b/repository.c @@ -80,6 +80,8 @@ void repo_set_gitdir(struct repository *repo, expand_base_dir(&repo->objects->odb->path, o->object_dir, repo->commondir, "objects"); + repo->objects->odb->is_temp = o->object_dir_is_temp; + free(repo->objects->alternate_db); repo->objects->alternate_db = xstrdup_or_null(o->alternate_db); expand_base_dir(&repo->graft_file, o->graft_file, diff --git a/repository.h b/repository.h index 3740c93bc0f..d3711367a6f 100644 --- a/repository.h +++ b/repository.h @@ -162,6 +162,7 @@ struct set_gitdir_args { const char *graft_file; const char *index_file; const char *alternate_db; + int object_dir_is_temp; }; void repo_set_gitdir(struct repository *repo, const char *root, From patchwork Tue Sep 28 23:32:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BA6DC433FE for ; Tue, 28 Sep 2021 23:33:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 15BF7613A5 for ; Tue, 28 Sep 2021 23:33:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243295AbhI1Xek (ORCPT ); Tue, 28 Sep 2021 19:34:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243268AbhI1Xeg (ORCPT ); Tue, 28 Sep 2021 19:34:36 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CCA4C061745 for ; Tue, 28 Sep 2021 16:32:56 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id o19so373291wms.1 for ; Tue, 28 Sep 2021 16:32:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hAyaakDALyH+/tv2+ZEEb51m00CuouRK6JzcS2PngXw=; b=MNK3paKsyES3usd5LKABG9iZbR33NVGb1qsHO7iOdsbUu9CNo8+zcuKEcsZIyaHw2H sjRBrT3ULuct8pGNR7wCPOKiIZ4D9Tz/GxphIMmUTn+QosaY2jakkRP6KA+MFMY9IRwB lMjUrAONoLenx/4M3ImlfKZQwex64omz4kGY8jAloddUYrQTdlGLgPIhNcUnlBKfV4DY DyYhuR36SEs/SniCYZslt+QmOOlREIw/hN0+o3W5y6F3yCdYuxURHggKt//TpJ6GbsE+ s5bmiaXG7ujhf9d2nL23nktLNlLVuQsctmznNUCNrCJo61FvGsQMEMfq3iHSfaD5itf7 pLTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hAyaakDALyH+/tv2+ZEEb51m00CuouRK6JzcS2PngXw=; b=KOCBQC4BzTSxIdu5gnbq2OqzTgbLt+XsU1dSSEJarZbZ3BcXGhUe2amU7EON8FvpPc l5XZUpf4yXdftm3LI9W7MCLfUKJsj0kPLeeaWhkTunV7nWQQ6d6xsY6ekikXydfYqvCX Ze2o5Ss1uwTkr7ZygEfNoUZCVHuoS7/ic3fyr1tpI4XhTmkSi2ExktzlBhUwhTYkpjC8 df3cNcyGyKTNANaz3FjdOQUuCMPqQgNEb55Dsc0+dLYyEriYGpn50SHo3BBbpHbYEb7G gQ9UG4zpm1QvRAIjWngwveIarT9T74UapHOQCFuvQ4BFGA+ANgLn5IlRyH+Vk1XvBUrh tVOw== X-Gm-Message-State: AOAM533YBHPRniq6Whmo67aVF8VrNY/KmcpCP+HphDBjMLJXiCFmSQDP 0whI7NoxMMYy8ZxTbukoTsmcipWl37k= X-Google-Smtp-Source: ABdhPJza6YLKdIAKV1hgjDC5TGSrNiD2Jp+NpVMHpV2HiWDqIeqOWJ50P0gBFBFcMchf1pHRZee/+Q== X-Received: by 2002:a05:600c:3585:: with SMTP id p5mr7466645wmq.110.1632871973974; Tue, 28 Sep 2021 16:32:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r2sm395369wmq.28.2021.09.28.16.32.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:53 -0700 (PDT) Message-Id: <6ce72a709a11686b9082439a257fd5f58e5eb0f7.1632871971.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:44 +0000 Subject: [PATCH v7 2/9] tmp-objdir: new API for creating temporary writable databases Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh This patch is based on work by Elijah Newren. Any bugs however are my own. The tmp_objdir API provides the ability to create temporary object directories, but was designed with the goal of having subprocesses access these object stores, followed by the main process migrating objects from it to the main object store or just deleting it. The subprocesses would view it as their primary datastore and write to it. Here we add the tmp_objdir_replace_primary_odb function that replaces the current process's writable "main" object directory with the specified one. The previous main object directory is restored in either tmp_objdir_migrate or tmp_objdir_destroy. For the --remerge-diff usecase, add a new `will_destroy` flag in `struct object_database` to mark ephemeral object databases that do not require fsync durability. Add 'git prune' support for removing temporary object databases, and make sure that they have a name starting with tmp_ and containing an operation-specific name. Signed-off-by: Neeraj Singh --- builtin/prune.c | 22 +++++++++++++++++---- builtin/receive-pack.c | 2 +- object-file.c | 45 ++++++++++++++++++++++++++++++++++++++++-- object-store.h | 21 +++++++++++++++++++- object.c | 2 +- tmp-objdir.c | 32 +++++++++++++++++++++++++++--- tmp-objdir.h | 14 ++++++++++--- 7 files changed, 123 insertions(+), 15 deletions(-) diff --git a/builtin/prune.c b/builtin/prune.c index 02c6ab7cbaa..9c72ecf5a58 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -18,6 +18,7 @@ static int show_only; static int verbose; static timestamp_t expire; static int show_progress = -1; +static struct strbuf remove_dir_buf = STRBUF_INIT; static int prune_tmp_file(const char *fullpath) { @@ -26,10 +27,19 @@ static int prune_tmp_file(const char *fullpath) return error("Could not stat '%s'", fullpath); if (st.st_mtime > expire) return 0; - if (show_only || verbose) - printf("Removing stale temporary file %s\n", fullpath); - if (!show_only) - unlink_or_warn(fullpath); + if (S_ISDIR(st.st_mode)) { + if (show_only || verbose) + printf("Removing stale temporary directory %s\n", fullpath); + if (!show_only) { + strbuf_addstr(&remove_dir_buf, fullpath); + remove_dir_recursively(&remove_dir_buf, 0); + } + } else { + if (show_only || verbose) + printf("Removing stale temporary file %s\n", fullpath); + if (!show_only) + unlink_or_warn(fullpath); + } return 0; } @@ -97,6 +107,9 @@ static int prune_cruft(const char *basename, const char *path, void *data) static int prune_subdir(unsigned int nr, const char *path, void *data) { + if (verbose) + printf("Removing directory %s\n", path); + if (!show_only) rmdir(path); return 0; @@ -185,5 +198,6 @@ int cmd_prune(int argc, const char **argv, const char *prefix) prune_shallow(show_only ? PRUNE_SHOW_ONLY : 0); } + strbuf_release(&remove_dir_buf); return 0; } diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 48960a9575b..418a42ca069 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -2208,7 +2208,7 @@ static const char *unpack(int err_fd, struct shallow_info *si) strvec_push(&child.args, alt_shallow_file); } - tmp_objdir = tmp_objdir_create(); + tmp_objdir = tmp_objdir_create("incoming"); if (!tmp_objdir) { if (err_fd > 0) close(err_fd); diff --git a/object-file.c b/object-file.c index 49c53f801f7..1a3ad558c45 100644 --- a/object-file.c +++ b/object-file.c @@ -751,6 +751,44 @@ void add_to_alternates_memory(const char *reference) '\n', NULL, 0); } +struct object_directory *set_temporary_primary_odb(const char *dir, int will_destroy) +{ + struct object_directory *new_odb; + + /* + * Make sure alternates are initialized, or else our entry may be + * overwritten when they are. + */ + prepare_alt_odb(the_repository); + + /* + * Make a new primary odb and link the old primary ODB in as an + * alternate + */ + new_odb = xcalloc(1, sizeof(*new_odb)); + new_odb->path = xstrdup(dir); + new_odb->is_temp = 1; + new_odb->will_destroy = will_destroy; + new_odb->next = the_repository->objects->odb; + the_repository->objects->odb = new_odb; + return new_odb->next; +} + +void restore_primary_odb(struct object_directory *restore_odb, const char *old_path) +{ + struct object_directory *cur_odb = the_repository->objects->odb; + + if (strcmp(old_path, cur_odb->path)) + BUG("expected %s as primary object store; found %s", + old_path, cur_odb->path); + + if (cur_odb->next != restore_odb) + BUG("we expect the old primary object store to be the first alternate"); + + the_repository->objects->odb = restore_odb; + free_object_directory(cur_odb); +} + /* * Compute the exact path an alternate is at and returns it. In case of * error NULL is returned and the human readable error is added to `err` @@ -1893,8 +1931,11 @@ int hash_object_file(const struct git_hash_algo *algo, const void *buf, /* Finalize a file on disk, and close it. */ static void close_loose_object(int fd) { - if (fsync_object_files) - fsync_or_die(fd, "loose object file"); + if (!the_repository->objects->odb->will_destroy) { + if (fsync_object_files) + fsync_or_die(fd, "loose object file"); + } + if (close(fd) != 0) die_errno(_("error when closing loose object file")); } diff --git a/object-store.h b/object-store.h index 551639f173d..5bc9da6634e 100644 --- a/object-store.h +++ b/object-store.h @@ -31,7 +31,12 @@ struct object_directory { * This is a temporary object store, so there is no need to * create new objects via rename. */ - int is_temp; + int is_temp : 8; + + /* + * This object store is ephemeral, so there is no need to fsync. + */ + int will_destroy : 8; /* * Path to the alternative object store. If this is a relative path, @@ -64,6 +69,17 @@ void add_to_alternates_file(const char *dir); */ void add_to_alternates_memory(const char *dir); +/* + * Replace the current writable object directory with the specified temporary + * object directory; returns the former primary object directory. + */ +struct object_directory *set_temporary_primary_odb(const char *dir, int will_destroy); + +/* + * Restore a previous ODB replaced by set_temporary_main_odb. + */ +void restore_primary_odb(struct object_directory *restore_odb, const char *old_path); + /* * Populate and return the loose object cache array corresponding to the * given object ID. @@ -74,6 +90,9 @@ struct oidtree *odb_loose_cache(struct object_directory *odb, /* Empty the loose object cache for the specified object directory. */ void odb_clear_loose_cache(struct object_directory *odb); +/* Clear and free the specified object directory */ +void free_object_directory(struct object_directory *odb); + struct packed_git { struct hashmap_entry packmap_ent; struct packed_git *next; diff --git a/object.c b/object.c index 4e85955a941..98635bc4043 100644 --- a/object.c +++ b/object.c @@ -513,7 +513,7 @@ struct raw_object_store *raw_object_store_new(void) return o; } -static void free_object_directory(struct object_directory *odb) +void free_object_directory(struct object_directory *odb) { free(odb->path); odb_clear_loose_cache(odb); diff --git a/tmp-objdir.c b/tmp-objdir.c index b8d880e3626..366ffe28511 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -11,6 +11,7 @@ struct tmp_objdir { struct strbuf path; struct strvec env; + struct object_directory *prev_odb; }; /* @@ -38,6 +39,9 @@ static int tmp_objdir_destroy_1(struct tmp_objdir *t, int on_signal) if (t == the_tmp_objdir) the_tmp_objdir = NULL; + if (!on_signal && t->prev_odb) + restore_primary_odb(t->prev_odb, t->path.buf); + /* * This may use malloc via strbuf_grow(), but we should * have pre-grown t->path sufficiently so that this @@ -52,6 +56,7 @@ static int tmp_objdir_destroy_1(struct tmp_objdir *t, int on_signal) */ if (!on_signal) tmp_objdir_free(t); + return err; } @@ -121,7 +126,7 @@ static int setup_tmp_objdir(const char *root) return ret; } -struct tmp_objdir *tmp_objdir_create(void) +struct tmp_objdir *tmp_objdir_create(const char *prefix) { static int installed_handlers; struct tmp_objdir *t; @@ -129,11 +134,16 @@ struct tmp_objdir *tmp_objdir_create(void) if (the_tmp_objdir) BUG("only one tmp_objdir can be used at a time"); - t = xmalloc(sizeof(*t)); + t = xcalloc(1, sizeof(*t)); strbuf_init(&t->path, 0); strvec_init(&t->env); - strbuf_addf(&t->path, "%s/incoming-XXXXXX", get_object_directory()); + /* + * Use a string starting with tmp_ so that the builtin/prune.c code + * can recognize any stale objdirs left behind by a crash and delete + * them. + */ + strbuf_addf(&t->path, "%s/tmp_objdir-%s-XXXXXX", get_object_directory(), prefix); /* * Grow the strbuf beyond any filename we expect to be placed in it. @@ -269,6 +279,15 @@ int tmp_objdir_migrate(struct tmp_objdir *t) if (!t) return 0; + + + if (t->prev_odb) { + if (the_repository->objects->odb->will_destroy) + BUG("migrating and ODB that was marked for destruction"); + restore_primary_odb(t->prev_odb, t->path.buf); + t->prev_odb = NULL; + } + strbuf_addbuf(&src, &t->path); strbuf_addstr(&dst, get_object_directory()); @@ -292,3 +311,10 @@ void tmp_objdir_add_as_alternate(const struct tmp_objdir *t) { add_to_alternates_memory(t->path.buf); } + +void tmp_objdir_replace_primary_odb(struct tmp_objdir *t, int will_destroy) +{ + if (t->prev_odb) + BUG("the primary object database is already replaced"); + t->prev_odb = set_temporary_primary_odb(t->path.buf, will_destroy); +} diff --git a/tmp-objdir.h b/tmp-objdir.h index b1e45b4c75d..75754cbfba6 100644 --- a/tmp-objdir.h +++ b/tmp-objdir.h @@ -10,7 +10,7 @@ * * Example: * - * struct tmp_objdir *t = tmp_objdir_create(); + * struct tmp_objdir *t = tmp_objdir_create("incoming"); * if (!run_command_v_opt_cd_env(cmd, 0, NULL, tmp_objdir_env(t)) && * !tmp_objdir_migrate(t)) * printf("success!\n"); @@ -22,9 +22,10 @@ struct tmp_objdir; /* - * Create a new temporary object directory; returns NULL on failure. + * Create a new temporary object directory with the specified prefix; + * returns NULL on failure. */ -struct tmp_objdir *tmp_objdir_create(void); +struct tmp_objdir *tmp_objdir_create(const char *prefix); /* * Return a list of environment strings, suitable for use with @@ -51,4 +52,11 @@ int tmp_objdir_destroy(struct tmp_objdir *); */ void tmp_objdir_add_as_alternate(const struct tmp_objdir *); +/* + * Replaces the main object store in the current process with the temporary + * object directory and makes the former main object store an alternate. + * If will_destroy is nonzero, the object directory may not be migrated. + */ +void tmp_objdir_replace_primary_odb(struct tmp_objdir *, int will_destroy); + #endif /* TMP_OBJDIR_H */ From patchwork Tue Sep 28 23:32:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED689C433F5 for ; Tue, 28 Sep 2021 23:32:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CDD67613A5 for ; Tue, 28 Sep 2021 23:32:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243276AbhI1Xeh (ORCPT ); Tue, 28 Sep 2021 19:34:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243249AbhI1Xeg (ORCPT ); Tue, 28 Sep 2021 19:34:36 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9ACAC06161C for ; Tue, 28 Sep 2021 16:32:55 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id s21so1094934wra.7 for ; Tue, 28 Sep 2021 16:32:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=0YFH/7O9dujfKmmu6ZRk/KJCH7LOCPzTEaGTMr5K95k=; b=aWCZ1T65XqMImHd5TfY2pNbp/BJL8DrinLcRkn3g+hHl/8ZbFVpiyLtG05uULVKSD1 y7Z7gD16oR8xMPRCTFYUszre2ZnlHR6howvGcFsiLpa7mfW2ADrqV4soZ3UJgbJmknI4 eW+3cVZ4H3wcsSa5yhCEkIIH+XEKmmsmr8eMsOHDJyMPpL2ySFzHRNMcHVpsx/RLdCqn 6UgmlQV3gKUpLZonlzmVsOx8SlDx22Ea20XPWYbWZxTov4rYjnBR+A6IaK7/48vhWlDw RW/51pydDf+XlPf0iGIRHhhVjj61hm4Ha6UUILoD/D4a6/FW97al7Gk/N07k3ULFFra2 eNsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=0YFH/7O9dujfKmmu6ZRk/KJCH7LOCPzTEaGTMr5K95k=; b=rFeUUriHyWx4Fn/YBjWnfboMLMakw1MZK9Yidv+XbK8MUcOTbkCVmyNVLc2vvQ5qBS a1yKZelVd13LYQTBQqHfXe3szNi+mq3ftQzsQYIpB+hBxJOh6F7WkvTEMvMXNbU8RpBU kHN5xQUgFca9Jg6ZeRFdxpQ30tsGuqPrhBKQyqJpbWA6u523NxD0Xqg/8/6kMj9BAkRO RjdQ0hrc8p3BTLAX3puvuLxKpEUlHYOcxR3Vm0i7uxuru0v/lVcYEx3tiCaYN7S/5cEZ 9RkKeXU3Rk0V8V2p44BblKSl20RgI+XIgLm/DBhaFIAPDU2k3I4Jzrz3lfMSqyQQctFV AvTQ== X-Gm-Message-State: AOAM533cLCKU3Ak0vY76q+41e8ogI5Ixd8vUF4JrLcUqymkuRclMegcJ nWuXtSADYOXeBaCFWFc7GKBnTU1vmlw= X-Google-Smtp-Source: ABdhPJz1z7vFPay/gLgXpi/cIYzrYKpBX4UXVhJayCE0+jdlDnSbUAL5Y1xIg8MZQzwCN4VzNwdGqw== X-Received: by 2002:a5d:61c1:: with SMTP id q1mr3140536wrv.154.1632871974559; Tue, 28 Sep 2021 16:32:54 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r27sm460163wrr.70.2021.09.28.16.32.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:54 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:45 +0000 Subject: [PATCH v7 3/9] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Preparation for adding bulk-fsync to the bulk-checkin.c infrastructure. * Rename 'state' variable to 'bulk_checkin_state', since we will later be adding 'bulk_fsync_state'. This also makes the variable easier to find in the debugger, since the name is more unique. * Move the 'plugged' data member of 'bulk_checkin_state' into a separate static variable. Doing this avoids resetting the variable in finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we seem to unintentionally disable the plugging functionality the first time a new packfile must be created due to packfile size limits. While disabling the plugging state only results in suboptimal behavior for the current code, it would be fatal for the bulk-fsync functionality later in this patch series. Signed-off-by: Neeraj Singh --- bulk-checkin.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/bulk-checkin.c b/bulk-checkin.c index 8785b2ac806..6ae18401e04 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -10,9 +10,9 @@ #include "packfile.h" #include "object-store.h" -static struct bulk_checkin_state { - unsigned plugged:1; +static int bulk_checkin_plugged; +static struct bulk_checkin_state { char *pack_tmp_name; struct hashfile *f; off_t offset; @@ -21,7 +21,7 @@ static struct bulk_checkin_state { struct pack_idx_entry **written; uint32_t alloc_written; uint32_t nr_written; -} state; +} bulk_checkin_state; static void finish_tmp_packfile(struct strbuf *basename, const char *pack_tmp_name, @@ -277,21 +277,23 @@ int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags) { - int status = deflate_to_pack(&state, oid, fd, size, type, + int status = deflate_to_pack(&bulk_checkin_state, oid, fd, size, type, path, flags); - if (!state.plugged) - finish_bulk_checkin(&state); + if (!bulk_checkin_plugged) + finish_bulk_checkin(&bulk_checkin_state); return status; } void plug_bulk_checkin(void) { - state.plugged = 1; + assert(!bulk_checkin_plugged); + bulk_checkin_plugged = 1; } void unplug_bulk_checkin(void) { - state.plugged = 0; - if (state.f) - finish_bulk_checkin(&state); + assert(bulk_checkin_plugged); + bulk_checkin_plugged = 0; + if (bulk_checkin_state.f) + finish_bulk_checkin(&bulk_checkin_state); } From patchwork Tue Sep 28 23:32:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524073 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E87BBC433F5 for ; Tue, 28 Sep 2021 23:33:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D3FE9613A5 for ; Tue, 28 Sep 2021 23:33:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243300AbhI1Xel (ORCPT ); Tue, 28 Sep 2021 19:34:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243273AbhI1Xeh (ORCPT ); Tue, 28 Sep 2021 19:34:37 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0AFFC06161C for ; Tue, 28 Sep 2021 16:32:56 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id b192-20020a1c1bc9000000b0030cfaf18864so321053wmb.4 for ; Tue, 28 Sep 2021 16:32:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=i7xjfEoZusxqVasHDFV1YUX36yzTMkc29bjjBgod14E=; b=hADRkCJn47VfiIAKpZkTaTueGqFSHepPZc7IQD6CWkp+31WLkQ55VNsN/ML0gRoRDG EABZV+afx4D1OiwaA0k3i+CcbsTCmNjs+y60VNSreDGZR/QDl0KZZ/ymIw8sYYHWQHMd aR3vmIqEbWQqKAHWGpJ/b2hpEcHgWIqSCFD/uYEHUV9YhQyaNkl1uRgHYrZXN9ycEa4b fek2++7Y25RDBv0QXvzfxxCuxncSBjWovYKOGG2CDAmBUmb48oXc/5w9+53BkN+hAqiO eEcp+fyQzZlkPfAeviXb6VCrji/6Y7N/8VPr3uwI8eUBdQnyCgMX0gj2GfLZtGv2gIDA ofEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=i7xjfEoZusxqVasHDFV1YUX36yzTMkc29bjjBgod14E=; b=5Xw3HUxWynUO7sv3LWs6G7/naPYs6UPro05LLy7pZKCdBAa4v0GipQO8gtH6TR+2Ta /+LuXtbED5lh4oYeogCXjnwXErKS8uwtua7rjspHBal5Kv5VBg3dhEbIJFLSacecnwGF Nkg+gUwSD9RnJNqPZtBRICPBvKbzkVVJNBXSvZhStUee+/pOPiGDV9kg5Ku71OToHOa0 ZBl1GRtcdz7Cb42vbztqZXj9C9qs3nLQpKsgB/Zqzk2FozawCFN0mCE8wzvC0qRhUOZ+ lg7Qb1i/hXPG0UZsf/YywXgJQK2cOwRWDp8VeodMzKVvD9kOI/a2D8Ez6b9tE5zQePd/ 4bVw== X-Gm-Message-State: AOAM530BXYxwffs7UoFZjamE6HLJ+aakVLRKfjZc23pON3ps+jQ7BEyj NwzjtDTu7LahmGMv2deyM2CDbWY0IZk= X-Google-Smtp-Source: ABdhPJy2epK1wuFybFUNmsB5p8Xiod7sY6SBWsrDeOgtG6dTrLpN+9ZQK1mRKfvJifu9cVcc67v05w== X-Received: by 2002:a05:600c:350a:: with SMTP id h10mr7097934wmq.163.1632871975318; Tue, 28 Sep 2021 16:32:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i1sm460937wrb.93.2021.09.28.16.32.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:54 -0700 (PDT) Message-Id: <55556bb3e90225263fa39d8813d1831eda135eb5.1632871972.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:46 +0000 Subject: [PATCH v7 4/9] core.fsyncobjectfiles: batched disk flushes Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh When adding many objects to a repo with core.fsyncObjectFiles set to true, the cost of fsync'ing each object file can become prohibitive. One major source of the cost of fsync is the implied flush of the hardware writeback cache within the disk drive. Fortunately, Windows, and macOS offer mechanisms to write data from the filesystem page cache without initiating a hardware flush. Linux has the sync_file_range API, which issues a pagecache writeback request reliably after version 5.2. This patch introduces a new 'core.fsyncObjectFiles = batch' option that batches up hardware flushes. It hooks into the bulk-checkin plugging and unplugging functionality and takes advantage of tmp-objdir. When the new mode is enabled do the following for each new object: 1. Create the object in a tmp-objdir. 2. Issue a pagecache writeback request and wait for it to complete. At the end of the entire transaction when unplugging bulk checkin: 1. Issue an fsync against a dummy file to flush the hardware writeback cache, which should by now have processed the tmp-objdir writes. 2. Rename all of the tmp-objdir files to their final names. 3. When updating the index and/or refs, we assume that Git will issue another fsync internal to that operation. This is not the case today, but may be a good extension to those components. On a filesystem with a singular journal that is updated during name operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS we would expect the fsync to trigger a journal writeout so that this sequence is enough to ensure that the user's data is durable by the time the git command returns. This change also updates the macOS code to trigger a real hardware flush via fnctl(fd, F_FULLFSYNC) when fsync_or_die is called. Previously, on macOS there was no guarantee of durability since a simple fsync(2) call does not flush any hardware caches. _Performance numbers_: Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD. Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD. Windows - Same host as Linux, a preview version of Windows 11. This number is from a patch later in the series. Adding 500 files to the repo with 'git add' Times reported in seconds. core.fsyncObjectFiles | Linux | Mac | Windows ----------------------|-------|-------|-------- false | 0.06 | 0.35 | 0.61 true | 1.88 | 11.18 | 2.47 batch | 0.15 | 0.41 | 1.53 Signed-off-by: Neeraj Singh --- Documentation/config/core.txt | 29 ++++++++++++--- Makefile | 6 +++ bulk-checkin.c | 70 +++++++++++++++++++++++++++++++++++ bulk-checkin.h | 2 + cache.h | 8 +++- config.c | 7 +++- config.mak.uname | 1 + configure.ac | 8 ++++ environment.c | 2 +- git-compat-util.h | 7 ++++ object-file.c | 12 +++++- tmp-objdir.c | 2 - wrapper.c | 44 ++++++++++++++++++++++ write-or-die.c | 2 +- 14 files changed, 187 insertions(+), 13 deletions(-) diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index c04f62a54a1..200b4d9f06e 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -548,12 +548,29 @@ core.whitespace:: errors. The default tab width is 8. Allowed values are 1 to 63. core.fsyncObjectFiles:: - This boolean will enable 'fsync()' when writing object files. -+ -This is a total waste of time and effort on a filesystem that orders -data writes properly, but can be useful for filesystems that do not use -journalling (traditional UNIX filesystems) or that only journal metadata -and not file contents (OS X's HFS+, or Linux ext3 with "data=writeback"). + A value indicating the level of effort Git will expend in + trying to make objects added to the repo durable in the event + of an unclean system shutdown. This setting currently only + controls loose objects in the object store, so updates to any + refs or the index may not be equally durable. ++ +* `false` allows data to remain in file system caches according to + operating system policy, whence it may be lost if the system loses power + or crashes. +* `true` triggers a data integrity flush for each loose object added to the + object store. This is the safest setting that is likely to ensure durability + across all operating systems and file systems that honor the 'fsync' system + call. However, this setting comes with a significant performance cost on + common hardware. Git does not currently fsync parent directories for + newly-added files, so some filesystems may still allow data to be lost on + system crash. +* `batch` enables an experimental mode that uses interfaces available in some + operating systems to write loose object data with a minimal set of FLUSH + CACHE (or equivalent) commands sent to the storage controller. If the + operating system interfaces are not available, this mode behaves the same as + `true`. This mode is expected to be as safe as `true` on macOS for repos + stored on HFS+ or APFS filesystems and on Windows for repos stored on NTFS or + ReFS. core.preloadIndex:: Enable parallel index preload for operations like 'git diff' diff --git a/Makefile b/Makefile index a9f9b689f0c..313b3dc7cd6 100644 --- a/Makefile +++ b/Makefile @@ -406,6 +406,8 @@ all:: # # Define HAVE_CLOCK_MONOTONIC if your platform has CLOCK_MONOTONIC. # +# Define HAVE_SYNC_FILE_RANGE if your platform has sync_file_range. +# # Define NEEDS_LIBRT if your platform requires linking with librt (glibc version # before 2.17) for clock_gettime and CLOCK_MONOTONIC. # @@ -1874,6 +1876,10 @@ ifdef HAVE_CLOCK_MONOTONIC BASIC_CFLAGS += -DHAVE_CLOCK_MONOTONIC endif +ifdef HAVE_SYNC_FILE_RANGE + BASIC_CFLAGS += -DHAVE_SYNC_FILE_RANGE +endif + ifdef NEEDS_LIBRT EXTLIBS += -lrt endif diff --git a/bulk-checkin.c b/bulk-checkin.c index 6ae18401e04..e6c830f9c0f 100644 --- a/bulk-checkin.c +++ b/bulk-checkin.c @@ -3,14 +3,20 @@ */ #include "cache.h" #include "bulk-checkin.h" +#include "lockfile.h" #include "repository.h" #include "csum-file.h" #include "pack.h" #include "strbuf.h" +#include "string-list.h" +#include "tmp-objdir.h" #include "packfile.h" #include "object-store.h" static int bulk_checkin_plugged; +static int needs_batch_fsync; + +static struct tmp_objdir *bulk_fsync_objdir; static struct bulk_checkin_state { char *pack_tmp_name; @@ -79,6 +85,34 @@ clear_exit: reprepare_packed_git(the_repository); } +/* + * Cleanup after batch-mode fsync_object_files. + */ +static void do_batch_fsync(void) +{ + /* + * Issue a full hardware flush against a temporary file to ensure + * that all objects are durable before any renames occur. The code in + * fsync_loose_object_bulk_checkin has already issued a writeout + * request, but it has not flushed any writeback cache in the storage + * hardware. + */ + + if (needs_batch_fsync) { + struct strbuf temp_path = STRBUF_INIT; + struct tempfile *temp; + + strbuf_addf(&temp_path, "%s/bulk_fsync_XXXXXX", get_object_directory()); + temp = xmks_tempfile(temp_path.buf); + fsync_or_die(get_tempfile_fd(temp), get_tempfile_path(temp)); + delete_tempfile(&temp); + strbuf_release(&temp_path); + } + + if (bulk_fsync_objdir) + tmp_objdir_migrate(bulk_fsync_objdir); +} + static int already_written(struct bulk_checkin_state *state, struct object_id *oid) { int i; @@ -273,6 +307,26 @@ static int deflate_to_pack(struct bulk_checkin_state *state, return 0; } +void fsync_loose_object_bulk_checkin(int fd) +{ + assert(fsync_object_files == FSYNC_OBJECT_FILES_BATCH); + + /* + * If we have a plugged bulk checkin, we issue a call that + * cleans the filesystem page cache but avoids a hardware flush + * command. Later on we will issue a single hardware flush + * before as part of do_batch_fsync. + */ + if (bulk_checkin_plugged && + git_fsync(fd, FSYNC_WRITEOUT_ONLY) >= 0) { + assert(the_repository->objects->odb->is_temp); + if (!needs_batch_fsync) + needs_batch_fsync = 1; + } else { + fsync_or_die(fd, "loose object file"); + } +} + int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags) @@ -287,6 +341,20 @@ int index_bulk_checkin(struct object_id *oid, void plug_bulk_checkin(void) { assert(!bulk_checkin_plugged); + + /* + * Create a temporary object directory if the current + * object directory is not already temporary. + */ + if (fsync_object_files == FSYNC_OBJECT_FILES_BATCH && + !the_repository->objects->odb->is_temp) { + bulk_fsync_objdir = tmp_objdir_create("bulk-fsync"); + if (!bulk_fsync_objdir) + die(_("Could not create temporary object directory for core.fsyncobjectfiles=batch")); + + tmp_objdir_replace_primary_odb(bulk_fsync_objdir, 0); + } + bulk_checkin_plugged = 1; } @@ -296,4 +364,6 @@ void unplug_bulk_checkin(void) bulk_checkin_plugged = 0; if (bulk_checkin_state.f) finish_bulk_checkin(&bulk_checkin_state); + + do_batch_fsync(); } diff --git a/bulk-checkin.h b/bulk-checkin.h index b26f3dc3b74..08f292379b6 100644 --- a/bulk-checkin.h +++ b/bulk-checkin.h @@ -6,6 +6,8 @@ #include "cache.h" +void fsync_loose_object_bulk_checkin(int fd); + int index_bulk_checkin(struct object_id *oid, int fd, size_t size, enum object_type type, const char *path, unsigned flags); diff --git a/cache.h b/cache.h index f6295f3b048..1ed8137b5e6 100644 --- a/cache.h +++ b/cache.h @@ -984,7 +984,13 @@ void reset_shared_repository(void); extern int read_replace_refs; extern char *git_replace_ref_base; -extern int fsync_object_files; +enum fsync_object_files_mode { + FSYNC_OBJECT_FILES_OFF, + FSYNC_OBJECT_FILES_ON, + FSYNC_OBJECT_FILES_BATCH +}; + +extern enum fsync_object_files_mode fsync_object_files; extern int core_preload_index; extern int precomposed_unicode; extern int protect_hfs; diff --git a/config.c b/config.c index 2edf835262f..8315d020eeb 100644 --- a/config.c +++ b/config.c @@ -1506,7 +1506,12 @@ static int git_default_core_config(const char *var, const char *value, void *cb) } if (!strcmp(var, "core.fsyncobjectfiles")) { - fsync_object_files = git_config_bool(var, value); + if (value && !strcmp(value, "batch")) + fsync_object_files = FSYNC_OBJECT_FILES_BATCH; + else if (git_config_bool(var, value)) + fsync_object_files = FSYNC_OBJECT_FILES_ON; + else + fsync_object_files = FSYNC_OBJECT_FILES_OFF; return 0; } diff --git a/config.mak.uname b/config.mak.uname index 76516aaa9a5..e6d482fbcc6 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -53,6 +53,7 @@ ifeq ($(uname_S),Linux) HAVE_CLOCK_MONOTONIC = YesPlease # -lrt is needed for clock_gettime on glibc <= 2.16 NEEDS_LIBRT = YesPlease + HAVE_SYNC_FILE_RANGE = YesPlease HAVE_GETDELIM = YesPlease SANE_TEXT_GREP=-a FREAD_READS_DIRECTORIES = UnfortunatelyYes diff --git a/configure.ac b/configure.ac index 031e8d3fee8..c711037d625 100644 --- a/configure.ac +++ b/configure.ac @@ -1090,6 +1090,14 @@ AC_COMPILE_IFELSE([CLOCK_MONOTONIC_SRC], [AC_MSG_RESULT([no]) HAVE_CLOCK_MONOTONIC=]) GIT_CONF_SUBST([HAVE_CLOCK_MONOTONIC]) + +# +# Define HAVE_SYNC_FILE_RANGE=YesPlease if sync_file_range is available. +GIT_CHECK_FUNC(sync_file_range, + [HAVE_SYNC_FILE_RANGE=YesPlease], + [HAVE_SYNC_FILE_RANGE]) +GIT_CONF_SUBST([HAVE_SYNC_FILE_RANGE]) + # # Define NO_SETITIMER if you don't have setitimer. GIT_CHECK_FUNC(setitimer, diff --git a/environment.c b/environment.c index 30fca67e6d6..371a73c1e30 100644 --- a/environment.c +++ b/environment.c @@ -42,7 +42,7 @@ const char *git_attributes_file; const char *git_hooks_path; int zlib_compression_level = Z_BEST_SPEED; int pack_compression_level = Z_DEFAULT_COMPRESSION; -int fsync_object_files; +enum fsync_object_files_mode fsync_object_files; size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE; size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT; size_t delta_base_cache_limit = 96 * 1024 * 1024; diff --git a/git-compat-util.h b/git-compat-util.h index 7c99eef6612..9daee873782 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1213,6 +1213,13 @@ __attribute__((format (printf, 1, 2))) NORETURN void BUG(const char *fmt, ...); #endif +enum fsync_action { + FSYNC_WRITEOUT_ONLY, + FSYNC_HARDWARE_FLUSH +}; + +int git_fsync(int fd, enum fsync_action action); + /* * Preserves errno, prints a message, but gives no warning for ENOENT. * Returns 0 on success, which includes trying to unlink an object that does diff --git a/object-file.c b/object-file.c index 1a3ad558c45..8ea1348f0db 100644 --- a/object-file.c +++ b/object-file.c @@ -1932,8 +1932,18 @@ int hash_object_file(const struct git_hash_algo *algo, const void *buf, static void close_loose_object(int fd) { if (!the_repository->objects->odb->will_destroy) { - if (fsync_object_files) + switch (fsync_object_files) { + case FSYNC_OBJECT_FILES_OFF: + break; + case FSYNC_OBJECT_FILES_ON: fsync_or_die(fd, "loose object file"); + break; + case FSYNC_OBJECT_FILES_BATCH: + fsync_loose_object_bulk_checkin(fd); + break; + default: + BUG("Invalid fsync_object_files mode."); + } } if (close(fd) != 0) diff --git a/tmp-objdir.c b/tmp-objdir.c index 366ffe28511..c26cb5eafee 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -279,8 +279,6 @@ int tmp_objdir_migrate(struct tmp_objdir *t) if (!t) return 0; - - if (t->prev_odb) { if (the_repository->objects->odb->will_destroy) BUG("migrating and ODB that was marked for destruction"); diff --git a/wrapper.c b/wrapper.c index 7c6586af321..bb4f9f043ce 100644 --- a/wrapper.c +++ b/wrapper.c @@ -540,6 +540,50 @@ int xmkstemp_mode(char *filename_template, int mode) return fd; } +int git_fsync(int fd, enum fsync_action action) +{ + switch (action) { + case FSYNC_WRITEOUT_ONLY: + +#ifdef __APPLE__ + /* + * on macOS, fsync just causes filesystem cache writeback but does not + * flush hardware caches. + */ + return fsync(fd); +#endif + +#ifdef HAVE_SYNC_FILE_RANGE + /* + * On linux 2.6.17 and above, sync_file_range is the way to issue + * a writeback without a hardware flush. An offset of 0 and size of 0 + * indicates writeout of the entire file and the wait flags ensure that all + * dirty data is written to the disk (potentially in a disk-side cache) + * before we continue. + */ + + return sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE | + SYNC_FILE_RANGE_WRITE | + SYNC_FILE_RANGE_WAIT_AFTER); +#endif + + errno = ENOSYS; + return -1; + + case FSYNC_HARDWARE_FLUSH: + +#ifdef __APPLE__ + return fcntl(fd, F_FULLFSYNC); +#else + return fsync(fd); +#endif + + default: + BUG("unexpected git_fsync(%d) call", action); + } + +} + static int warn_if_unremovable(const char *op, const char *file, int rc) { int err; diff --git a/write-or-die.c b/write-or-die.c index 0b1ec8190b6..cc8291d9794 100644 --- a/write-or-die.c +++ b/write-or-die.c @@ -57,7 +57,7 @@ void fprintf_or_die(FILE *f, const char *fmt, ...) void fsync_or_die(int fd, const char *msg) { - while (fsync(fd) < 0) { + while (git_fsync(fd, FSYNC_HARDWARE_FLUSH) < 0) { if (errno != EINTR) die_errno("fsync error on '%s'", msg); } From patchwork Tue Sep 28 23:32:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE5F1C433EF for ; Tue, 28 Sep 2021 23:33:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D7F77613A5 for ; Tue, 28 Sep 2021 23:33:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243287AbhI1Xen (ORCPT ); Tue, 28 Sep 2021 19:34:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243249AbhI1Xeh (ORCPT ); Tue, 28 Sep 2021 19:34:37 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43BDFC061746 for ; Tue, 28 Sep 2021 16:32:57 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id s21so1094996wra.7 for ; Tue, 28 Sep 2021 16:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=fv7nQ5ZUwa4uFFNumJcqLSj/tyOInbTAFP+n8tQn3ko=; b=pi9Imp1TfNTqTB6H34BwRLlq1YNJjjKos442C+cMob47SNidwCyOiyfLJOfl17bx3y XLD4b8XX3XCVLpGTwZmCV/CUBotJgdodSDwnwCgrVcx1hIXcRAyxsyJoJdy1H7FIMMXZ J/z1+hYRKAatpObuJ3A2Tz41LRFwBxCGvQZIX1iiLU2sEUpqncW3+9z7Ohyv1mBwLtlp Zupn4gMfXWcCCNXQYt0G7GjYdVqDzlW6ThhhHaVXsIjKCPXICbtM2KC6XuFs4fFbxxc0 NeT0H/sPx2/CbM+SaiRy/6CCfPizmGR2OeDJT8bfQan5aZDzVY6aeuW56Y7zyA3Dlmw7 0Qbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=fv7nQ5ZUwa4uFFNumJcqLSj/tyOInbTAFP+n8tQn3ko=; b=Hf7cPF354uX7AzlkbEboBkNc7c4sFsCwZsYYVgpwkhQ8DKNxVdChwjUn0cbg7utCu0 xZOM/qVoxbwJR1wHha9SQ/STH4+SsZtpD55Ak47jzu19REj/GlktHRKXKFooxCqPBCqX TeCiuOc577rqiLFvpd2le+aNhq6fM768VdDCLtgprpBi1i2Lr5e91YTg8qRV/m71XTj/ brvyAaNUDu0xh//I9f9ucpcSGo0qLGi45U0NN3esIUvycKYu1SlmQEC5ACOA9Wc2y0zJ KHuwe8At4m7xCRk4/sDWOkqpKf4Cfpk54SCY1OGFAXNiwczE8NJL5tQFC8NM1yFhi4Ea 6/2A== X-Gm-Message-State: AOAM530JjlALBR0SswAeOIHs+5MqkGgVa5FKagbgT19KjZeBkngW4MKV QCGK+NmMqSpvi8Hkuu6xpC+3GNIJQpA= X-Google-Smtp-Source: ABdhPJwBy71sawITpjW3F+LSUoIcgmyKCeJX0KyJJjSMJhG/aWI3OpCUHszmSoWqTwvBiauWbOKfNQ== X-Received: by 2002:a05:6000:160c:: with SMTP id u12mr3229256wrb.128.1632871975951; Tue, 28 Sep 2021 16:32:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w5sm444557wra.87.2021.09.28.16.32.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:55 -0700 (PDT) Message-Id: <6c33e79d6f0499bc24e4b01c44bb290556f61eef.1632871972.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:47 +0000 Subject: [PATCH v7 5/9] core.fsyncobjectfiles: add windows support for batch mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh This commit adds a win32 implementation for fsync_no_flush that is called git_fsync. The 'NtFlushBuffersFileEx' function being called is available since Windows 8. If the function is not available, we return -1 and Git falls back to doing a full fsync. The operating system is told to flush data only without a hardware flush primitive. A later full fsync will cause the metadata log to be flushed and then the disk cache to be flushed on NTFS and ReFS. Other filesystems will treat this as a full flush operation. I added a new file here for this system call so as not to conflict with downstream changes in the git-for-windows repository related to fscache. Signed-off-by: Neeraj Singh --- compat/mingw.h | 3 +++ compat/win32/flush.c | 28 ++++++++++++++++++++++++++++ config.mak.uname | 2 ++ contrib/buildsystems/CMakeLists.txt | 3 ++- wrapper.c | 4 ++++ 5 files changed, 39 insertions(+), 1 deletion(-) create mode 100644 compat/win32/flush.c diff --git a/compat/mingw.h b/compat/mingw.h index c9a52ad64a6..6074a3d3ced 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -329,6 +329,9 @@ int mingw_getpagesize(void); #define getpagesize mingw_getpagesize #endif +int win32_fsync_no_flush(int fd); +#define fsync_no_flush win32_fsync_no_flush + struct rlimit { unsigned int rlim_cur; }; diff --git a/compat/win32/flush.c b/compat/win32/flush.c new file mode 100644 index 00000000000..75324c24ee7 --- /dev/null +++ b/compat/win32/flush.c @@ -0,0 +1,28 @@ +#include "../../git-compat-util.h" +#include +#include "lazyload.h" + +int win32_fsync_no_flush(int fd) +{ + IO_STATUS_BLOCK io_status; + +#define FLUSH_FLAGS_FILE_DATA_ONLY 1 + + DECLARE_PROC_ADDR(ntdll.dll, NTSTATUS, NtFlushBuffersFileEx, + HANDLE FileHandle, ULONG Flags, PVOID Parameters, ULONG ParameterSize, + PIO_STATUS_BLOCK IoStatusBlock); + + if (!INIT_PROC_ADDR(NtFlushBuffersFileEx)) { + errno = ENOSYS; + return -1; + } + + memset(&io_status, 0, sizeof(io_status)); + if (NtFlushBuffersFileEx((HANDLE)_get_osfhandle(fd), FLUSH_FLAGS_FILE_DATA_ONLY, + NULL, 0, &io_status)) { + errno = EINVAL; + return -1; + } + + return 0; +} diff --git a/config.mak.uname b/config.mak.uname index e6d482fbcc6..34c93314a50 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -451,6 +451,7 @@ endif CFLAGS = BASIC_CFLAGS = -nologo -I. -Icompat/vcbuild/include -DWIN32 -D_CONSOLE -DHAVE_STRING_H -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE COMPAT_OBJS = compat/msvc.o compat/winansi.o \ + compat/win32/flush.o \ compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/trace2_win32_process_info.o \ @@ -626,6 +627,7 @@ ifneq (,$(findstring MINGW,$(uname_S))) COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\" COMPAT_OBJS += compat/mingw.o compat/winansi.o \ compat/win32/trace2_win32_process_info.o \ + compat/win32/flush.o \ compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/dirent.o diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 171b4124afe..b573a5ee122 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -261,7 +261,8 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows") NOGDI OBJECT_CREATION_MODE=1 __USE_MINGW_ANSI_STDIO=0 USE_NED_ALLOCATOR OVERRIDE_STRDUP MMAP_PREVENTS_DELETE USE_WIN32_MMAP UNICODE _UNICODE HAVE_WPGMPTR ENSURE_MSYSTEM_IS_SET) - list(APPEND compat_SOURCES compat/mingw.c compat/winansi.c compat/win32/path-utils.c + list(APPEND compat_SOURCES compat/mingw.c compat/winansi.c + compat/win32/flush.c compat/win32/path-utils.c compat/win32/pthread.c compat/win32mmap.c compat/win32/syslog.c compat/win32/trace2_win32_process_info.c compat/win32/dirent.c compat/nedmalloc/nedmalloc.c compat/strdup.c) diff --git a/wrapper.c b/wrapper.c index bb4f9f043ce..1a1e2fba9c9 100644 --- a/wrapper.c +++ b/wrapper.c @@ -567,6 +567,10 @@ int git_fsync(int fd, enum fsync_action action) SYNC_FILE_RANGE_WAIT_AFTER); #endif +#ifdef fsync_no_flush + return fsync_no_flush(fd); +#endif + errno = ENOSYS; return -1; From patchwork Tue Sep 28 23:32:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524069 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D049CC433FE for ; Tue, 28 Sep 2021 23:33:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ACB1A61357 for ; Tue, 28 Sep 2021 23:33:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243309AbhI1Xep (ORCPT ); Tue, 28 Sep 2021 19:34:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243279AbhI1Xei (ORCPT ); Tue, 28 Sep 2021 19:34:38 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6969C061753 for ; Tue, 28 Sep 2021 16:32:57 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id t8so1124680wrq.4 for ; Tue, 28 Sep 2021 16:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hJTprmd6l9eGdcrH8bWYEQJ64Pw67KrU1jFu1jkOkoM=; b=WyTXcdJgtII1WE8LybN9R6z/BMY6WhyiJ/tYUvAkuviF8bNKQizX1PExaJ4rAwlh5F tCZ8OKSsq6PJyUEUbtDvqwRqKwwd19X79S2szCC0xpBzLVSSPK9UIM+/5n1T5qQDT1ys itDWUoDbqeFtj/eeI1vEmkgy9TP3gYL5dGzJBBGcC9kOv5EFF18P9AepyUAedq1xqVeP SBselKr6mnZ7X/JfrhO6456Yy56PlGWAKPFRxJsaOQpM9kJxCg6MhjAjfME1kBbKAbWP aWZlUHz2gQna4Gy9H1+wGXbSOBdPazsVdAzA2N5YoGmkNcgem9KPKoe7IFVL20UIRB62 Z2Pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hJTprmd6l9eGdcrH8bWYEQJ64Pw67KrU1jFu1jkOkoM=; b=L2dlAE6qfeUuDu+yPSEiecJmcm+/vf68yRdFue+O6pTLUddqxXG3CMLchbk1oGFKhH YeoJtGRfv33DM32jHn3IQCEWsRrIr+ao6KLza+KnlDGwd35+LBBYxhCWMp+KKhMK71/E v/VTAuue4NEWp4HNLl9AJY2Z9py1uGjUs5WGzIoBCoOs7vqR3TWSO6LLQYTzRaf1rSX5 pERc60q4BIMCu0kRLyZtH8jwvje6bvoovIK0K8aNGUDEhb4ud2ItizOW2hR2+73HBONY yX5s12MpND5lz37Ebhq7a4fh2CD/27ODpAiti/ho+jKywaXv5TV8znYXyYT66L5LJ1cQ mPxQ== X-Gm-Message-State: AOAM5309sMDwMIMlau8owxvzob1nCpJu1ECTcL9KMuiPG3Ds3l+xDmVw f1A7clHImkzWduk45OVhEvWdrb4itZ0= X-Google-Smtp-Source: ABdhPJx8mv0hJFAxgt/4JCqyMlefz6/ooHyC0gm1SsTBsiQ4MhIjAOIe7CIxomEUVNU6Th5VKwmUsw== X-Received: by 2002:adf:cf04:: with SMTP id o4mr3144955wrj.352.1632871976516; Tue, 28 Sep 2021 16:32:56 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n66sm399093wmn.2.2021.09.28.16.32.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:56 -0700 (PDT) Message-Id: <09dbff1004ed9b7ae501f7b1cfc91cd743195fd6.1632871972.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:48 +0000 Subject: [PATCH v7 6/9] update-index: use the bulk-checkin infrastructure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh The update-index functionality is used internally by 'git stash push' to setup the internal stashed commit. This change enables bulk-checkin for update-index infrastructure to speed up adding new objects to the object database by leveraging the pack functionality and the new bulk-fsync functionality. There is some risk with this change, since under batch fsync, the object files will not be available until the update-index is entirely complete. This usage is unlikely, since any tool invoking update-index and expecting to see objects would have to synchronize with the update-index process after passing it a file path. Signed-off-by: Neeraj Singh --- builtin/update-index.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/builtin/update-index.c b/builtin/update-index.c index 187203e8bb5..dc7368bb1ee 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -5,6 +5,7 @@ */ #define USE_THE_INDEX_COMPATIBILITY_MACROS #include "cache.h" +#include "bulk-checkin.h" #include "config.h" #include "lockfile.h" #include "quote.h" @@ -1088,6 +1089,9 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) the_index.updated_skipworktree = 1; + /* we might be adding many objects to the object database */ + plug_bulk_checkin(); + /* * Custom copy of parse_options() because we want to handle * filename arguments as they come. @@ -1168,6 +1172,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix) strbuf_release(&buf); } + /* by now we must have added all of the new objects */ + unplug_bulk_checkin(); if (split_index > 0) { if (git_config_get_split_index() == 0) warning(_("core.splitIndex is set to false; " From patchwork Tue Sep 28 23:32:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524075 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B5FFC433EF for ; Tue, 28 Sep 2021 23:33:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 21EDD61357 for ; Tue, 28 Sep 2021 23:33:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243317AbhI1Xeq (ORCPT ); Tue, 28 Sep 2021 19:34:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243286AbhI1Xei (ORCPT ); Tue, 28 Sep 2021 19:34:38 -0400 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FDA0C061745 for ; Tue, 28 Sep 2021 16:32:58 -0700 (PDT) Received: by mail-wr1-x429.google.com with SMTP id s21so1095081wra.7 for ; Tue, 28 Sep 2021 16:32:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Wx/NRBAYoIBhOKGldB61QZt0lf65wQv8k/Ky8embUfk=; b=AcsxNiC0qcjXCsa6+mTTAjdswRwvXxCf61sUumbQ6Ld7pSrsA4widX12RLsFlsqEgy CDslov7zOS4XRtk9yMd+fZ+Cntt1RrHfio8FbWMDlGKh7VK44/dGrdZAlpYXrJ3LbM6u kJJHu8EBICGm2bbd1+lRaF6RQNlbFW5591yNCMQUZxMAT6oKXO9zt05XJsL3sPCfmaUz o20x9K6A1kZC+XYaDORD4NkrTsLosRpBZUJ58sIp3VObL7Pde2dvQb8bCTVjwTdHbY8H njDeVspZpGDky2QcfiTOdkEhu0OGwxgX3orWYkH5+Ipiox0MMEJP+le2VZrvNTWAYYr2 dTOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Wx/NRBAYoIBhOKGldB61QZt0lf65wQv8k/Ky8embUfk=; b=vcVGmoKt8alXcCUpbqsNkFopTnkaQBqlgSVqtc4VkNQlY3RP27HbOKShbB4ky50wJH cYo43QLNuJ8ozoR59Z0bTRWw5vVf9iMEUzX645E15qtoAmH6kUhYS8FIFFZJu92jdCzF WRKKnzyE9O1eDt2e27uPByM022Asu6DGNN6e7hgGyapKvv/Agx/HSCO+jr8zwQQjxkp2 JpoHN9Pejd0iFY4uxK9VVlH3Bx1FgxIGfXlLkUFYSr0fkRbEAV+vE6CY22kb5bPyt6dK S5Xpg4prC9MNH7BxR76ukiAgXcV/bZup6YfZR3+sZWJ5QsqBl/w4ks4fssw+46q0uVOt D7xQ== X-Gm-Message-State: AOAM531Ntk3ZW//BGEmqyhqvsrFvGVuQ2R5iVCenExHLMFIuUc+XeIU7 ZO80jUJK07pg2VWgcXrZ2alwpRT6G5Q= X-Google-Smtp-Source: ABdhPJyh9ifLZZrWErKzCiNf/qKuIUvsCo83gyH2l3pzQgLFgnFl/AlUIi2XVGe4yXlKzy/SQVVJWQ== X-Received: by 2002:a5d:6245:: with SMTP id m5mr3351633wrv.148.1632871977253; Tue, 28 Sep 2021 16:32:57 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v191sm365079wme.36.2021.09.28.16.32.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:56 -0700 (PDT) Message-Id: <1eced9f9f9a882749eb4210908e6561c51c48d87.1632871972.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:49 +0000 Subject: [PATCH v7 7/9] unpack-objects: use the bulk-checkin infrastructure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh The unpack-objects functionality is used by fetch, push, and fast-import to turn the transfered data into object database entries when there are fewer objects than the 'unpacklimit' setting. By enabling bulk-checkin when unpacking objects, we can take advantage of batched fsyncs. Signed-off-by: Neeraj Singh --- builtin/unpack-objects.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 4a9466295ba..51eb4f7b531 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -1,5 +1,6 @@ #include "builtin.h" #include "cache.h" +#include "bulk-checkin.h" #include "config.h" #include "object-store.h" #include "object.h" @@ -503,10 +504,12 @@ static void unpack_all(void) if (!quiet) progress = start_progress(_("Unpacking objects"), nr_objects); CALLOC_ARRAY(obj_list, nr_objects); + plug_bulk_checkin(); for (i = 0; i < nr_objects; i++) { unpack_one(i); display_progress(progress, i + 1); } + unplug_bulk_checkin(); stop_progress(&progress); if (delta_list) From patchwork Tue Sep 28 23:32:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524071 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3759C4332F for ; Tue, 28 Sep 2021 23:33:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CC3A261357 for ; Tue, 28 Sep 2021 23:33:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243312AbhI1Xep (ORCPT ); Tue, 28 Sep 2021 19:34:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243292AbhI1Xej (ORCPT ); Tue, 28 Sep 2021 19:34:39 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E6F0C061755 for ; Tue, 28 Sep 2021 16:32:59 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id v127so343437wme.5 for ; Tue, 28 Sep 2021 16:32:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=8AgoE4x6pFlEAgdAVwM5FjKoJLl1pgqUb7ScpI1FJL0=; b=QqE+hKmSf8VhDnnubJn//AjTjFm06LXW2HDEbOj9mcm17XIfBr3CHnNWFoSov0ygPN x06Wz4kKL6tiPVyZFjSe1R7QO5QkvaGem4wMovDb1KKWXZabwjvZhpNIHxYU94qYL1x9 7F/U8AYbC4P02lcQcoyMvTrdfnt09QuYMtpCRHfWnB5ZiUe5ZVu7/uLGxAdNrnYQNial j+R/+0CKWXEGSHbbRVoVe+Uv459EaXM9gwXJugAI78YeMX9Eg5sthPrMxFpFKRsGSLCP pQE0TAdFOLf5IMvgryRlt26TQyxn991JxsZ3okSyJgW8B5nTqRhII8vVL2PIfaei5GW2 KIsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=8AgoE4x6pFlEAgdAVwM5FjKoJLl1pgqUb7ScpI1FJL0=; b=3vXpGMv2zpyppkCTGmODAaoGgnk2vhmaaFKQZx06vqJmqy+QsxAr6E8N3Hk6dXx5rE a1UdHCqTttusKacuc7fvUHjU0+uoGBColWka/f/cXPCzX8cFkZ6WqE4tOPyoStqAibHl X1LJK7IWcyYESZ4xnXZ2VKyPSk2DnD+jfv+9TwV/D3jfWkgUqyqD3FToXlgPBKrivQfu k4dSrlvht7Om15GgsEdEK7l4g3EdX8ZyPZ2y3+u8WnU3rf5upDETvUMGp/VrjwBXTQX0 wun6i1V3RRpigKR6Yns6Bf6OGSbn+2eMOWwyu8VyewTrNL5Lh5Z0pSmZqkooqOjsTYKg cMqw== X-Gm-Message-State: AOAM533JNCt6OF+KvUjN7Zs27iVpCVBfFSu0d2C6KD6Mgq2QNFKFd3+9 ibVN0BwaReTZy8bG9UNWPAv41Nu8P98= X-Google-Smtp-Source: ABdhPJyAnVQVYqLWcaII9zo4mX2wxS4mUHRz+ft68u2rW2nN6I74qBw2KP4Wlf7I5hmaBd9W2UPXIw== X-Received: by 2002:a7b:cc14:: with SMTP id f20mr7141718wmh.137.1632871977833; Tue, 28 Sep 2021 16:32:57 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v8sm474364wrt.12.2021.09.28.16.32.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:57 -0700 (PDT) Message-Id: <7aaa08d5f5fc2b3c83bc48de8dac25dea90d956b.1632871972.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:50 +0000 Subject: [PATCH v7 8/9] core.fsyncobjectfiles: tests for batch mode Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Add test cases to exercise batch mode for: * 'git add' * 'git stash' * 'git update-index' * 'git unpack-objects' These tests ensure that the added data winds up in the object database. In this change we introduce a new test helper lib-unique-files.sh. The goal of this library is to create a tree of files that have different oids from any other files that may have been created in the current test repo. This helps us avoid missing validation of an object being added due to it already being in the repo. Signed-off-by: Neeraj Singh --- t/lib-unique-files.sh | 36 ++++++++++++++++++++++++++++++++++++ t/t3700-add.sh | 20 ++++++++++++++++++++ t/t3903-stash.sh | 14 ++++++++++++++ t/t5300-pack-object.sh | 30 +++++++++++++++++++----------- 4 files changed, 89 insertions(+), 11 deletions(-) create mode 100644 t/lib-unique-files.sh diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh new file mode 100644 index 00000000000..a7de4ca8512 --- /dev/null +++ b/t/lib-unique-files.sh @@ -0,0 +1,36 @@ +# Helper to create files with unique contents + + +# Create multiple files with unique contents. Takes the number of +# directories, the number of files in each directory, and the base +# directory. +# +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files +# each in my_dir, all with unique +# contents. + +test_create_unique_files() { + test "$#" -ne 3 && BUG "3 param" + + local dirs=$1 + local files=$2 + local basedir=$3 + local counter=0 + test_tick + local basedata=$test_tick + + + rm -rf $basedir + + for i in $(test_seq $dirs) + do + local dir=$basedir/dir$i + + mkdir -p "$dir" + for j in $(test_seq $files) + do + counter=$((counter + 1)) + echo "$basedata.$counter" >"$dir/file$j.txt" + done + done +} diff --git a/t/t3700-add.sh b/t/t3700-add.sh index 4086e1ebbc9..36049a53ff7 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -7,6 +7,8 @@ test_description='Test of git add, including the -- option.' . ./test-lib.sh +. $TEST_DIRECTORY/lib-unique-files.sh + # Test the file mode "$1" of the file "$2" in the index. test_mode_in_index () { case "$(git ls-files -s "$2")" in @@ -33,6 +35,24 @@ test_expect_success \ 'Test that "git add -- -q" works' \ 'touch -- -q && git add -- -q' +test_expect_success 'git add: core.fsyncobjectfiles=batch' " + test_create_unique_files 2 4 fsync-files && + git -c core.fsyncobjectfiles=batch add -- ./fsync-files/ && + rm -f fsynced_files && + git ls-files --stage fsync-files/ > fsynced_files && + test_line_count = 8 fsynced_files && + awk -- '{print \$2}' fsynced_files | xargs -n1 git cat-file -e +" + +test_expect_success 'git update-index: core.fsyncobjectfiles=batch' " + test_create_unique_files 2 4 fsync-files2 && + find fsync-files2 ! -type d -print | xargs git -c core.fsyncobjectfiles=batch update-index --add -- && + rm -f fsynced_files2 && + git ls-files --stage fsync-files2/ > fsynced_files2 && + test_line_count = 8 fsynced_files2 && + awk -- '{print \$2}' fsynced_files2 | xargs -n1 git cat-file -e +" + test_expect_success \ 'git add: Test that executable bit is not used if core.filemode=0' \ 'git config core.filemode 0 && diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index 873aa56e359..2fc819e5584 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -9,6 +9,7 @@ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh +. $TEST_DIRECTORY/lib-unique-files.sh diff_cmp () { for i in "$1" "$2" @@ -1293,6 +1294,19 @@ test_expect_success 'stash handles skip-worktree entries nicely' ' git rev-parse --verify refs/stash:A.t ' +test_expect_success 'stash with core.fsyncobjectfiles=batch' " + test_create_unique_files 2 4 fsync-files && + git -c core.fsyncobjectfiles=batch stash push -u -- ./fsync-files/ && + rm -f fsynced_files && + + # The files were untracked, so use the third parent, + # which contains the untracked files + git ls-tree -r stash^3 -- ./fsync-files/ > fsynced_files && + test_line_count = 8 fsynced_files && + awk -- '{print \$3}' fsynced_files | xargs -n1 git cat-file -e +" + + test_expect_success 'stash -c stash.useBuiltin=false warning ' ' expected="stash.useBuiltin support has been removed" && diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index e13a8842075..38663dc1393 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -162,23 +162,23 @@ test_expect_success 'pack-objects with bogus arguments' ' check_unpack () { test_when_finished "rm -rf git2" && - git init --bare git2 && - git -C git2 unpack-objects -n <"$1".pack && - git -C git2 unpack-objects <"$1".pack && - (cd .git && find objects -type f -print) | - while read path - do - cmp git2/$path .git/$path || { - echo $path differs. - return 1 - } - done + git $2 init --bare git2 && + ( + git $2 -C git2 unpack-objects -n <"$1".pack && + git $2 -C git2 unpack-objects <"$1".pack && + git $2 -C git2 cat-file --batch-check="%(objectname)" + ) current && + cmp obj-list current } test_expect_success 'unpack without delta' ' check_unpack test-1-${packname_1} ' +test_expect_success 'unpack without delta (core.fsyncobjectfiles=batch)' ' + check_unpack test-1-${packname_1} "-c core.fsyncobjectfiles=batch" +' + test_expect_success 'pack with REF_DELTA' ' packname_2=$(git pack-objects --progress test-2 stderr) && check_deltas stderr -gt 0 @@ -188,6 +188,10 @@ test_expect_success 'unpack with REF_DELTA' ' check_unpack test-2-${packname_2} ' +test_expect_success 'unpack with REF_DELTA (core.fsyncobjectfiles=batch)' ' + check_unpack test-2-${packname_2} "-c core.fsyncobjectfiles=batch" +' + test_expect_success 'pack with OFS_DELTA' ' packname_3=$(git pack-objects --progress --delta-base-offset test-3 \ stderr) && @@ -198,6 +202,10 @@ test_expect_success 'unpack with OFS_DELTA' ' check_unpack test-3-${packname_3} ' +test_expect_success 'unpack with OFS_DELTA (core.fsyncobjectfiles=batch)' ' + check_unpack test-3-${packname_3} "-c core.fsyncobjectfiles=batch" +' + test_expect_success 'compare delta flavors' ' perl -e '\'' defined($_ = -s $_) or die for @ARGV; From patchwork Tue Sep 28 23:32:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Neeraj Singh (WINDOWS-SFS)" X-Patchwork-Id: 12524077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 245ECC433FE for ; Tue, 28 Sep 2021 23:33:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 036B26136F for ; Tue, 28 Sep 2021 23:33:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243324AbhI1Xet (ORCPT ); Tue, 28 Sep 2021 19:34:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243296AbhI1Xek (ORCPT ); Tue, 28 Sep 2021 19:34:40 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2A33C061760 for ; Tue, 28 Sep 2021 16:32:59 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id d21so1058528wra.12 for ; Tue, 28 Sep 2021 16:32:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=THp3oyYMQZ0IyvYbWDVtqI55HStotyepThNgwaAMuqY=; b=OcTgiBD6bHtPiNjBeCeCE2qzsKY8Ds6qbTAi28kkLi4/Ub4ebWQcxn4rNwe/K3C80H seSaV3QU2tzbAC9sIxBTHOCQdfb0kxxa1Vdv2aYC/sghmuyTIKv5XGJnzywNTIERxMjm ZWrM+c43/84Tu79enXppMTJIzM4NnUoXJslw6gP5mQ3GeMkVl7O07mmOlLauaK0dMVAA uU+kkYGcZLoHbXKl2rqn4ceMOnHFbKkLHRg9Rgnj3qnED93q00GZMbqMkwkAOoZwIY2Z 1v3iiAQrxqHr6O+qVAPrCAveTV+IRJ8z4LDF/jsz6W+wYlTbdobbBwKoCC3Bur1hIsoz YcSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=THp3oyYMQZ0IyvYbWDVtqI55HStotyepThNgwaAMuqY=; b=4vU+9sJLvYyFMFzetiStI8UkNeeDC2WK9QKZ8OgppciavAS25zCuAuizQWjAQRS6BF nW1nNHCplUyVqA3Okdru5xSATCS6buSrQX1gKVay/lXWIUsB/5nDs1oi+Yu/js9ZT+FK B/aHJ2F48pv33oYW5IyXrTzxu+XogqW2r3juAyJfxxfIO4zHOqJZQ7pCOaEiP6PpvB/F pYSAnZzj5JW/vnhwWK97upGe4lQE09XjsYBgj3L78lo+7a+PPD32eJkbSBsnZ95pdlN7 xgbftrcr5++yHadUyKep9z9vEB8CWAuTV8l/1pn78pDnmu1B6C7eMuqwg2hqBnTHtHVr c0jg== X-Gm-Message-State: AOAM532NzMciBTqHGVQOfPp7DWdwzE0GF2O6vESRnfUz6FJ9JB0/OTfS PnH1yQwMA05kCD7/IhZHuNrA9ZNLGrQ= X-Google-Smtp-Source: ABdhPJyJOqmmzsZFOuD5B+ttleB6ureIhKQxCnvcWRw+4m5uEI9htzBwLhHLdsU9MSottwmhstT3/A== X-Received: by 2002:a05:6000:1546:: with SMTP id 6mr3188593wry.305.1632871978412; Tue, 28 Sep 2021 16:32:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q126sm4033476wma.10.2021.09.28.16.32.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Sep 2021 16:32:58 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Tue, 28 Sep 2021 23:32:51 +0000 Subject: [PATCH v7 9/9] core.fsyncobjectfiles: performance tests for add and stash Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Neeraj-Personal , Johannes Schindelin , Jeff King , Jeff Hostetler , Christoph Hellwig , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , "Randall S. Becker" , Bagas Sanjaya , "Neeraj K. Singh" , Neeraj Singh Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Neeraj Singh From: Neeraj Singh Add a basic performance test for "git add" and "git stash" of a lot of new objects with various fsync settings. Signed-off-by: Neeraj Singh --- t/perf/p3700-add.sh | 43 ++++++++++++++++++++++++++++++++++++++++ t/perf/p3900-stash.sh | 46 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) create mode 100755 t/perf/p3700-add.sh create mode 100755 t/perf/p3900-stash.sh diff --git a/t/perf/p3700-add.sh b/t/perf/p3700-add.sh new file mode 100755 index 00000000000..e93c08a2e70 --- /dev/null +++ b/t/perf/p3700-add.sh @@ -0,0 +1,43 @@ +#!/bin/sh +# +# This test measures the performance of adding new files to the object database +# and index. The test was originally added to measure the effect of the +# core.fsyncObjectFiles=batch mode, which is why we are testing different values +# of that setting explicitly and creating a lot of unique objects. + +test_description="Tests performance of add" + +. ./perf-lib.sh + +. $TEST_DIRECTORY/lib-unique-files.sh + +test_perf_default_repo +test_checkout_worktree + +dir_count=10 +files_per_dir=50 +total_files=$((dir_count * files_per_dir)) + +# We need to create the files each time we run the perf test, but +# we do not want to measure the cost of creating the files, so run +# the tet once. +if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1 +then + echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2 + GIT_PERF_REPEAT_COUNT=1 +fi + +for m in false true batch +do + test_expect_success "create the files for core.fsyncObjectFiles=$m" ' + git reset --hard && + # create files across directories + test_create_unique_files $dir_count $files_per_dir files + ' + + test_perf "add $total_files files (core.fsyncObjectFiles=$m)" " + git -c core.fsyncobjectfiles=$m add files + " +done + +test_done diff --git a/t/perf/p3900-stash.sh b/t/perf/p3900-stash.sh new file mode 100755 index 00000000000..c9fcd0c03eb --- /dev/null +++ b/t/perf/p3900-stash.sh @@ -0,0 +1,46 @@ +#!/bin/sh +# +# This test measures the performance of adding new files to the object database +# and index. The test was originally added to measure the effect of the +# core.fsyncObjectFiles=batch mode, which is why we are testing different values +# of that setting explicitly and creating a lot of unique objects. + +test_description="Tests performance of stash" + +. ./perf-lib.sh + +. $TEST_DIRECTORY/lib-unique-files.sh + +test_perf_default_repo +test_checkout_worktree + +dir_count=10 +files_per_dir=50 +total_files=$((dir_count * files_per_dir)) + +# We need to create the files each time we run the perf test, but +# we do not want to measure the cost of creating the files, so run +# the tet once. +if test "${GIT_PERF_REPEAT_COUNT-1}" -ne 1 +then + echo "warning: Setting GIT_PERF_REPEAT_COUNT=1" >&2 + GIT_PERF_REPEAT_COUNT=1 +fi + +for m in false true batch +do + test_expect_success "create the files for core.fsyncObjectFiles=$m" ' + git reset --hard && + # create files across directories + test_create_unique_files $dir_count $files_per_dir files + ' + + # We only stash files in the 'files' subdirectory since + # the perf test infrastructure creates files in the + # current working directory that need to be preserved + test_perf "stash 500 files (core.fsyncObjectFiles=$m)" " + git -c core.fsyncobjectfiles=$m stash push -u -- files + " +done + +test_done