From patchwork Wed Jun 19 07:41:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13703487 Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E7392770E3 for ; Wed, 19 Jun 2024 07:41:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718782865; cv=none; b=DIiqXO7rmaDiswWTBw4zOkOUz0lWPN6FA67CTJYP8UrrKyJ6ubm2+dAdniG/lPl2VwyPniQoVGodvTssUczSmUQ5JBoDtZ2TDOGrlk/xXsjYAraaPEcv8goiXxc9cjhRZmLkA/B4Z6hOdFzB5LY2id4I7PSNxft+pMGkqSsfXyg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718782865; c=relaxed/simple; bh=wJkfH/C1KN0G7BCd/+y08gqtWsfwhEBqq7rC/njmvps=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=e51PDuPVDaJVgmu++D6T4Yr0SCStkX72UnORiLkpwHyIoATeV5UWCXXFPCIfG3WUrrWi6sriUf+3QdiU0UbV4jYTZysUmuFSH86taXL+7sWQgRM+kuLnaAR5se3NAuOBeLkAWR5jVEGCgdH1olm4RmFiUzj4hnQsJ1oU1xv4NJE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Vj22GGnt; arc=none smtp.client-ip=209.85.215.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Vj22GGnt" Received: by mail-pg1-f176.google.com with SMTP id 41be03b00d2f7-70a365a4532so2549998a12.1 for ; Wed, 19 Jun 2024 00:41:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718782863; x=1719387663; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=0esFvY3+bJvpicREEbQly/0b4xz2ftYbnop5KSlM3Js=; b=Vj22GGntB2kMr7yeoiJ5hwl/YJnQrAEH1hf7Sveakaq+awdktEemS2TTRuc2zoczbp wpSb7tImdY9Sg6JByIFLvbwqfKN+AhJ5pFZv9Y0MdZHAIwjoEQgu134pLPfJgXOtQ342 UEb4+haqvwPtHTGgwfnJzomfB53RulMW/vucF+XwtMvoOjUqUCZM7QDctNA+fLF8UDuV Qf9v7jlNFW574OqgnViwsaSU+xaOJVDavwEo1dPWPoh9WOrafib7JKxMZe8gLzz06tum WwvYNCRPhc5DY/7N2mDCeXepnOTrFU053likGNx48IwvjKIkAmNvxL/jZ7zmyfo8rqp4 Zt2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718782863; x=1719387663; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=0esFvY3+bJvpicREEbQly/0b4xz2ftYbnop5KSlM3Js=; b=C0+6C+k930txlTK3Di327Ruf8o5LC4PtGSLLjtv1mWTZXKBLqtrrQAt+7HVyifuQnx 91QJj/r8h9npXiPmlJGc7ESbjBxWtFq4vpHQmDwW7LVDh5vcsnAcsEqelnoOMDMGJMkG yfBNk4bn29KaP8VqC/CI6Yq16jei2CYqvEHrVW5s9hXIWUUD880xbU+xYAzUiJyXYEUH 84vUMZIKDIC8hjjJ4IRcoxoKg/YWt4qazUPbuScL+xiz24qRpcMyJkQNQA379UUUDH83 kdUTZY3Si2D82pkbJyoGH03JJp6KtDuv3DN+LybFVnVte0NhrmoA9aPS6xZZbtHqcH4Q n0yg== X-Gm-Message-State: AOJu0YxdwuddOFSBl3VrAHGWi6BNMlaRNF1WcS53Lt6oCw/sFar2cqfB VQFGWXBuMODVGiKbmj0/KFxeGah1770b9aR0Uc47fiZtaRXCXfpZNSFd/Q== X-Google-Smtp-Source: AGHT+IH5eEVy1EXS+kMo5qi7CN95FFpT2Zq/PIEgFCGzQGYuLfxYMXCF6oG07gEWqQmpqgqS/k5dyQ== X-Received: by 2002:a17:902:ec91:b0:1f6:da67:830b with SMTP id d9443c01a7336-1f9aa46c2a5mr20400825ad.59.1718782862498; Wed, 19 Jun 2024 00:41:02 -0700 (PDT) Received: from localhost ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f855e55eb0sm110458545ad.18.2024.06.19.00.41.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Jun 2024 00:41:02 -0700 (PDT) Date: Wed, 19 Jun 2024 15:41:01 +0800 From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine Subject: [GSoC][PATCH v4 1/7] fsck: add refs check interfaces to interact with fsck error levels Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The git-fsck(1) focuses on object database consistency check. It relies on the "fsck_options" to interact with fsck error levels. However, "fsck_options" aims at checking the object database which contains a lot of fields only related to object database. In order to add ref operations, create a new struct named "fsck_refs_options" and a new struct named "fsck_objs_options". Remove object-related fields from "fsck_options" to "fsck_objs_options". Change the "fsck_options" with three parts of members: 1. The "fsck_refs_options". 2. The "fsck_objs_options". 3. The common settings both for refs and objects. Because we leave common settings in "fsck_options". The setup process could be fully reused without any code changing. Also add related macros to align with the current code. Because we remove some fields from "fsck_options" to "fsck_objs_options". Change the influenced code to use the "fsck_options.objs_options" instead of using "fsck_options" itself. The static function "report" provided by "fsck.c" aims at reporting the problems related to object database which cannot be reused for refs. Provide "fsck_refs_report" function to integrate the fsck error levels into reference consistency check. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- builtin/fsck.c | 8 +-- builtin/index-pack.c | 2 +- builtin/mktag.c | 5 +- builtin/unpack-objects.c | 2 +- fetch-pack.c | 8 +-- fsck.c | 113 +++++++++++++++++++++++++++++---------- fsck.h | 79 ++++++++++++++++++++------- object-file.c | 2 +- 8 files changed, 161 insertions(+), 58 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index d13a226c2e..a5b82e228c 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -233,7 +233,7 @@ static void mark_unreachable_referents(const struct object_id *oid) object_as_type(obj, type, 0); } - options.walk = mark_used; + options.objs_options.walk = mark_used; fsck_walk(obj, NULL, &options); if (obj->type == OBJ_TREE) free_tree_buffer((struct tree *)obj); @@ -936,9 +936,9 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0); - fsck_walk_options.walk = mark_object; - fsck_obj_options.walk = mark_used; - fsck_obj_options.error_func = fsck_error_func; + fsck_walk_options.objs_options.walk = mark_object; + fsck_obj_options.objs_options.walk = mark_used; + fsck_obj_options.objs_options.error_func = fsck_error_func; if (check_strict) fsck_obj_options.strict = 1; diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 856428fef9..8824d7b8a9 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1746,7 +1746,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) usage(index_pack_usage); disable_replace_refs(); - fsck_options.walk = mark_link; + fsck_options.objs_options.walk = mark_link; reset_pack_idx_option(&opts); opts.flags |= WRITE_REV; diff --git a/builtin/mktag.c b/builtin/mktag.c index 4767f1a97e..e88475f009 100644 --- a/builtin/mktag.c +++ b/builtin/mktag.c @@ -91,8 +91,9 @@ int cmd_mktag(int argc, const char **argv, const char *prefix) if (strbuf_read(&buf, 0, 0) < 0) die_errno(_("could not read from stdin")); - fsck_options.error_func = mktag_fsck_error_func; - fsck_set_msg_type_from_ids(&fsck_options, FSCK_MSG_EXTRA_HEADER_ENTRY, + fsck_options.objs_options.error_func = mktag_fsck_error_func; + fsck_set_msg_type_from_ids(&fsck_options, + FSCK_MSG_EXTRA_HEADER_ENTRY, FSCK_WARN); /* config might set fsck.extraHeaderEntry=* again */ git_config(git_fsck_config, &fsck_options); diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index f1c85a00ae..368810a2a1 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -239,7 +239,7 @@ static int check_object(struct object *obj, enum object_type type, die("Whoops! Cannot find object '%s'", oid_to_hex(&obj->oid)); if (fsck_object(obj, obj_buf->buffer, obj_buf->size, &fsck_options)) die("fsck error in packed object"); - fsck_options.walk = check_object; + fsck_options.objs_options.walk = check_object; if (fsck_walk(obj, NULL, &fsck_options)) die("Error on reachable objects of %s", oid_to_hex(&obj->oid)); write_cached_object(obj, obj_buf); diff --git a/fetch-pack.c b/fetch-pack.c index eba9e420ea..148f9bd371 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -1220,7 +1220,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args, } else alternate_shallow_file = NULL; if (get_pack(args, fd, pack_lockfiles, NULL, sought, nr_sought, - &fsck_options.gitmodules_found)) + &fsck_options.objs_options.gitmodules_found)) die(_("git fetch-pack: fetch failed.")); if (fsck_finish(&fsck_options)) die("fsck failed"); @@ -1780,7 +1780,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, if (get_pack(args, fd, pack_lockfiles, packfile_uris.nr ? &index_pack_args : NULL, - sought, nr_sought, &fsck_options.gitmodules_found)) + sought, nr_sought, + &fsck_options.objs_options.gitmodules_found)) die(_("git fetch-pack: fetch failed.")); do_check_stateless_delimiter(args->stateless_rpc, &reader); @@ -1823,7 +1824,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args, packname[the_hash_algo->hexsz] = '\0'; - parse_gitmodules_oids(cmd.out, &fsck_options.gitmodules_found); + parse_gitmodules_oids(cmd.out, + &fsck_options.objs_options.gitmodules_found); close(cmd.out); diff --git a/fsck.c b/fsck.c index e193930ae7..4949d2e110 100644 --- a/fsck.c +++ b/fsck.c @@ -203,7 +203,8 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) if (!strcmp(buf, "skiplist")) { if (equal == len) die("skiplist requires a path"); - oidset_parse_file(&options->skiplist, buf + equal + 1); + oidset_parse_file(&options->objs_options.skiplist, + buf + equal + 1); buf += len + 1; continue; } @@ -220,7 +221,7 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) static int object_on_skiplist(struct fsck_options *opts, const struct object_id *oid) { - return opts && oid && oidset_contains(&opts->skiplist, oid); + return opts && oid && oidset_contains(&opts->objs_options.skiplist, oid); } __attribute__((format (printf, 5, 6))) @@ -249,8 +250,8 @@ static int report(struct fsck_options *options, va_start(ap, fmt); strbuf_vaddf(&sb, fmt, ap); - result = options->error_func(options, oid, object_type, - msg_type, msg_id, sb.buf); + result = options->objs_options.error_func(options, oid, object_type, + msg_type, msg_id, sb.buf); strbuf_release(&sb); va_end(ap); @@ -259,20 +260,20 @@ static int report(struct fsck_options *options, void fsck_enable_object_names(struct fsck_options *options) { - if (!options->object_names) - options->object_names = kh_init_oid_map(); + if (!options->objs_options.object_names) + options->objs_options.object_names = kh_init_oid_map(); } const char *fsck_get_object_name(struct fsck_options *options, const struct object_id *oid) { khiter_t pos; - if (!options->object_names) + if (!options->objs_options.object_names) return NULL; - pos = kh_get_oid_map(options->object_names, *oid); - if (pos >= kh_end(options->object_names)) + pos = kh_get_oid_map(options->objs_options.object_names, *oid); + if (pos >= kh_end(options->objs_options.object_names)) return NULL; - return kh_value(options->object_names, pos); + return kh_value(options->objs_options.object_names, pos); } void fsck_put_object_name(struct fsck_options *options, @@ -284,15 +285,17 @@ void fsck_put_object_name(struct fsck_options *options, khiter_t pos; int hashret; - if (!options->object_names) + if (!options->objs_options.object_names) return; - pos = kh_put_oid_map(options->object_names, *oid, &hashret); + pos = kh_put_oid_map(options->objs_options.object_names, + *oid, &hashret); if (!hashret) return; va_start(ap, fmt); strbuf_vaddf(&buf, fmt, ap); - kh_value(options->object_names, pos) = strbuf_detach(&buf, NULL); + kh_value(options->objs_options.object_names, pos) = + strbuf_detach(&buf, NULL); va_end(ap); } @@ -318,6 +321,7 @@ const char *fsck_describe_object(struct fsck_options *options, static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *options) { + struct fsck_objs_options *objs_options = &options->objs_options; struct tree_desc desc; struct name_entry entry; int res = 0; @@ -342,14 +346,14 @@ static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *op if (name && obj) fsck_put_object_name(options, &entry.oid, "%s%s/", name, entry.path); - result = options->walk(obj, OBJ_TREE, data, options); + result = objs_options->walk(obj, OBJ_TREE, data, options); } else if (S_ISREG(entry.mode) || S_ISLNK(entry.mode)) { obj = (struct object *)lookup_blob(the_repository, &entry.oid); if (name && obj) fsck_put_object_name(options, &entry.oid, "%s%s", name, entry.path); - result = options->walk(obj, OBJ_BLOB, data, options); + result = objs_options->walk(obj, OBJ_BLOB, data, options); } else { result = error("in tree %s: entry %s has bad mode %.6o", @@ -366,6 +370,7 @@ static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options *op static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_options *options) { + struct fsck_objs_options *objs_options = &options->objs_options; int counter = 0, generation = 0, name_prefix_len = 0; struct commit_list *parents; int res; @@ -380,8 +385,8 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio fsck_put_object_name(options, get_commit_tree_oid(commit), "%s:", name); - result = options->walk((struct object *) repo_get_commit_tree(the_repository, commit), - OBJ_TREE, data, options); + result = objs_options->walk((struct object *) repo_get_commit_tree(the_repository, commit), + OBJ_TREE, data, options); if (result < 0) return result; res = result; @@ -423,7 +428,8 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio else fsck_put_object_name(options, oid, "%s^", name); } - result = options->walk((struct object *)parents->item, OBJ_COMMIT, data, options); + result = objs_options->walk((struct object *)parents->item, + OBJ_COMMIT, data, options); if (result < 0) return result; if (!res) @@ -436,12 +442,13 @@ static int fsck_walk_commit(struct commit *commit, void *data, struct fsck_optio static int fsck_walk_tag(struct tag *tag, void *data, struct fsck_options *options) { const char *name = fsck_get_object_name(options, &tag->object.oid); + struct fsck_objs_options *objs_options = &options->objs_options; if (parse_tag(tag)) return -1; if (name) fsck_put_object_name(options, &tag->tagged->oid, "%s", name); - return options->walk(tag->tagged, OBJ_ANY, data, options); + return objs_options->walk(tag->tagged, OBJ_ANY, data, options); } int fsck_walk(struct object *obj, void *data, struct fsck_options *options) @@ -598,6 +605,7 @@ static int fsck_tree(const struct object_id *tree_oid, unsigned o_mode; const char *o_name; struct name_stack df_dup_candidates = { NULL }; + struct fsck_objs_options *objs_options = &options->objs_options; if (init_tree_desc_gently(&desc, tree_oid, buffer, size, TREE_DESC_RAW_MODES)) { @@ -628,7 +636,7 @@ static int fsck_tree(const struct object_id *tree_oid, if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { if (!S_ISLNK(mode)) - oidset_insert(&options->gitmodules_found, + oidset_insert(&objs_options->gitmodules_found, entry_oid); else retval += report(options, @@ -639,7 +647,7 @@ static int fsck_tree(const struct object_id *tree_oid, if (is_hfs_dotgitattributes(name) || is_ntfs_dotgitattributes(name)) { if (!S_ISLNK(mode)) - oidset_insert(&options->gitattributes_found, + oidset_insert(&objs_options->gitattributes_found, entry_oid); else retval += report(options, tree_oid, OBJ_TREE, @@ -666,7 +674,7 @@ static int fsck_tree(const struct object_id *tree_oid, has_dotgit |= is_ntfs_dotgit(backslash); if (is_ntfs_dotgitmodules(backslash)) { if (!S_ISLNK(mode)) - oidset_insert(&options->gitmodules_found, + oidset_insert(&objs_options->gitmodules_found, entry_oid); else retval += report(options, tree_oid, OBJ_TREE, @@ -1102,16 +1110,17 @@ static int fsck_gitmodules_fn(const char *var, const char *value, static int fsck_blob(const struct object_id *oid, const char *buf, unsigned long size, struct fsck_options *options) { + struct fsck_objs_options *objs_options = &options->objs_options; int ret = 0; if (object_on_skiplist(options, oid)) return 0; - if (oidset_contains(&options->gitmodules_found, oid)) { + if (oidset_contains(&objs_options->gitmodules_found, oid)) { struct config_options config_opts = { 0 }; struct fsck_gitmodules_data data; - oidset_insert(&options->gitmodules_done, oid); + oidset_insert(&objs_options->gitmodules_done, oid); if (!buf) { /* @@ -1137,13 +1146,14 @@ static int fsck_blob(const struct object_id *oid, const char *buf, ret |= data.ret; } - if (oidset_contains(&options->gitattributes_found, oid)) { + if (oidset_contains(&objs_options->gitattributes_found, oid)) { const char *ptr; - oidset_insert(&options->gitattributes_done, oid); + oidset_insert(&objs_options->gitattributes_done, oid); if (!buf || size > ATTR_MAX_FILE_SIZE) { /* + * A missing buffer here is a sign that the caller found the * blob too gigantic to load into memory. Let's just consider * that an error. @@ -1197,6 +1207,20 @@ int fsck_buffer(const struct object_id *oid, enum object_type type, type); } +int fsck_refs_error_function(struct fsck_options *o UNUSED, + const char *name, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id UNUSED, + const char *message) +{ + if (msg_type == FSCK_WARN) { + warning("%s: %s", name, message); + return 0; + } + error("%s: %s", name, message); + return 1; +} + int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type UNUSED, @@ -1255,18 +1279,51 @@ static int fsck_blobs(struct oidset *blobs_found, struct oidset *blobs_done, int fsck_finish(struct fsck_options *options) { + struct fsck_objs_options *objs_options = &options->objs_options; int ret = 0; - ret |= fsck_blobs(&options->gitmodules_found, &options->gitmodules_done, + ret |= fsck_blobs(&objs_options->gitmodules_found, + &objs_options->gitmodules_done, FSCK_MSG_GITMODULES_MISSING, FSCK_MSG_GITMODULES_BLOB, options, ".gitmodules"); - ret |= fsck_blobs(&options->gitattributes_found, &options->gitattributes_done, + ret |= fsck_blobs(&objs_options->gitattributes_found, + &objs_options->gitattributes_done, FSCK_MSG_GITATTRIBUTES_MISSING, FSCK_MSG_GITATTRIBUTES_BLOB, options, ".gitattributes"); return ret; } +int fsck_refs_report(struct fsck_options *o, + const char *name, + enum fsck_msg_id msg_id, + const char *fmt, ...) +{ + va_list ap; + struct strbuf sb = STRBUF_INIT; + enum fsck_msg_type msg_type = fsck_msg_type(msg_id, o); + int ret = 0; + + if (msg_type == FSCK_IGNORE) + return 0; + + if (msg_type == FSCK_FATAL) + msg_type = FSCK_ERROR; + else if (msg_type == FSCK_INFO) + msg_type = FSCK_WARN; + + prepare_msg_ids(); + strbuf_addf(&sb, "%s: ", msg_id_info[msg_id].camelcased); + + va_start(ap, fmt); + strbuf_vaddf(&sb, fmt, ap); + ret = o->refs_options.error_func(o, name, msg_type, msg_id, sb.buf); + strbuf_release(&sb); + va_end(ap); + + return ret; +} + int git_fsck_config(const char *var, const char *value, const struct config_context *ctx, void *cb) { diff --git a/fsck.h b/fsck.h index 6085a384f6..0391dffbb0 100644 --- a/fsck.h +++ b/fsck.h @@ -103,6 +103,21 @@ void fsck_set_msg_type(struct fsck_options *options, void fsck_set_msg_types(struct fsck_options *options, const char *values); int is_valid_msg_type(const char *msg_id, const char *msg_type); +/* + * callback function for fsck refs and reflogs. + */ +typedef int (*fsck_refs_error)(struct fsck_options *o, + const char *name, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); + +int fsck_refs_error_function(struct fsck_options *o, + const char *name, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); + /* * callback function for fsck_walk * type is the expected type of the object or OBJ_ANY @@ -115,10 +130,12 @@ typedef int (*fsck_walk_func)(struct object *obj, enum object_type object_type, void *data, struct fsck_options *options); /* callback for fsck_object, type is FSCK_ERROR or FSCK_WARN */ -typedef int (*fsck_error)(struct fsck_options *o, - const struct object_id *oid, enum object_type object_type, - enum fsck_msg_type msg_type, enum fsck_msg_id msg_id, - const char *message); +typedef int (*fsck_obj_error)(struct fsck_options *o, + const struct object_id *oid, + enum object_type object_type, + enum fsck_msg_type msg_type, + enum fsck_msg_id msg_id, + const char *message); int fsck_error_function(struct fsck_options *o, const struct object_id *oid, enum object_type object_type, @@ -131,11 +148,17 @@ int fsck_error_cb_print_missing_gitmodules(struct fsck_options *o, enum fsck_msg_id msg_id, const char *message); -struct fsck_options { +struct fsck_refs_options { + fsck_refs_error error_func; +}; + +#define FSCK_REFS_OPTIONS_DEFAULT { \ + .error_func = fsck_refs_error_function, \ +} + +struct fsck_objs_options { fsck_walk_func walk; - fsck_error error_func; - unsigned strict:1; - enum fsck_msg_type *msg_type; + fsck_obj_error error_func; struct oidset skiplist; struct oidset gitmodules_found; struct oidset gitmodules_done; @@ -144,29 +167,43 @@ struct fsck_options { kh_oid_map_t *object_names; }; -#define FSCK_OPTIONS_DEFAULT { \ +#define FSCK_OBJS_OPTIONS_DEFAULT { \ + .error_func = fsck_error_function, \ .skiplist = OIDSET_INIT, \ .gitmodules_found = OIDSET_INIT, \ .gitmodules_done = OIDSET_INIT, \ .gitattributes_found = OIDSET_INIT, \ .gitattributes_done = OIDSET_INIT, \ - .error_func = fsck_error_function \ } -#define FSCK_OPTIONS_STRICT { \ - .strict = 1, \ +#define FSCK_OBJS_OPTIONS_MISSING_GITMODULES { \ + .error_func = fsck_error_cb_print_missing_gitmodules, \ .gitmodules_found = OIDSET_INIT, \ .gitmodules_done = OIDSET_INIT, \ .gitattributes_found = OIDSET_INIT, \ .gitattributes_done = OIDSET_INIT, \ - .error_func = fsck_error_function, \ +} + +struct fsck_options { + struct fsck_refs_options refs_options; + struct fsck_objs_options objs_options; + enum fsck_msg_type *msg_type; + unsigned strict:1, + verbose:1; +}; + +#define FSCK_OPTIONS_DEFAULT { \ + .refs_options = FSCK_REFS_OPTIONS_DEFAULT, \ + .objs_options = FSCK_OBJS_OPTIONS_DEFAULT, \ +} +#define FSCK_OPTIONS_STRICT { \ + .refs_options = FSCK_REFS_OPTIONS_DEFAULT, \ + .objs_options = FSCK_OBJS_OPTIONS_DEFAULT, \ + .strict = 1, \ } #define FSCK_OPTIONS_MISSING_GITMODULES { \ + .refs_options = FSCK_REFS_OPTIONS_DEFAULT, \ + .objs_options = FSCK_OBJS_OPTIONS_MISSING_GITMODULES, \ .strict = 1, \ - .gitmodules_found = OIDSET_INIT, \ - .gitmodules_done = OIDSET_INIT, \ - .gitattributes_found = OIDSET_INIT, \ - .gitattributes_done = OIDSET_INIT, \ - .error_func = fsck_error_cb_print_missing_gitmodules, \ } /* descend in all linked child objects @@ -209,6 +246,12 @@ int fsck_tag_standalone(const struct object_id *oid, const char *buffer, */ int fsck_finish(struct fsck_options *options); +__attribute__((format (printf, 4, 5))) +int fsck_refs_report(struct fsck_options *o, + const char *name, + enum fsck_msg_id msg_id, + const char *fmt, ...); + /* * Subsystem for storing human-readable names for each object. * diff --git a/object-file.c b/object-file.c index d3cf4b8b2e..b027d70725 100644 --- a/object-file.c +++ b/object-file.c @@ -2510,7 +2510,7 @@ static int index_mem(struct index_state *istate, struct fsck_options opts = FSCK_OPTIONS_DEFAULT; opts.strict = 1; - opts.error_func = hash_format_check_report; + opts.objs_options.error_func = hash_format_check_report; if (fsck_buffer(null_oid(), type, buf, size, &opts)) die(_("refusing to create malformed object")); fsck_finish(&opts); From patchwork Wed Jun 19 07:41:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13703488 Received: from mail-pg1-f170.google.com (mail-pg1-f170.google.com [209.85.215.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 843C673514 for ; Wed, 19 Jun 2024 07:41:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718782900; cv=none; b=n/moXgMTdjtSIxgVcSSuMlAttk/6iXbbfQ+Gropff5X0aj3rFIjXUtXMAYKTVqfYHi7FSZ582gcNtqdR+pLKfHLxhFL49IGEuRtx47fVyHzsNTBkzCJMoiGDdY+Aci8ykTb2EoQ82GsZPX9DXHPxtzzF+B1U39vJ/3YsfFvX/kE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718782900; c=relaxed/simple; bh=p5tQ+guEGQoep+FasFzMk4dHeNLSn6pyEA7QQmENpC8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Yd3tABxz7/NV8QZHxaqIccizZ1W1RaAFzsokQy4feaxHkyzXVzzvXRZTivCyn765VbmCABjEXMl7T77BxEoU5wzoBwQbRys1xFtIAvkWI7HnIKX289MkIgd4GxlrZTuizEw6jP7xhccayuTB5/1NyLZgGQa32NboMbqCjYMzRIo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Dh0HzM17; arc=none smtp.client-ip=209.85.215.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Dh0HzM17" Received: by mail-pg1-f170.google.com with SMTP id 41be03b00d2f7-6bfd4b88608so4550556a12.1 for ; Wed, 19 Jun 2024 00:41:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718782897; x=1719387697; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=TBFZcvivzkBzB6T/hEWONlZZbiEiAS/ZvCiBrpj1U6o=; b=Dh0HzM179j30PusBvfRXpoNvuCoukPjK2F70jZyBpNutVVi72feLMFmWRa5jehE/br Y2lTx4/qQnSbKdfSaMSBGWp4wcL1lhO3acxVLhHXehbMuM1TWqYy6m7t1TdfTt9/jhFT mYcpzXHkD7J00/oFN27P16wYz/zqMd2QfmR4WNYGbcGR9eQRMBi/pEGZIVde2dD5kCnP j5VdHYsfKPRSi0MqnxcTCRl2lAW3Be41EysiJTISN+gjN6wAMans6KqfN+HnLHfrdaty 9Mn9xDuSWXj2zmGOSwXfPnmDIVpUFwZNfKniBkdKqpfVj6irzWeAsHgXdSEQQauPQ7ay creg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718782897; x=1719387697; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=TBFZcvivzkBzB6T/hEWONlZZbiEiAS/ZvCiBrpj1U6o=; b=n3sJayPpgfEOJzlx41oLHZGR4BjXhYTbXWPfohTGHRnhgPGiTJBckIXm1Ktbf5vny9 7ua3+AfBuxnQITLBC3uAibZ4Vm+pcs6ZO6ae9imIFPirduDTe2V9HIMzYLK2hqfktXFh 14/P5dEhi8MW3yQXk4Oi2BF3FdC3vJcWf7w2AIKtcXBUnU9Q8APnTE26w8IXNmZL0cc5 O6dWQaWlzxVvHd8W/3f93dATlIejYGm4Y+PlBwvOoxQEg0lpiWL0kiU9C+7Nh2aOAC6m zXa0Q9eB3Zz+MnBu1jdv/vr45O/G2aLBLJtCHEHiOfayqMVjfZRljWI0vx26FYgDyBUs sXeg== X-Gm-Message-State: AOJu0YyNmMrA1MZg5HewrGuiVyb/fUKGZa8VWxMa9OcivbArf+mRxdxU UddOMs7dXwiiYFHA81yWfJfkWgn4rDyCFmtfzXrbYy9M6VMvBHj9ofW6fg== X-Google-Smtp-Source: AGHT+IHsXGtbWZs8rEywzFmMTLWTwI2D8hOdmoatIVOXroIe/ema8wfkiF61+4gDlVe+iPjiYb5HxA== X-Received: by 2002:a05:6a20:12cf:b0:1b4:3f96:f1d8 with SMTP id adf61e73a8af0-1bcbb385db8mr2003385637.13.1718782896721; Wed, 19 Jun 2024 00:41:36 -0700 (PDT) Received: from localhost ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f855f07c5fsm109922435ad.206.2024.06.19.00.41.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Jun 2024 00:41:36 -0700 (PDT) Date: Wed, 19 Jun 2024 15:41:35 +0800 From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine Subject: [GSoC][PATCH v4 2/7] refs: set up ref consistency check infrastructure Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The interfaces defined in the `ref_storage_be` are carefully structured in semantic. It's organized as the five parts: 1. The name and the initialization interfaces. 2. The ref transaction interfaces. 3. The ref internal interfaces (pack, rename and copy). 4. The ref filesystem interfaces. 5. The reflog related interfaces. To keep consistent with the git-fsck(1), add a new interface named "fsck_refs_fn" to the end of "ref_storage_be". This semantic cannot be grouped into any above five categories. Explicitly add blank line to make it different from others. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- refs.c | 5 +++++ refs.h | 8 ++++++++ refs/debug.c | 11 +++++++++++ refs/files-backend.c | 15 ++++++++++++++- refs/packed-backend.c | 8 ++++++++ refs/refs-internal.h | 6 ++++++ refs/reftable-backend.c | 8 ++++++++ 7 files changed, 60 insertions(+), 1 deletion(-) diff --git a/refs.c b/refs.c index 0c1e45f5a9..cb3b8ec36a 100644 --- a/refs.c +++ b/refs.c @@ -316,6 +316,11 @@ int check_refname_format(const char *refname, int flags) return check_or_sanitize_refname(refname, flags, NULL); } +int refs_fsck(struct ref_store *refs, struct fsck_options *o) +{ + return refs->be->fsck(refs, o); +} + void sanitize_refname_component(const char *refname, struct strbuf *out) { if (check_or_sanitize_refname(refname, REFNAME_ALLOW_ONELEVEL, out)) diff --git a/refs.h b/refs.h index 756039a051..93fa23c7ee 100644 --- a/refs.h +++ b/refs.h @@ -3,6 +3,7 @@ #include "commit.h" +struct fsck_options; struct object_id; struct ref_store; struct repository; @@ -547,6 +548,13 @@ int refs_for_each_reflog(struct ref_store *refs, each_reflog_fn fn, void *cb_dat */ int check_refname_format(const char *refname, int flags); +/* + * Check the reference database for consistency. Return 0 if refs and + * reflogs are consistent, and non-zero otherwise. The errors will be + * written to stderr. + */ +int refs_fsck(struct ref_store *refs, struct fsck_options *o); + /* * Apply the rules from check_refname_format, but mutate the result until it * is acceptable, and place the result in "out". diff --git a/refs/debug.c b/refs/debug.c index 547d9245b9..45e2e784a0 100644 --- a/refs/debug.c +++ b/refs/debug.c @@ -419,6 +419,15 @@ static int debug_reflog_expire(struct ref_store *ref_store, const char *refname, return res; } +static int debug_fsck(struct ref_store *ref_store, + struct fsck_options *o) +{ + struct debug_ref_store *drefs = (struct debug_ref_store *)ref_store; + int res = drefs->refs->be->fsck(drefs->refs, o); + trace_printf_key(&trace_refs, "fsck: %d\n", res); + return res; +} + struct ref_storage_be refs_be_debug = { .name = "debug", .init = NULL, @@ -451,4 +460,6 @@ struct ref_storage_be refs_be_debug = { .create_reflog = debug_create_reflog, .delete_reflog = debug_delete_reflog, .reflog_expire = debug_reflog_expire, + + .fsck = debug_fsck, }; diff --git a/refs/files-backend.c b/refs/files-backend.c index 4519b46171..ba9e57d1e9 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -3405,6 +3405,17 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store, return ret; } +static int files_fsck(struct ref_store *ref_store, + struct fsck_options *o) +{ + int ret; + struct files_ref_store *refs = + files_downcast(ref_store, REF_STORE_READ, "fsck"); + + ret = refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); + return ret; +} + struct ref_storage_be refs_be_files = { .name = "files", .init = files_ref_store_init, @@ -3431,5 +3442,7 @@ struct ref_storage_be refs_be_files = { .reflog_exists = files_reflog_exists, .create_reflog = files_create_reflog, .delete_reflog = files_delete_reflog, - .reflog_expire = files_reflog_expire + .reflog_expire = files_reflog_expire, + + .fsck = files_fsck, }; diff --git a/refs/packed-backend.c b/refs/packed-backend.c index c4c1e36aa2..ad3e8fb1d1 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -1733,6 +1733,12 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s return empty_ref_iterator_begin(); } +static int packed_fsck(struct ref_store *ref_store, + struct fsck_options *o) +{ + return 0; +} + struct ref_storage_be refs_be_packed = { .name = "packed", .init = packed_ref_store_init, @@ -1760,4 +1766,6 @@ struct ref_storage_be refs_be_packed = { .create_reflog = NULL, .delete_reflog = NULL, .reflog_expire = NULL, + + .fsck = packed_fsck, }; diff --git a/refs/refs-internal.h b/refs/refs-internal.h index cbcb6f9c36..280acb7f9e 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -4,6 +4,7 @@ #include "refs.h" #include "iterator.h" +struct fsck_options; struct ref_transaction; /* @@ -650,6 +651,9 @@ typedef int read_raw_ref_fn(struct ref_store *ref_store, const char *refname, typedef int read_symbolic_ref_fn(struct ref_store *ref_store, const char *refname, struct strbuf *referent); +typedef int fsck_fn(struct ref_store *ref_store, + struct fsck_options *o); + struct ref_storage_be { const char *name; ref_store_init_fn *init; @@ -677,6 +681,8 @@ struct ref_storage_be { create_reflog_fn *create_reflog; delete_reflog_fn *delete_reflog; reflog_expire_fn *reflog_expire; + + fsck_fn *fsck; }; extern struct ref_storage_be refs_be_files; diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 3a75d63eb5..2b2630d47a 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -2281,6 +2281,12 @@ static int reftable_be_reflog_expire(struct ref_store *ref_store, return ret; } +static int reftable_be_fsck(struct ref_store *ref_store, + struct fsck_options *o) +{ + return 0; +} + struct ref_storage_be refs_be_reftable = { .name = "reftable", .init = reftable_be_init, @@ -2308,4 +2314,6 @@ struct ref_storage_be refs_be_reftable = { .create_reflog = reftable_be_create_reflog, .delete_reflog = reftable_be_delete_reflog, .reflog_expire = reftable_be_reflog_expire, + + .fsck = reftable_be_fsck, }; From patchwork Wed Jun 19 07:42:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13703490 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6569762D0 for ; Wed, 19 Jun 2024 07:42:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718782936; cv=none; b=LKIvg8tJX1kFtOymTpMrKptpUyn+ucC9vYEDlSLsUSXljte86CWcHEGhHIeJLItquRRxzikqs80fRhfEcm8JEnzLjR9W8S15EMAzajg8mDlf9G1t1gMCG6LbKY3f570YN6XB226PUzzsuVslLkiBJw4vIt2/MXplxTarbAaBnt4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718782936; c=relaxed/simple; bh=VodnOPh5WO1nlmB6xwoFQoqK80utX3kci6yumDJkeaI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=pmfEhtW2E/l5PEme3TSC1ZMs4Z+P6cn3NVf6BpuPauSQy/s6F1tvE/2MdS/pZQDt/GziLpd5U10CjA0bOJNDC0pkoM73Dmrq7KmNJlVFF/JxTy73KnvYeNLLJZlWkZJ5op0KBrusCdSsanz6Ylz2T1OOGSJWpNGEq5RgvEYy7/s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RamOyiOz; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RamOyiOz" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-2c306e87b1fso4924155a91.3 for ; Wed, 19 Jun 2024 00:42:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718782933; x=1719387733; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=TGrvTYjj5qxu44bU63Q3wLCWgfp16TiJWXB8Q+bWEsM=; b=RamOyiOzQXUjCo2Am+C3epPwOC2zgY1kxxboz4MboUPzGQWkIY82OtOySD6+x16r8A TFMES60pW8dZwuLUpNmLSyWX2QCL2RGNdyTPbyHnZwmCQZ0qR7Z7rZxlybZjb+vanq6U QL2N5ev5/XnilwyzIV9aeDkiQG1mXsTwY71TmQAYlMTJBvSGcvN/iRhEcK8uUUjwbm64 IKAJLvVM01OrlLaazklPZjClOaOIcLEXMsiz2Eed8EGcvm9jyh3fBoStW/vhujhk4geq RWByzp680wdmkW4QyggB/IsDmyjFs6h2K0heIAVxndAyIdn0pOEuHxAoENt9YT+tcWUU w8kA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718782933; x=1719387733; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=TGrvTYjj5qxu44bU63Q3wLCWgfp16TiJWXB8Q+bWEsM=; b=eS8PVbKnie9pTeUCOvP27w6rswIBK4HmjlAeDpHM5yR47rwpHBOlWjPCBTNYc3bH0E Nejp4TBrcvUIwMf1hhzqWplweZMSBOsCXmE1rwijcj9DobpjRQuoRA4ygZ6gxz5Muzhg 2b7U2OOQCCp9UQTQ35nXiQl4sRfl2DYTkLRgWNJO+WzCIBwD0/jMFtbm/zXp+TcTBdEW mILd8Wv5kv0ivSpeNcsA6zNuYEOUqHPXziJ5HtCHSIuxo/LHUSEGxHEWX1MrFBl7UuAr 6mInlyepUG50HnUik42+9Y8GqQ3Zt/CJexwLaWqp4a89PX/zGaXndtlwZGv9Yw0Bpn62 tdEQ== X-Gm-Message-State: AOJu0YyYF2gFrpu10ydKVirQrar02FYxClZrgC0rmOY02HS2f1jTnmmP RRDMLS8X/cAN8N4k4LUbUP5spAqt4v1DbSzhShx61+DSSdp67OEx6qsf3A== X-Google-Smtp-Source: AGHT+IFH0tPwW0beIRVknt+1lLwxWWu2yp+PBT8MUcf4wUMORSbVQoFg2q24HS+1FaV9ytNLppPjSA== X-Received: by 2002:a17:90a:df07:b0:2c1:ebc4:4f1f with SMTP id 98e67ed59e1d1-2c7b5d56b91mr1769098a91.33.1718782932988; Wed, 19 Jun 2024 00:42:12 -0700 (PDT) Received: from localhost ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2c4a7603158sm14486223a91.27.2024.06.19.00.42.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Jun 2024 00:42:12 -0700 (PDT) Date: Wed, 19 Jun 2024 15:42:11 +0800 From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine Subject: [GSoC][PATCH v4 3/7] builtin/refs: add verify subcommand Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Introduce a new subcommand "verify" in git-refs(1) to allow the user to check the reference database consistency. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- Documentation/git-refs.txt | 13 +++++++++++ builtin/refs.c | 45 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+) diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt index 5b99e04385..1244a85b64 100644 --- a/Documentation/git-refs.txt +++ b/Documentation/git-refs.txt @@ -10,6 +10,7 @@ SYNOPSIS -------- [verse] 'git refs migrate' --ref-format= [--dry-run] +'git refs verify' [--strict] [--verbose] DESCRIPTION ----------- @@ -22,6 +23,9 @@ COMMANDS migrate:: Migrate ref store between different formats. +verify:: + Verify reference database consistency. + OPTIONS ------- @@ -39,6 +43,15 @@ include::ref-storage-format.txt[] can be used to double check that the migration works as expected before performing the actual migration. +The following options are specific to 'git refs verify': + +--strict:: + Enable more strict checking, every WARN severity for the `Fsck Messages` + be seen as ERROR. See linkgit:git-fsck[1]. + +--verbose:: + When verifying the reference database consistency, be chatty. + KNOWN LIMITATIONS ----------------- diff --git a/builtin/refs.c b/builtin/refs.c index 46dcd150d4..231a051f6b 100644 --- a/builtin/refs.c +++ b/builtin/refs.c @@ -1,4 +1,6 @@ #include "builtin.h" +#include "config.h" +#include "fsck.h" #include "parse-options.h" #include "refs.h" #include "repository.h" @@ -7,6 +9,9 @@ #define REFS_MIGRATE_USAGE \ N_("git refs migrate --ref-format= [--dry-run]") +#define REFS_VERIFY_USAGE \ + N_("git refs verify [--strict] [--verbose]") + static int cmd_refs_migrate(int argc, const char **argv, const char *prefix) { const char * const migrate_usage[] = { @@ -58,15 +63,55 @@ static int cmd_refs_migrate(int argc, const char **argv, const char *prefix) return err; } +static int cmd_refs_verify(int argc, const char **argv, const char *prefix) +{ + const char * const verify_usage[] = { + REFS_VERIFY_USAGE, + NULL, + }; + int ret = 0; + unsigned int verbose = 0, strict = 0; + struct fsck_options fsck_options = FSCK_OPTIONS_DEFAULT; + struct option options[] = { + OPT__VERBOSE(&verbose, N_("be verbose")), + OPT_BOOL(0, "strict", &strict, N_("enable strict checking")), + OPT_END(), + }; + + argc = parse_options(argc, argv, prefix, options, verify_usage, 0); + if (argc) + usage(_("too many arguments")); + + if (verbose) + fsck_options.verbose = 1; + if (strict) + fsck_options.strict = 1; + + git_config(git_fsck_config, &fsck_options); + prepare_repo_settings(the_repository); + + ret = refs_fsck(get_main_ref_store(the_repository), &fsck_options); + + /* + * Explicitly free the allocated array and object skiplist set. Because + * we reuse `git_fsck_config` here. It will still set the skiplist. + */ + free(fsck_options.msg_type); + oidset_clear(&fsck_options.objs_options.skiplist); + return ret; +} + int cmd_refs(int argc, const char **argv, const char *prefix) { const char * const refs_usage[] = { REFS_MIGRATE_USAGE, + REFS_VERIFY_USAGE, NULL, }; parse_opt_subcommand_fn *fn = NULL; struct option opts[] = { OPT_SUBCOMMAND("migrate", &fn, cmd_refs_migrate), + OPT_SUBCOMMAND("verify", &fn, cmd_refs_verify), OPT_END(), }; From patchwork Wed Jun 19 07:42:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13703491 Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4EE67770E8 for ; Wed, 19 Jun 2024 07:42:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718782960; cv=none; b=gKL5xXo8WV4Vg8cnv2YHiy5y8ORMin+lec8bX8/tTypc6t7SUPILrhJQ175avDDNDdISW8FLYP4axxpy8s/+EpjLLGJxcAWnMBtiams344kMULNmErQcBmbTUeksmAtsOPTL7Mia0TgQY0EM/4nUpIVNEUcRTCBO36xPqaVA8jo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718782960; c=relaxed/simple; bh=t7B+KROCFH4j6b5oWVBjS8FdcdYfBYsgyJx97g70jUU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=t0Fwo8Q9AEvtVMbogpjzz3hGi/QWaWgB6um4BuUTPVelz8qS1Qwym2YvnG+RRd3opeF3TWoDWOSMLtJvW3SkdETNlHGpLsA5ZUM6xFW+lSHYWKRTVHmCtM+c/5g5qfINy3OmuCZFBm0RG//iArh6wlnBN9i4/4gHxJIR+2R2RQ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CzLRrlwE; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CzLRrlwE" Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-704313fa830so4926745b3a.3 for ; Wed, 19 Jun 2024 00:42:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718782958; x=1719387758; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=xGUQWad6A3sg1h/q4M7oBoBoUiwy19gq84I5o14dazM=; b=CzLRrlwEHcekTE7k/tYSoJKWfA62qyohihZzQGNCk+f6W62ZRxwFFDZWLXQsn5xD7h 73qBzfjKSzn7rPOWIefsAaEuf3voLbFaeOcRtqXCmx2TN5yWaUSB5hh5fKRJcaajWb6H TU7TTRcCg/nY0kvwg04d4CdmvcL2/ZFeztRkJPM95NPAE+Ij6mPpr4w+815F9CGCVTgr i+1PZJLAg8Jti2nnpRHtkjnaSnwWX/UCEAkkNErJPlzTdb39SncLE4Ydkg3IPePn+Sif TwHgAWXXkm9+7T93BDYAC3CztQN+V26Ae9UpFT2tEob4EA2WfFK4nt0nJzK0giFscvDx wDNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718782958; x=1719387758; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=xGUQWad6A3sg1h/q4M7oBoBoUiwy19gq84I5o14dazM=; b=HKA6Hh/KOLBJdZFyt2l5T56IkQSQAGa2gZiN2FBMkpEMvlHYnDhJDnhP3NXFY5DWOg OBzXE968yBszJC6rJe3mHi6UPEmVdcDC1fZv0bQAPDRlYoeQgeRJrgfKV0ZweqEoei6t jd8b/v165Fx8j7UFkbEeX1kCPwEytH0Nm2n/2UJKdLHiX4UnZt9z0abXV9VgV1eYZBuE 6F+XGN/ztIDLSMjPhL1A4SNSMyeOaBsB/c7YvmXFX2cJsw2j9ggPz9jLZ4Qob8YNQ61+ Ehyd6X3qfxabvegGWgIjAgx0W0P96WxNVXlUU/o4QY/4bW/Wb0V0gPpwlBq1xcxS9MOh aDWw== X-Gm-Message-State: AOJu0YxG2H47mFMuJpC9SA9UonXNqgz1F+jEwpaDh6amEEERN/IYXH8a eQpEQ90uEs3eTWaHaD6w5sAvz4vX0n1P6Z/4M7OuWErUpxbBDjyArEH/4Q== X-Google-Smtp-Source: AGHT+IGn+eiP5xtZ6/kBB5gx0zTJR2+Q3FUnLyhLK7OEnGo5pmnCRQ4Q9FNlj+WNw25Ol0/SgxbU0Q== X-Received: by 2002:a05:6a00:1144:b0:705:c860:13c with SMTP id d2e1a72fcca58-70629d06645mr2055489b3a.34.1718782957868; Wed, 19 Jun 2024 00:42:37 -0700 (PDT) Received: from localhost ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-705ccb4158fsm10071183b3a.108.2024.06.19.00.42.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Jun 2024 00:42:37 -0700 (PDT) Date: Wed, 19 Jun 2024 15:42:36 +0800 From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine Subject: [GSoC][PATCH v4 4/7] builtin/fsck: add `git-refs verify` child process Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Introduce a new function "fsck_refs" that initializes and runs a child process to execute the "git-refs verify" command. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- builtin/fsck.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index a5b82e228c..cd988ff167 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -896,6 +896,21 @@ static int check_pack_rev_indexes(struct repository *r, int show_progress) return res; } +static void fsck_refs(void) +{ + struct child_process refs_verify = CHILD_PROCESS_INIT; + child_process_init(&refs_verify); + refs_verify.git_cmd = 1; + strvec_pushl(&refs_verify.args, "refs", "verify", NULL); + if (verbose) + strvec_push(&refs_verify.args, "--verbose"); + if (check_strict) + strvec_push(&refs_verify.args, "--strict"); + + if (run_command(&refs_verify)) + errors_found |= ERROR_REFS; +} + static char const * const fsck_usage[] = { N_("git fsck [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]\n" " [--[no-]full] [--strict] [--verbose] [--lost-found]\n" @@ -1065,6 +1080,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) check_connectivity(); + fsck_refs(); + if (the_repository->settings.core_commit_graph) { struct child_process commit_graph_verify = CHILD_PROCESS_INIT; From patchwork Wed Jun 19 07:43:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13703492 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7209D770F0 for ; Wed, 19 Jun 2024 07:43:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718783037; cv=none; b=MxRu2ot3gJKAGdZoQtEbkfMjPxdETr8nIkaGsNA1TRabXaxbmzvZ6KqrQuqFmGRZRnApDiQHjcQqVp6cBfojI5XcTY+1umxDG3nqv+ktnfmEmvoPBK6DzDeZ9KV8CifMrHWbcuxjjbO8tBOKeMC+lhSp0Cf1PoI5to4kct2nA6k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718783037; c=relaxed/simple; bh=08n7DPHdXvQUAImOxYUWMSvkjmt4sf5l7yGSwwRjkf0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=OZU0th5CQtRaxC8k9of0MXOsjwePPLkRUabJwemm8K9a8RfNKlMCOqC8aUYeeFzhGp8l1v6Szd/adiP7Or+cBmlMZhi2VnpH+7PKPQwul879RQSaQTkDizkh1MDr9eB+h937jOpo1W4by4fMsvwbUsBpr2BoFY3Na6l65W61htQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=DSmWQ0z6; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DSmWQ0z6" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-1f99fe4dc5aso11287265ad.0 for ; Wed, 19 Jun 2024 00:43:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718783035; x=1719387835; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=VrURskpzped2/CY0H6zThCVDKrCajOhzffR3MJDStK4=; b=DSmWQ0z6OgF4Zg6tNDtQOSWsOvRpAZdB1XWj0o8VAEuQ91MR7qT6mH4zMHoF2ASzSI 40DRTgvBB6G2IvSedB6Ls3LOt7CU3aNToGQ3105Jm3XJ5f7y3UNjoweBP6d07wMR96YP JtxifIkqqDRWBKgyvcd+ZkkiIuDXhHGGNiIu0obfR97wqi3Yqk0ldg8YPXXuotE2fET5 WDQWn5y5EDFSIYa6lEu4s032/xdEeEhPlliYsu2KYI2jMhADbu/vEzd0FG7xgUDHDk7a PI3Hj2br/q1x1UmYBt1TEVCHU7pRN272xkrLpqemFZnw72KZ0WcRe8x2opWDD9xXAf+j ql6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718783035; x=1719387835; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=VrURskpzped2/CY0H6zThCVDKrCajOhzffR3MJDStK4=; b=Qt/Vv1cR/mL+C4jiHxXeHBNu5sGLhW+lMT6fvdfr1GctMNOcuOr03RL+QqQ+zscV5t 48Eg5JjM4PHNwIXEB+puDJoTS6k9gD10+87kNqsm9ba0Rg9VjSvN32zitzSKBnkXqGRZ TuXGDSR3v7zPmh0gMzxipMxk6PKgCvZRMIwA2T2zjoYyeJM48fQSpU4DuM3jueP14znn uLqfbxxlf/tzTMxnP+LC8LgdhNcPsOpg8hLnFeZeZXjEqXA/F36wAcJlKsgaZvC4NTjJ HH3wfN3pcYaUHuS14LfjjM+METjvlci+dVqxSKAwXd/77EnKNfFrDZjqWU/kRl0o9zCB zmKQ== X-Gm-Message-State: AOJu0YxMtbylNMP5kcmUtXxEEyaUxX4NDLfU65jp4stka8AaUABa1MlB o/qyqO1PvAbJYAHlZtx0dy9uB87gvaeTpUqxNaGIbaXSDnWNHM6OP0fdYg== X-Google-Smtp-Source: AGHT+IEfEh6IBXRlSOch+36hRY9pOdAQacig57+3drMJmqzJpE3m9G4gDKgwYNvl4GtAUfwogThIBQ== X-Received: by 2002:a17:902:e54e:b0:1f7:111c:2d3a with SMTP id d9443c01a7336-1f9aa47cc5cmr19200265ad.65.1718783034625; Wed, 19 Jun 2024 00:43:54 -0700 (PDT) Received: from localhost ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f855f13dd2sm110040195ad.215.2024.06.19.00.43.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Jun 2024 00:43:54 -0700 (PDT) Date: Wed, 19 Jun 2024 15:43:53 +0800 From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine Subject: [GSoC][PATCH v4 5/7] files-backend: add unified interface for refs scanning Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: For refs and reflogs, we need to scan its corresponding directories to check every regular file or symbolic link which shares the same pattern. Introduce a unified interface for scanning directories for files-backend. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- refs/files-backend.c | 77 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 76 insertions(+), 1 deletion(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index ba9e57d1e9..5156530774 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -4,6 +4,7 @@ #include "../gettext.h" #include "../hash.h" #include "../hex.h" +#include "../fsck.h" #include "../refs.h" #include "refs-internal.h" #include "ref-cache.h" @@ -3405,6 +3406,78 @@ static int files_ref_store_remove_on_disk(struct ref_store *ref_store, return ret; } +/* + * For refs and reflogs, they share a unified interface when scanning + * the whole directory. This function is used as the callback for each + * regular file or symlink in the directory. + */ +typedef int (*files_fsck_refs_fn)(struct fsck_options *o, + const char *gitdir, + const char *refs_check_dir, + struct dir_iterator *iter); + +static int files_fsck_refs_dir(struct ref_store *ref_store, + struct fsck_options *o, + const char *refs_check_dir, + files_fsck_refs_fn *fsck_refs_fns) +{ + const char *gitdir = ref_store->gitdir; + struct strbuf sb = STRBUF_INIT; + struct dir_iterator *iter; + int iter_status; + int ret = 0; + + strbuf_addf(&sb, "%s/%s", gitdir, refs_check_dir); + + iter = dir_iterator_begin(sb.buf, 0); + + if (!iter) { + ret = error_errno("cannot open directory %s", sb.buf); + goto out; + } + + while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) { + if (S_ISDIR(iter->st.st_mode)) { + continue; + } else if (S_ISREG(iter->st.st_mode) || + S_ISLNK(iter->st.st_mode)) { + if (o->verbose) + fprintf_ln(stderr, "Checking %s/%s", + refs_check_dir, iter->relative_path); + for (size_t i = 0; fsck_refs_fns[i]; i++) { + if (fsck_refs_fns[i](o, gitdir, refs_check_dir, iter)) + ret = -1; + } + } else { + ret = error(_("unexpected file type for '%s'"), + iter->basename); + } + } + + if (iter_status != ITER_DONE) + ret = error(_("failed to iterate over '%s'"), sb.buf); + +out: + strbuf_release(&sb); + return ret; +} + +static int files_fsck_refs(struct ref_store *ref_store, + struct fsck_options *o) +{ + int ret; + files_fsck_refs_fn fsck_refs_fns[]= { + NULL + }; + + if (o->verbose) + fprintf_ln(stderr, "Checking references consistency"); + + ret = files_fsck_refs_dir(ref_store, o, "refs", fsck_refs_fns); + + return ret; +} + static int files_fsck(struct ref_store *ref_store, struct fsck_options *o) { @@ -3412,7 +3485,9 @@ static int files_fsck(struct ref_store *ref_store, struct files_ref_store *refs = files_downcast(ref_store, REF_STORE_READ, "fsck"); - ret = refs->packed_ref_store->be->fsck(refs->packed_ref_store, o); + ret = refs->packed_ref_store->be->fsck(refs->packed_ref_store, o) + | files_fsck_refs(ref_store, o); + return ret; } From patchwork Wed Jun 19 07:44:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13703493 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A90F774061 for ; Wed, 19 Jun 2024 07:44:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718783063; cv=none; b=hINzSDmvV7ESQ5j1UXLDjiNAnnl/eVdPc5zRLK2+q+2z50XK3GPj6+5Tbv1svP88kK9RWSuprpVW3xvCKA7smHr2VrIYKEs+17hMGolc00yleYL1SUyfx6qnuiPz8L6rCU3o1SCiN5cx0OTjJLFJPKkZkJOmVckLuQokNQURMp8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718783063; c=relaxed/simple; bh=NpMW6CCWLHF2PYn0cmABustB9CwZlAkvCP2fWauXhQw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=QMMCE67Q2uFfJ7t0H7TjhqZmFYyF+jYIujLNERHI2pbsBhTlXRiCE42BxylO2UbIJKYnDrqSjjtbnLZqkWW5gklGdxGbM6mol7VaVuQBDrgtV6l1BzZQVoZztsoievNQ+vEDjVFxJN7xew8aMye2Kdemq+GvJMJre6a0xvqTj3w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Z/26MUn8; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Z/26MUn8" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-705fff50de2so410244b3a.1 for ; Wed, 19 Jun 2024 00:44:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718783059; x=1719387859; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=l9T6MJkW7rhzMbTiB/u8tHJ8/mRjirGurHSrpQGNbCg=; b=Z/26MUn8JicUIfK45YOqo0/fd/QKu9dn7OsbzuF52BipxXI/nkRSRlNuclDOY0MfIP xHZT9gEQGE+I16vD/LpJsEqNNsTTcTqhYwT/4o1rLU7dCznKUyJlEjegvoyqKs32Y4XP XF5CXq3XsXMf9jYiXFU6Ryrrsvuee2OOolbLBlaPE0rj5VRJlCRSraWFei8SxcEaZEd3 WlHs1mqgH9SVXBpDLr22NTJoA1RNUtuR2FW45ioKn0mTb0QOhPIv2Kaj8ZBUI0mW0oI+ 4W61OqV0emSINdMSiMN+F4gu0TjL9UmKWRVqy9/5OQda+Ilg8dTkrrVv6IM0Z9H0NrMf vWtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718783059; x=1719387859; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=l9T6MJkW7rhzMbTiB/u8tHJ8/mRjirGurHSrpQGNbCg=; b=E2aw2mhVc2w2+yS7hLKUfk0/ESrROzwz12+5QLgZDOcFGpT1d3ebTKGlUJMiBIo+aJ RQfhbffE6/cOxcosaEDEmeOFcckd1GOX9CAyCvQ9HtCWOs3iwSiBcONxZKRXv+GA0bJR i9tPuG+gQgnC0UqoG9nZ8NFjnUZI7cxpIk9AO+wZdDCub6id9gapxAxfjLXQ8upIK+Ab rJYSLm8fNhnfRjyZWbVqk/ReW6PPgwyD5B8d7I8gZHc6p1zOzQoNK6v7fEpNLb0E1jKI xTOX6jVEaYXyzuTg3Fzs/zgmVVCw5puwd6tFjJIszI6MXBme2AhDaGeDiQijrTWZKfCW cudQ== X-Gm-Message-State: AOJu0YzwJ6Aw5VYcVRlrxUML0zEmES2krAziASWFqHllR70RkAVzQKQw FXW8lrvh8SFNU2GgdfJB+a71V92VLJp8HhkugeJCWxPyaAJGKZp8wcdeWg== X-Google-Smtp-Source: AGHT+IHyvP3AuzxUrk3catjN2dWI/ZiHCyTsWWlZVOmrGTBLx4FXZitx0Coja4mXd7MUtiOXC76PSg== X-Received: by 2002:a05:6a00:3115:b0:705:a47a:9c66 with SMTP id d2e1a72fcca58-7061ab8f09cmr6099719b3a.9.1718783058821; Wed, 19 Jun 2024 00:44:18 -0700 (PDT) Received: from localhost ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-6fee2d34cc2sm8942453a12.64.2024.06.19.00.44.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Jun 2024 00:44:18 -0700 (PDT) Date: Wed, 19 Jun 2024 15:44:17 +0800 From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine Subject: [GSoC][PATCH v4 6/7] fsck: add ref name check for files backend Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The git-fsck(1) only implicitly checks the reference, it does not fully check refs with bad format name such as standalone "@" and name ending with ".lock". In order to provide such checks, add a new fsck message id "badRefName" with default ERROR type. Use existing "check_refname_format" to explicit check the ref name. And add a new unit test to verify the functionality. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- Documentation/fsck-msgids.txt | 3 + fsck.h | 1 + refs/files-backend.c | 20 +++++++ t/t0602-reffiles-fsck.sh | 101 ++++++++++++++++++++++++++++++++++ 4 files changed, 125 insertions(+) create mode 100755 t/t0602-reffiles-fsck.sh diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt index f643585a34..dab4012246 100644 --- a/Documentation/fsck-msgids.txt +++ b/Documentation/fsck-msgids.txt @@ -19,6 +19,9 @@ `badParentSha1`:: (ERROR) A commit object has a bad parent sha1. +`badRefName`:: + (ERROR) A ref has a bad name. + `badTagName`:: (INFO) A tag has an invalid format. diff --git a/fsck.h b/fsck.h index 0391dffbb0..f26dec2ea4 100644 --- a/fsck.h +++ b/fsck.h @@ -31,6 +31,7 @@ enum fsck_msg_type { FUNC(BAD_NAME, ERROR) \ FUNC(BAD_OBJECT_SHA1, ERROR) \ FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_REF_NAME, ERROR) \ FUNC(BAD_TIMEZONE, ERROR) \ FUNC(BAD_TREE, ERROR) \ FUNC(BAD_TREE_SHA1, ERROR) \ diff --git a/refs/files-backend.c b/refs/files-backend.c index 5156530774..5bc233b524 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -3416,6 +3416,25 @@ typedef int (*files_fsck_refs_fn)(struct fsck_options *o, const char *refs_check_dir, struct dir_iterator *iter); +static int files_fsck_refs_name(struct fsck_options *o, + const char *gitdir UNUSED, + const char *refs_check_dir, + struct dir_iterator *iter) +{ + struct strbuf sb = STRBUF_INIT; + int ret = 0; + + if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) { + strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path); + ret = fsck_refs_report(o, sb.buf, + FSCK_MSG_BAD_REF_NAME, + "invalid refname format"); + } + + strbuf_release(&sb); + return ret; +} + static int files_fsck_refs_dir(struct ref_store *ref_store, struct fsck_options *o, const char *refs_check_dir, @@ -3467,6 +3486,7 @@ static int files_fsck_refs(struct ref_store *ref_store, { int ret; files_fsck_refs_fn fsck_refs_fns[]= { + files_fsck_refs_name, NULL }; diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh new file mode 100755 index 0000000000..b2db58d2c6 --- /dev/null +++ b/t/t0602-reffiles-fsck.sh @@ -0,0 +1,101 @@ +#!/bin/sh + +test_description='Test reffiles backend consistency check' + +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +GIT_TEST_DEFAULT_REF_FORMAT=files +export GIT_TEST_DEFAULT_REF_FORMAT + +. ./test-lib.sh + +test_expect_success 'ref name should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 && + git commit --allow-empty -m second && + git checkout -b branch-2 && + git tag tag-2 && + git tag multi_hierarchy/tag-2 + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/heads/.branch-1: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/.branch-1 && + test_cmp expect err + ) && + ( + cd repo && + cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/tags/tag-1.lock: badRefName: invalid refname format + EOF + rm $tag_dir_prefix/tag-1.lock && + test_cmp expect err + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/heads/@: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/@ && + test_cmp expect err + ) && + ( + cd repo && + cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format + EOF + rm $tag_dir_prefix/multi_hierarchy/@ && + test_cmp expect err + ) +' + +test_expect_success 'ref name check should be adapted into fsck messages' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 && + git commit --allow-empty -m second && + git checkout -b branch-2 && + git tag tag-2 + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 && + git -c fsck.badRefName=warn fsck 2>err && + cat >expect <<-EOF && + warning: refs/heads/.branch-1: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/.branch-1 && + test_cmp expect err + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ && + git -c fsck.badRefName=ignore fsck 2>err && + test_must_be_empty err + ) +' + +test_done From patchwork Wed Jun 19 07:44:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: shejialuo X-Patchwork-Id: 13703494 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E20074061 for ; Wed, 19 Jun 2024 07:44:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718783080; cv=none; b=HmEdDG71Hz3gyYS5cOEA8icgtPDXjtLuguyn9/6bnvlnD/POPoN0desOgAxydz27oU2Y4h5fSaOZ5bNr7kqEDN4VLSPy7AusMfLac9RJaE2NgMDjAWyWiqTSd1C4KiY/yiYS3a2sZBHbzY0pG/uYPIwR8Q+FwyEPoP+ZTgyPq3k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718783080; c=relaxed/simple; bh=j5kKIJ0dhEqZX9d9Dq5nTTi3FutDZn+HoPsMTAn+sBk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EDdGYN5DO6y/FNJx1LRk3Mn0fEHxGbkPSoUxEJerk8S5+uLlfWRqZTgJk/NH0kjqMke++piHRQ6IpQuJPw68rjcN0edZSLfM5tBwFwH5ZsqFK6J1+ss/EHEJ+gIV12UG5bsUf6RlffEQQ0rWihmFZmLG22nG8Ggjp0+xqdA4RVk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=OZmKUfq7; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OZmKUfq7" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-706354409e1so212450b3a.2 for ; Wed, 19 Jun 2024 00:44:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718783077; x=1719387877; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Z11q+4+CLSKpEBZEqMA6rUQTJ+lW/Vw2zSXsi4WGfqU=; b=OZmKUfq7TWJFWl0d39hiQoHh52qYi9ZUitA96sTTWmEfEL6yJyphd1xWFcWJawXiUS 4rCCRPBYshGH+/vHXsjkyfvMQtMZqvuaUy5eZ3dZMvIp4GE3NS/OOu4UjUPJK4+CfZjY 5r2QOjoM4dfZtxrINMrMvBQj/tHZBg5zVcDqf+VEPLt2XeYLdsMU5m8pQ1noef92PVL9 JsTvumnNDNtHSpKFJV1VS+cRmFr8Fxj55I348/3BSSN7YEk8BxW/WOaWjEMgS86wFD7K zX77hUSSZHPtKJlrRz86/oeEOhXjUS5akv/YoKa7+pFy1GQyUVSfDN044ZBb6PWuUZC2 vCjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718783077; x=1719387877; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Z11q+4+CLSKpEBZEqMA6rUQTJ+lW/Vw2zSXsi4WGfqU=; b=GVdijIqCwbmixywooCNFLTXvlLKwzmcxANylGtFvF3IJIwBrFQXrN4EBWgp2ORIxtM Yq4KmI5eXtmAGdmLCnhI+8D6B2tWJuiDCCLSIy/EX7FjWLLRyoCobyenKw9hzKeuw/Hb C7LieDkgvB92gwoy0MIuc1MkBdleamGVQmQcYAoWoJwGOnaXgs5clpYz89R5BzhLSblV lB8FU+yJ4+AkGhK8u+5xRmRkouhy2RG1uDo2XucpuFAWE9xMm8tji6j2FUCfwPXpJbij rqsN5tkEtrcEAmIbr9PYJheW80h1bOW5R8h+QbOAYt77geh2iEx2wUbGQRJRHvnbzrss kqJA== X-Gm-Message-State: AOJu0YxiAeyOSVYD9CJ+b33G6hqbcwnSSZBB+TJLISwmvz6Jzbm4KhCF EUZgOawusIe1L5mNpsGxScMhYcat+7oClLKzdwpkXqdczHSm6z0o7m6tUA== X-Google-Smtp-Source: AGHT+IFVFFrMioYgUhH3YYaHf3bBJuZdwFGMJVc1HDoAyYHklsNEHX0+/KQnRhEtzE6q3LgecL1PdQ== X-Received: by 2002:a05:6a00:640b:b0:706:2b1f:d44d with SMTP id d2e1a72fcca58-7062b1fd83bmr1446766b3a.14.1718783076665; Wed, 19 Jun 2024 00:44:36 -0700 (PDT) Received: from localhost ([2605:52c0:1:4cf:6c5a:92ff:fe25:ceff]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-705ccb3e4e7sm10086902b3a.139.2024.06.19.00.44.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Jun 2024 00:44:36 -0700 (PDT) Date: Wed, 19 Jun 2024 15:44:35 +0800 From: shejialuo To: git@vger.kernel.org Cc: Patrick Steinhardt , Karthik Nayak , Junio C Hamano , Eric Sunshine Subject: [GSoC][PATCH v4 7/7] fsck: add ref content check for files backend Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Enhance the git-fsck(1) command by adding a check for reference content in the files backend. The new functionality ensures that symrefs, real symbolic link and regular refs are validated correctly. In order to check the trailing content of the regular refs, add a new parameter `trailing` to `parse_loose_ref_contents`. For symrefs, `parse_loose_ref_contents` will set the "referent". However, symbolic link could be either absolute or relative. Use "strbuf_add_real_path" to read the symbolic link and convert the relative path to absolute path. Then use "skip_prefix" to make it align with symref "referent". Thus, the symrefs and symbolic links could share the same interface. Add a new function "files_fsck_symref_target" which aims at checking the following things: 1. whether the pointee is under the `refs/` directory. 2. whether the pointee name is correct. 3. whether the pointee path is a wrong type in filesystem. Last, add the following FSCK MESSAGEs: 1. "badRefContent(ERROR)": A ref has a bad content 2. "badSymrefPointee(ERROR)": The pointee of a symref is bad. 3. "trailingRefContent(WARN)": A ref content has trailing contents. Mentored-by: Patrick Steinhardt Mentored-by: Karthik Nayak Signed-off-by: shejialuo --- Documentation/fsck-msgids.txt | 9 +++ fsck.h | 3 + refs.c | 2 +- refs/files-backend.c | 145 +++++++++++++++++++++++++++++++++- refs/refs-internal.h | 5 +- t/t0602-reffiles-fsck.sh | 110 ++++++++++++++++++++++++++ 6 files changed, 269 insertions(+), 5 deletions(-) diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt index dab4012246..b1630a478b 100644 --- a/Documentation/fsck-msgids.txt +++ b/Documentation/fsck-msgids.txt @@ -19,9 +19,15 @@ `badParentSha1`:: (ERROR) A commit object has a bad parent sha1. +`badRefContent`:: + (ERROR) A ref has a bad content. + `badRefName`:: (ERROR) A ref has a bad name. +`badSymrefPointee`:: + (ERROR) The pointee of a symref is bad. + `badTagName`:: (INFO) A tag has an invalid format. @@ -167,6 +173,9 @@ `nullSha1`:: (WARN) Tree contains entries pointing to a null sha1. +`trailingRefContent`:: + (WARN) A ref content has trailing contents. + `treeNotSorted`:: (ERROR) A tree is not properly sorted. diff --git a/fsck.h b/fsck.h index f26dec2ea4..8afee05f20 100644 --- a/fsck.h +++ b/fsck.h @@ -32,6 +32,8 @@ enum fsck_msg_type { FUNC(BAD_OBJECT_SHA1, ERROR) \ FUNC(BAD_PARENT_SHA1, ERROR) \ FUNC(BAD_REF_NAME, ERROR) \ + FUNC(BAD_REF_CONTENT, ERROR) \ + FUNC(BAD_SYMREF_POINTEE, ERROR) \ FUNC(BAD_TIMEZONE, ERROR) \ FUNC(BAD_TREE, ERROR) \ FUNC(BAD_TREE_SHA1, ERROR) \ @@ -72,6 +74,7 @@ enum fsck_msg_type { FUNC(HAS_DOTDOT, WARN) \ FUNC(HAS_DOTGIT, WARN) \ FUNC(NULL_SHA1, WARN) \ + FUNC(TRAILING_REF_CONTENT, WARN) \ FUNC(ZERO_PADDED_FILEMODE, WARN) \ FUNC(NUL_IN_COMMIT, WARN) \ FUNC(LARGE_PATHNAME, WARN) \ diff --git a/refs.c b/refs.c index cb3b8ec36a..5fc34d39b1 100644 --- a/refs.c +++ b/refs.c @@ -1744,7 +1744,7 @@ static int refs_read_special_head(struct ref_store *ref_store, } result = parse_loose_ref_contents(content.buf, oid, referent, type, - failure_errno); + failure_errno, NULL); done: strbuf_release(&full_path); diff --git a/refs/files-backend.c b/refs/files-backend.c index 5bc233b524..3b8e76dab5 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1,4 +1,5 @@ #include "../git-compat-util.h" +#include "../abspath.h" #include "../copy.h" #include "../environment.h" #include "../gettext.h" @@ -551,7 +552,7 @@ static int read_ref_internal(struct ref_store *ref_store, const char *refname, strbuf_rtrim(&sb_contents); buf = sb_contents.buf; - ret = parse_loose_ref_contents(buf, oid, referent, type, &myerr); + ret = parse_loose_ref_contents(buf, oid, referent, type, &myerr, NULL); out: if (ret && !myerr) @@ -587,7 +588,7 @@ static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn int parse_loose_ref_contents(const char *buf, struct object_id *oid, struct strbuf *referent, unsigned int *type, - int *failure_errno) + int *failure_errno, const char **trailing) { const char *p; if (skip_prefix(buf, "ref:", &buf)) { @@ -609,6 +610,10 @@ int parse_loose_ref_contents(const char *buf, struct object_id *oid, *failure_errno = EINVAL; return -1; } + + if (trailing) + *trailing = p; + return 0; } @@ -3435,6 +3440,141 @@ static int files_fsck_refs_name(struct fsck_options *o, return ret; } +/* + * Check the symref "pointee_name" and "pointee_path". The caller should + * make sure that "pointee_path" is absolute. For symbolic ref, "pointee_name" + * would be the content after "refs:". For symblic link, "pointee_name" would + * be the relative path agaignst "gitdir". + */ +static int files_fsck_symref_target(struct fsck_options *o, + const char *refname, + const char *pointee_name, + const char *pointee_path) +{ + const char *p = NULL; + struct stat st; + int ret = 0; + + if (!skip_prefix(pointee_name, "refs/", &p)) { + + ret = fsck_refs_report(o, refname, + FSCK_MSG_BAD_SYMREF_POINTEE, + "point to target out of refs hierarchy"); + goto out; + } + + if (check_refname_format(pointee_name, 0)) { + ret = fsck_refs_report(o, refname, + FSCK_MSG_BAD_SYMREF_POINTEE, + "point to invalid refname"); + } + + if (lstat(pointee_path, &st) < 0) + goto out; + + if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) { + ret = fsck_refs_report(o, refname, + FSCK_MSG_BAD_SYMREF_POINTEE, + "point to invalid target"); + goto out; + } +out: + return ret; +} + +static int files_fsck_refs_content(struct fsck_options *o, + const char *gitdir, + const char *refs_check_dir, + struct dir_iterator *iter) +{ + struct strbuf pointee_path = STRBUF_INIT, + ref_content = STRBUF_INIT, + abs_gitdir = STRBUF_INIT, + referent = STRBUF_INIT, + refname = STRBUF_INIT; + const char *trailing = NULL; + int failure_errno = 0; + unsigned int type = 0; + struct object_id oid; + int ret = 0; + + strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path); + + /* + * If the file is a symlink, we need to only check the connectivity + * of the destination object. + */ + if (S_ISLNK(iter->st.st_mode)) { + const char *pointee_name = NULL; + + strbuf_add_real_path(&pointee_path, iter->path.buf); + + strbuf_add_absolute_path(&abs_gitdir, gitdir); + strbuf_normalize_path(&abs_gitdir); + if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1])) + strbuf_addch(&abs_gitdir, '/'); + + if (!skip_prefix(pointee_path.buf, + abs_gitdir.buf, &pointee_name)) { + ret = fsck_refs_report(o, refname.buf, + FSCK_MSG_BAD_SYMREF_POINTEE, + "point to target outside gitdir"); + goto clean; + } + + ret = files_fsck_symref_target(o, refname.buf, pointee_name, + pointee_path.buf); + goto clean; + } + + if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) { + ret = error_errno(_("%s/%s: unable to read the ref"), + refs_check_dir, iter->relative_path); + goto clean; + } + + if (parse_loose_ref_contents(ref_content.buf, &oid, + &referent, &type, + &failure_errno, &trailing)) { + ret = fsck_refs_report(o, refname.buf, + FSCK_MSG_BAD_REF_CONTENT, + "invalid ref content"); + goto clean; + } + + /* + * If the ref is a symref, we need to check the destination name and + * connectivity. + */ + if (referent.len && (type & REF_ISSYMREF)) { + strbuf_addf(&pointee_path, "%s/%s", gitdir, referent.buf); + strbuf_rtrim(&referent); + + ret = files_fsck_symref_target(o, refname.buf, referent.buf, + pointee_path.buf); + goto clean; + } else { + /* + * Only regular refs could have a trailing garbage. Should + * be reported as a warning. + */ + if (trailing && (*trailing != '\0' && *trailing != '\n')) { + ret = fsck_refs_report(o, refname.buf, + FSCK_MSG_TRAILING_REF_CONTENT, + "trailing garbage in ref"); + goto clean; + } + } + +clean: + strbuf_release(&abs_gitdir); + strbuf_release(&pointee_path); + strbuf_release(&refname); + strbuf_release(&ref_content); + strbuf_release(&referent); + return ret; +} + static int files_fsck_refs_dir(struct ref_store *ref_store, struct fsck_options *o, const char *refs_check_dir, @@ -3487,6 +3627,7 @@ static int files_fsck_refs(struct ref_store *ref_store, int ret; files_fsck_refs_fn fsck_refs_fns[]= { files_fsck_refs_name, + files_fsck_refs_content, NULL }; diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 280acb7f9e..1126c6102a 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -709,11 +709,12 @@ struct ref_store { /* * Parse contents of a loose ref file. *failure_errno maybe be set to EINVAL for - * invalid contents. + * invalid contents. Also *trailing is set to the first character after the + * refname or NULL if the referent is not empty. */ int parse_loose_ref_contents(const char *buf, struct object_id *oid, struct strbuf *referent, unsigned int *type, - int *failure_errno); + int *failure_errno, const char **trailing); /* * Fill in the generic part of refs and add it to our collection of diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh index b2db58d2c6..35bf40ee64 100755 --- a/t/t0602-reffiles-fsck.sh +++ b/t/t0602-reffiles-fsck.sh @@ -98,4 +98,114 @@ test_expect_success 'ref name check should be adapted into fsck messages' ' ) ' +test_expect_success 'regular ref content should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 && + git commit --allow-empty -m second && + git checkout -b branch-2 && + git tag tag-2 && + git checkout -b a/b/tag-2 + ) && + ( + cd repo && + printf "%s garbage" "$(git rev-parse branch-1)" > $branch_dir_prefix/branch-1-garbage && + git fsck 2>err && + cat >expect <<-EOF && + warning: refs/heads/branch-1-garbage: trailingRefContent: trailing garbage in ref + EOF + rm $branch_dir_prefix/branch-1-garbage && + test_cmp expect err + ) && + ( + cd repo && + printf "%s garbage" "$(git rev-parse tag-1)" > $tag_dir_prefix/tag-1-garbage && + test_must_fail git -c fsck.trailingRefContent=error fsck 2>err && + cat >expect <<-EOF && + error: refs/tags/tag-1-garbage: trailingRefContent: trailing garbage in ref + EOF + rm $tag_dir_prefix/tag-1-garbage && + test_cmp expect err + ) && + ( + cd repo && + printf "%s " "$(git rev-parse tag-2)" > $tag_dir_prefix/tag-2-garbage && + git fsck 2>err && + cat >expect <<-EOF && + warning: refs/tags/tag-2-garbage: trailingRefContent: trailing garbage in ref + EOF + rm $tag_dir_prefix/tag-2-garbage && + test_cmp expect err + ) && + ( + cd repo && + printf "xfsazqfxcadas" > $tag_dir_prefix/tag-2-bad && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/tags/tag-2-bad: badRefContent: invalid ref content + EOF + rm $tag_dir_prefix/tag-2-bad && + test_cmp expect err + ) && + ( + cd repo && + printf "xfsazqfxcadas" > $branch_dir_prefix/a/b/branch-2-bad && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/heads/a/b/branch-2-bad: badRefContent: invalid ref content + EOF + rm $branch_dir_prefix/a/b/branch-2-bad && + test_cmp expect err + ) +' + +test_expect_success 'symbolic ref content should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 + ) && + ( + cd repo && + printf "ref: refs/heads/.branch" > $branch_dir_prefix/branch-2-bad && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/heads/branch-2-bad: badSymrefPointee: point to invalid refname + EOF + rm $branch_dir_prefix/branch-2-bad && + test_cmp expect err + ) && + ( + cd repo && + printf "ref: refs/heads" > $branch_dir_prefix/branch-2-bad && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/heads/branch-2-bad: badSymrefPointee: point to invalid target + EOF + rm $branch_dir_prefix/branch-2-bad && + test_cmp expect err + ) && + ( + cd repo && + printf "ref: logs/maint-v2.45" > $branch_dir_prefix/branch-2-bad && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: refs/heads/branch-2-bad: badSymrefPointee: point to target out of refs hierarchy + EOF + rm $branch_dir_prefix/branch-2-bad && + test_cmp expect err + ) +' + test_done