From patchwork Tue Sep 8 19:48:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11764177 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D891959D for ; Tue, 8 Sep 2020 19:50:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BD12F207DE for ; Tue, 8 Sep 2020 19:50:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="foBvGfLR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731753AbgIHTuH (ORCPT ); Tue, 8 Sep 2020 15:50:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730916AbgIHTsq (ORCPT ); Tue, 8 Sep 2020 15:48:46 -0400 Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03FD6C061756 for ; Tue, 8 Sep 2020 12:48:46 -0700 (PDT) Received: by mail-qk1-x74a.google.com with SMTP id d184so31721qkf.15 for ; Tue, 08 Sep 2020 12:48:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=XcY6xyVhSYie1UA3uifnaQdJNJ91XES1Mf59m7OFhQU=; b=foBvGfLRrilPpyq0fMUXIIhYjRuOWcerRm7eTtds2dvAtWs0Ckb2Y5BzPo6x5pH/n5 Quz7oP3SZAKrq4ca6ZskUoTXXTCNXPTxNLWzilITn4a5Ct7o4c2tiBrGzYt/v39VyF7p NI6Mp9wEPU5Njh19T1OyhumbdkSLEKiG3gOkIVAKjyoyqZxgw/EC5BCdF5OHsSH13m1r lq143CjyFp6JvgjxK+4tkxtd8dyA+54kYDWFSt7Hdb2VZLSwb3yvmZmB7NvoxSoApMdI Wngy/LJq4rSumYF2FqI4eR1E5QQBTelgdDJTBreIhKKGvy6mDNAvRO2/F3OY500A0eYJ GFcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=XcY6xyVhSYie1UA3uifnaQdJNJ91XES1Mf59m7OFhQU=; b=hI8+JbTIKS6IYsQkKDqubu2InwRjLT2B3++swE83Nu3OeolodxxIJydWMPCeClmc7Y UXxzi/hVEtT9iejoKdBTso+R1olyBAAd1QfpKALnhBy9/JGydE+y3FiO1GwbKfaH9eQi DdaDm3FCcRM4pre9Ja4iDfKMGNbwuCxWdi4vHFapfmcBPHsvxubd3Mr3E2fFwNfKNH2e aVWRAm/rP3nEFFPzpQbKEzXhOR4u51RSbBOFq3Qydr2zMtuPdgIIsjtk5HZRzgXlJBt+ 7kdFcU/6wEkcSJZrgCjNoPc49u0Mrhdv2xISL25bQ7mjM7ipyei377pZRFmvqrsAfvzG 92LA== X-Gm-Message-State: AOAM533SkM5xY4GJl0y+RZjgb3/VVksAsJmTvu8VKY1guadQmAcEnEV4 YYqb069oqbUJMIVaC74aoBO200VN8y+x1Gn6stzcAN03Avyl60oh+l2VHsZCFm3T50w5jxaQ/zP 5B2x57O/jFIRk2n0MBO2aYYt8YuHUF7BuQSRL1ef4avnaSjZPv9X8a8dCy2QBKcRjeY5BYnb6hz NL X-Google-Smtp-Source: ABdhPJyRnc8HmLMKDtcmVINI7vu9723rrCBZQPakBEAhKqEFsvR3eFsVo6fNqYf//U9IAG1V/b266BSuNpTrGKRSPhNx X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:10:24:72f4:c0a8:18d]) (user=jonathantanmy job=sendgmr) by 2002:a0c:c709:: with SMTP id w9mr735452qvi.26.1599594523749; Tue, 08 Sep 2020 12:48:43 -0700 (PDT) Date: Tue, 8 Sep 2020 12:48:29 -0700 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH v2 1/7] Documentation: deltaBaseCacheLimit is per-thread From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Clarify that core.deltaBaseCacheLimit is per-thread, as can be seen from the fact that cache usage (base_cache_used in struct thread_local in builtin/index-pack.c) is tracked individually for each thread and compared against delta_base_cache_limit. Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- Documentation/config/core.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index 74619a9c03..02002cf109 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -399,7 +399,7 @@ the largest projects. You probably do not need to adjust this value. Common unit suffixes of 'k', 'm', or 'g' are supported. core.deltaBaseCacheLimit:: - Maximum number of bytes to reserve for caching base objects + Maximum number of bytes per thread to reserve for caching base objects that may be referenced by multiple deltified objects. By storing the entire decompressed base objects in a cache Git is able to avoid unpacking and decompressing frequently used base From patchwork Tue Sep 8 19:48:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11764151 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A1F4759D for ; Tue, 8 Sep 2020 19:48:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 830C3207DE for ; Tue, 8 Sep 2020 19:48:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="gENVZcoo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730809AbgIHTsz (ORCPT ); Tue, 8 Sep 2020 15:48:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730882AbgIHTsq (ORCPT ); Tue, 8 Sep 2020 15:48:46 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B64AC061757 for ; Tue, 8 Sep 2020 12:48:46 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id q130so175305ybb.11 for ; Tue, 08 Sep 2020 12:48:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=9bdwFdUqT3DPsyQ1QrJMIauvqEFN4VWAUUcDJ/J9fIo=; b=gENVZcookJJIF4TkoOw+1jS1jp6cxZBqvEhPZsw/Cqzpl82v2+JF7jNrcy0UYep5/M 1RcjoC6MzMDmK/fNoUF9F06VPz9F+p6ucQ6xbhKZIncyN9XwssHR3wYFDKC1ItHBLiFI sl9k0dfobGiGrvbagADZDgsW1FLY3ZcUrenK3CrWIjGHhKoWHOzwrSdpHEglEMJJ65uT VK+fvwIz0lZAB0W4qwjpGAUP8iOZPIqNBQJiYkHsSB60SpPQWwsYIImF1dJCrGM0Xkze hrSUsDtpQa9ElxQhhdVxIVrCa2UrRfwDI9Ys7toqydyNfZ8f6M7NhHEjn1wiy/qQvDV/ TmuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=9bdwFdUqT3DPsyQ1QrJMIauvqEFN4VWAUUcDJ/J9fIo=; b=eI91NbOwlcZGipfY+a+azRHbUK72LndJV2IP9t9Ix/e36aecqSMF4dfhFhF6s4VvS/ tj8wntH9ttBval48FCUSeI508VLsNNX3RCa8XVsiuzxPbki5VnePOgnBCUct4U3hSSEe anbboPimy0/DCJEa0DKZHKiFErlktNKyvjMP4zzAAu/HVMmCO7kHYumeKjZq1ubECCq0 ZQuN9qcoSCfjloworda7/CNrHszylaA5ZySlqkykUwlJ7rUAi2TBmsAiX333oDMytG2y 42IZzwcYB7tB65iLHk8Gricjyd4yhzltK+FVcMPQM612WDHXNUymKIXWiXqpml3TQ+yI Oeww== X-Gm-Message-State: AOAM5306hhuwQ+qMW70iF9s1TtBbfFSLpgqIpiWXXxJs2QyftXPHQao1 f/f9uqooc1C/mKpJZDevxz0w29U8e0MvbvgF4DzfrPLhhwc2tpcTVyHbJU/t8sQalf+yBv8qS5Y qD5uGEyKNyqj2jG9fjUyPAqlDIJbvmY/rpnfAWnUwx/Ejv+P92vPBaM0gKddEld2JH01ff/4vHs e/ X-Google-Smtp-Source: ABdhPJxi+k3a+5wyJbIexFuU5c1PEqqedtiJ85WpfLPUjA6GrRlL25ylFeur7KSUXVdfRXJ7FQOsCcCmjZZIfT4s+3cA X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:10:24:72f4:c0a8:18d]) (user=jonathantanmy job=sendgmr) by 2002:a25:b122:: with SMTP id g34mr844712ybj.196.1599594525322; Tue, 08 Sep 2020 12:48:45 -0700 (PDT) Date: Tue, 8 Sep 2020 12:48:30 -0700 In-Reply-To: Message-Id: <00502dee35e2d21fda2b240b57381b09b098e198.1599594441.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH v2 2/7] index-pack: remove redundant parameter From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org find_{ref,ofs}_delta_{,children} take an enum object_type parameter, but the object type is already present in the name of the function. Remove that parameter from these functions. Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 26 ++++++++++++-------------- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 9721bf1ffe..0e0889afc4 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -614,7 +614,7 @@ static int compare_ofs_delta_bases(off_t offset1, off_t offset2, 0; } -static int find_ofs_delta(const off_t offset, enum object_type type) +static int find_ofs_delta(const off_t offset) { int first = 0, last = nr_ofs_deltas; @@ -624,7 +624,8 @@ static int find_ofs_delta(const off_t offset, enum object_type type) int cmp; cmp = compare_ofs_delta_bases(offset, delta->offset, - type, objects[delta->obj_no].type); + OBJ_OFS_DELTA, + objects[delta->obj_no].type); if (!cmp) return next; if (cmp < 0) { @@ -637,10 +638,9 @@ static int find_ofs_delta(const off_t offset, enum object_type type) } static void find_ofs_delta_children(off_t offset, - int *first_index, int *last_index, - enum object_type type) + int *first_index, int *last_index) { - int first = find_ofs_delta(offset, type); + int first = find_ofs_delta(offset); int last = first; int end = nr_ofs_deltas - 1; @@ -668,7 +668,7 @@ static int compare_ref_delta_bases(const struct object_id *oid1, return oidcmp(oid1, oid2); } -static int find_ref_delta(const struct object_id *oid, enum object_type type) +static int find_ref_delta(const struct object_id *oid) { int first = 0, last = nr_ref_deltas; @@ -678,7 +678,8 @@ static int find_ref_delta(const struct object_id *oid, enum object_type type) int cmp; cmp = compare_ref_delta_bases(oid, &delta->oid, - type, objects[delta->obj_no].type); + OBJ_REF_DELTA, + objects[delta->obj_no].type); if (!cmp) return next; if (cmp < 0) { @@ -691,10 +692,9 @@ static int find_ref_delta(const struct object_id *oid, enum object_type type) } static void find_ref_delta_children(const struct object_id *oid, - int *first_index, int *last_index, - enum object_type type) + int *first_index, int *last_index) { - int first = find_ref_delta(oid, type); + int first = find_ref_delta(oid); int last = first; int end = nr_ref_deltas - 1; @@ -983,12 +983,10 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, { if (base->ref_last == -1 && base->ofs_last == -1) { find_ref_delta_children(&base->obj->idx.oid, - &base->ref_first, &base->ref_last, - OBJ_REF_DELTA); + &base->ref_first, &base->ref_last); find_ofs_delta_children(base->obj->idx.offset, - &base->ofs_first, &base->ofs_last, - OBJ_OFS_DELTA); + &base->ofs_first, &base->ofs_last); if (base->ref_last == -1 && base->ofs_last == -1) { free(base->data); From patchwork Tue Sep 8 19:48:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11764153 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 86C9A1599 for ; Tue, 8 Sep 2020 19:49:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 68DAC207DE for ; Tue, 8 Sep 2020 19:49:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="nGbYiD3J" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730721AbgIHTs6 (ORCPT ); Tue, 8 Sep 2020 15:48:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731053AbgIHTss (ORCPT ); Tue, 8 Sep 2020 15:48:48 -0400 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2C34C0613ED for ; Tue, 8 Sep 2020 12:48:47 -0700 (PDT) Received: by mail-pf1-x44a.google.com with SMTP id q5so202819pfl.16 for ; Tue, 08 Sep 2020 12:48:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=7JKY3sywb5PoF2eqzKuwTh+/TXzU8WfCj9zyzzvqg9g=; b=nGbYiD3JziNmZPVyJUdvX5cpx7iOmZLTS9Oqu7XtkY61vmrScnNttCRhM+pGWw+cdI 1/Q46X4Morq7oSPl/sCc//MkRh//vaRtxQHKSnY/1NZtgEXMT/jzrcLp3BmiCPVYlmQ/ YgDv9r2+7ISl88827treg5guwbrMM2P4yIo1u6erHXLxs0mYXklAPjghRYS0fKLopfsQ 6ZxO8MahqjzVn4VSwQH4z1GjX+SXyQbL/t7baIMbHuSGHCz5uniysOyAPjXf5t/oaME8 BfQ10qJVijooRivfiptxwXTEiUvd7tL8vAHvWVTn4dLXu7K3IA4T1tG+fL771vfRzxRa AKJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=7JKY3sywb5PoF2eqzKuwTh+/TXzU8WfCj9zyzzvqg9g=; b=sAF2EWsIHIZRl+4CXBsY0ADQJTlvd1fmgUoTBneFOHizsDSh6ykFCOpk7f5agNGU2N 46axTbd63RAteYkvkksutedNPztjqeycUtvnbs4qZ7ls0EluRtJNbNmxNQoW+Z41PMdx ZN/Ouw7Dk+bhmTOrY7Pss+hI3Q43a/OAOTFHJ1Vurp8vwwEhSrWvqbRNK75oUuS9p6nd QYmi3JK6W38Ds00mVmgsY/G1w5PBuwo0/BC/4mQk41GB5HNFx2F7o0sMnoTqGMx5j1gA b5WMO6Y8FNC3FUVO06cOQUAdJlzk4jOjZh2kgbtYMUIKFLnajhiyczD0owGqUKKddudG rVoQ== X-Gm-Message-State: AOAM531gBKRBa58EkrDhVuQSFdWazHt1hWiT5tTfilzD2GJoVqys3Cgm ejEoWB0ZozSjZn1DoiPHTjvbsdShqSJqDXpHqXID9uJeIRQh98qcy0B5Ba8X/i8UwT4Ssv7WdPd cSi2gMzpuMKQptmdz5nL/JPpgUlYqfzO0rnpcmdo3QUDcz/9l7bJAEVKJz+uFhaIneCsanbpcjL 54 X-Google-Smtp-Source: ABdhPJx0tQbVLN37wvnTNrU4S5d5lHZKwLCj+2RFyfoySt22BrRfUIJwQMxO8xWGKiQjyWNkNi9PUmvvug7cWE2z7cEG X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:10:24:72f4:c0a8:18d]) (user=jonathantanmy job=sendgmr) by 2002:a63:2fc7:: with SMTP id v190mr271529pgv.250.1599594527168; Tue, 08 Sep 2020 12:48:47 -0700 (PDT) Date: Tue, 8 Sep 2020 12:48:31 -0700 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH v2 3/7] index-pack: unify threaded and unthreaded code From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 0e0889afc4..c7b4aef4e4 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1211,15 +1211,7 @@ static void resolve_deltas(void) cleanup_thread(); return; } - - for (i = 0; i < nr_objects; i++) { - struct object_entry *obj = &objects[i]; - - if (is_delta_type(obj->type)) - continue; - resolve_base(obj); - display_progress(progress, nr_resolved_deltas); - } + threaded_second_pass(¬hread_data); } /* From patchwork Tue Sep 8 19:48:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11764173 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D24A31599 for ; Tue, 8 Sep 2020 19:50:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B8ECF2145D for ; Tue, 8 Sep 2020 19:50:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="RkRbufaB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731248AbgIHTuC (ORCPT ); Tue, 8 Sep 2020 15:50:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731236AbgIHTsw (ORCPT ); Tue, 8 Sep 2020 15:48:52 -0400 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58E65C061786 for ; Tue, 8 Sep 2020 12:48:49 -0700 (PDT) Received: by mail-pf1-x44a.google.com with SMTP id b142so217547pfb.9 for ; Tue, 08 Sep 2020 12:48:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=a1xcHiPif6Jjk6UDgCvbltph9+bTMGsks5MA1oj2l6Q=; b=RkRbufaB0pOJN/hgXWpQyqaQy12LIlpeZ4MhODmaYASck4XywN+6BWlf8RCUntjOOT ABIC+g5NRuM1rKJQuOuc/ebsLLa5bNSBVb2l6PVvKRYtctoTbVyVvcgaIuYZvlcllm2l IHrfh6Add2APjtuuaxbQOYqn5JwuQqGIP86v0LGwFWQlKbP8aID2MJ2NPahvYHivnlaP Pkc1ZdmgcNmic2D4y7bGy1n/zuSGj5LBTV/D3JUJyN12LGei6fNOmte85A6y6fszEzE3 81l4QzFQPR7VrJk1vrT24uyMDAZ+vt9Zpm6YktQ5u1QIImWcQRAQugHmX9klSIOd3o6l rtog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=a1xcHiPif6Jjk6UDgCvbltph9+bTMGsks5MA1oj2l6Q=; b=dQmlojwB4aaHPj8XSFAuW3OY99f+wIVWLo5Ht1NxlMM2JQb5byPRewB4kB3v1DLLbS yo+4sLUpJvTHyeTaBl87kZQmW81A+X498l0rOXQvFp5HuHUTTFtoJLnPETjjQpGyU/Ga 5pDuDg/W9Hj0H696Frs4lpzpPzqkK6atHQ+Wi23c0fpbWAweTgmLVYAnU8FRIq/X1TMS HQ0XbiUx2HuL0TlX22SH43CchLLMQlmzmQVVdU+11yTK4EUbz3IiQiOixqbMQS8UUe4P f3zLdcxSrfVY5KgwSv3py1611nl5FaKHuwrR3ROmn+YLfDMda9HnkRhsnJPg2TdOxlJI oLdw== X-Gm-Message-State: AOAM531A+Xdcj+waskuGEocr9jE0GAjKsRcZrw3KT1KTAG+suvt5b+PT ZIdvFp678YqhcWaGoW/mS1XbSGRa/CViHmCgY0aWCueSnH8Yxub+2NnguBBJd345OnWcz7+NyvO NpRznCsO2Lc06PAB1ZKcuDL+KsgjdeksLuw/k5o0lq7d5MRiJbxVKnbKG45LuYRZDPu46yjt26B Mr X-Google-Smtp-Source: ABdhPJxS42KuUvONXoBLFe/7eByld3mANBVa0Es2pgaXpQNF0p+yrXbIw0I2tlGceiX3K1C+gio4wjb9EPwacT37jyni X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:10:24:72f4:c0a8:18d]) (user=jonathantanmy job=sendgmr) by 2002:a17:90a:8817:: with SMTP id s23mr401870pjn.158.1599594528754; Tue, 08 Sep 2020 12:48:48 -0700 (PDT) Date: Tue, 8 Sep 2020 12:48:32 -0700 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH v2 4/7] index-pack: remove redundant child field From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This is refactoring 1 of 2 to simplify struct base_data. In index-pack, each thread maintains a doubly-linked list of the delta chain that it is currently processing (the "base" and "child" pointers in struct base_data). When a thread exceeds the delta base cache limit and needs to reclaim memory, it uses the "child" pointers to traverse the lineage, reclaiming the memory of the eldest delta bases first. A subsequent patch will perform memory reclaiming in a different way and will thus no longer need the "child" pointer. Because the "child" pointer is redundant even now, remove it so that the aforementioned subsequent patch will be clearer. In the meantime, reclaim memory in the reverse order of the "base" pointers. Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 41 ++++++++++++++++++++++------------------- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index c7b4aef4e4..c8db464557 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -34,7 +34,6 @@ struct object_stat { struct base_data { struct base_data *base; - struct base_data *child; struct object_entry *obj; void *data; unsigned long size; @@ -44,7 +43,6 @@ struct base_data { struct thread_local { pthread_t thread; - struct base_data *base_cache; size_t base_cache_used; int pack_fd; }; @@ -380,27 +378,37 @@ static void free_base_data(struct base_data *c) } } -static void prune_base_data(struct base_data *retain) +static void prune_base_data(struct base_data *youngest_child) { struct base_data *b; struct thread_local *data = get_thread_data(); - for (b = data->base_cache; - data->base_cache_used > delta_base_cache_limit && b; - b = b->child) { - if (b->data && b != retain) - free_base_data(b); + struct base_data **ancestry = NULL; + size_t nr = 0, alloc = 0; + ssize_t i; + + if (data->base_cache_used <= delta_base_cache_limit) + return; + + /* + * Free all ancestors of youngest_child until we have enough space, + * starting with the oldest. (We cannot free youngest_child itself.) + */ + for (b = youngest_child->base; b != NULL; b = b->base) { + ALLOC_GROW(ancestry, nr + 1, alloc); + ancestry[nr++] = b; } + for (i = nr - 1; + i >= 0 && data->base_cache_used > delta_base_cache_limit; + i--) { + if (ancestry[i]->data) + free_base_data(ancestry[i]); + } + free(ancestry); } static void link_base_data(struct base_data *base, struct base_data *c) { - if (base) - base->child = c; - else - get_thread_data()->base_cache = c; - c->base = base; - c->child = NULL; if (c->data) get_thread_data()->base_cache_used += c->size; prune_base_data(c); @@ -408,11 +416,6 @@ static void link_base_data(struct base_data *base, struct base_data *c) static void unlink_base_data(struct base_data *c) { - struct base_data *base = c->base; - if (base) - base->child = NULL; - else - get_thread_data()->base_cache = NULL; free_base_data(c); } From patchwork Tue Sep 8 19:48:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11764169 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D34FC15E4 for ; Tue, 8 Sep 2020 19:49:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B73FD207DE for ; Tue, 8 Sep 2020 19:49:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="QcKaLynV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730926AbgIHTt4 (ORCPT ); Tue, 8 Sep 2020 15:49:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731091AbgIHTsw (ORCPT ); Tue, 8 Sep 2020 15:48:52 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 52324C061795 for ; Tue, 8 Sep 2020 12:48:51 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id s3so157124ybi.18 for ; Tue, 08 Sep 2020 12:48:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=nojDSosNK3fk9/Ttz8oIEPQVNcGaJNLyPfASgErRs9Q=; b=QcKaLynVrKSSXJl8VIL4KbipS3N4pdTt6WJhqyyMlCYjBhVMM/4YcdhnL899Ejr/6j qYZ6lwwpzO5D/1MRFB+V6jN/bY5Rkm9GtvSJFDGlLku8NT5Kknm+BUTG0ZFmp9a4InKb xgw9nfEDxRG7rjFCkk9DHIJP/Kam95UUg2azmgQS4d6U/aLRvF+ZjNiBMVFG7Icf6es+ u30uc83+gBIu3vy0erB8YvZzXg50MV1XQwQYKP3xX/Vuj/AbYL1V16QGyJRx9uaR2OKU WZKIaFsnVE3rbBsGXCIHc+6vIvCoY+nRpCQ4HnbPm2UMAvNMmb1k0KjJNhO7uaaCfxVp Cy3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=nojDSosNK3fk9/Ttz8oIEPQVNcGaJNLyPfASgErRs9Q=; b=S9+xV9qxixb4a490vU4mhybdvCk/Nr0DFqtwtj+yKSx845brZs55IaSuzyNweZvlxJ u2Nnd/B+mCPl7aHR+9ynIgUOLmeTdGl8+r9LAEMGtCcmFgPyqorvN6Jvn+0oHISTVdAa 4GUV7Oh8HzpxSDAizI5aqLil/YNEQSkFPAKc36gRtaFg/8rY1mWm6ySERyDlXFJtgwQe D3M3+QKAA58j8OzkwpJ4hESW/uVUFwWw02tNHijcuL/OBEYY+UUtZQn0+xQFc/KAxtEs /O6J3Mwcc5rnwi2sARsYbREgE2vyoRmRbMKeussB9GDT5oPrUXHptmjoOXy7cLwopYFb KDGQ== X-Gm-Message-State: AOAM533GjOsZvURc20wHkP1T9xIvaUJBB4d60NFAed2NnosPS6Yb3AoV TRj4dETb5c+ItqwmaPbju3L0p+TpLbLU3infj0M1TbZQ5n/SI/KeOd1+ANsBe39IO9ga4RyxbMs RTPVtGp99daRHM8o7QBUORETzjufpbyTDeXuyTJNMHu2HXTckbwqYCnd0hVtYyu4otk2AVxkwoC 1c X-Google-Smtp-Source: ABdhPJzX5wyvg3cPLw5M+zpAgMX/WJ4HXj3wlFE3oKFWZnIFep9vZ28YPXZyCmeDERFg2lt30FrvhBa19DRo1/LjR9tb X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:10:24:72f4:c0a8:18d]) (user=jonathantanmy job=sendgmr) by 2002:a25:ca85:: with SMTP id a127mr730560ybg.113.1599594530376; Tue, 08 Sep 2020 12:48:50 -0700 (PDT) Date: Tue, 8 Sep 2020 12:48:33 -0700 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH v2 5/7] index-pack: calculate {ref,ofs}_{first,last} early From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This is refactoring 2 of 2 to simplify struct base_data. Whenever we make a struct base_data, immediately calculate its delta children. This eliminates confusion as to when the {ref,ofs}_{first,last} fields are initialized. Before this patch, the delta children were calculated at the last possible moment. This allowed the members of struct base_data to be populated in any order, superficially useful when we have the object contents before the struct object_entry. But this makes reasoning about the state of struct base_data more complicated, hence this patch. Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 123 +++++++++++++++++++++---------------------- 1 file changed, 60 insertions(+), 63 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index c8db464557..94d0f53b03 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -33,12 +33,15 @@ struct object_stat { }; struct base_data { + /* Initialized by make_base(). */ struct base_data *base; struct object_entry *obj; - void *data; - unsigned long size; int ref_first, ref_last; int ofs_first, ofs_last; + + /* Not initialized by make_base(). */ + void *data; + unsigned long size; }; struct thread_local { @@ -362,14 +365,6 @@ static void set_thread_data(struct thread_local *data) pthread_setspecific(key, data); } -static struct base_data *alloc_base_data(void) -{ - struct base_data *base = xcalloc(1, sizeof(struct base_data)); - base->ref_last = -1; - base->ofs_last = -1; - return base; -} - static void free_base_data(struct base_data *c) { if (c->data) { @@ -406,19 +401,6 @@ static void prune_base_data(struct base_data *youngest_child) free(ancestry); } -static void link_base_data(struct base_data *base, struct base_data *c) -{ - c->base = base; - if (c->data) - get_thread_data()->base_cache_used += c->size; - prune_base_data(c); -} - -static void unlink_base_data(struct base_data *c) -{ - free_base_data(c); -} - static int is_delta_type(enum object_type type) { return (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA); @@ -929,10 +911,25 @@ static void *get_base_data(struct base_data *c) return c->data; } -static void resolve_delta(struct object_entry *delta_obj, - struct base_data *base, struct base_data *result) +static struct base_data *make_base(struct object_entry *obj, + struct base_data *parent) { - void *base_data, *delta_data; + struct base_data *base = xcalloc(1, sizeof(struct base_data)); + base->base = parent; + base->obj = obj; + find_ref_delta_children(&obj->idx.oid, + &base->ref_first, &base->ref_last); + find_ofs_delta_children(obj->idx.offset, + &base->ofs_first, &base->ofs_last); + return base; +} + +static struct base_data *resolve_delta(struct object_entry *delta_obj, + struct base_data *base) +{ + void *base_data, *delta_data, *result_data; + struct base_data *result; + unsigned long result_size; if (show_stat) { int i = delta_obj - objects; @@ -946,19 +943,31 @@ static void resolve_delta(struct object_entry *delta_obj, } delta_data = get_data_from_pack(delta_obj); base_data = get_base_data(base); - result->obj = delta_obj; - result->data = patch_delta(base_data, base->size, - delta_data, delta_obj->size, &result->size); + result_data = patch_delta(base_data, base->size, + delta_data, delta_obj->size, &result_size); free(delta_data); - if (!result->data) + if (!result_data) bad_object(delta_obj->idx.offset, _("failed to apply delta")); - hash_object_file(the_hash_algo, result->data, result->size, + hash_object_file(the_hash_algo, result_data, result_size, type_name(delta_obj->real_type), &delta_obj->idx.oid); - sha1_object(result->data, NULL, result->size, delta_obj->real_type, + sha1_object(result_data, NULL, result_size, delta_obj->real_type, &delta_obj->idx.oid); + + result = make_base(delta_obj, base); + if (result->ref_last == -1 && result->ofs_last == -1) { + free(result_data); + } else { + result->data = result_data; + result->size = result_size; + get_thread_data()->base_cache_used += result->size; + prune_base_data(result); + } + counter_lock(); nr_resolved_deltas++; counter_unlock(); + + return result; } /* @@ -984,24 +993,9 @@ static int compare_and_swap_type(signed char *type, static struct base_data *find_unresolved_deltas_1(struct base_data *base, struct base_data *prev_base) { - if (base->ref_last == -1 && base->ofs_last == -1) { - find_ref_delta_children(&base->obj->idx.oid, - &base->ref_first, &base->ref_last); - - find_ofs_delta_children(base->obj->idx.offset, - &base->ofs_first, &base->ofs_last); - - if (base->ref_last == -1 && base->ofs_last == -1) { - free(base->data); - return NULL; - } - - link_base_data(prev_base, base); - } - if (base->ref_first <= base->ref_last) { struct object_entry *child = objects + ref_deltas[base->ref_first].obj_no; - struct base_data *result = alloc_base_data(); + struct base_data *result; if (!compare_and_swap_type(&child->real_type, OBJ_REF_DELTA, base->obj->real_type)) @@ -1009,7 +1003,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, (uintmax_t)child->idx.offset, oid_to_hex(&base->obj->idx.oid)); - resolve_delta(child, base, result); + result = resolve_delta(child, base); if (base->ref_first == base->ref_last && base->ofs_last == -1) free_base_data(base); @@ -1019,11 +1013,11 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, if (base->ofs_first <= base->ofs_last) { struct object_entry *child = objects + ofs_deltas[base->ofs_first].obj_no; - struct base_data *result = alloc_base_data(); + struct base_data *result; assert(child->real_type == OBJ_OFS_DELTA); child->real_type = base->obj->real_type; - resolve_delta(child, base, result); + result = resolve_delta(child, base); if (base->ofs_first == base->ofs_last) free_base_data(base); @@ -1031,7 +1025,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, return result; } - unlink_base_data(base); + free_base_data(base); return NULL; } @@ -1074,9 +1068,8 @@ static int compare_ref_delta_entry(const void *a, const void *b) static void resolve_base(struct object_entry *obj) { - struct base_data *base_obj = alloc_base_data(); - base_obj->obj = obj; - base_obj->data = NULL; + struct base_data *base_obj = make_base(obj, NULL); + find_unresolved_deltas(base_obj); } @@ -1369,22 +1362,26 @@ static void fix_unresolved_deltas(struct hashfile *f) for (i = 0; i < nr_ref_deltas; i++) { struct ref_delta_entry *d = sorted_by_pos[i]; enum object_type type; - struct base_data *base_obj = alloc_base_data(); + struct base_data *base; + void *data; + unsigned long size; + struct object_entry *obj; if (objects[d->obj_no].real_type != OBJ_REF_DELTA) continue; - base_obj->data = read_object_file(&d->oid, &type, - &base_obj->size); - if (!base_obj->data) + data = read_object_file(&d->oid, &type, &size); + if (!data) continue; if (check_object_signature(the_repository, &d->oid, - base_obj->data, base_obj->size, + data, size, type_name(type))) die(_("local object %s is corrupt"), oid_to_hex(&d->oid)); - base_obj->obj = append_obj_to_pack(f, d->oid.hash, - base_obj->data, base_obj->size, type); - find_unresolved_deltas(base_obj); + obj = append_obj_to_pack(f, d->oid.hash, data, size, type); + base = make_base(obj, NULL); + base->data = data; + base->size = size; + find_unresolved_deltas(base); display_progress(progress, nr_resolved_deltas); } free(sorted_by_pos); From patchwork Tue Sep 8 19:48:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11764165 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 746A41599 for ; Tue, 8 Sep 2020 19:49:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 588822145D for ; Tue, 8 Sep 2020 19:49:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="hhpy1py7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731706AbgIHTtu (ORCPT ); Tue, 8 Sep 2020 15:49:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731526AbgIHTsx (ORCPT ); Tue, 8 Sep 2020 15:48:53 -0400 Received: from mail-qv1-xf4a.google.com (mail-qv1-xf4a.google.com [IPv6:2607:f8b0:4864:20::f4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD383C061573 for ; Tue, 8 Sep 2020 12:48:52 -0700 (PDT) Received: by mail-qv1-xf4a.google.com with SMTP id y2so8083qvs.14 for ; Tue, 08 Sep 2020 12:48:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=QPqg6xo/A1IkEKrq0zVtesvZ6lQPk6hHuSHsAkiOGjI=; b=hhpy1py7pBg9oIbsW+QpW9dCLk2KOw3Vo0TMZMOlSQz7nairsgodAo+wuwGqqpQxfh 15ArkXtAugMX65LCbJGP/nE2EHvdvH2YYT4bHCKyocDTZOsKoLXhUPCpzjDRpa6IzSVf 2RcRnepndaXaFzF/bqhaQCIE0EJ51RIk1P26+Qm7lq4Zw0KONwzRiegxq0cEhZ/7V10H zoIXBUuT4eCAygL/R5D+QhQN+vBJA735bKoXF21FNqFNlBGxjYSJw5J7DqBJsdOa9mY4 q7eu4sOeM++hRI3DCdl7W8203hgvlX1keYqvPi3KkqUEoPhGKNpf1txS3CGhFip5yFWw qMIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=QPqg6xo/A1IkEKrq0zVtesvZ6lQPk6hHuSHsAkiOGjI=; b=AEMlz0olSl/ZOxwKj7pH5qEquSWh/EDgj60BggVZILfuowJxt2uqGMc24JHZrKmLk6 VfKAbl4ZRtEkrE6GBfq8xQEwP3KHUYuA1ToY0lUd0M9p2QdqEupmYRF/mmu9H0LNutX3 uG8C858Rdrb1f/rnoTc1H2dbYokVg92sfBDy2wNs/1fPCCo/BdMYGN7S7A9b4OLb7d+4 p3JE2/fosPJFjQzWQWXntMMBR3les4HPtVJR72v0wMrAUOALBiOpioJGywEtYjR8Jyt4 rLvJbVEmDEdBoZY6AKMG7/SMQ75aPbPbUsLoZoOomyks6qbJg7lLsMefisyRuXX5xAq9 K5Yg== X-Gm-Message-State: AOAM531JOtAaxoYoHjQV3aNkenUrfD6CQJCS6vORXVZFPAgtUMIEB8S/ 6aRbR2Z7J+hS2JMi4TTjPv8/sgO5gzatupTj9SumyvHW5RTibBv7m0Y0YkCeL5/S9Mmyef0+0HJ VYJrBCF58tcYQU7TvDOn1VEeB3+RrhWA6I6dgpToEbbn4o0Nxpjg5+nW57oABgqyCmpooSTIpUI ZD X-Google-Smtp-Source: ABdhPJxJb8WGX9WIT2oSVSzP1y6x4fs8GSZrTExzX9G/nQn+d3tDCs9QntHkVQl7w+9SZlbvyH0so4/Yat2QT3bpz/0Y X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:10:24:72f4:c0a8:18d]) (user=jonathantanmy job=sendgmr) by 2002:a05:6214:292:: with SMTP id l18mr695176qvv.3.1599594531828; Tue, 08 Sep 2020 12:48:51 -0700 (PDT) Date: Tue, 8 Sep 2020 12:48:34 -0700 In-Reply-To: Message-Id: <34b53f268fad32aca499f5f4b73ceedca693e9ef.1599594441.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH v2 6/7] index-pack: make resolve_delta() assume base data From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent commit will make the quantum of work smaller, necessitating more locking. This commit allows resolve_delta() to be called outside the lock. Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 94d0f53b03..f69a50d46b 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -927,7 +927,7 @@ static struct base_data *make_base(struct object_entry *obj, static struct base_data *resolve_delta(struct object_entry *delta_obj, struct base_data *base) { - void *base_data, *delta_data, *result_data; + void *delta_data, *result_data; struct base_data *result; unsigned long result_size; @@ -942,8 +942,8 @@ static struct base_data *resolve_delta(struct object_entry *delta_obj, obj_stat[i].base_object_no = j; } delta_data = get_data_from_pack(delta_obj); - base_data = get_base_data(base); - result_data = patch_delta(base_data, base->size, + assert(base->data); + result_data = patch_delta(base->data, base->size, delta_data, delta_obj->size, &result_size); free(delta_data); if (!result_data) @@ -1003,6 +1003,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, (uintmax_t)child->idx.offset, oid_to_hex(&base->obj->idx.oid)); + get_base_data(base); result = resolve_delta(child, base); if (base->ref_first == base->ref_last && base->ofs_last == -1) free_base_data(base); @@ -1017,6 +1018,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, assert(child->real_type == OBJ_OFS_DELTA); child->real_type = base->obj->real_type; + get_base_data(base); result = resolve_delta(child, base); if (base->ofs_first == base->ofs_last) free_base_data(base); From patchwork Tue Sep 8 19:48:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11764157 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A5B2459D for ; Tue, 8 Sep 2020 19:49:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7C4E62166E for ; Tue, 8 Sep 2020 19:49:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="oM+On1nX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731233AbgIHTtV (ORCPT ); Tue, 8 Sep 2020 15:49:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731688AbgIHTsy (ORCPT ); Tue, 8 Sep 2020 15:48:54 -0400 Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EB16C061755 for ; Tue, 8 Sep 2020 12:48:54 -0700 (PDT) Received: by mail-qt1-x84a.google.com with SMTP id m13so268579qtu.10 for ; Tue, 08 Sep 2020 12:48:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=JVrBJ2BA5Iy3Ju5YzsAoQWtgvIXfIhuFwkyNysfhCvk=; b=oM+On1nXDwpKxF+NCc3FTSPv9IqEF+6FLZ3u8+SvhrK5beyu9qduT653wbRfchgzRR GS7HVsN1yjKhRBQTj4gKEl+wgqkj+xNLhnVc8i/qoTFcf63RPJgiI3ZgzB1CE57st1M4 0u5eFmCyhVwNqXWPcfmFbWQKohb488/pI0qqy/uhFxQPYm31G5jrej66pZag8J91dVZp qh8eeIVNPOHE3AqeoDLJKHVQN9bfWaoBVf0SC/OC30pRngYXKvKndoU1n2VsiwRgfF5P 1sAngnoqoPR3wg4tCecmF2XTSi9zq5Bhn40NxQns3Afd4I75QrDdxjaEf6TY6LsibvOj OdPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=JVrBJ2BA5Iy3Ju5YzsAoQWtgvIXfIhuFwkyNysfhCvk=; b=lFlm1M512cbYRNCFpocUw0slit00omsddAC0LPxdk81iZeTM4qM0kMILhuRQCJL38P sAzl9vtVb7ea91eQKd79CK6x4oBxDtLIBeFpFXwpKXWKriZjkp3EqfUcBv+vDG6+owzK PwyNLlLte2m/VqEPX4UtDBQRimeH2/c1kA4nna4bH04EMpv8h7/fW5H/xwPgrSsKWeH0 WiagLc8r9YH0UBbItPLhDs4bvxjkNwZo0DPogUz7ZFbabRO0hO092pYvAZv1xqmWB4C4 bye1PmjnDZkykdN79X7aeAGhyaIJiuIC4784xI68ON6WWD5/zcDJQIV/VZh+hQHLqdDH ABkw== X-Gm-Message-State: AOAM533FdID/V0pv426eFRgT6fdflUV9F3euiXmzdkcUP47a+q6oH8I8 pxrKHcPZ8NZNSh1u/dEPQXT/llso9hOowcSgvjn6H8D1txSgM4c96sZuqqxl6k13vZvkkE8fhhm 2FC0vp4UenCK+7uh+z2JnGcYl+FLKqDqllETHrn2TVrKWoFa87tYOJiuADQguR9RTGaAWxpQSam cC X-Google-Smtp-Source: ABdhPJxrFvSR0bAbfgSmDq5Jt4NHOpBSL6jqxiRXxUjxLfY04VWzLrE6YRqN0iSQDwWuJPzGqdOPVn/2IpE8mpPp42YP X-Received: from twelve4.c.googlers.com ([fda3:e722:ac3:10:24:72f4:c0a8:18d]) (user=jonathantanmy job=sendgmr) by 2002:a0c:b308:: with SMTP id s8mr784807qve.31.1599594533237; Tue, 08 Sep 2020 12:48:53 -0700 (PDT) Date: Tue, 8 Sep 2020 12:48:35 -0700 In-Reply-To: Message-Id: <01d6d882768127df1596e3a4c63d9b1a5a9ba093.1599594441.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH v2 7/7] index-pack: make quantum of work smaller From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , gitster@pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Currently, when index-pack resolves deltas, it does not split up delta trees into threads: each delta base root (an object that is not a REF_DELTA or OFS_DELTA) can go into its own thread, but all deltas on that root (direct or indirect) are processed in the same thread. This is a problem when a repository contains a large text file (thus, delta-able) that is modified many times - delta resolution time during fetching is dominated by processing the deltas corresponding to that text file. This patch contains a solution to that. When cloning using git -c core.deltabasecachelimit=1g clone \ https://fuchsia.googlesource.com/third_party/vulkan-cts on my laptop, clone time improved from 3m2s to 2m5s (using 3 threads, which is the default). The solution is to have a global work stack. This stack contains delta bases (objects, whether appearing directly in the packfile or generated by delta resolution, that themselves have delta children) that need to be processed; whenever a thread needs work, it peeks at the top of the stack and processes its next unprocessed child. If a thread finds the stack empty, it will look for more delta base roots to push on the stack instead. The main weakness of having a global work stack is that more time is spent in the mutex, but profiling has shown that most time is spent in the resolution of the deltas themselves, so this shouldn't be an issue in practice. In any case, experimentation (as described in the clone command above) shows that this patch is a net improvement. Signed-off-by: Jonathan Tan Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 348 +++++++++++++++++++++++++------------------ 1 file changed, 200 insertions(+), 148 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index f69a50d46b..8acd078aa0 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -38,15 +38,56 @@ struct base_data { struct object_entry *obj; int ref_first, ref_last; int ofs_first, ofs_last; + /* + * Threads should increment retain_data if they are about to call + * patch_delta() using this struct's data as a base, and decrement this + * when they are done. While retain_data is nonzero, this struct's data + * will not be freed even if the delta base cache limit is exceeded. + */ + int retain_data; + /* + * The number of direct children that have not been fully processed + * (entered work_head, entered done_head, left done_head). When this + * number reaches zero, this struct base_data can be freed. + */ + int children_remaining; /* Not initialized by make_base(). */ + struct list_head list; void *data; unsigned long size; }; +/* + * Stack of struct base_data that have unprocessed children. + * threaded_second_pass() uses this as a source of work (the other being the + * objects array). + * + * Guarded by work_mutex. + */ +static LIST_HEAD(work_head); + +/* + * Stack of struct base_data that have children, all of whom have been + * processed or are being processed, and at least one child is being processed. + * These struct base_data must be kept around until the last child is + * processed. + * + * Guarded by work_mutex. + */ +static LIST_HEAD(done_head); + +/* + * All threads share one delta base cache. + * + * base_cache_used is guarded by work_mutex, and base_cache_limit is read-only + * in a thread. + */ +static size_t base_cache_used; +static size_t base_cache_limit; + struct thread_local { pthread_t thread; - size_t base_cache_used; int pack_fd; }; @@ -369,36 +410,38 @@ static void free_base_data(struct base_data *c) { if (c->data) { FREE_AND_NULL(c->data); - get_thread_data()->base_cache_used -= c->size; + base_cache_used -= c->size; } } -static void prune_base_data(struct base_data *youngest_child) +static void prune_base_data(struct base_data *retain) { - struct base_data *b; - struct thread_local *data = get_thread_data(); - struct base_data **ancestry = NULL; - size_t nr = 0, alloc = 0; - ssize_t i; + struct list_head *pos; - if (data->base_cache_used <= delta_base_cache_limit) + if (base_cache_used <= base_cache_limit) return; - /* - * Free all ancestors of youngest_child until we have enough space, - * starting with the oldest. (We cannot free youngest_child itself.) - */ - for (b = youngest_child->base; b != NULL; b = b->base) { - ALLOC_GROW(ancestry, nr + 1, alloc); - ancestry[nr++] = b; + list_for_each_prev(pos, &done_head) { + struct base_data *b = list_entry(pos, struct base_data, list); + if (b->retain_data || b == retain) + continue; + if (b->data) { + free_base_data(b); + if (base_cache_used <= base_cache_limit) + return; + } } - for (i = nr - 1; - i >= 0 && data->base_cache_used > delta_base_cache_limit; - i--) { - if (ancestry[i]->data) - free_base_data(ancestry[i]); + + list_for_each_prev(pos, &work_head) { + struct base_data *b = list_entry(pos, struct base_data, list); + if (b->retain_data || b == retain) + continue; + if (b->data) { + free_base_data(b); + if (base_cache_used <= base_cache_limit) + return; + } } - free(ancestry); } static int is_delta_type(enum object_type type) @@ -851,15 +894,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, } /* - * This function is part of find_unresolved_deltas(). There are two - * walkers going in the opposite ways. - * - * The first one in find_unresolved_deltas() traverses down from - * parent node to children, deflating nodes along the way. However, - * memory for deflated nodes is limited by delta_base_cache_limit, so - * at some point parent node's deflated content may be freed. - * - * The second walker is this function, which goes from current node up + * Walk from current node up * to top parent if necessary to deflate the node. In normal * situation, its parent node would be already deflated, so it just * needs to apply delta. @@ -887,7 +922,7 @@ static void *get_base_data(struct base_data *c) if (!delta_nr) { c->data = get_data_from_pack(obj); c->size = obj->size; - get_thread_data()->base_cache_used += c->size; + base_cache_used += c->size; prune_base_data(c); } for (; delta_nr > 0; delta_nr--) { @@ -903,7 +938,7 @@ static void *get_base_data(struct base_data *c) free(raw); if (!c->data) bad_object(obj->idx.offset, _("failed to apply delta")); - get_thread_data()->base_cache_used += c->size; + base_cache_used += c->size; prune_base_data(c); } free(delta); @@ -921,6 +956,8 @@ static struct base_data *make_base(struct object_entry *obj, &base->ref_first, &base->ref_last); find_ofs_delta_children(obj->idx.offset, &base->ofs_first, &base->ofs_last); + base->children_remaining = base->ref_last - base->ref_first + + base->ofs_last - base->ofs_first + 2; return base; } @@ -954,14 +991,8 @@ static struct base_data *resolve_delta(struct object_entry *delta_obj, &delta_obj->idx.oid); result = make_base(delta_obj, base); - if (result->ref_last == -1 && result->ofs_last == -1) { - free(result_data); - } else { - result->data = result_data; - result->size = result_size; - get_thread_data()->base_cache_used += result->size; - prune_base_data(result); - } + result->data = result_data; + result->size = result_size; counter_lock(); nr_resolved_deltas++; @@ -970,86 +1001,6 @@ static struct base_data *resolve_delta(struct object_entry *delta_obj, return result; } -/* - * Standard boolean compare-and-swap: atomically check whether "*type" is - * "want"; if so, swap in "set" and return true. Otherwise, leave it untouched - * and return false. - */ -static int compare_and_swap_type(signed char *type, - enum object_type want, - enum object_type set) -{ - enum object_type old; - - type_cas_lock(); - old = *type; - if (old == want) - *type = set; - type_cas_unlock(); - - return old == want; -} - -static struct base_data *find_unresolved_deltas_1(struct base_data *base, - struct base_data *prev_base) -{ - if (base->ref_first <= base->ref_last) { - struct object_entry *child = objects + ref_deltas[base->ref_first].obj_no; - struct base_data *result; - - if (!compare_and_swap_type(&child->real_type, OBJ_REF_DELTA, - base->obj->real_type)) - die("REF_DELTA at offset %"PRIuMAX" already resolved (duplicate base %s?)", - (uintmax_t)child->idx.offset, - oid_to_hex(&base->obj->idx.oid)); - - get_base_data(base); - result = resolve_delta(child, base); - if (base->ref_first == base->ref_last && base->ofs_last == -1) - free_base_data(base); - - base->ref_first++; - return result; - } - - if (base->ofs_first <= base->ofs_last) { - struct object_entry *child = objects + ofs_deltas[base->ofs_first].obj_no; - struct base_data *result; - - assert(child->real_type == OBJ_OFS_DELTA); - child->real_type = base->obj->real_type; - get_base_data(base); - result = resolve_delta(child, base); - if (base->ofs_first == base->ofs_last) - free_base_data(base); - - base->ofs_first++; - return result; - } - - free_base_data(base); - return NULL; -} - -static void find_unresolved_deltas(struct base_data *base) -{ - struct base_data *new_base, *prev_base = NULL; - for (;;) { - new_base = find_unresolved_deltas_1(base, prev_base); - - if (new_base) { - prev_base = base; - base = new_base; - } else { - free(base); - base = prev_base; - if (!base) - return; - prev_base = base->base; - } - } -} - static int compare_ofs_delta_entry(const void *a, const void *b) { const struct ofs_delta_entry *delta_a = a; @@ -1068,33 +1019,131 @@ static int compare_ref_delta_entry(const void *a, const void *b) return oidcmp(&delta_a->oid, &delta_b->oid); } -static void resolve_base(struct object_entry *obj) -{ - struct base_data *base_obj = make_base(obj, NULL); - - find_unresolved_deltas(base_obj); -} - static void *threaded_second_pass(void *data) { - set_thread_data(data); + if (data) + set_thread_data(data); for (;;) { - int i; - counter_lock(); - display_progress(progress, nr_resolved_deltas); - counter_unlock(); + struct base_data *parent = NULL; + struct object_entry *child_obj; + struct base_data *child; + work_lock(); - while (nr_dispatched < nr_objects && - is_delta_type(objects[nr_dispatched].type)) - nr_dispatched++; - if (nr_dispatched >= nr_objects) { - work_unlock(); - break; + if (list_empty(&work_head)) { + /* + * Take an object from the object array. + */ + while (nr_dispatched < nr_objects && + is_delta_type(objects[nr_dispatched].type)) + nr_dispatched++; + if (nr_dispatched >= nr_objects) { + work_unlock(); + break; + } + child_obj = &objects[nr_dispatched++]; + } else { + /* + * Peek at the top of the stack, and take a child from + * it. + */ + parent = list_first_entry(&work_head, struct base_data, + list); + + if (parent->ref_first <= parent->ref_last) { + int offset = ref_deltas[parent->ref_first++].obj_no; + child_obj = objects + offset; + if (child_obj->real_type != OBJ_REF_DELTA) + die("REF_DELTA at offset %"PRIuMAX" already resolved (duplicate base %s?)", + (uintmax_t) child_obj->idx.offset, + oid_to_hex(&parent->obj->idx.oid)); + child_obj->real_type = parent->obj->real_type; + } else { + child_obj = objects + + ofs_deltas[parent->ofs_first++].obj_no; + assert(child_obj->real_type == OBJ_OFS_DELTA); + child_obj->real_type = parent->obj->real_type; + } + + if (parent->ref_first > parent->ref_last && + parent->ofs_first > parent->ofs_last) { + /* + * This parent has run out of children, so move + * it to done_head. + */ + list_del(&parent->list); + list_add(&parent->list, &done_head); + } + + /* + * Ensure that the parent has data, since we will need + * it later. + * + * NEEDSWORK: If parent data needs to be reloaded, this + * prolongs the time that the current thread spends in + * the mutex. A mitigating factor is that parent data + * needs to be reloaded only if the delta base cache + * limit is exceeded, so in the typical case, this does + * not happen. + */ + get_base_data(parent); + parent->retain_data++; } - i = nr_dispatched++; work_unlock(); - resolve_base(&objects[i]); + if (parent) { + child = resolve_delta(child_obj, parent); + if (!child->children_remaining) + FREE_AND_NULL(child->data); + } else { + child = make_base(child_obj, NULL); + if (child->children_remaining) { + /* + * Since this child has its own delta children, + * we will need this data in the future. + * Inflate now so that future iterations will + * have access to this object's data while + * outside the work mutex. + */ + child->data = get_data_from_pack(child_obj); + child->size = child_obj->size; + } + } + + work_lock(); + if (parent) + parent->retain_data--; + if (child->data) { + /* + * This child has its own children, so add it to + * work_head. + */ + list_add(&child->list, &work_head); + base_cache_used += child->size; + prune_base_data(NULL); + } else { + /* + * This child does not have its own children. It may be + * the last descendant of its ancestors; free those + * that we can. + */ + struct base_data *p = parent; + + while (p) { + struct base_data *next_p; + + p->children_remaining--; + if (p->children_remaining) + break; + + next_p = p->base; + free_base_data(p); + list_del(&p->list); + free(p); + + p = next_p; + } + } + work_unlock(); } return NULL; } @@ -1195,6 +1244,7 @@ static void resolve_deltas(void) nr_ref_deltas + nr_ofs_deltas); nr_dispatched = 0; + base_cache_limit = delta_base_cache_limit * nr_threads; if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) { init_thread(); for (i = 0; i < nr_threads; i++) { @@ -1364,10 +1414,8 @@ static void fix_unresolved_deltas(struct hashfile *f) for (i = 0; i < nr_ref_deltas; i++) { struct ref_delta_entry *d = sorted_by_pos[i]; enum object_type type; - struct base_data *base; void *data; unsigned long size; - struct object_entry *obj; if (objects[d->obj_no].real_type != OBJ_REF_DELTA) continue; @@ -1379,11 +1427,15 @@ static void fix_unresolved_deltas(struct hashfile *f) data, size, type_name(type))) die(_("local object %s is corrupt"), oid_to_hex(&d->oid)); - obj = append_obj_to_pack(f, d->oid.hash, data, size, type); - base = make_base(obj, NULL); - base->data = data; - base->size = size; - find_unresolved_deltas(base); + + /* + * Add this as an object to the objects array and call + * threaded_second_pass() (which will pick up the added + * object). + */ + append_obj_to_pack(f, d->oid.hash, data, size, type); + threaded_second_pass(NULL); + display_progress(progress, nr_resolved_deltas); } free(sorted_by_pos);