From patchwork Thu Oct 17 20:17:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11197091 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C66B1575 for ; Thu, 17 Oct 2019 20:17:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 19AD120869 for ; Thu, 17 Oct 2019 20:17:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Avvl5pNr" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436992AbfJQURZ (ORCPT ); Thu, 17 Oct 2019 16:17:25 -0400 Received: from mail-pl1-f202.google.com ([209.85.214.202]:46612 "EHLO mail-pl1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731007AbfJQURY (ORCPT ); Thu, 17 Oct 2019 16:17:24 -0400 Received: by mail-pl1-f202.google.com with SMTP id o9so2194412plk.13 for ; Thu, 17 Oct 2019 13:17:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=WkUKT08bNc2kI+unhVYVIPGXpNpMOhf36018IYaugZg=; b=Avvl5pNrwiVl4pSJNxD++oK+0hGZAyTzxWf+3jULcwPaXYsozXfU03zhCLdTUsyW6M AOBeFh15KfSUtxfy76w5vTMu5VWJTKDum+pflsjNlmmcczdWMB61VtutMoC05pQuWoLL OeWgT473MzP6I3UItDdlfLqNDqs9x1JeacjXSMcLOxQYyMR9hWU+PPidNGNXRU/IY+hJ fECgLzNTh1zZtSetqq+3YHgPXu/CZVRGjEFXrrFbxOrCI6uVBOecU6QsiH7Uy1hRrdYz tDdoqNftgUBKfNSlXOxPLZtaI/dOZSjgVsOiWQRD90P/4sHKKbA9oQqL+MYcna56rcfD Nqpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=WkUKT08bNc2kI+unhVYVIPGXpNpMOhf36018IYaugZg=; b=LkceS7Mh+q5+y8EyafD/IpZOXmpdqlq+9gdF0TxBLkPF189J+OLP0/nWeBwJ3gxoxG GXnOXg6FzqdiANKWIcrqBJoGflQFZS05Swg0jhA2qj1IoniCAepwialwGJguSN7o/a5Y 5dPN2AzFo9e6w4jt68FDFh1RBctvC0VikHBCkkfZ2SP2vuOq50Mg31XMCbQ7TC21FXNS Ji2Xtn6e3UcG1pFm/z4LoRvR/fH1XHqzR4qILk0cwjUUugDrHLaiOT+LY7Ij+amR5pbU Jv/BKCJ7VFiGBKQhlDwyOYM4MU9mB6OOj0IkPDLy6EP8uSO0wmg4MDLWCFsfrpd/pSAn RmHg== X-Gm-Message-State: APjAAAUogP8k3Wi8tAPjsA7AaAx8jlBKuwIUd2kJPWU2OTinLZ+uAvEc SOPrOnOxbgOhKnWfvTIqC0qsXcnoxavSTh8x2gcAxldn9Aoaal97HpYGxhu6kyxhB51lMfOHdah dWGBN5SvJnx3C2o97FWmyA+hJ/3VXbfypnLqVARBu44vdMmWquPTbKuvRmIB+AVic0afm1dpNmU Zh X-Google-Smtp-Source: APXvYqw6jJ4XfreTS0JsEk8ZPkAZKQmk5/ipJ2l96qMStDOmKVKGcf4Jb31JB5ILkdC49BMnpCM4leIev0mQCKGv1RHg X-Received: by 2002:a63:5d04:: with SMTP id r4mr6123547pgb.22.1571343443858; Thu, 17 Oct 2019 13:17:23 -0700 (PDT) Date: Thu, 17 Oct 2019 13:17:10 -0700 In-Reply-To: Message-Id: <0a6777a2431e96f1e53110fa2ce2219420b0640c.1571343096.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.23.0.866.gb869b98d4c-goog Subject: [PATCH v2 1/7] Documentation: deltaBaseCacheLimit is per-thread From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , stolee@gmail.com, peff@peff.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Clarify that core.deltaBaseCacheLimit is per-thread, as can be seen from the fact that cache usage (base_cache_used in struct thread_local in builtin/index-pack.c) is tracked individually for each thread and compared against delta_base_cache_limit. Signed-off-by: Jonathan Tan --- Documentation/config/core.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index 852d2ba37a..ce1ba8d55f 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -388,7 +388,7 @@ the largest projects. You probably do not need to adjust this value. Common unit suffixes of 'k', 'm', or 'g' are supported. core.deltaBaseCacheLimit:: - Maximum number of bytes to reserve for caching base objects + Maximum number of bytes per thread to reserve for caching base objects that may be referenced by multiple deltified objects. By storing the entire decompressed base objects in a cache Git is able to avoid unpacking and decompressing frequently used base From patchwork Thu Oct 17 20:17:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11197093 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 743844B64 for ; Thu, 17 Oct 2019 20:17:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 47D4220869 for ; Thu, 17 Oct 2019 20:17:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ggm9QlXv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437020AbfJQUR1 (ORCPT ); Thu, 17 Oct 2019 16:17:27 -0400 Received: from mail-yb1-f201.google.com ([209.85.219.201]:36501 "EHLO mail-yb1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731007AbfJQUR1 (ORCPT ); Thu, 17 Oct 2019 16:17:27 -0400 Received: by mail-yb1-f201.google.com with SMTP id w2so2737466ybo.3 for ; Thu, 17 Oct 2019 13:17:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=wuw9pHIBKMwZZl6lzf8ADfjXz2feiVGcfVtWSouL+7U=; b=ggm9QlXv5qVhk2DYeiXA1Llzj6OQWq03PHX3SQUAkcEGFMf9//sBnM2uKg9uo8vZgc 4dEuE1lSVq6XBAZftqM1VBbXqT/GHamZwpKZunYMbMoptuDYv2QeKCLrd7KyBzVufoEQ iLu7L8CFKdjfG7KkWIB4GlL4+MaAcDdCTKFzSH8BGrTrzz2ETGJjSZCeLPRuboZ8ThFv lSFex7Qz7vkWbuboopRN4y2geL4MSE6jyhRcZJYR6UZafOzMtQeR9w6Rg9Nn7aSNoRCv u/Af29MlGVOpG6QQ+Nvnf9bZRBi7+Zqz3XIrJ7XLvOsbOCT3fJIg8RgDcCut6V7QqtiJ R7tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=wuw9pHIBKMwZZl6lzf8ADfjXz2feiVGcfVtWSouL+7U=; b=m1lsaO/+qPMoZmg7BJBWRNYmPhGejN8kfeBZ00UlZjSGnXo/AKX/BYe37zLP94rmFB c7lvxkx0uca1wkPqQMf0fgrRd7gmZsC2pYbICkR7UPSToQWA84xL0NjDRMqnBZhbAozx ND/1j2OIKZT9MSzaWcf5ttCxQLNBckJGOk5d5nt691yi4ElZc2G9sgFHD9F4mPFk1b51 Lof9rmOd1/230IurIFG3w0HkrWndVIxvHKkT5YnWgAv/9TzARo1FzpM2pI62UQsvFOKH BWMNdwFUNI/7IkRxacrexe2E2j72f7RwPbnvwUNNTEuzsGBVvO6Pw6YzNfg2YVBYnR2W LCjw== X-Gm-Message-State: APjAAAWzZjXfXzAyssUNxKBiwBOi8og3Pjh+jc57q/yfDLXj8rP7o515 CuNCmLz1U/55Ab6CFy5sVATK+DDKAD5rBGxi100DK0ZUEf/zXUwWtDoN/8jZUVvr1LaSUc0ecqN Nud1/EDZR/KpsNPsymGKn4KmAjCXKcdoGsvAt2KB9ILbvBrsqo/sTZmegDdaegS5F4S8T5z0EvI aH X-Google-Smtp-Source: APXvYqw7AkKXb/6lQM+LSgUD2izGMLyXCy7iD6fD7NY9I5nyg7F8vE1yb1DrHZ/OWfspMJd4dWP90sKryqtocFtqtcc/ X-Received: by 2002:a81:6cd3:: with SMTP id h202mr4067022ywc.223.1571343446632; Thu, 17 Oct 2019 13:17:26 -0700 (PDT) Date: Thu, 17 Oct 2019 13:17:11 -0700 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.23.0.866.gb869b98d4c-goog Subject: [PATCH v2 2/7] index-pack: unify threaded and unthreaded code From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , stolee@gmail.com, peff@peff.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 60a5591039..df6b3b8cf6 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1210,15 +1210,7 @@ static void resolve_deltas(void) cleanup_thread(); return; } - - for (i = 0; i < nr_objects; i++) { - struct object_entry *obj = &objects[i]; - - if (is_delta_type(obj->type)) - continue; - resolve_base(obj); - display_progress(progress, nr_resolved_deltas); - } + threaded_second_pass(¬hread_data); } /* From patchwork Thu Oct 17 20:17:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11197095 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE6ED5705 for ; Thu, 17 Oct 2019 20:17:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8158820869 for ; Thu, 17 Oct 2019 20:17:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="saS/8c82" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437080AbfJQURb (ORCPT ); Thu, 17 Oct 2019 16:17:31 -0400 Received: from mail-pg1-f201.google.com ([209.85.215.201]:55313 "EHLO mail-pg1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731007AbfJQURb (ORCPT ); Thu, 17 Oct 2019 16:17:31 -0400 Received: by mail-pg1-f201.google.com with SMTP id b14so2578390pgm.22 for ; Thu, 17 Oct 2019 13:17:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=xbePw5qFrrtu4A8JivCmOtF2bj2S91pEHOwOlx74Cv8=; b=saS/8c82X/5o0scb9VlPtZ/Au52oVPHAygRDlU4YiqLinKcmMM9iGF3S3Al3JIdiig +Phvs3czz017ardrx6L5IkcCSkdAxK+gAv65UlWQgeop/KAURPO0ggR6vYUTsVDL3HwR LH9YSNzEa1oX11hu4nIDI17YypjpIgi+lT+BENtraYirO0Gvz53i8X9VDSsE/+UwErs7 7zBuhcwBipfxTTadH23h6aO79L/1P3L36z802Qx3s0Itjxq3AAwHaD0nIVHBZHeKBl/a CFGcOauI0ihfnvwdUZkIblSB5izS1I7rK7XM0iDLriQHMhKUkXEwNu4HG00dIevxeWZg BABA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=xbePw5qFrrtu4A8JivCmOtF2bj2S91pEHOwOlx74Cv8=; b=p+2ydvOD+zNnub/L0mzaMzEb4AmcC+9jga6gp8ZxArTuwiXQ9uXcmk2fu2GRx31U6H ApltkJJhRB2ls1xZrUrGveoyguweASVMUZduCFdV4FNYQa+YbPorVkI62GZG+fQWC4M+ Ofi7O6widYeTgJ9IJv4L8lsolpOQbMs+SftcYoMQjrHbz4QQ3mKTHouQI6yhX+4YWizq Ebr9EQMPSmCF4H5U+vnwIohkNxGn5LJ9VvQ/yVk7xvPXIK8YGqMUNULULfaNh+GF5vok CrU9koBOdgfYAEopHcxUrdMoicxfMZZzJX/ZELpG1io1ykEyddWUXyYcwHRUXmz3KJk0 jJ/g== X-Gm-Message-State: APjAAAUhXM5M5SH1rJ/W/BjHYtKFdUYilyfLqO50kRiJ8/4aSJt+zJEB vTbZ9eggvtM0WjSVbt8QDxz6RTOlqBINM5Hbz8J91StikSzPkXyGNTG0dAj5rilOllIHDmvKhs5 QEvYzxez43IbpOTfB4gFcF9BT4eZYXvRGFTuGUZXZun28TqiLUCC67/I1Xy0aUvgr55c7CVEEdu 1m X-Google-Smtp-Source: APXvYqyHO/0+Mts9pMY506xn5a/JF6KsrlxJg/AaYhHz2p69tSIDio2MbJWHJLDmscb8sMZcR2aqIFyWeAISeEAX1eaR X-Received: by 2002:a63:e148:: with SMTP id h8mr5906629pgk.297.1571343448939; Thu, 17 Oct 2019 13:17:28 -0700 (PDT) Date: Thu, 17 Oct 2019 13:17:12 -0700 In-Reply-To: Message-Id: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.23.0.866.gb869b98d4c-goog Subject: [PATCH v2 3/7] index-pack: remove redundant parameter From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , stolee@gmail.com, peff@peff.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org find_{ref,ofs}_delta_{,children} take an enum object_type parameter, but the object type is already present in the name of the function. Remove that parameter from these functions. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 26 ++++++++++++-------------- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index df6b3b8cf6..296804230c 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -614,7 +614,7 @@ static int compare_ofs_delta_bases(off_t offset1, off_t offset2, 0; } -static int find_ofs_delta(const off_t offset, enum object_type type) +static int find_ofs_delta(const off_t offset) { int first = 0, last = nr_ofs_deltas; @@ -624,7 +624,8 @@ static int find_ofs_delta(const off_t offset, enum object_type type) int cmp; cmp = compare_ofs_delta_bases(offset, delta->offset, - type, objects[delta->obj_no].type); + OBJ_OFS_DELTA, + objects[delta->obj_no].type); if (!cmp) return next; if (cmp < 0) { @@ -637,10 +638,9 @@ static int find_ofs_delta(const off_t offset, enum object_type type) } static void find_ofs_delta_children(off_t offset, - int *first_index, int *last_index, - enum object_type type) + int *first_index, int *last_index) { - int first = find_ofs_delta(offset, type); + int first = find_ofs_delta(offset); int last = first; int end = nr_ofs_deltas - 1; @@ -668,7 +668,7 @@ static int compare_ref_delta_bases(const struct object_id *oid1, return oidcmp(oid1, oid2); } -static int find_ref_delta(const struct object_id *oid, enum object_type type) +static int find_ref_delta(const struct object_id *oid) { int first = 0, last = nr_ref_deltas; @@ -678,7 +678,8 @@ static int find_ref_delta(const struct object_id *oid, enum object_type type) int cmp; cmp = compare_ref_delta_bases(oid, &delta->oid, - type, objects[delta->obj_no].type); + OBJ_REF_DELTA, + objects[delta->obj_no].type); if (!cmp) return next; if (cmp < 0) { @@ -691,10 +692,9 @@ static int find_ref_delta(const struct object_id *oid, enum object_type type) } static void find_ref_delta_children(const struct object_id *oid, - int *first_index, int *last_index, - enum object_type type) + int *first_index, int *last_index) { - int first = find_ref_delta(oid, type); + int first = find_ref_delta(oid); int last = first; int end = nr_ref_deltas - 1; @@ -982,12 +982,10 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, { if (base->ref_last == -1 && base->ofs_last == -1) { find_ref_delta_children(&base->obj->idx.oid, - &base->ref_first, &base->ref_last, - OBJ_REF_DELTA); + &base->ref_first, &base->ref_last); find_ofs_delta_children(base->obj->idx.offset, - &base->ofs_first, &base->ofs_last, - OBJ_OFS_DELTA); + &base->ofs_first, &base->ofs_last); if (base->ref_last == -1 && base->ofs_last == -1) { free(base->data); From patchwork Thu Oct 17 20:17:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11197099 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6DD101575 for ; Thu, 17 Oct 2019 20:17:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4B96820869 for ; Thu, 17 Oct 2019 20:17:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nO05626i" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437218AbfJQURd (ORCPT ); Thu, 17 Oct 2019 16:17:33 -0400 Received: from mail-vk1-f201.google.com ([209.85.221.201]:38142 "EHLO mail-vk1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731007AbfJQURd (ORCPT ); Thu, 17 Oct 2019 16:17:33 -0400 Received: by mail-vk1-f201.google.com with SMTP id k132so1412042vka.5 for ; Thu, 17 Oct 2019 13:17:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=SkWoFSW9/ZnGBNqh9eBUlPVU68AMoA32DFzs1tDlqsg=; b=nO05626in75120Wd8npkKPuwJzxImKKqqAbbVwFtPzmY70Yjp3sDjDGPpLqk0qxEvK aJGrGPXmR05Ei8UGks+9NMdrDhoGlGmUB7rC+5yeTE+T2Zu+/BgOt4C/c21WzJtIt0bE umJiX1vVQmZxwyDB0+VzLfEFSTjUZEQjS1vJM2jIrWxZvJM2I5X2pd+7+LwKZPn45vuF lAxylsBjUO8kIlUkN63LYNFW1tKh977nLt7S/bbETpejg2QBXAdoQ9robrUT9wCit1ow I1aDnI8PHeVK22LZiy3yFhh7PyLx7ph6ztg4p4AXd/l/45ZGUYOwmYGk02MYT9m7o6U/ I6jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=SkWoFSW9/ZnGBNqh9eBUlPVU68AMoA32DFzs1tDlqsg=; b=QDDPGDNp5v84k957DvtAN885WbAj7UjWDHCrVInOrxVTSxuRs6RQNH6NtEDHxfSrkV O3lo/vjUcFtHGwi526Q9ydgaMjsVDYVSi8jJfgiy7zAu3KLqXCTbh4x4CAzTLXLBSXO2 Db8ZVoNAoWBk/7YRyBb77Gct2Iyz6Xu4xmBSucmp+Twi/6i7RczeTOwb6+yztApsVfM1 OM0zjC0SFTNLBIj8/Zd2m2BhJ69aw1dUb/PYbF8a5CZxpbtVgLFMewlxLcZHTOGnfN7W 9exKsMueiH0K6y4UJnFN9+IF6x+AECnOttr7VWjmZ57Q6Ik/oGXiVQBEBG0JRCnssOjI 9tAA== X-Gm-Message-State: APjAAAV7Y42K4tpkvYd5kyu5NI7OF4tDHSGGV1acAisEDLPz6iQuA+Ys 85IgXUmrBhBDcpM71Wj0xi80vwo+rXURPovynTslytUO6R8TnTGyLZggeXL4CJ6HA70SC87Lwwg ByXJWnIK20MiDdQPaaT0sqws+PPALYBkrD/Z1gT9tGPRt8ky8h0rgDQOa6gDFd9CB69k0pAespe dD X-Google-Smtp-Source: APXvYqyKLGcLNwLaAB8ka+0W4Mx06xipzUfUwuEOKIC9MOlMOXQH2kDn6CR/C+eq+C42mE6Oeo/9sDqCR6Ffijm/2M33 X-Received: by 2002:ab0:55c8:: with SMTP id w8mr3288316uaa.66.1571343451775; Thu, 17 Oct 2019 13:17:31 -0700 (PDT) Date: Thu, 17 Oct 2019 13:17:13 -0700 In-Reply-To: Message-Id: <3359b66b841e7eabdc45d3d937e97208b22e2901.1571343096.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.23.0.866.gb869b98d4c-goog Subject: [PATCH v2 4/7] index-pack: remove redundant child field From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , stolee@gmail.com, peff@peff.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This is refactoring 1 of 2 to simplify struct base_data. In index-pack, each thread maintains a doubly-linked list of the delta chain that it is currently processing (the "base" and "child" pointers in struct base_data). When a thread exceeds the delta base cache limit and needs to reclaim memory, it uses the "child" pointers to traverse the lineage, reclaiming the memory of the eldest delta bases first. A subsequent patch will perform memory reclaiming in a different way and will thus no longer need the "child" pointer. Because the "child" pointer is redundant even now, remove it so that the aforementioned subsequent patch will be clearer. In the meantime, reclaim memory in the reverse order of the "base" pointers. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 41 ++++++++++++++++++++++------------------- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 296804230c..220e1e3693 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -34,7 +34,6 @@ struct object_stat { struct base_data { struct base_data *base; - struct base_data *child; struct object_entry *obj; void *data; unsigned long size; @@ -44,7 +43,6 @@ struct base_data { struct thread_local { pthread_t thread; - struct base_data *base_cache; size_t base_cache_used; int pack_fd; }; @@ -380,27 +378,37 @@ static void free_base_data(struct base_data *c) } } -static void prune_base_data(struct base_data *retain) +static void prune_base_data(struct base_data *youngest_child) { struct base_data *b; struct thread_local *data = get_thread_data(); - for (b = data->base_cache; - data->base_cache_used > delta_base_cache_limit && b; - b = b->child) { - if (b->data && b != retain) - free_base_data(b); + struct base_data **ancestry = NULL; + size_t nr = 0, alloc = 0; + ssize_t i; + + if (data->base_cache_used <= delta_base_cache_limit) + return; + + /* + * Free all ancestors of youngest_child until we have enough space, + * starting with the oldest. (We cannot free youngest_child itself.) + */ + for (b = youngest_child->base; b != NULL; b = b->base) { + ALLOC_GROW(ancestry, nr + 1, alloc); + ancestry[nr++] = b; } + for (i = nr - 1; + i >= 0 && data->base_cache_used > delta_base_cache_limit; + i--) { + if (ancestry[i]->data) + free_base_data(ancestry[i]); + } + free(ancestry); } static void link_base_data(struct base_data *base, struct base_data *c) { - if (base) - base->child = c; - else - get_thread_data()->base_cache = c; - c->base = base; - c->child = NULL; if (c->data) get_thread_data()->base_cache_used += c->size; prune_base_data(c); @@ -408,11 +416,6 @@ static void link_base_data(struct base_data *base, struct base_data *c) static void unlink_base_data(struct base_data *c) { - struct base_data *base = c->base; - if (base) - base->child = NULL; - else - get_thread_data()->base_cache = NULL; free_base_data(c); } From patchwork Thu Oct 17 20:17:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11197101 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9940F4B64 for ; Thu, 17 Oct 2019 20:17:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 78A4720869 for ; Thu, 17 Oct 2019 20:17:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="g0zr3Ax2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437324AbfJQURh (ORCPT ); Thu, 17 Oct 2019 16:17:37 -0400 Received: from mail-yb1-f201.google.com ([209.85.219.201]:45129 "EHLO mail-yb1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731007AbfJQURh (ORCPT ); Thu, 17 Oct 2019 16:17:37 -0400 Received: by mail-yb1-f201.google.com with SMTP id y6so2722098ybm.12 for ; Thu, 17 Oct 2019 13:17:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=4Uo3j7qHAfJI4XLY4La7B3jbu9qzZn2r38Y90Vqt5I4=; b=g0zr3Ax2fN7Da6Ey/9d98AFYpBfC3K2O2KeoxerhKDaVlb51v59UVe4lQkYQ+VqSCs 3FnMFH9YcTfRaLFMh34OQhXpYFX+hKKXLrKK+2yIvzbpkTv+mz2qx3e9AY+H0Xt/XY/Y Oeg+7HULYW8TG74lwRnlEkPW5ubo1wmAY/lyYomY61Uw4dSj/ABHull88f24i9P7m5GS nWT3rvABWbmEt6VAo7WaNtFwRLHNHgxNZ/IQ3DgagzfQgTrSfynMRfqlMC8d6vjWvZlI WIGUVIB1e+STcgfsc5KyCtEfUnaKcemGEL4s6lxFMg3OfG/CqyBJVzF4WgLuNZRw65Ov /AQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=4Uo3j7qHAfJI4XLY4La7B3jbu9qzZn2r38Y90Vqt5I4=; b=SZRDxFMt5LDOmNhvolCsSbcONHfBzdDrCdgS0rn2cu+pLSTze7upNLPun3ilsIWXy4 52DH4AzQKjLMZd3BrvZHkdKPS7+0okcSR3zSRwYK34aD7eSgd+j2zrhwbqxLD7DSOGnL Bxi4r/airW0uXyEettfO4r+xTVNQxVwMNQLxbNDYkfg8OvMGXWbwkpfRnEK1OVkreOLW ckgy2ZYCZ2uIBj9pvVKkqvXQSPQqI85cWRGb9tLaDzFHNrhQ9qat6dbiaCepaHyRAZc/ xhjGKq19YiB5GG/ule3KDaRZiSs0lnoyceVXU3N+egajU8lSWmoxLzMSnwH+yiDnSk+g zAVg== X-Gm-Message-State: APjAAAVwx/0hnUMShPUt4hHd+gMuVjAbq1r7axFlLQDKcTd+ifZozXiK OPPFf7Y7d37ycVJSE+gR8MPHXO7PLPkeFHV1GbmMMhQ96eolDNVcC2PDRLdDbdpDXuvBXJAXc0k /pSYh91NuQFQY8oj55JonIlBqQTpKq3+sl9ufQf035sOwKKZIO1m96FPCjp2QNG/P37jKnVeUVk JT X-Google-Smtp-Source: APXvYqyrK0BNpc8tgBVbD4XRDrS+HZnpYtPfMfz2D/8zFd+uta8gqStxTxnlmTkFAqK7e2wZsXJNUpCu99dbnPWUmWvF X-Received: by 2002:a25:d048:: with SMTP id h69mr3435849ybg.458.1571343454716; Thu, 17 Oct 2019 13:17:34 -0700 (PDT) Date: Thu, 17 Oct 2019 13:17:14 -0700 In-Reply-To: Message-Id: <7f18480c45193d2a54832705cd29353911ad5b83.1571343096.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.23.0.866.gb869b98d4c-goog Subject: [PATCH v2 5/7] index-pack: calculate {ref,ofs}_{first,last} early From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , stolee@gmail.com, peff@peff.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This is refactoring 2 of 2 to simplify struct base_data. Whenever we make a struct base_data, immediately calculate its delta children. This eliminates confusion as to when the {ref,ofs}_{first,last} fields are initialized. Before this patch, the delta children were calculated at the last possible moment. This allowed the members of struct base_data to be populated in any order, superficially useful when we have the object contents before the struct object_entry. But this makes reasoning about the state of struct base_data more complicated, hence this patch. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 125 +++++++++++++++++++++---------------------- 1 file changed, 61 insertions(+), 64 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 220e1e3693..d21353757d 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -33,12 +33,15 @@ struct object_stat { }; struct base_data { + /* Initialized by make_base(). */ struct base_data *base; struct object_entry *obj; - void *data; - unsigned long size; int ref_first, ref_last; int ofs_first, ofs_last; + + /* Not initialized by make_base(). */ + void *data; + unsigned long size; }; struct thread_local { @@ -362,14 +365,6 @@ static void set_thread_data(struct thread_local *data) pthread_setspecific(key, data); } -static struct base_data *alloc_base_data(void) -{ - struct base_data *base = xcalloc(1, sizeof(struct base_data)); - base->ref_last = -1; - base->ofs_last = -1; - return base; -} - static void free_base_data(struct base_data *c) { if (c->data) { @@ -406,19 +401,6 @@ static void prune_base_data(struct base_data *youngest_child) free(ancestry); } -static void link_base_data(struct base_data *base, struct base_data *c) -{ - c->base = base; - if (c->data) - get_thread_data()->base_cache_used += c->size; - prune_base_data(c); -} - -static void unlink_base_data(struct base_data *c) -{ - free_base_data(c); -} - static int is_delta_type(enum object_type type) { return (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA); @@ -928,10 +910,25 @@ static void *get_base_data(struct base_data *c) return c->data; } -static void resolve_delta(struct object_entry *delta_obj, - struct base_data *base, struct base_data *result) +static struct base_data *make_base(struct object_entry *obj, + struct base_data *parent) { - void *base_data, *delta_data; + struct base_data *base = xcalloc(1, sizeof(struct base_data)); + base->base = parent; + base->obj = obj; + find_ref_delta_children(&obj->idx.oid, + &base->ref_first, &base->ref_last); + find_ofs_delta_children(obj->idx.offset, + &base->ofs_first, &base->ofs_last); + return base; +} + +static struct base_data *resolve_delta(struct object_entry *delta_obj, + struct base_data *base) +{ + void *base_data, *delta_data, *result_data; + struct base_data *result; + unsigned long result_size; if (show_stat) { int i = delta_obj - objects; @@ -945,19 +942,31 @@ static void resolve_delta(struct object_entry *delta_obj, } delta_data = get_data_from_pack(delta_obj); base_data = get_base_data(base); - result->obj = delta_obj; - result->data = patch_delta(base_data, base->size, - delta_data, delta_obj->size, &result->size); + result_data = patch_delta(base_data, base->size, + delta_data, delta_obj->size, &result_size); free(delta_data); - if (!result->data) + if (!result_data) bad_object(delta_obj->idx.offset, _("failed to apply delta")); - hash_object_file(result->data, result->size, + hash_object_file(result_data, result_size, type_name(delta_obj->real_type), &delta_obj->idx.oid); - sha1_object(result->data, NULL, result->size, delta_obj->real_type, + sha1_object(result_data, NULL, result_size, delta_obj->real_type, &delta_obj->idx.oid); + + result = make_base(delta_obj, base); + if (result->ref_last == -1 && result->ofs_last == -1) { + free(result_data); + } else { + result->data = result_data; + result->size = result_size; + get_thread_data()->base_cache_used += result->size; + prune_base_data(result); + } + counter_lock(); nr_resolved_deltas++; counter_unlock(); + + return result; } /* @@ -983,30 +992,15 @@ static int compare_and_swap_type(signed char *type, static struct base_data *find_unresolved_deltas_1(struct base_data *base, struct base_data *prev_base) { - if (base->ref_last == -1 && base->ofs_last == -1) { - find_ref_delta_children(&base->obj->idx.oid, - &base->ref_first, &base->ref_last); - - find_ofs_delta_children(base->obj->idx.offset, - &base->ofs_first, &base->ofs_last); - - if (base->ref_last == -1 && base->ofs_last == -1) { - free(base->data); - return NULL; - } - - link_base_data(prev_base, base); - } - if (base->ref_first <= base->ref_last) { struct object_entry *child = objects + ref_deltas[base->ref_first].obj_no; - struct base_data *result = alloc_base_data(); + struct base_data *result; if (!compare_and_swap_type(&child->real_type, OBJ_REF_DELTA, base->obj->real_type)) BUG("child->real_type != OBJ_REF_DELTA"); - resolve_delta(child, base, result); + result = resolve_delta(child, base); if (base->ref_first == base->ref_last && base->ofs_last == -1) free_base_data(base); @@ -1016,11 +1010,11 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, if (base->ofs_first <= base->ofs_last) { struct object_entry *child = objects + ofs_deltas[base->ofs_first].obj_no; - struct base_data *result = alloc_base_data(); + struct base_data *result; assert(child->real_type == OBJ_OFS_DELTA); child->real_type = base->obj->real_type; - resolve_delta(child, base, result); + result = resolve_delta(child, base); if (base->ofs_first == base->ofs_last) free_base_data(base); @@ -1028,7 +1022,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, return result; } - unlink_base_data(base); + free_base_data(base); return NULL; } @@ -1071,9 +1065,8 @@ static int compare_ref_delta_entry(const void *a, const void *b) static void resolve_base(struct object_entry *obj) { - struct base_data *base_obj = alloc_base_data(); - base_obj->obj = obj; - base_obj->data = NULL; + struct base_data *base_obj = make_base(obj, NULL); + find_unresolved_deltas(base_obj); } @@ -1367,21 +1360,25 @@ static void fix_unresolved_deltas(struct hashfile *f) for (i = 0; i < nr_ref_deltas; i++) { struct ref_delta_entry *d = sorted_by_pos[i]; enum object_type type; - struct base_data *base_obj = alloc_base_data(); + struct base_data *base; + void *data; + unsigned long size; + struct object_entry *obj; if (objects[d->obj_no].real_type != OBJ_REF_DELTA) continue; - base_obj->data = read_object_file(&d->oid, &type, - &base_obj->size); - if (!base_obj->data) + data = read_object_file(&d->oid, &type, &size); + if (!data) continue; - if (check_object_signature(&d->oid, base_obj->data, - base_obj->size, type_name(type))) + if (check_object_signature(&d->oid, data, + size, type_name(type))) die(_("local object %s is corrupt"), oid_to_hex(&d->oid)); - base_obj->obj = append_obj_to_pack(f, d->oid.hash, - base_obj->data, base_obj->size, type); - find_unresolved_deltas(base_obj); + obj = append_obj_to_pack(f, d->oid.hash, data, size, type); + base = make_base(obj, NULL); + base->data = data; + base->size = size; + find_unresolved_deltas(base); display_progress(progress, nr_resolved_deltas); } free(sorted_by_pos); From patchwork Thu Oct 17 20:17:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11197103 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E441C1575 for ; Thu, 17 Oct 2019 20:17:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C38EF20869 for ; Thu, 17 Oct 2019 20:17:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UmCeCcfu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437384AbfJQURk (ORCPT ); Thu, 17 Oct 2019 16:17:40 -0400 Received: from mail-pg1-f201.google.com ([209.85.215.201]:33161 "EHLO mail-pg1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731007AbfJQURj (ORCPT ); Thu, 17 Oct 2019 16:17:39 -0400 Received: by mail-pg1-f201.google.com with SMTP id f10so2610596pgj.0 for ; Thu, 17 Oct 2019 13:17:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=q72mRa+DvPKvsvKcKotCMdsPLS1mELrWagxkQdQbawU=; b=UmCeCcfuMcdzxJS98vsy6/PVj1ecX0yfpWR6+Tb0VorTWa1rr8KvzsJ1eUuZlhjmBT 6HYQ/ljMIXBeOWlAK/wU9ZFBFQub8K+nPKvyAz7P7iTbrIsUnMQpiEEqTjqsioTysuMb PXAtTn+wzqxSgGJyGVxLjitPl4+NppK1G5tGcNoE3R5nlSOgOIVKQ32+w3ImYfonyhK7 8S7aK5dG3KdVvfTtN8/OCx6ThPOFu2XMn4ZK/kOV8YPWhXhI2PLtF72AnwP6Imhhuus9 ks5jpyb+Z0MzE4W6JeuD8glpxya6HljTPWBZSe+9Wm4vSfTXVMMoi78WulzZgYCuj0ro sQKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=q72mRa+DvPKvsvKcKotCMdsPLS1mELrWagxkQdQbawU=; b=RhXDvg4ukhtbKw1YBFh0ETOXTn6cRtLUzacOVQqe8n9Kd4DlQQEjHFJBLERs09sGAK NcoD27hD2emF5Dd+TkVQ1L2QJ5CMygElXXSOoBFCSXQ1dDDW860jQrQ09JRol6R+RQm0 RpoZJ7HTQixdAQn7n44WulPQb1R/76YJme+qb16rNUQSLoCZ0HgoFFsBZxSaYQ+j3NKS 1tn1skg0E7u8iEA6L+EDJ3wIPf8mDMUlhSlo6ZEb8g4l6l4Q1Ms52G+V0+d/kh0+GhG5 bVc8ypT60DPH+euRG3apgaTtG+bJxNtxvrQKXDhGqPa+lrPj1MUPOSOYYYx/hDzT7Uw3 DSPw== X-Gm-Message-State: APjAAAV7Q7WU3GlCfgT0U8VALzkjurUrMc5ge8yNXftSHbdk2SgvE5kP jj2pJtcWOU/ZunBfN8+xz5T4tRngRbImOQiWPH8GACwG4CvP/hQ3dLP2u0R4+amB8mb4c5tH+sE Id9TF9asbJ/bw2gNb2lBjtmvvtaV8zRGIEs3pUV38WyC3u7lerUWBxjUg3GtEl1t3aNEOwjRaVH b0 X-Google-Smtp-Source: APXvYqyR/+JGp9ERPcwpqn0icjhvqyT5VsFzbRIXVSg9tbK3bRd9hKHAoyw2dw8KaFazVaV8MKVQ79bGRBwUrBwa1SXq X-Received: by 2002:a63:5712:: with SMTP id l18mr5863278pgb.197.1571343457148; Thu, 17 Oct 2019 13:17:37 -0700 (PDT) Date: Thu, 17 Oct 2019 13:17:15 -0700 In-Reply-To: Message-Id: <910b15219f0875663953e0087529ab19815ef3f4.1571343096.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.23.0.866.gb869b98d4c-goog Subject: [PATCH v2 6/7] index-pack: make resolve_delta() assume base data From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , stolee@gmail.com, peff@peff.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent commit will make the quantum of work smaller, necessitating more locking. This commit allows resolve_delta() to be called outside the lock. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index d21353757d..31607a77fc 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -926,7 +926,7 @@ static struct base_data *make_base(struct object_entry *obj, static struct base_data *resolve_delta(struct object_entry *delta_obj, struct base_data *base) { - void *base_data, *delta_data, *result_data; + void *delta_data, *result_data; struct base_data *result; unsigned long result_size; @@ -941,8 +941,8 @@ static struct base_data *resolve_delta(struct object_entry *delta_obj, obj_stat[i].base_object_no = j; } delta_data = get_data_from_pack(delta_obj); - base_data = get_base_data(base); - result_data = patch_delta(base_data, base->size, + assert(base->data); + result_data = patch_delta(base->data, base->size, delta_data, delta_obj->size, &result_size); free(delta_data); if (!result_data) @@ -1000,6 +1000,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, base->obj->real_type)) BUG("child->real_type != OBJ_REF_DELTA"); + get_base_data(base); result = resolve_delta(child, base); if (base->ref_first == base->ref_last && base->ofs_last == -1) free_base_data(base); @@ -1014,6 +1015,7 @@ static struct base_data *find_unresolved_deltas_1(struct base_data *base, assert(child->real_type == OBJ_OFS_DELTA); child->real_type = base->obj->real_type; + get_base_data(base); result = resolve_delta(child, base); if (base->ofs_first == base->ofs_last) free_base_data(base); From patchwork Thu Oct 17 20:17:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 11197105 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03EF04B64 for ; Thu, 17 Oct 2019 20:17:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B3EEA2089C for ; Thu, 17 Oct 2019 20:17:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TCZ6Js6R" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437543AbfJQURm (ORCPT ); Thu, 17 Oct 2019 16:17:42 -0400 Received: from mail-pl1-f202.google.com ([209.85.214.202]:40110 "EHLO mail-pl1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731007AbfJQURm (ORCPT ); Thu, 17 Oct 2019 16:17:42 -0400 Received: by mail-pl1-f202.google.com with SMTP id f10so2195854plr.7 for ; Thu, 17 Oct 2019 13:17:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=eozVfvq4ip4nW178ugLyVb+jkghpn2B5c9Z7OGvbiWA=; b=TCZ6Js6RW8EzjOjVQV76QaU1DhbjYYu/vPl8pbxsl9QcINfcBP8Cw4BjEjW7Sj3INW B7n5jtljNbqn8IqeBnjzTR4WHQS+Mwj6X84Nw3IZfOGF8dGTR1bt+gNsnLB88bvz127p 5xKygme4OuokGW1UnrgWSeTGckGUIrVH/ilh8JuQG5Tx/c/3YyAD/Moc4tArCv5eJMS1 iaW+7EVf5NEroSJc3NuBa1bqjX4m0ui3wiasM6hOUujO63AP04lRE43IsBqwkAkUZKmH kWpqcbMfh7h30vO3SLFjBEYrnPTBlK2Il+Nffi3HtWVO8X3muY1HDue2/zB7O5Ww+vku axfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=eozVfvq4ip4nW178ugLyVb+jkghpn2B5c9Z7OGvbiWA=; b=lhQvq/VJCLVjs8dNItxOP1yjyR9Lk4m/VzNdJ4YYEzDPmGErwbuua4JkgTqXISyVsT qvj8JNoHdbFgFJdMhBXyCKILnAE7+WA8Yh2FmxD9oZwWcWkUDtNU7ftRq+7wsWv1xyBo OYmU8zN0s1Do8rK57glsG7Mrt8BPoqPe6KIL07ukaswLTEWKjI/5iI4YXS9WUU5ThMA7 7h68DhS240OZGAfwYq60y2b0YuO9TAm84wZbZjLnZPqYaVOuR3Zb/d4zhQq0yFyMfo4o mSe6qr/1G8l/TDYfDh1Vq/fUL6qWIUM+/kwAlxQBgWNQKH27LurmdJltiwWEB1AYbZjw yTFQ== X-Gm-Message-State: APjAAAVswAUPY2Sl/kFV4y+aSGhGHJ9AdvfWIzCmmAc8ZKgUo7Nm1f+Y EUUJ/X+ZfV2ZP/wAT+XxgORreuqDh+p1oBqvlJKiDv3uNRtnm+/DwkJTLegqgdDBPquWuWqgkbe 3M47sA+TB2DlQ+o3ZA13toF0QSKhbzGEbhONHkXLXZ3l27RCWkg9jWtnP3tsGDC7Mf3nN6eGCzX X3 X-Google-Smtp-Source: APXvYqz+xXHEHGk33F13pvRSylMJfhJ4l7L/r4xMoXkPB3UkZ48CsDIQBV/jClPOMf4YkXJhiA9gdptig2j3L4dzxMxe X-Received: by 2002:a63:cc4a:: with SMTP id q10mr6128133pgi.221.1571343459569; Thu, 17 Oct 2019 13:17:39 -0700 (PDT) Date: Thu, 17 Oct 2019 13:17:16 -0700 In-Reply-To: Message-Id: <2f2e36d3efede79f55347ce9d80d453bb05a4e15.1571343096.git.jonathantanmy@google.com> Mime-Version: 1.0 References: X-Mailer: git-send-email 2.23.0.866.gb869b98d4c-goog Subject: [PATCH v2 7/7] index-pack: make quantum of work smaller From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , stolee@gmail.com, peff@peff.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Currently, when index-pack resolves deltas, it does not split up delta trees into threads: each delta base root (an object that is not a REF_DELTA or OFS_DELTA) can go into its own thread, but all deltas on that root (direct or indirect) are processed in the same thread. This is a problem when a repository contains a large text file (thus, delta-able) that is modified many times - delta resolution time during fetching is dominated by processing the deltas corresponding to that text file. This patch contains a solution to that. When cloning using git -c core.deltabasecachelimit=1g clone \ https://fuchsia.googlesource.com/third_party/vulkan-cts on my laptop, clone time improved from 3m2s to 2m5s (using 3 threads, which is the default). The solution is to have a global work stack. This stack contains delta bases (objects, whether appearing directly in the packfile or generated by delta resolution, that themselves have delta children) that need to be processed; whenever a thread needs work, it peeks at the top of the stack and processes its next unprocessed child. If a thread finds the stack empty, it will look for more delta base roots to push on the stack instead. The main weakness of having a global work stack is that more time is spent in the mutex, but profiling has shown that most time is spent in the resolution of the deltas themselves, so this shouldn't be an issue in practice. In any case, experimentation (as described in the clone command above) shows that this patch is a net improvement. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 336 ++++++++++++++++++++++++------------------- 1 file changed, 190 insertions(+), 146 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 31607a77fc..072592a35d 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -38,15 +38,49 @@ struct base_data { struct object_entry *obj; int ref_first, ref_last; int ofs_first, ofs_last; + /* + * Threads should increment retain_data if they are about to call + * patch_delta() using this struct's data as a base, and decrement this + * when they are done. While retain_data is nonzero, this struct's data + * will not be freed even if the delta base cache limit is exceeded. + */ + int retain_data; + /* + * The number of direct children that have not been fully processed + * (entered work_head, entered done_head, left done_head). When this + * number reaches zero, this struct base_data can be freed. + */ + int children_remaining; /* Not initialized by make_base(). */ + struct list_head list; void *data; unsigned long size; }; +/* + * Stack of struct base_data that have unprocessed children. + * threaded_second_pass() uses this as a source of work (the other being the + * objects array). + */ +LIST_HEAD(work_head); + +/* + * Stack of struct base_data that have children, all of whom have been + * processed or are being processed, and at least one child is being processed. + * These struct base_data must be kept around until the last child is + * processed. + */ +LIST_HEAD(done_head); + +/* + * All threads share one delta base cache. + */ +size_t base_cache_used; +size_t base_cache_limit; + struct thread_local { pthread_t thread; - size_t base_cache_used; int pack_fd; }; @@ -369,36 +403,38 @@ static void free_base_data(struct base_data *c) { if (c->data) { FREE_AND_NULL(c->data); - get_thread_data()->base_cache_used -= c->size; + base_cache_used -= c->size; } } -static void prune_base_data(struct base_data *youngest_child) +static void prune_base_data(struct base_data *retain) { - struct base_data *b; - struct thread_local *data = get_thread_data(); - struct base_data **ancestry = NULL; - size_t nr = 0, alloc = 0; - ssize_t i; + struct list_head *pos; - if (data->base_cache_used <= delta_base_cache_limit) + if (base_cache_used <= base_cache_limit) return; - /* - * Free all ancestors of youngest_child until we have enough space, - * starting with the oldest. (We cannot free youngest_child itself.) - */ - for (b = youngest_child->base; b != NULL; b = b->base) { - ALLOC_GROW(ancestry, nr + 1, alloc); - ancestry[nr++] = b; + list_for_each_prev(pos, &done_head) { + struct base_data *b = list_entry(pos, struct base_data, list); + if (b->retain_data || b == retain) + continue; + if (b->data) { + free_base_data(b); + if (base_cache_used <= base_cache_limit) + return; + } } - for (i = nr - 1; - i >= 0 && data->base_cache_used > delta_base_cache_limit; - i--) { - if (ancestry[i]->data) - free_base_data(ancestry[i]); + + list_for_each_prev(pos, &work_head) { + struct base_data *b = list_entry(pos, struct base_data, list); + if (b->retain_data || b == retain) + continue; + if (b->data) { + free_base_data(b); + if (base_cache_used <= base_cache_limit) + return; + } } - free(ancestry); } static int is_delta_type(enum object_type type) @@ -850,15 +886,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, } /* - * This function is part of find_unresolved_deltas(). There are two - * walkers going in the opposite ways. - * - * The first one in find_unresolved_deltas() traverses down from - * parent node to children, deflating nodes along the way. However, - * memory for deflated nodes is limited by delta_base_cache_limit, so - * at some point parent node's deflated content may be freed. - * - * The second walker is this function, which goes from current node up + * Walk from current node up * to top parent if necessary to deflate the node. In normal * situation, its parent node would be already deflated, so it just * needs to apply delta. @@ -886,7 +914,7 @@ static void *get_base_data(struct base_data *c) if (!delta_nr) { c->data = get_data_from_pack(obj); c->size = obj->size; - get_thread_data()->base_cache_used += c->size; + base_cache_used += c->size; prune_base_data(c); } for (; delta_nr > 0; delta_nr--) { @@ -902,7 +930,7 @@ static void *get_base_data(struct base_data *c) free(raw); if (!c->data) bad_object(obj->idx.offset, _("failed to apply delta")); - get_thread_data()->base_cache_used += c->size; + base_cache_used += c->size; prune_base_data(c); } free(delta); @@ -920,6 +948,8 @@ static struct base_data *make_base(struct object_entry *obj, &base->ref_first, &base->ref_last); find_ofs_delta_children(obj->idx.offset, &base->ofs_first, &base->ofs_last); + base->children_remaining = base->ref_last - base->ref_first + + base->ofs_last - base->ofs_first + 2; return base; } @@ -953,14 +983,8 @@ static struct base_data *resolve_delta(struct object_entry *delta_obj, &delta_obj->idx.oid); result = make_base(delta_obj, base); - if (result->ref_last == -1 && result->ofs_last == -1) { - free(result_data); - } else { - result->data = result_data; - result->size = result_size; - get_thread_data()->base_cache_used += result->size; - prune_base_data(result); - } + result->data = result_data; + result->size = result_size; counter_lock(); nr_resolved_deltas++; @@ -969,84 +993,6 @@ static struct base_data *resolve_delta(struct object_entry *delta_obj, return result; } -/* - * Standard boolean compare-and-swap: atomically check whether "*type" is - * "want"; if so, swap in "set" and return true. Otherwise, leave it untouched - * and return false. - */ -static int compare_and_swap_type(signed char *type, - enum object_type want, - enum object_type set) -{ - enum object_type old; - - type_cas_lock(); - old = *type; - if (old == want) - *type = set; - type_cas_unlock(); - - return old == want; -} - -static struct base_data *find_unresolved_deltas_1(struct base_data *base, - struct base_data *prev_base) -{ - if (base->ref_first <= base->ref_last) { - struct object_entry *child = objects + ref_deltas[base->ref_first].obj_no; - struct base_data *result; - - if (!compare_and_swap_type(&child->real_type, OBJ_REF_DELTA, - base->obj->real_type)) - BUG("child->real_type != OBJ_REF_DELTA"); - - get_base_data(base); - result = resolve_delta(child, base); - if (base->ref_first == base->ref_last && base->ofs_last == -1) - free_base_data(base); - - base->ref_first++; - return result; - } - - if (base->ofs_first <= base->ofs_last) { - struct object_entry *child = objects + ofs_deltas[base->ofs_first].obj_no; - struct base_data *result; - - assert(child->real_type == OBJ_OFS_DELTA); - child->real_type = base->obj->real_type; - get_base_data(base); - result = resolve_delta(child, base); - if (base->ofs_first == base->ofs_last) - free_base_data(base); - - base->ofs_first++; - return result; - } - - free_base_data(base); - return NULL; -} - -static void find_unresolved_deltas(struct base_data *base) -{ - struct base_data *new_base, *prev_base = NULL; - for (;;) { - new_base = find_unresolved_deltas_1(base, prev_base); - - if (new_base) { - prev_base = base; - base = new_base; - } else { - free(base); - base = prev_base; - if (!base) - return; - prev_base = base->base; - } - } -} - static int compare_ofs_delta_entry(const void *a, const void *b) { const struct ofs_delta_entry *delta_a = a; @@ -1065,33 +1011,128 @@ static int compare_ref_delta_entry(const void *a, const void *b) return oidcmp(&delta_a->oid, &delta_b->oid); } -static void resolve_base(struct object_entry *obj) -{ - struct base_data *base_obj = make_base(obj, NULL); - - find_unresolved_deltas(base_obj); -} - static void *threaded_second_pass(void *data) { - set_thread_data(data); + if (data) + set_thread_data(data); for (;;) { - int i; - counter_lock(); - display_progress(progress, nr_resolved_deltas); - counter_unlock(); + struct base_data *parent = NULL; + struct object_entry *child_obj; + struct base_data *child; + work_lock(); - while (nr_dispatched < nr_objects && - is_delta_type(objects[nr_dispatched].type)) - nr_dispatched++; - if (nr_dispatched >= nr_objects) { - work_unlock(); - break; + if (list_empty(&work_head)) { + /* + * Take an object from the object array. + */ + while (nr_dispatched < nr_objects && + is_delta_type(objects[nr_dispatched].type)) + nr_dispatched++; + if (nr_dispatched >= nr_objects) { + work_unlock(); + break; + } + child_obj = &objects[nr_dispatched++]; + } else { + /* + * Peek at the top of the stack, and take a child from + * it. + */ + parent = list_first_entry(&work_head, struct base_data, + list); + + if (parent->ref_first <= parent->ref_last) { + child_obj = objects + + ref_deltas[parent->ref_first++].obj_no; + assert(child_obj->real_type == OBJ_REF_DELTA); + child_obj->real_type = parent->obj->real_type; + } else { + child_obj = objects + + ofs_deltas[parent->ofs_first++].obj_no; + assert(child_obj->real_type == OBJ_OFS_DELTA); + child_obj->real_type = parent->obj->real_type; + } + + if (parent->ref_first > parent->ref_last && + parent->ofs_first > parent->ofs_last) { + /* + * This parent has run out of children, so move + * it to done_head. + */ + list_del(&parent->list); + list_add(&parent->list, &done_head); + } + + /* + * Ensure that the parent has data, since we will need + * it later. + * + * NEEDSWORK: If parent data needs to be reloaded, this + * prolongs the time that the current thread spends in + * the mutex. A mitigating factor is that parent data + * needs to be reloaded only if the delta base cache + * limit is exceeded, so in the typical case, this does + * not happen. + */ + get_base_data(parent); + parent->retain_data++; } - i = nr_dispatched++; work_unlock(); - resolve_base(&objects[i]); + if (parent) { + child = resolve_delta(child_obj, parent); + if (!child->children_remaining) + FREE_AND_NULL(child->data); + } else { + child = make_base(child_obj, NULL); + if (child->children_remaining) { + /* + * Since this child has its own delta children, + * we will need this data in the future. + * Inflate now so that future iterations will + * have access to this object's data while + * outside the work mutex. + */ + child->data = get_data_from_pack(child_obj); + child->size = child_obj->size; + } + } + + work_lock(); + if (parent) + parent->retain_data--; + if (child->data) { + /* + * This child has its own children, so add it to + * work_head. + */ + list_add(&child->list, &work_head); + base_cache_used += child->size; + prune_base_data(NULL); + } else { + /* + * This child does not have its own children. It may be + * the last descendant of its ancestors; free those + * that we can. + */ + struct base_data *p = parent; + + while (p) { + struct base_data *next_p; + + p->children_remaining--; + if (p->children_remaining) + break; + + next_p = p->base; + free_base_data(p); + list_del(&p->list); + free(p); + + p = next_p; + } + } + work_unlock(); } return NULL; } @@ -1192,6 +1233,7 @@ static void resolve_deltas(void) nr_ref_deltas + nr_ofs_deltas); nr_dispatched = 0; + base_cache_limit = delta_base_cache_limit * nr_threads; if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) { init_thread(); for (i = 0; i < nr_threads; i++) { @@ -1362,10 +1404,8 @@ static void fix_unresolved_deltas(struct hashfile *f) for (i = 0; i < nr_ref_deltas; i++) { struct ref_delta_entry *d = sorted_by_pos[i]; enum object_type type; - struct base_data *base; void *data; unsigned long size; - struct object_entry *obj; if (objects[d->obj_no].real_type != OBJ_REF_DELTA) continue; @@ -1376,11 +1416,15 @@ static void fix_unresolved_deltas(struct hashfile *f) if (check_object_signature(&d->oid, data, size, type_name(type))) die(_("local object %s is corrupt"), oid_to_hex(&d->oid)); - obj = append_obj_to_pack(f, d->oid.hash, data, size, type); - base = make_base(obj, NULL); - base->data = data; - base->size = size; - find_unresolved_deltas(base); + + /* + * Add this as an object to the objects array and call + * threaded_second_pass() (which will pick up the added + * object). + */ + append_obj_to_pack(f, d->oid.hash, data, size, type); + threaded_second_pass(NULL); + display_progress(progress, nr_resolved_deltas); } free(sorted_by_pos);