From patchwork Sat Oct 19 10:35:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200109 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3B64D17EE for ; Sat, 19 Oct 2019 10:35:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1A44D2064A for ; Sat, 19 Oct 2019 10:35:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jD/P+iw8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725995AbfJSKfx (ORCPT ); Sat, 19 Oct 2019 06:35:53 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:37713 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725812AbfJSKfx (ORCPT ); Sat, 19 Oct 2019 06:35:53 -0400 Received: by mail-wm1-f67.google.com with SMTP id f22so8338808wmc.2 for ; Sat, 19 Oct 2019 03:35:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=p7QrJej4PYv8ghfmPIe507A12jbjBI0M2hwH9c1bPdM=; b=jD/P+iw8fm2U/gWF8GkohESc+hSzKipG1ro1nfBoUYIio9JUcJgGJsJJ2AF/GUULG1 QJAm+eCB9GUiAV0rNyvAYFm4PF83Ymzu4isW/NniRhYath/AGx85pCqf4YWNMHWVOsjv CAzOsLpz1fBLYapi03dODnmO0krqX7HUZ/pfsEoO7+ce2EZNFEFTf7dz6cWYk/vhVSbg tP0CjAhbeTyWmS+Fb/vb/qMq6LI+8OmG+L8vEbofNp1l03F9Lf0Z/CmARJPUK99/WFG4 YDlrmSaZVArqCxt/oCRp7L5ULYtoXzwWsROJAAdBBjaFemIA+CuvZbqU6wuu3Vbw1AuT AT2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=p7QrJej4PYv8ghfmPIe507A12jbjBI0M2hwH9c1bPdM=; b=LPJAqmlrJQOqtfeWu3AS/Xt4Ula+F4DuTk/14fNHORPfLVFPaqyz3wq2kmlreuNzRI og22Qf9CJwgzGh//UD41DaKy6eb5F5fuZgQqJwGZCMeCbcmjBsqVc/BIXreZcqp2VsbL /NyjnLs79NHjyos+b30JClz3/wbEbwtdxus6TQamf8aY/e4r5fnh9yJi3b8tgRlHm4Vf JC7WMqlKhXJUCAJsIRJ/FxROlOjZNusOOeRcLaT4oNpG4MLWJCL7BoZsM5XXJSzKFkiL OmyYvdTu0Iqvv8NtOHSn5woxJMQ3za9LqPoiRlzS18+MHEHwsk9CH2S693XLlwbm++9V Q0Aw== X-Gm-Message-State: APjAAAV9m4dzIyg3n9xWgHzx78lB2QP5HKtaIqhUPK0JNAhsw4Z8gFOr UUiDfQhWktHpm8SX7+DG7X30nu5A2nmc9A== X-Google-Smtp-Source: APXvYqxjg6fE6FG51JJWJRzC+OiIkXbiOCyDhFZcYM3jRp+81hFeB4/1iUH9ztQyA+z3Ch63Hb0jUQ== X-Received: by 2002:a1c:5409:: with SMTP id i9mr11496820wmb.120.1571481350918; Sat, 19 Oct 2019 03:35:50 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.35.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:35:50 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan , James Ramsay Subject: [PATCH v2 1/9] builtin/pack-objects: report reused packfile objects Date: Sat, 19 Oct 2019 12:35:23 +0200 Message-Id: <20191019103531.23274-2-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King To see when packfile reuse kicks in or not, it is useful to show reused packfile objects statistics in the output of upload-pack. Helped-by: James Ramsay Signed-off-by: Jeff King Signed-off-by: Christian Couder --- builtin/pack-objects.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5876583220..f2c2703090 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3509,7 +3509,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (progress) fprintf_ln(stderr, _("Total %"PRIu32" (delta %"PRIu32")," - " reused %"PRIu32" (delta %"PRIu32")"), - written, written_delta, reused, reused_delta); + " reused %"PRIu32" (delta %"PRIu32")," + " pack-reused %"PRIu32), + written, written_delta, reused, reused_delta, + reuse_packfile_objects); return 0; } From patchwork Sat Oct 19 10:35:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200111 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 664E869B1 for ; Sat, 19 Oct 2019 10:35:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 451822064A for ; Sat, 19 Oct 2019 10:35:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fumtNEiJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726008AbfJSKfy (ORCPT ); Sat, 19 Oct 2019 06:35:54 -0400 Received: from mail-wr1-f66.google.com ([209.85.221.66]:34527 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725940AbfJSKfy (ORCPT ); Sat, 19 Oct 2019 06:35:54 -0400 Received: by mail-wr1-f66.google.com with SMTP id t16so3608875wrr.1 for ; Sat, 19 Oct 2019 03:35:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oK2HzqZ7FBuvcugKHc3SgF2RSQEtsp+Lq2fa+TU6eF4=; b=fumtNEiJ2e4xH1wBcBSiL2S3/GeV6g++FIiqFzyRrbhTTeXRwjrNDlHXZSjGcf21tq YC3frSwp1TQ1q+1jjcD0tkjiO4VPJuUWLVhhuuTrNHdYmnmkvd4x+b7wt5DdI9WW9D6L cGCP+LM2bFOq3rvSz/5dkgFs6QWK+GbqtkIhwxTgDIaunXFv23m4Duh7WvTA9/YgM6vQ bEVVqdOUtLoY92dovHxDZ4NBGCOdTezkUTYRNuR0KQHLfHQ2bKQd/Sx67Htwzbebn06u o27OhZUdt22nCtOASKgmENyuyPOBAgqJshUvyOJIYaHcp7d7PHY214Iv8BIqMqzPnW+a Yb1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oK2HzqZ7FBuvcugKHc3SgF2RSQEtsp+Lq2fa+TU6eF4=; b=BZFL0O65mmeVK/L7qb6pLyNRVKtuK/DmT2W8P5GM/bBpt+UcSQ1kjAoMm+zlrIcHcS wBWIhcmNd+qyts962DelZgA3AxCFE+PGv9u7OGMDe7JcoOQ0IbtbG/1/IPGbU8HlLA6O ijgGOJ4ODV12m7ToaLs9TAcwWYjhhHNm1YBDAwUtbe4LvGczD5WlsEp+40f3W94IfRyp AgcJ6UTc4xJQ8N+PSd9G3fNw1GuzR7r0SMnd+pyBEtaoikpk8uF4UFXqzNcE9vj0o/ov Uf4Arp1Aav6FS7ed8wne9Szqt3cutKsgDeIpKHBI2QDp9RwXeAdpbeDI0sOXw1Fe/lqC w9HQ== X-Gm-Message-State: APjAAAWeCK8+TGNOZQVybQfaVJeZROopFP8d6l9bsfuYRrkLCp4OdUu6 GJM9nwIWhleQYhZq+MsMJxAfKrZpBlf/PQ== X-Google-Smtp-Source: APXvYqxEYKGDv83kUuTyuZ55B9Ky1YBAu7mNtXku6qaukSGhGmcDuKtr4eqo/Ttuo83UWKv8vwPGbw== X-Received: by 2002:adf:f004:: with SMTP id j4mr12372879wro.68.1571481352293; Sat, 19 Oct 2019 03:35:52 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.35.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:35:51 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v2 2/9] packfile: expose get_delta_base() Date: Sat, 19 Oct 2019 12:35:24 +0200 Message-Id: <20191019103531.23274-3-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King In a following commit get_delta_base() will be used outside packfile.c, so let's make it non static and declare it in packfile.h. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- packfile.c | 10 +++++----- packfile.h | 3 +++ 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/packfile.c b/packfile.c index 355066de17..81e66847bf 100644 --- a/packfile.c +++ b/packfile.c @@ -1173,11 +1173,11 @@ const struct packed_git *has_packed_and_bad(struct repository *r, return NULL; } -static off_t get_delta_base(struct packed_git *p, - struct pack_window **w_curs, - off_t *curpos, - enum object_type type, - off_t delta_obj_offset) +off_t get_delta_base(struct packed_git *p, + struct pack_window **w_curs, + off_t *curpos, + enum object_type type, + off_t delta_obj_offset) { unsigned char *base_info = use_pack(p, w_curs, *curpos, NULL); off_t base_offset; diff --git a/packfile.h b/packfile.h index fc7904ec81..ec536a4ae5 100644 --- a/packfile.h +++ b/packfile.h @@ -151,6 +151,9 @@ void *unpack_entry(struct repository *r, struct packed_git *, off_t, enum object unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep); unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t); int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *); +off_t get_delta_base(struct packed_git *p, struct pack_window **w_curs, + off_t *curpos, enum object_type type, + off_t delta_obj_offset); void release_pack_memory(size_t); From patchwork Sat Oct 19 10:35:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200113 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 90DAD1951 for ; Sat, 19 Oct 2019 10:35:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6FD492064A for ; Sat, 19 Oct 2019 10:35:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mJHJeU1U" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726052AbfJSKf4 (ORCPT ); Sat, 19 Oct 2019 06:35:56 -0400 Received: from mail-wr1-f49.google.com ([209.85.221.49]:36723 "EHLO mail-wr1-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725940AbfJSKfz (ORCPT ); Sat, 19 Oct 2019 06:35:55 -0400 Received: by mail-wr1-f49.google.com with SMTP id w18so8185670wrt.3 for ; Sat, 19 Oct 2019 03:35:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6VTcwaoI1go6JkzckWqozzEySf/t0UZ7pe3UqqHDKsI=; b=mJHJeU1UXfB/Qi9FQrsu3mcVuC+rdUn4ueYnTts6CUNyyeK3Y5obXI7NRRip/yonx3 +6f9dywy89k9eDHHeyM4j9Rb45hj/IYZxnXbmyrhbiSYMCN/+Nxti2fxWyjwFftAUWUk 6UzBrWrtNEqMjPut/wOJQRzE0GV+piytSn07UxuBa1GNtWEP/E6UO+nZw7LdO1qVIiB9 H5TonE7Hb5EI406YYwQL8hPymn4AgyxMERpbNsoYMQsgM4StzUD+IJGmBDuGwqrvQVPW rabf0XDFwg0FxoxGTwY7Bale2ZsRYAxMAPoDZH5slPTj60c5PmAoetYusk9PExTZbRFM Ozhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6VTcwaoI1go6JkzckWqozzEySf/t0UZ7pe3UqqHDKsI=; b=m8ywbQXH4y/EEfLo68Qlr5o6JSEqdnWIth+H7FQWuSjhQNMxjnEf0YU8tugN79ow7T XBMsNcaGbSRhwrGADlUUb/3Qmya9hjcliQnYBWalABNlKqZOoCRL/b4j+T92hdBQoXN3 8/B85YwowZpFjl3nDjr0DCZ7ccXNYiRSfekYjseFko1h1kPmlKyxzTjAVDkLvofWx5dx NrpyPcvReiXnCnesPbAKC4/lNy/Ai3wyTi0FSe4YAD0+wEAh8zNUj5LMzRRDwu7A6jgm 72u8D44O7aFq+pvTKbUMXqpA+fYjSlYjkAxZgYMiofPIGTk41PJRYxRrzVi+bzSiPS5p 7Xxg== X-Gm-Message-State: APjAAAVmlZ/CsiHhuYSJPHaNKacCQY1cAs7oDg5COh/GWK6epKIlaVq/ PoHB7igRBUThOc0oZetdGdRK8OqssEp90A== X-Google-Smtp-Source: APXvYqwuTZnVAhf8GvFrXo1S2+SFu1FSB5hF4iuOaFljwi4e+L3zFMQJ+zdRiM7MteMpgQUNKkV4ig== X-Received: by 2002:adf:e5c4:: with SMTP id a4mr11828205wrn.334.1571481353614; Sat, 19 Oct 2019 03:35:53 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.35.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:35:53 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v2 3/9] ewah/bitmap: introduce bitmap_word_alloc() Date: Sat, 19 Oct 2019 12:35:25 +0200 Message-Id: <20191019103531.23274-4-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King In a following commit we will need to allocate a variable number of bitmap words, instead of always 32, so let's add bitmap_word_alloc() for this purpose. We will also always access at least one word for each bitmap, so we want to make sure that at least one is always allocated. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- ewah/bitmap.c | 13 +++++++++---- ewah/ewok.h | 1 + 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/ewah/bitmap.c b/ewah/bitmap.c index 52f1178db4..b5fed9621f 100644 --- a/ewah/bitmap.c +++ b/ewah/bitmap.c @@ -22,21 +22,26 @@ #define EWAH_MASK(x) ((eword_t)1 << (x % BITS_IN_EWORD)) #define EWAH_BLOCK(x) (x / BITS_IN_EWORD) -struct bitmap *bitmap_new(void) +struct bitmap *bitmap_word_alloc(size_t word_alloc) { struct bitmap *bitmap = xmalloc(sizeof(struct bitmap)); - bitmap->words = xcalloc(32, sizeof(eword_t)); - bitmap->word_alloc = 32; + bitmap->words = xcalloc(word_alloc, sizeof(eword_t)); + bitmap->word_alloc = word_alloc; return bitmap; } +struct bitmap *bitmap_new(void) +{ + return bitmap_word_alloc(32); +} + void bitmap_set(struct bitmap *self, size_t pos) { size_t block = EWAH_BLOCK(pos); if (block >= self->word_alloc) { size_t old_size = self->word_alloc; - self->word_alloc = block * 2; + self->word_alloc = block ? block * 2 : 1; REALLOC_ARRAY(self->words, self->word_alloc); memset(self->words + old_size, 0x0, (self->word_alloc - old_size) * sizeof(eword_t)); diff --git a/ewah/ewok.h b/ewah/ewok.h index 84b2a29faa..1b98b57c8b 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -172,6 +172,7 @@ struct bitmap { }; struct bitmap *bitmap_new(void); +struct bitmap *bitmap_word_alloc(size_t word_alloc); void bitmap_set(struct bitmap *self, size_t pos); int bitmap_get(struct bitmap *self, size_t pos); void bitmap_reset(struct bitmap *self); From patchwork Sat Oct 19 10:35:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200115 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5E6A92DD for ; Sat, 19 Oct 2019 10:35:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9A7AA2064A for ; Sat, 19 Oct 2019 10:35:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cF/VkOBP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726078AbfJSKf5 (ORCPT ); Sat, 19 Oct 2019 06:35:57 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:44257 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726026AbfJSKf4 (ORCPT ); Sat, 19 Oct 2019 06:35:56 -0400 Received: by mail-wr1-f68.google.com with SMTP id z9so8723862wrl.11 for ; Sat, 19 Oct 2019 03:35:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mZD6CHAoeLktNUBmB4u3H0zNOFeqE0gRxk0+vJ91mGM=; b=cF/VkOBPW3wohjWbXwJL9ebyEXaEp6kCzaP9aql5t8Eb7HxW02nnUC6kOVA6F4hr0d zUnO3S0PDIJuEepfDohpTO4BM7hKNiU3N2huyiyA3eyXRfzYP+XJKV4maotQX2IUUwOm 4BFW2bm6QG1/sUwfE7ZHYn4bCr+ghT6OsauxuNjnVKMugVecv6XDtHSACez+zBWhoqtB 3o5eeBl83QQL1s+3oP3eV1as3bwlkrlUxAGqIBzmdoW+5TXb2PdJkdQ1EdYezvMaAKxL ou1pWdxpo3mh3yIsvWyRWfhJeFepUqiP0RpwS/PKPt2kbXEa+RM9c2zXRZL7U8FkdHAU rCyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mZD6CHAoeLktNUBmB4u3H0zNOFeqE0gRxk0+vJ91mGM=; b=ICpywrE2+oEZqj7w8NJmy/5hLjpI2la1VXwdLjTeo9XFtwUGH82Syua7KmB2pRjF+S 24oe4d1xHfxoCMdj3HF9hP177Xtfw/aPbLDyRWTegn7+Rkm32cVH61svHVJ/2EdUPHF0 gPlzsM1rf8o2z50UpOhJDWONz4/n9tcQ7MLzyZbOx26BZbLXydnVW/65k3cl95OfLU8E dJH0KzwznsnQXf4EPkcpkyKjiPO1M69qOWieBQkH2mlYxPduKFwwCNH20zA3LPGGUgjb MC7ROBoFgbq9ZMCOdUhyvXCvlj/UMFjavXHequx2ivQ6beFkaNlgEsyEKUwumzzYD0v3 8jWw== X-Gm-Message-State: APjAAAVAAOWdls3lrayHzNoKVLe8LTV1uA1Vdn9UW4LgUim4L6pi9Ck3 T4mMcU4MqrKm1taT/UAJrdG8R8gVojI2cg== X-Google-Smtp-Source: APXvYqwtzJekCAdiTzH6/sr975hBHCgt6zegEUs8dRnH+IyVT2cDtr2OrK4YHoBtVr6V11mLtbHewA== X-Received: by 2002:adf:8481:: with SMTP id 1mr1687696wrg.189.1571481355007; Sat, 19 Oct 2019 03:35:55 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.35.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:35:54 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v2 4/9] pack-bitmap: don't rely on bitmap_git->reuse_objects Date: Sat, 19 Oct 2019 12:35:26 +0200 Message-Id: <20191019103531.23274-5-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King We will no longer compute bitmap_git->reuse_objects in a following commit, so we cannot rely on it anymore to terminate the loop early; we have to iterate to the end. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- pack-bitmap.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index e07c798879..016d0319fc 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -622,7 +622,7 @@ static void show_objects_for_type( enum object_type object_type, show_reachable_fn show_reach) { - size_t pos = 0, i = 0; + size_t i = 0; uint32_t offset; struct ewah_iterator it; @@ -630,13 +630,15 @@ static void show_objects_for_type( struct bitmap *objects = bitmap_git->result; - if (bitmap_git->reuse_objects == bitmap_git->pack->num_objects) - return; - ewah_iterator_init(&it, type_filter); - while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) { + for (i = 0; i < objects->word_alloc && + ewah_iterator_next(&filter, &it); i++) { eword_t word = objects->words[i] & filter; + size_t pos = (i * BITS_IN_EWORD); + + if (!word) + continue; for (offset = 0; offset < BITS_IN_EWORD; ++offset) { struct object_id oid; @@ -648,9 +650,6 @@ static void show_objects_for_type( offset += ewah_bit_ctz64(word >> offset); - if (pos + offset < bitmap_git->reuse_objects) - continue; - entry = &bitmap_git->pack->revindex[pos + offset]; nth_packed_object_oid(&oid, bitmap_git->pack, entry->nr); @@ -659,9 +658,6 @@ static void show_objects_for_type( show_reach(&oid, object_type, 0, hash, bitmap_git->pack, entry->offset); } - - pos += BITS_IN_EWORD; - i++; } } From patchwork Sat Oct 19 10:35:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200117 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6A39A13BD for ; Sat, 19 Oct 2019 10:36:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 46DE020869 for ; Sat, 19 Oct 2019 10:36:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Z93phz6/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726090AbfJSKf7 (ORCPT ); Sat, 19 Oct 2019 06:35:59 -0400 Received: from mail-wm1-f44.google.com ([209.85.128.44]:38403 "EHLO mail-wm1-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726077AbfJSKf6 (ORCPT ); Sat, 19 Oct 2019 06:35:58 -0400 Received: by mail-wm1-f44.google.com with SMTP id 3so8339218wmi.3 for ; Sat, 19 Oct 2019 03:35:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gFGwwi+GBO0lN0KYb6ZvLRcrKMzEqi9SVRxjC5jMOfY=; b=Z93phz6/E/F6ILBlbJEGpeYaS6JsJC368CLEptjWX518QRJMfSbiPnR+YoHcqD3ceK JSz6p0qNmDNiZE/xnO9zS376szHN9KsZSpZ1MhqYPtaiqSVVZrrsB+aAbV/74kd0O3V0 mTknE2EPha6Y2gCPwUNH6njUAr9PN91zX9nN+W/iGn5OtWwHPmQ+9aa4nC1TkQfQ7COf HYMTRpfgKiYSLfMQKBFfAHbDxzy8H2wKlzBZbxKO1cpnSYaWl7w0GsVPH+cEvlHjZjmM laeeLyQBmiUrBEhQvOkl4KtdRX7SzjDeBDUPLjve/9cAAIkatBp9WeUjBXeuXdN/H7XC NkLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gFGwwi+GBO0lN0KYb6ZvLRcrKMzEqi9SVRxjC5jMOfY=; b=QE3DLc3SL0gAS9aiv54Jx/UaWwMVnTWG/jBt765SO32hTZoqRNJv74lCbF69a31MAE 9usf2R/NnxU/RzDcbsCoFZCm2KZ0YDVzEH5C8tJuE+Fx9HyxGacIiuTpXhvri5TKZfal lzg9XVSPiUypPq5VSRc43ht5F7aBaW1NfA5UqpTttMgZxXYN0zpdNN7ocFI9+uXkpHfw ZWF2pH7A1H7rftfQATHZEUmxhhqZucx9NhId1y2fpATNKmWjRcHdw9xaWlByawBW0Zox vutIbQNGmIeifn17XfJlnlIa/9xusAi3Kwcd7TSBT80wID97Jqpt9bsUoLy5xJToxGv2 sq+Q== X-Gm-Message-State: APjAAAWjOy1W3qscBV7eVT/cErHcawqQroD//I4ahby0CAiTROyhl6l1 r92UcSUpYwCsMqVMxRNunmtSlcXeLb/AFg== X-Google-Smtp-Source: APXvYqxUNAFJDr2y3xOdVzzT+DeO5hik0HgYmV8Grd9ZmEhDhIhMBvQ8e57zfDg63HVtxKQX1bov9w== X-Received: by 2002:a1c:2604:: with SMTP id m4mr12053918wmm.112.1571481356218; Sat, 19 Oct 2019 03:35:56 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.35.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:35:55 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v2 5/9] pack-bitmap: introduce bitmap_walk_contains() Date: Sat, 19 Oct 2019 12:35:27 +0200 Message-Id: <20191019103531.23274-6-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King We will use this helper function in a following commit to tell us if an object is packed. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- pack-bitmap.c | 12 ++++++++++++ pack-bitmap.h | 3 +++ 2 files changed, 15 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index 016d0319fc..8a51302a1a 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -826,6 +826,18 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, return 0; } +int bitmap_walk_contains(struct bitmap_index *bitmap_git, + struct bitmap *bitmap, const struct object_id *oid) +{ + int idx; + + if (!bitmap) + return 0; + + idx = bitmap_position(bitmap_git, oid); + return idx >= 0 && bitmap_get(bitmap, idx); +} + void traverse_bitmap_commit_list(struct bitmap_index *bitmap_git, show_reachable_fn show_reachable) { diff --git a/pack-bitmap.h b/pack-bitmap.h index 466c5afa09..6ab6033dbe 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -3,6 +3,7 @@ #include "ewah/ewok.h" #include "khash.h" +#include "pack.h" #include "pack-objects.h" struct commit; @@ -53,6 +54,8 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *, int rebuild_existing_bitmaps(struct bitmap_index *, struct packing_data *mapping, kh_oid_map_t *reused_bitmaps, int show_progress); void free_bitmap_index(struct bitmap_index *); +int bitmap_walk_contains(struct bitmap_index *, + struct bitmap *bitmap, const struct object_id *oid); /* * After a traversal has been performed by prepare_bitmap_walk(), this can be From patchwork Sat Oct 19 10:35:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200119 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 958271951 for ; Sat, 19 Oct 2019 10:36:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 735DB2064A for ; Sat, 19 Oct 2019 10:36:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZYMmdsxd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726106AbfJSKgB (ORCPT ); Sat, 19 Oct 2019 06:36:01 -0400 Received: from mail-wr1-f66.google.com ([209.85.221.66]:41579 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726026AbfJSKf7 (ORCPT ); Sat, 19 Oct 2019 06:35:59 -0400 Received: by mail-wr1-f66.google.com with SMTP id p4so8729227wrm.8 for ; Sat, 19 Oct 2019 03:35:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bR5KB9UHFGldisBxxZKAgVdv++NTWWdMv0wZ83KQU3E=; b=ZYMmdsxdf03NkFR0DsHPIjwfAoesjHGB1t5n0DQmpd4CPHb1wmV/3NDaIXuzXzj/mL jfL/Usp7BFqry7MPq8tMSrmHTWeXFQj3BlkV74eanGF3MxD3BH341kOZmvLMcHL7iAWt aZNyxXe1qtWsxXA7ZZjExdYU1RAtiP7lwY+fRUEKgbC1FHYLl9j49lq+HBW/gnWrsZN2 nSp1XlNB2JfmYQKIbXFiUwSZp9ffRw+t1oDslEWD4t6k1qT5VVUuYSPE2MTt4PHUvJg1 iYpmq+5VrNoj2j1JDl9PhoaEx7crZ8tLlwWD/hv60AExbsnp48j0/ngSaHZ0bfrEZbrL F4nA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bR5KB9UHFGldisBxxZKAgVdv++NTWWdMv0wZ83KQU3E=; b=lJfAlMwdOrBqkJVCaNu2Dfq/hVLtXb8cxYO8KMtY8q101Vzmzt1P3UUJ+ULbvT7zgJ TepMW6CaQif1nBmXwq/rd8c6qPnFk9i2JJOQMP6lZkKK5d4NwqPfT7qNTmF8UPgDipzT TaTxSHQeuGmZZjU2oe2VvN7HUMh0V9cfWJGzrfCoeR95ZmesfYW5vQW7IEpUiBUX5w+y s0hSerJENZbVyLAr/V7wl9cD+ZwJLpbZE4/arJICsSqk3p4WtxaqpIHlnUqk6LYqFUUG WKK2XhDl/NuXQVVcYELS/hAkQmv4yNZnSoXGGb00E++ufIWAQRmvrir/Y9ILgXjF8h2j 78Gg== X-Gm-Message-State: APjAAAWAxHCeOkWg6Uel9ABeCMOg6bvVFTWh3qKAms25Z0AOW0NxUiZl quPxPMonDA7juxw6ikggU9J+IpO/JeO3MA== X-Google-Smtp-Source: APXvYqzpmHU+ObA5cafZo0WUNx9fz+dDUPWj317Je9aQejLqPsA5fvJXIwBI1v3LQqPq+BY3QmVjxg== X-Received: by 2002:adf:f482:: with SMTP id l2mr937654wro.256.1571481357463; Sat, 19 Oct 2019 03:35:57 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.35.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:35:57 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v2 6/9] csum-file: introduce hashfile_total() Date: Sat, 19 Oct 2019 12:35:28 +0200 Message-Id: <20191019103531.23274-7-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King We will need this helper function in a following commit to give us total number of bytes fed to the hashfile so far. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- csum-file.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/csum-file.h b/csum-file.h index a98b1eee53..f9cbd317fb 100644 --- a/csum-file.h +++ b/csum-file.h @@ -42,6 +42,15 @@ void hashflush(struct hashfile *f); void crc32_begin(struct hashfile *); uint32_t crc32_end(struct hashfile *); +/* + * Returns the total number of bytes fed to the hashfile so far (including ones + * that have not been written out to the descriptor yet). + */ +static inline off_t hashfile_total(struct hashfile *f) +{ + return f->total + f->offset; +} + static inline void hashwrite_u8(struct hashfile *f, uint8_t data) { hashwrite(f, &data, sizeof(data)); From patchwork Sat Oct 19 10:35:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200121 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EFAB113BD for ; Sat, 19 Oct 2019 10:36:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CE59C20869 for ; Sat, 19 Oct 2019 10:36:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Cg6fec1M" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726128AbfJSKgD (ORCPT ); Sat, 19 Oct 2019 06:36:03 -0400 Received: from mail-wm1-f65.google.com ([209.85.128.65]:51630 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726112AbfJSKgC (ORCPT ); Sat, 19 Oct 2019 06:36:02 -0400 Received: by mail-wm1-f65.google.com with SMTP id q70so1368321wme.1 for ; Sat, 19 Oct 2019 03:35:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kJFoM6gd2X7T/Qy34F3uqPb94w5OpKQr+aW9LuJgN18=; b=Cg6fec1MGSLkQvXKX4XJ9yrq6YAXm10HvHwqfcd0DkpXXrE8M9IhPxpswBOv3Zzgt7 I9s6Yn9lHuBq//ROnFnMApdya0GctoZluX99z/mopiEWTshOIPVrWHvB+XT4YcbPJc4r WDmbqdaG26k6hWAAnMYUQApcYyvF+g6OF6maHqI3/mViKqowzdSzuhIj9txTxRmx5OZJ DB+17HV2iSMSFqBco3F4AV/j1Smf7SBOHlLTo2/cYbAGUSABdBuTKZBUcuBowajiWRhB anSqSRfCEeyJbQ+0sSylW7p1vJeF2ZxV8YmR8xM6dIXoYXkStEHF2sN4ajSKHhhswkZs 3fYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kJFoM6gd2X7T/Qy34F3uqPb94w5OpKQr+aW9LuJgN18=; b=XYXEMe5QfXTAL8ozXwIGJsLPd6Vzt98xtjWnNok+DLkTQYg7afqJmUS6m5xWAdGpaZ XH6qYh0XnuRwkCW9uyPG2dYzaI7KJMNZHD/HVOY1CFhjmBcONZH1IzF+IkUMsCkKnxVT srYvmpMs3BTrgesD+qCxVRu3rlWB3zx+t8mWxpwnmT2RLW/v7Hm5FIrHsM+DjuwhlZT3 sSVBGyM+fCcyc8lCdpyYJe1TxF1r/LZUIGNph2OGYRIeKxXUhdokxnVZsKA+Erm3yPUu cR6rf7ZI5l/nVzyDYOCIKC5pnSb6d1I1XmC7/BaSHnGXXHKMWg3n6hn33iZFY3Gah2dH u+NQ== X-Gm-Message-State: APjAAAUghYnrGKr4b33hnfo7mFqjLcxaRVlVzZSzEVCpspeGd1kgZRy/ yUeORMsYPp7nm/+QQoyGS/lA2x7CFiubcQ== X-Google-Smtp-Source: APXvYqxGLKHvi6qbBnqGWIvSA2Uko8QdX1FAOT6ZAUESaqkuCwpPRrOxA/i0upqxSf9XbCErkBRWiQ== X-Received: by 2002:a05:600c:2107:: with SMTP id u7mr11694684wml.86.1571481358694; Sat, 19 Oct 2019 03:35:58 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.35.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:35:58 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v2 7/9] pack-objects: introduce pack.allowPackReuse Date: Sat, 19 Oct 2019 12:35:29 +0200 Message-Id: <20191019103531.23274-8-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Let's make it possible to configure if we want pack reuse or not. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- Documentation/config/pack.txt | 4 ++++ builtin/pack-objects.c | 8 +++++++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 1d66f0c992..58323a351f 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -27,6 +27,10 @@ Note that changing the compression level will not automatically recompress all existing objects. You can force recompression by passing the -F option to linkgit:git-repack[1]. +pack.allowPackReuse:: + When true, which is the default, Git will try to reuse parts + of existing packfiles when preparing new packfiles. + pack.island:: An extended regular expression configuring a set of delta islands. See "DELTA ISLANDS" in linkgit:git-pack-objects[1] diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f2c2703090..4fcfcf6097 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -96,6 +96,7 @@ static off_t reuse_packfile_offset; static int use_bitmap_index_default = 1; static int use_bitmap_index = -1; +static int allow_pack_reuse = 1; static enum { WRITE_BITMAP_FALSE = 0, WRITE_BITMAP_QUIET, @@ -2699,6 +2700,10 @@ static int git_pack_config(const char *k, const char *v, void *cb) use_bitmap_index_default = git_config_bool(k, v); return 0; } + if (!strcmp(k, "pack.allowpackreuse")) { + allow_pack_reuse = git_config_bool(k, v); + return 0; + } if (!strcmp(k, "pack.threads")) { delta_search_threads = git_config_int(k, v); if (delta_search_threads < 0) @@ -3030,7 +3035,8 @@ static void loosen_unused_packed_objects(void) */ static int pack_options_allow_reuse(void) { - return pack_to_stdout && + return allow_pack_reuse && + pack_to_stdout && allow_ofs_delta && !ignore_packed_keep_on_disk && !ignore_packed_keep_in_core && From patchwork Sat Oct 19 10:35:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200123 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2CA0D1951 for ; Sat, 19 Oct 2019 10:36:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0AE6920869 for ; Sat, 19 Oct 2019 10:36:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kQrR0GDd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726120AbfJSKgD (ORCPT ); Sat, 19 Oct 2019 06:36:03 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:44265 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726103AbfJSKgC (ORCPT ); Sat, 19 Oct 2019 06:36:02 -0400 Received: by mail-wr1-f65.google.com with SMTP id z9so8724021wrl.11 for ; Sat, 19 Oct 2019 03:36:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QGaq1AZHL3X9vsrwZoXNz1Ga/GP3WjkYvKoin45GkPQ=; b=kQrR0GDdFBA8exu1z1UDDxZCW4sBAcefqia/E3Teq8C91ndC6M6dgbZcb9y2B6730X eKEo3l1jmLUTAlt6wFFW3JSDGsNr4LWYrFGYotjhBC/LZ0hh5VliKn0uDw+sV9q/0F7W 0ltNFQbvrXFpewNywgdUgcMGI+CcxgR80ZymIlXBBpskkJZ10H/RAajAdYoLcS9DTzzc uzxoo7S7BAWn4ZD3fz46miOVI6DIQhUxRmp3AZp/05TA2xbTEMqe5nrSn9N+Nrwa8NMH wvtEsqeO0MDS3pSyNYNWGTFhm5TsSEoV30D0abdhXxwtd70GOJa0PYXW6/xeq9PWeHvr plfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QGaq1AZHL3X9vsrwZoXNz1Ga/GP3WjkYvKoin45GkPQ=; b=UAcL0tQcxrJbKUx2sX0hLyta8jkMQexpJxEtHBuYl/A3n2GluX6c3sXyRYle/Dci26 rQMA5hgdQ9FJUq2Ky2jUu7TKC95rkqZARhKb1In0hOaWsP7PJ/zWxavszZ0hQwFxyfC7 zmS+uioHSocQlgROOZrWFgm5VtkVfP17Txn7Xj+4ARv46+2AXks7hUm2p8pJU1IA/2+M N6O3qn9HCNn5qgDty1Mfi/hCp2ZTc7ixthpbqWuR5W+XOwkKbFXrTPVM//bDgwH6r65Z v5oB9QtmyvTwqAnEH/NwiZf1oZyBYBIfOD4/Qpyy+ggW1zP/ifzWsWuX9aqW1vP3uR94 Oq1Q== X-Gm-Message-State: APjAAAXAnHbtE1zGr/OS6YQd+EyJPQAFJDT3FyAtv4gmetugkdCYY7uO VCRrwQrJzw3XMj2oQ1paPqrF/75Z75gE3g== X-Google-Smtp-Source: APXvYqytPpOk7bICxJlER1flb92kKLtE+RBstlI0OrB+LO/KJjJJ3jKm9Q7G0HdejWrUAOKsA92CpQ== X-Received: by 2002:adf:cd87:: with SMTP id q7mr1401640wrj.216.1571481359941; Sat, 19 Oct 2019 03:35:59 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.35.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:35:59 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v2 8/9] builtin/pack-objects: introduce obj_is_packed() Date: Sat, 19 Oct 2019 12:35:30 +0200 Message-Id: <20191019103531.23274-9-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Let's refactor the way we check if an object is packed by introducing obj_is_packed(). This function is now a simple wrapper around packlist_find(), but it will evolve in a following commit. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- builtin/pack-objects.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 4fcfcf6097..08898331ef 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2553,6 +2553,11 @@ static void ll_find_deltas(struct object_entry **list, unsigned list_size, free(p); } +static int obj_is_packed(const struct object_id *oid) +{ + return !!packlist_find(&to_pack, oid); +} + static void add_tag_chain(const struct object_id *oid) { struct tag *tag; @@ -2564,7 +2569,7 @@ static void add_tag_chain(const struct object_id *oid) * it was included via bitmaps, we would not have parsed it * previously). */ - if (packlist_find(&to_pack, oid)) + if (obj_is_packed(oid)) return; tag = lookup_tag(the_repository, oid); @@ -2588,7 +2593,7 @@ static int add_ref_tag(const char *path, const struct object_id *oid, int flag, if (starts_with(path, "refs/tags/") && /* is a tag? */ !peel_ref(path, &peeled) && /* peelable? */ - packlist_find(&to_pack, &peeled)) /* object packed? */ + obj_is_packed(&peeled)) /* object packed? */ add_tag_chain(oid); return 0; } From patchwork Sat Oct 19 10:35:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 11200125 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0DADD13BD for ; Sat, 19 Oct 2019 10:36:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CB62A2064A for ; Sat, 19 Oct 2019 10:36:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IYOnxqNJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726130AbfJSKgI (ORCPT ); Sat, 19 Oct 2019 06:36:08 -0400 Received: from mail-wr1-f54.google.com ([209.85.221.54]:46056 "EHLO mail-wr1-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726112AbfJSKgF (ORCPT ); Sat, 19 Oct 2019 06:36:05 -0400 Received: by mail-wr1-f54.google.com with SMTP id q13so3765581wrs.12 for ; Sat, 19 Oct 2019 03:36:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BTRq+NqVF424qMd9aJJd13XV30sTMY1YHvLbHuluefc=; b=IYOnxqNJguIa3WdrS1jaoZp3KdiFcmo5l1Chjf5+pLVLYGnSkXMVV7inyP+qeQ4GJe JW6Tp7B3THCzb9VX2B0weZAlr8vQORWMQpResXusssZ4hGkgQZ0XuN/0hL39CUWVDDkC W3SpvjKC+HuzzzZ0qQFNPoWk5ODZpKD7QfzAg3PGvdVKdcKC+23hijA1857mNQ8TSomo IOAjjLgctme9zHyfSgjJmAL9Q7m1BRDUzES/KztuJEmrz8nlS/zNdis0rS340mzBbMy4 tzGTYxUcpPNXni7WIVNkySoQkqnvQ8GbPwcEfQpBSJ5li5/EJAGN0JvgA2YJg0xc1iK0 7Syw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BTRq+NqVF424qMd9aJJd13XV30sTMY1YHvLbHuluefc=; b=a107CNcqspcBqy/1lrH5x22vbAs5VkVpvHItRRJ65SPL+WI1JVaqdQGb4M6bQ2wuxF jZv/XVtsR9nDxV6zFzkWgBUE2P/liKPZPf5lwqZaimx2AIze8jC8IQ1m2EEtYam1f4YF u72vhpnhOEznzc12PRqI2h5dhVqc0JtUwJuE6PBcrFAeB4wi8C/22uB72DKynEaCsrlA bKLPCqYjn2LYAvRrVnmpP8IwlbeDDnVztQKDuX8s0p6c4q0J/kqK+dN7IVZ/CiHxzmsO 8K1nnbsxslaSIXaLHWapdwcP72AQJkGy4WbBUMkPdrrEaDH4zuqAJeNshbtJ2sAss/0m cDAw== X-Gm-Message-State: APjAAAVeatrloIIIko0esBbJtvHRl0bhCmTWrEwDOSYJtPCjjg/Oc5wr OH5ActkTrmK7NodcRNjlRI6pgZhpXWZNaA== X-Google-Smtp-Source: APXvYqwUtmISdgsFFfVXxoDiDs3EcQ8UPyDL0pw4KPpv5WMBHsN0agGLBOlCLR1VpBbSObpWaSKsXA== X-Received: by 2002:adf:fe10:: with SMTP id n16mr12180206wrr.288.1571481361455; Sat, 19 Oct 2019 03:36:01 -0700 (PDT) Received: from localhost.localdomain ([80.214.68.206]) by smtp.gmail.com with ESMTPSA id p68sm6383086wme.0.2019.10.19.03.36.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Oct 2019 03:36:00 -0700 (PDT) From: Christian Couder X-Google-Original-From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Christian Couder , Ramsay Jones , Jonathan Tan Subject: [PATCH v2 9/9] pack-objects: improve partial packfile reuse Date: Sat, 19 Oct 2019 12:35:31 +0200 Message-Id: <20191019103531.23274-10-chriscool@tuxfamily.org> X-Mailer: git-send-email 2.24.0.rc0.9.gef620577e2 In-Reply-To: <20191019103531.23274-1-chriscool@tuxfamily.org> References: <20191019103531.23274-1-chriscool@tuxfamily.org> MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Let's store the chunks of the packfile that we reuse in a dynamic array of `struct reused_chunk`, and let's use a reuse_packfile_bitmap to speed up reusing parts of packfiles. The dynamic array of `struct reused_chunk` is useful because we need to know not just the number of zero bits, but the accumulated offset due to missing objects. So without the array we'd end up having to walk over the revindex for that set of objects. The array is basically caching those accumulated offsets (for the parts we _do_ include), so we don't have to compute them repeatedly. The old code just tried to dump a whole segment of the pack verbatim. That's faster than the traditional way of actually adding objects to the packing list, but it didn't kick in very often. This new code is really going for a middle ground: do _some_ per-object work, but way less than we'd traditionally do. For instance, packing torvalds/linux on GitHub servers just now reused 6.5M objects, but only needed ~50k chunks. Additional checks are added in have_duplicate_entry() and obj_is_packed() to avoid duplicate objects in the reuse bitmap. It was probably buggy to not have such a check before. If a client both asks for a tag by sha1 and specifies "include-tag", we may end up including the tag in the reuse bitmap (due to the first thing), and then later adding it to the packlist (due to the second). This results in duplicate objects in the pack, which git chokes on. We should notice that we are already including it when doing the include-tag portion, and avoid adding it to the packlist. The simplest place to fix this is right in add_ref_tag, where we could avoid peeling the tag at all if we know that we are already including it. However, I've pushed the check instead into have_duplicate_entry(). This fixes not only this case, but also means that we cannot have any similar problems lurking in other code. No tests, because git does not actually exhibit this "ask for it and also include-tag" behavior. We do one or the other on clone, depending on whether --single-branch is set. However, libgit2 does both. Signed-off-by: Jeff King Signed-off-by: Christian Couder --- builtin/pack-objects.c | 214 ++++++++++++++++++++++++++++++++--------- pack-bitmap.c | 150 +++++++++++++++++++++-------- pack-bitmap.h | 3 +- 3 files changed, 280 insertions(+), 87 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 08898331ef..f710f8dde3 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -92,7 +92,7 @@ static struct progress *progress_state; static struct packed_git *reuse_packfile; static uint32_t reuse_packfile_objects; -static off_t reuse_packfile_offset; +static struct bitmap *reuse_packfile_bitmap; static int use_bitmap_index_default = 1; static int use_bitmap_index = -1; @@ -785,57 +785,177 @@ static struct object_entry **compute_write_order(void) return wo; } -static off_t write_reused_pack(struct hashfile *f) +/* + * Record the offsets needed in our reused packfile chunks due to + * "gaps" where we omitted some objects. + */ +static struct reused_chunk { + off_t start; + off_t offset; +} *reused_chunks; +static int reused_chunks_nr; +static int reused_chunks_alloc; + +static void record_reused_object(off_t where, off_t offset) { - unsigned char buffer[8192]; - off_t to_write, total; - int fd; + if (reused_chunks_nr && reused_chunks[reused_chunks_nr-1].offset == offset) + return; - if (!is_pack_valid(reuse_packfile)) - die(_("packfile is invalid: %s"), reuse_packfile->pack_name); + ALLOC_GROW(reused_chunks, reused_chunks_nr + 1, + reused_chunks_alloc); + reused_chunks[reused_chunks_nr].start = where; + reused_chunks[reused_chunks_nr].offset = offset; + reused_chunks_nr++; +} - fd = git_open(reuse_packfile->pack_name); - if (fd < 0) - die_errno(_("unable to open packfile for reuse: %s"), - reuse_packfile->pack_name); +/* + * Binary search to find the chunk that "where" is in. Note + * that we're not looking for an exact match, just the first + * chunk that contains it (which implicitly ends at the start + * of the next chunk. + */ +static off_t find_reused_offset(off_t where) +{ + int lo = 0, hi = reused_chunks_nr; + while (lo < hi) { + int mi = lo + ((hi - lo) / 2); + if (where == reused_chunks[mi].start) + return reused_chunks[mi].offset; + if (where < reused_chunks[mi].start) + hi = mi; + else + lo = mi + 1; + } - if (lseek(fd, sizeof(struct pack_header), SEEK_SET) == -1) - die_errno(_("unable to seek in reused packfile")); + /* + * The first chunk starts at zero, so we can't have gone below + * there. + */ + assert(lo); + return reused_chunks[lo-1].offset; +} + +static void write_reused_pack_one(size_t pos, struct hashfile *out, + struct pack_window **w_curs) +{ + off_t offset, next, cur; + enum object_type type; + unsigned long size; + + offset = reuse_packfile->revindex[pos].offset; + next = reuse_packfile->revindex[pos + 1].offset; - if (reuse_packfile_offset < 0) - reuse_packfile_offset = reuse_packfile->pack_size - the_hash_algo->rawsz; + record_reused_object(offset, offset - hashfile_total(out)); - total = to_write = reuse_packfile_offset - sizeof(struct pack_header); + cur = offset; + type = unpack_object_header(reuse_packfile, w_curs, &cur, &size); + assert(type >= 0); - while (to_write) { - int read_pack = xread(fd, buffer, sizeof(buffer)); + if (type == OBJ_OFS_DELTA) { + off_t base_offset; + off_t fixup; + + unsigned char header[MAX_PACK_OBJECT_HEADER]; + unsigned len; + + base_offset = get_delta_base(reuse_packfile, w_curs, &cur, type, offset); + assert(base_offset != 0); + + /* Convert to REF_DELTA if we must... */ + if (!allow_ofs_delta) { + int base_pos = find_revindex_position(reuse_packfile, base_offset); + const unsigned char *base_sha1 = + nth_packed_object_sha1(reuse_packfile, + reuse_packfile->revindex[base_pos].nr); + + len = encode_in_pack_object_header(header, sizeof(header), + OBJ_REF_DELTA, size); + hashwrite(out, header, len); + hashwrite(out, base_sha1, 20); + copy_pack_data(out, reuse_packfile, w_curs, cur, next - cur); + return; + } - if (read_pack <= 0) - die_errno(_("unable to read from reused packfile")); + /* Otherwise see if we need to rewrite the offset... */ + fixup = find_reused_offset(offset) - + find_reused_offset(base_offset); + if (fixup) { + unsigned char ofs_header[10]; + unsigned i, ofs_len; + off_t ofs = offset - base_offset - fixup; - if (read_pack > to_write) - read_pack = to_write; + len = encode_in_pack_object_header(header, sizeof(header), + OBJ_OFS_DELTA, size); - hashwrite(f, buffer, read_pack); - to_write -= read_pack; + i = sizeof(ofs_header) - 1; + ofs_header[i] = ofs & 127; + while (ofs >>= 7) + ofs_header[--i] = 128 | (--ofs & 127); + + ofs_len = sizeof(ofs_header) - i; + + hashwrite(out, header, len); + hashwrite(out, ofs_header + sizeof(ofs_header) - ofs_len, ofs_len); + copy_pack_data(out, reuse_packfile, w_curs, cur, next - cur); + return; + } + + /* ...otherwise we have no fixup, and can write it verbatim */ + } + + copy_pack_data(out, reuse_packfile, w_curs, offset, next - offset); +} + +static size_t write_reused_pack_verbatim(struct hashfile *out, + struct pack_window **w_curs) +{ + size_t pos = 0; + + while (pos < reuse_packfile_bitmap->word_alloc && + reuse_packfile_bitmap->words[pos] == (eword_t)~0) + pos++; + + if (pos) { + off_t to_write; + + written = (pos * BITS_IN_EWORD); + to_write = reuse_packfile->revindex[written].offset + - sizeof(struct pack_header); + + record_reused_object(sizeof(struct pack_header), 0); + hashflush(out); + copy_pack_data(out, reuse_packfile, w_curs, + sizeof(struct pack_header), to_write); - /* - * We don't know the actual number of objects written, - * only how many bytes written, how many bytes total, and - * how many objects total. So we can fake it by pretending all - * objects we are writing are the same size. This gives us a - * smooth progress meter, and at the end it matches the true - * answer. - */ - written = reuse_packfile_objects * - (((double)(total - to_write)) / total); display_progress(progress_state, written); } + return pos; +} + +static void write_reused_pack(struct hashfile *f) +{ + size_t i = 0; + uint32_t offset; + struct pack_window *w_curs = NULL; + + if (allow_ofs_delta) + i = write_reused_pack_verbatim(f, &w_curs); + + for (; i < reuse_packfile_bitmap->word_alloc; ++i) { + eword_t word = reuse_packfile_bitmap->words[i]; + size_t pos = (i * BITS_IN_EWORD); + + for (offset = 0; offset < BITS_IN_EWORD; ++offset) { + if ((word >> offset) == 0) + break; + + offset += ewah_bit_ctz64(word >> offset); + write_reused_pack_one(pos + offset, f, &w_curs); + display_progress(progress_state, ++written); + } + } - close(fd); - written = reuse_packfile_objects; - display_progress(progress_state, written); - return reuse_packfile_offset - sizeof(struct pack_header); + unuse_pack(&w_curs); } static const char no_split_warning[] = N_( @@ -868,11 +988,9 @@ static void write_pack_file(void) offset = write_pack_header(f, nr_remaining); if (reuse_packfile) { - off_t packfile_size; assert(pack_to_stdout); - - packfile_size = write_reused_pack(f); - offset += packfile_size; + write_reused_pack(f); + offset = hashfile_total(f); } nr_written = 0; @@ -1001,6 +1119,10 @@ static int have_duplicate_entry(const struct object_id *oid, { struct object_entry *entry; + if (reuse_packfile_bitmap && + bitmap_walk_contains(bitmap_git, reuse_packfile_bitmap, oid)) + return 1; + entry = packlist_find(&to_pack, oid); if (!entry) return 0; @@ -2555,7 +2677,9 @@ static void ll_find_deltas(struct object_entry **list, unsigned list_size, static int obj_is_packed(const struct object_id *oid) { - return !!packlist_find(&to_pack, oid); + return packlist_find(&to_pack, oid) || + (reuse_packfile_bitmap && + bitmap_walk_contains(bitmap_git, reuse_packfile_bitmap, oid)); } static void add_tag_chain(const struct object_id *oid) @@ -2661,6 +2785,7 @@ static void prepare_pack(int window, int depth) if (nr_deltas && n > 1) { unsigned nr_done = 0; + if (progress) progress_state = start_progress(_("Compressing objects"), nr_deltas); @@ -3042,7 +3167,6 @@ static int pack_options_allow_reuse(void) { return allow_pack_reuse && pack_to_stdout && - allow_ofs_delta && !ignore_packed_keep_on_disk && !ignore_packed_keep_in_core && (!local || !have_non_local_packs) && @@ -3059,7 +3183,7 @@ static int get_object_list_from_bitmap(struct rev_info *revs) bitmap_git, &reuse_packfile, &reuse_packfile_objects, - &reuse_packfile_offset)) { + &reuse_packfile_bitmap)) { assert(reuse_packfile_objects); nr_result += reuse_packfile_objects; display_progress(progress_state, nr_result); diff --git a/pack-bitmap.c b/pack-bitmap.c index 8a51302a1a..cbfc544411 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -326,6 +326,13 @@ static int load_pack_bitmap(struct bitmap_index *bitmap_git) munmap(bitmap_git->map, bitmap_git->map_size); bitmap_git->map = NULL; bitmap_git->map_size = 0; + + kh_destroy_oid_map(bitmap_git->bitmaps); + bitmap_git->bitmaps = NULL; + + kh_destroy_oid_pos(bitmap_git->ext_index.positions); + bitmap_git->ext_index.positions = NULL; + return -1; } @@ -764,65 +771,126 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs) return NULL; } -int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, - struct packed_git **packfile, - uint32_t *entries, - off_t *up_to) +static void try_partial_reuse(struct bitmap_index *bitmap_git, + size_t pos, + struct bitmap *reuse, + struct pack_window **w_curs) { + struct revindex_entry *revidx; + off_t offset; + enum object_type type; + unsigned long size; + + if (pos >= bitmap_git->pack->num_objects) + return; /* not actually in the pack */ + + revidx = &bitmap_git->pack->revindex[pos]; + offset = revidx->offset; + type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); + if (type < 0) + return; /* broken packfile, punt */ + + if (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA) { + off_t base_offset; + int base_pos; + + /* + * Find the position of the base object so we can look it up + * in our bitmaps. If we can't come up with an offset, or if + * that offset is not in the revidx, the pack is corrupt. + * There's nothing we can do, so just punt on this object, + * and the normal slow path will complain about it in + * more detail. + */ + base_offset = get_delta_base(bitmap_git->pack, w_curs, + &offset, type, revidx->offset); + if (!base_offset) + return; + base_pos = find_revindex_position(bitmap_git->pack, base_offset); + if (base_pos < 0) + return; + + /* + * We assume delta dependencies always point backwards. This + * lets us do a single pass, and is basically always true + * due to the way OFS_DELTAs work. You would not typically + * find REF_DELTA in a bitmapped pack, since we only bitmap + * packs we write fresh, and OFS_DELTA is the default). But + * let's double check to make sure the pack wasn't written with + * odd parameters. + */ + if (base_pos >= pos) + return; + + /* + * And finally, if we're not sending the base as part of our + * reuse chunk, then don't send this object either. The base + * would come after us, along with other objects not + * necessarily in the pack, which means we'd need to convert + * to REF_DELTA on the fly. Better to just let the normal + * object_entry code path handle it. + */ + if (!bitmap_get(reuse, base_pos)) + return; + } + /* - * Reuse the packfile content if we need more than - * 90% of its objects + * If we got here, then the object is OK to reuse. Mark it. */ - static const double REUSE_PERCENT = 0.9; + bitmap_set(reuse, pos); +} +int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, + struct packed_git **packfile_out, + uint32_t *entries, + struct bitmap **reuse_out) +{ struct bitmap *result = bitmap_git->result; - uint32_t reuse_threshold; - uint32_t i, reuse_objects = 0; + struct bitmap *reuse; + struct pack_window *w_curs = NULL; + size_t i = 0; + uint32_t offset; assert(result); - for (i = 0; i < result->word_alloc; ++i) { - if (result->words[i] != (eword_t)~0) { - reuse_objects += ewah_bit_ctz64(~result->words[i]); - break; - } - - reuse_objects += BITS_IN_EWORD; - } + while (i < result->word_alloc && result->words[i] == (eword_t)~0) + i++; -#ifdef GIT_BITMAP_DEBUG - { - const unsigned char *sha1; - struct revindex_entry *entry; + /* Don't mark objects not in the packfile */ + if (i > bitmap_git->pack->num_objects / BITS_IN_EWORD) + i = bitmap_git->pack->num_objects / BITS_IN_EWORD; - entry = &bitmap_git->reverse_index->revindex[reuse_objects]; - sha1 = nth_packed_object_sha1(bitmap_git->pack, entry->nr); + reuse = bitmap_word_alloc(i); + memset(reuse->words, 0xFF, i * sizeof(eword_t)); - fprintf(stderr, "Failed to reuse at %d (%016llx)\n", - reuse_objects, result->words[i]); - fprintf(stderr, " %s\n", hash_to_hex(sha1)); - } -#endif + for (; i < result->word_alloc; ++i) { + eword_t word = result->words[i]; + size_t pos = (i * BITS_IN_EWORD); - if (!reuse_objects) - return -1; + for (offset = 0; offset < BITS_IN_EWORD; ++offset) { + if ((word >> offset) == 0) + break; - if (reuse_objects >= bitmap_git->pack->num_objects) { - bitmap_git->reuse_objects = *entries = bitmap_git->pack->num_objects; - *up_to = -1; /* reuse the full pack */ - *packfile = bitmap_git->pack; - return 0; + offset += ewah_bit_ctz64(word >> offset); + try_partial_reuse(bitmap_git, pos + offset, reuse, &w_curs); + } } - reuse_threshold = bitmap_popcount(bitmap_git->result) * REUSE_PERCENT; + unuse_pack(&w_curs); - if (reuse_objects < reuse_threshold) + *entries = bitmap_popcount(reuse); + if (!*entries) { + bitmap_free(reuse); return -1; + } - bitmap_git->reuse_objects = *entries = reuse_objects; - *up_to = bitmap_git->pack->revindex[reuse_objects].offset; - *packfile = bitmap_git->pack; - + /* + * Drop any reused objects from the result, since they will not + * need to be handled separately. + */ + bitmap_and_not(result, reuse); + *packfile_out = bitmap_git->pack; + *reuse_out = reuse; return 0; } diff --git a/pack-bitmap.h b/pack-bitmap.h index 6ab6033dbe..bcd03b8993 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -50,7 +50,8 @@ void test_bitmap_walk(struct rev_info *revs); struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs); int reuse_partial_packfile_from_bitmap(struct bitmap_index *, struct packed_git **packfile, - uint32_t *entries, off_t *up_to); + uint32_t *entries, + struct bitmap **reuse_out); int rebuild_existing_bitmaps(struct bitmap_index *, struct packing_data *mapping, kh_oid_map_t *reused_bitmaps, int show_progress); void free_bitmap_index(struct bitmap_index *);