From patchwork Mon Nov 2 18:55:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874985 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAF5FC00A89 for ; Mon, 2 Nov 2020 18:55:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8AFFB22268 for ; Mon, 2 Nov 2020 18:55:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="guAH0j0/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726099AbgKBSzS (ORCPT ); Mon, 2 Nov 2020 13:55:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725797AbgKBSzS (ORCPT ); Mon, 2 Nov 2020 13:55:18 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3D7AC0617A6 for ; Mon, 2 Nov 2020 10:55:17 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id k10so14503058wrw.13 for ; Mon, 02 Nov 2020 10:55:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=oFAswi3o4d7CVNEpLSnDCYqFONh+d8GGqoJ1wyK3dI8=; b=guAH0j0//guIUNaZy52kYTA10vkZbIrOn4gPR/KwNYRbk3cTXujlGQA1g/+KKp8Voy B3qYJLXYnxla1tOCrbyB3y2gEPREJzJradlZXc8O6+UYKwJlArkrbYL9zEuLA00H7cfK 499aIYBNHoi/9+PbE7y1uAuhe2vo23rLq43dIOKcLE10IdJBGBvTpx4RSytzAcqeYOGZ gAUxu1WtEfpewJjLQ3AHFLCw5mkZRwScYwve8mVf7z5r5X2oPbvcZmdJ3aKBkNKwk8Rt BoTK95IYiZU/qaNaXy3JKTf0St8RMUNobrZgXEuDesLZdIS5V1+YDafY/oQen5GjpvqY 2Z8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=oFAswi3o4d7CVNEpLSnDCYqFONh+d8GGqoJ1wyK3dI8=; b=P2uIx8+wp489ee+0Ln6i35XDZhJeGI5lk1jZ5q8uAh1Kgv7hrrnBlkfuVYwWLTYHMy d/G62uQSnPsK+sSkNnvZLg7gHroWvqMovahwNNreOH8quRy8mwRlfmZfx9d4Z3o9fL+y Fld/WLi9OYLt+ubgFOO3p+pk4SdrkIoqlh/btYv9MPqXptZbSDekRLPx7AyfIbZcYsDQ ddl7jLsl3+GdPyAa5czQXfuIYVssD0RgWSxsLvM+mVr4Y7Cxx4dGUMhVBqkt9KZcWNgy lnKJeLwR0OKiYdBVpDzKQV/1F7z4EKBX0aGTpsAlHfXEM/VLnAPlQr+n2EZewGNxPfKv hjlQ== X-Gm-Message-State: AOAM531iD5Fyu3BA9A/S336dp1E2hfBCounM/dj/vfTkUA0aLID247uT JxX6qcH6qvnXTXedc9QAnDmrr1AHfb4= X-Google-Smtp-Source: ABdhPJxvLiNcgUVtMPT8Xx9FF19YNIv3+VRk29vLKCvf0Tt086kqBrMCZhwFhCTBGEbQyf5CbitgKw== X-Received: by 2002:adf:9bc9:: with SMTP id e9mr20870942wrc.94.1604343316514; Mon, 02 Nov 2020 10:55:16 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v14sm23838905wrq.46.2020.11.02.10.55.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:16 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:01 +0000 Subject: [PATCH v3 01/13] hashmap: add usage documentation explaining hashmap_free[_entries]() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren The existence of hashmap_free() and hashmap_free_entries() confused me, and the docs weren't clear enough. We are dealing with a map table, entries in that table, and possibly also things each of those entries point to. I had to consult other source code examples and the implementation. Add a brief note to clarify the differences. This will become even more important once we introduce a new hashmap_partial_clear() function which will add the question of whether the table itself has been freed. Signed-off-by: Elijah Newren --- hashmap.h | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/hashmap.h b/hashmap.h index b011b394fe..2994dc7a9c 100644 --- a/hashmap.h +++ b/hashmap.h @@ -236,13 +236,40 @@ void hashmap_init(struct hashmap *map, void hashmap_free_(struct hashmap *map, ssize_t offset); /* - * Frees a hashmap structure and allocated memory, leaves entries undisturbed + * Frees a hashmap structure and allocated memory for the table, but does not + * free the entries nor anything they point to. + * + * Usage note: + * + * Many callers will need to iterate over all entries and free the data each + * entry points to; in such a case, they can free the entry itself while at it. + * Thus, you might see: + * + * hashmap_for_each_entry(map, hashmap_iter, e, hashmap_entry_name) { + * free(e->somefield); + * free(e); + * } + * hashmap_free(map); + * + * instead of + * + * hashmap_for_each_entry(map, hashmap_iter, e, hashmap_entry_name) { + * free(e->somefield); + * } + * hashmap_free_entries(map, struct my_entry_struct, hashmap_entry_name); + * + * to avoid the implicit extra loop over the entries. However, if there are + * no special fields in your entry that need to be freed beyond the entry + * itself, it is probably simpler to avoid the explicit loop and just call + * hashmap_free_entries(). */ #define hashmap_free(map) hashmap_free_(map, -1) /* * Frees @map and all entries. @type is the struct type of the entry - * where @member is the hashmap_entry struct used to associate with @map + * where @member is the hashmap_entry struct used to associate with @map. + * + * See usage note above hashmap_free(). */ #define hashmap_free_entries(map, type, member) \ hashmap_free_(map, offsetof(type, member)); From patchwork Mon Nov 2 18:55:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D3E5C00A89 for ; Mon, 2 Nov 2020 18:55:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CEDA322268 for ; Mon, 2 Nov 2020 18:55:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cBRTIITU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726277AbgKBSzV (ORCPT ); Mon, 2 Nov 2020 13:55:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34298 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726162AbgKBSzU (ORCPT ); Mon, 2 Nov 2020 13:55:20 -0500 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CEEBEC061A04 for ; Mon, 2 Nov 2020 10:55:18 -0800 (PST) Received: by mail-wm1-x336.google.com with SMTP id e2so10514102wme.1 for ; Mon, 02 Nov 2020 10:55:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=lfeFxSmzBLKnFekm6mzwPvSqu+S1DEbaBDz97hV1DAw=; b=cBRTIITUjdBIc28SfTElVgoBWeUe/7eqG7emK2+Lo4HfQVYpu+6NVUbxMYEyTuMBsZ rbFhiOXStvIgYFK2cv3gysc7tmm6bNtkdVFBlrYeSIkiSADm09YublTZGtmOFVuqzjId 8dleE/9YUFy823zvwCn0YT8YpGLf6uTUscM8FFLnoof+VwHh8o+I5Wi0UJIu/A72NX45 yfEGUoyTrKGn0j0S5lr6+1NgYATTT4wVir0+v2ADgDHWvRUGKvPmD31MI+AFWEsZbyw6 vnP6cfk9z50shiLTgU2Dy/Br/2+WNgCXBxqPakHEcMg21E8jSiGsHoeR9i3L37dGq41u 6pJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=lfeFxSmzBLKnFekm6mzwPvSqu+S1DEbaBDz97hV1DAw=; b=fbXCM90L6XvaceR9ivZ+qW9HL921+5PdL97ZHVvHmmmdmsIaniQgF3B4VTThk8lpYD N+71AamkVBivZBp5XcWS9Uqol9f+B4E8gwZHLyb5VrebmwX1B/pXOY7bBd/hgV5BpsuP qEf3YMaOJcF6Zk9yLBzexFkxKnw8prHY8r2xbrb0baSps/bRIvz8WNhaAUHSeB4TrF0U c7GnSEXRXRZUOLwzzQqP6tNWNfpo8vNHAaKGd3R8MDzCMhVa5E+YqgixZhUtkZ6e4OBy V6OEL6xeD1JfRwqod7dmJRn6ABKTm8CdRgjfGBm/Tv69H76GFclKcr+EVnBGL6blwlt2 RflA== X-Gm-Message-State: AOAM530/k4iEKRS9QTCwSPTfIvXfsqY8/Nn77tuznOGw7QJfoTg70ZwP 7GHrV83o138bkBmvM09mRrqM3kG/w4g= X-Google-Smtp-Source: ABdhPJxmy3TGGzzfDpM2w6R1ZyzuFCDHNr5D6ik3yCkhO/8ap7B7FtK08vNgEccC6vW6Hr08EowevA== X-Received: by 2002:a1c:490b:: with SMTP id w11mr9680533wma.101.1604343317454; Mon, 02 Nov 2020 10:55:17 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c129sm401027wmd.7.2020.11.02.10.55.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:16 -0800 (PST) Message-Id: <591161fd781b7666ddaa45694eb20610cc359741.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:02 +0000 Subject: [PATCH v3 02/13] hashmap: adjust spacing to fix argument alignment Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren No actual code changes; just whitespace adjustments. Signed-off-by: Elijah Newren --- hashmap.c | 17 +++++++++-------- hashmap.h | 22 +++++++++++----------- 2 files changed, 20 insertions(+), 19 deletions(-) diff --git a/hashmap.c b/hashmap.c index 09813e1a46..e44d8a3e85 100644 --- a/hashmap.c +++ b/hashmap.c @@ -92,8 +92,9 @@ static void alloc_table(struct hashmap *map, unsigned int size) } static inline int entry_equals(const struct hashmap *map, - const struct hashmap_entry *e1, const struct hashmap_entry *e2, - const void *keydata) + const struct hashmap_entry *e1, + const struct hashmap_entry *e2, + const void *keydata) { return (e1 == e2) || (e1->hash == e2->hash && @@ -101,7 +102,7 @@ static inline int entry_equals(const struct hashmap *map, } static inline unsigned int bucket(const struct hashmap *map, - const struct hashmap_entry *key) + const struct hashmap_entry *key) { return key->hash & (map->tablesize - 1); } @@ -148,7 +149,7 @@ static int always_equal(const void *unused_cmp_data, } void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, - const void *cmpfn_data, size_t initial_size) + const void *cmpfn_data, size_t initial_size) { unsigned int size = HASHMAP_INITIAL_SIZE; @@ -199,7 +200,7 @@ struct hashmap_entry *hashmap_get(const struct hashmap *map, } struct hashmap_entry *hashmap_get_next(const struct hashmap *map, - const struct hashmap_entry *entry) + const struct hashmap_entry *entry) { struct hashmap_entry *e = entry->next; for (; e; e = e->next) @@ -225,8 +226,8 @@ void hashmap_add(struct hashmap *map, struct hashmap_entry *entry) } struct hashmap_entry *hashmap_remove(struct hashmap *map, - const struct hashmap_entry *key, - const void *keydata) + const struct hashmap_entry *key, + const void *keydata) { struct hashmap_entry *old; struct hashmap_entry **e = find_entry_ptr(map, key, keydata); @@ -249,7 +250,7 @@ struct hashmap_entry *hashmap_remove(struct hashmap *map, } struct hashmap_entry *hashmap_put(struct hashmap *map, - struct hashmap_entry *entry) + struct hashmap_entry *entry) { struct hashmap_entry *old = hashmap_remove(map, entry, NULL); hashmap_add(map, entry); diff --git a/hashmap.h b/hashmap.h index 2994dc7a9c..904f61d6e1 100644 --- a/hashmap.h +++ b/hashmap.h @@ -228,9 +228,9 @@ struct hashmap { * prevent expensive resizing. If 0, the table is dynamically resized. */ void hashmap_init(struct hashmap *map, - hashmap_cmp_fn equals_function, - const void *equals_function_data, - size_t initial_size); + hashmap_cmp_fn equals_function, + const void *equals_function_data, + size_t initial_size); /* internal function for freeing hashmap */ void hashmap_free_(struct hashmap *map, ssize_t offset); @@ -288,7 +288,7 @@ void hashmap_free_(struct hashmap *map, ssize_t offset); * and if it is on stack, you can just let it go out of scope). */ static inline void hashmap_entry_init(struct hashmap_entry *e, - unsigned int hash) + unsigned int hash) { e->hash = hash; e->next = NULL; @@ -330,8 +330,8 @@ static inline unsigned int hashmap_get_size(struct hashmap *map) * to `hashmap_cmp_fn` to decide whether the entry matches the key. */ struct hashmap_entry *hashmap_get(const struct hashmap *map, - const struct hashmap_entry *key, - const void *keydata); + const struct hashmap_entry *key, + const void *keydata); /* * Returns the hashmap entry for the specified hash code and key data, @@ -364,7 +364,7 @@ static inline struct hashmap_entry *hashmap_get_from_hash( * call to `hashmap_get` or `hashmap_get_next`. */ struct hashmap_entry *hashmap_get_next(const struct hashmap *map, - const struct hashmap_entry *entry); + const struct hashmap_entry *entry); /* * Adds a hashmap entry. This allows to add duplicate entries (i.e. @@ -384,7 +384,7 @@ void hashmap_add(struct hashmap *map, struct hashmap_entry *entry); * Returns the replaced entry, or NULL if not found (i.e. the entry was added). */ struct hashmap_entry *hashmap_put(struct hashmap *map, - struct hashmap_entry *entry); + struct hashmap_entry *entry); /* * Adds or replaces a hashmap entry contained within @keyvar, @@ -406,8 +406,8 @@ struct hashmap_entry *hashmap_put(struct hashmap *map, * Argument explanation is the same as in `hashmap_get`. */ struct hashmap_entry *hashmap_remove(struct hashmap *map, - const struct hashmap_entry *key, - const void *keydata); + const struct hashmap_entry *key, + const void *keydata); /* * Removes a hashmap entry contained within @keyvar, @@ -449,7 +449,7 @@ struct hashmap_entry *hashmap_iter_next(struct hashmap_iter *iter); /* Initializes the iterator and returns the first entry, if any. */ static inline struct hashmap_entry *hashmap_iter_first(struct hashmap *map, - struct hashmap_iter *iter) + struct hashmap_iter *iter) { hashmap_iter_init(map, iter); return hashmap_iter_next(iter); From patchwork Mon Nov 2 18:55:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD9B6C2D0A3 for ; Mon, 2 Nov 2020 18:55:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 63DAE22268 for ; Mon, 2 Nov 2020 18:55:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MmVpuPlt" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726433AbgKBSzX (ORCPT ); Mon, 2 Nov 2020 13:55:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726162AbgKBSzV (ORCPT ); Mon, 2 Nov 2020 13:55:21 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B29E4C061A47 for ; Mon, 2 Nov 2020 10:55:19 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id b3so9926937wrx.11 for ; Mon, 02 Nov 2020 10:55:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=IEPGTilWFeb/gYnUiQHEDPP0R8dCTuzsEUTPxGrkZm8=; b=MmVpuPltKb7eA6tjoT/JFGHI30J3q6pQzrOvYQyU1alDdeD0r5Egsla351n7/Zv8kl mhshrusduW/DAkf1hmKb+zel5q5xD0aBsuBl5h20pKyVbdrGoIR54h/SctqX2OucBiIC Dc1yygq+EldB7j9FeL9kBGE13IXaSeUhvJ32TSEcOSNtlhG2gDhBEq1G8TT2glPIf8Jh AlbEoN6bI5+i9mEuqnrXMbONj+6VCLwG6hrtiH9NX2iCcmqFtgfbRFZ9wIYWNrRAijvK IPS5fpGkK4MMEnV8+Zhk2vIkLQvFAt0l/aDZuOCPDUtuJzCNS0wlbih3hAb0WDrpQJZV SbGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=IEPGTilWFeb/gYnUiQHEDPP0R8dCTuzsEUTPxGrkZm8=; b=X6bghLz/DFfpZ5/tTcqR/jxzHW+a7FT/PhqQ3zHnY0SCH27feGVgoWTnqHjQ0Ptg0c +B7/lbIev/RVV6/Ul6sAfSBp5N8KyStwSkluX1fMBhFQOqN5OOhXBH0Gq0DV2rBHoAMM ceYcQJ/JPKwn2d0bjcD3dfUf4XHpAdm6aAUuE8D8YqyVwvzXHYCXpEYRTK07pakDwdE+ blOVF7+CU2YqGYo65K1nuxVnMhBIAHZP+kdA26BykA+jA463zvres7qq6t4U9gY83lk+ 4CHrzzpvcdL10o7V+hpZBHS8k9/o8viHPIBL9uFjwXOxYEzjt28OvXYo+h+7jlx1SS5t 0Hag== X-Gm-Message-State: AOAM533MRP+570pEBqPuLE7G7UY31oJtXGFjI6APxkhGwNHceEmRjVSv RrufcobUc+npIVQDazcbrQ/ogaCWP4c= X-Google-Smtp-Source: ABdhPJw7xCAmEQXjQJJmOSQa9FZK8C6deNx1wGda0zAcxcIN4Qy6fAAup/Qvi+J2st9RkbhFlV23xA== X-Received: by 2002:adf:fc07:: with SMTP id i7mr16712331wrr.223.1604343318334; Mon, 02 Nov 2020 10:55:18 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f13sm23930021wrp.12.2020.11.02.10.55.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:17 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:03 +0000 Subject: [PATCH v3 03/13] hashmap: allow re-use after hashmap_free() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Previously, once map->table had been freed, any calls to hashmap_put(), hashmap_get(), or hashmap_remove() would cause a NULL pointer dereference (since hashmap_free_() also zeros the memory; without that zeroing, calling these functions would cause a use-after-free problem). Modify these functions to check for a NULL table and automatically allocate as needed. Also add a HASHMAP_INIT(fn, data) macro for initializing hashmaps on the stack without calling hashmap_init(). Signed-off-by: Elijah Newren --- hashmap.c | 16 ++++++++++++++-- hashmap.h | 3 +++ 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/hashmap.c b/hashmap.c index e44d8a3e85..bb7c9979b8 100644 --- a/hashmap.c +++ b/hashmap.c @@ -114,6 +114,7 @@ int hashmap_bucket(const struct hashmap *map, unsigned int hash) static void rehash(struct hashmap *map, unsigned int newsize) { + /* map->table MUST NOT be NULL when this function is called */ unsigned int i, oldsize = map->tablesize; struct hashmap_entry **oldtable = map->table; @@ -134,6 +135,7 @@ static void rehash(struct hashmap *map, unsigned int newsize) static inline struct hashmap_entry **find_entry_ptr(const struct hashmap *map, const struct hashmap_entry *key, const void *keydata) { + /* map->table MUST NOT be NULL when this function is called */ struct hashmap_entry **e = &map->table[bucket(map, key)]; while (*e && !entry_equals(map, *e, key, keydata)) e = &(*e)->next; @@ -196,6 +198,8 @@ struct hashmap_entry *hashmap_get(const struct hashmap *map, const struct hashmap_entry *key, const void *keydata) { + if (!map->table) + return NULL; return *find_entry_ptr(map, key, keydata); } @@ -211,8 +215,12 @@ struct hashmap_entry *hashmap_get_next(const struct hashmap *map, void hashmap_add(struct hashmap *map, struct hashmap_entry *entry) { - unsigned int b = bucket(map, entry); + unsigned int b; + + if (!map->table) + alloc_table(map, HASHMAP_INITIAL_SIZE); + b = bucket(map, entry); /* add entry */ entry->next = map->table[b]; map->table[b] = entry; @@ -230,7 +238,11 @@ struct hashmap_entry *hashmap_remove(struct hashmap *map, const void *keydata) { struct hashmap_entry *old; - struct hashmap_entry **e = find_entry_ptr(map, key, keydata); + struct hashmap_entry **e; + + if (!map->table) + return NULL; + e = find_entry_ptr(map, key, keydata); if (!*e) return NULL; diff --git a/hashmap.h b/hashmap.h index 904f61d6e1..3b0f2bcade 100644 --- a/hashmap.h +++ b/hashmap.h @@ -210,6 +210,9 @@ struct hashmap { /* hashmap functions */ +#define HASHMAP_INIT(fn, data) { .cmpfn = fn, .cmpfn_data = data, \ + .do_count_items = 1 } + /* * Initializes a hashmap structure. * From patchwork Mon Nov 2 18:55:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CB88C388F9 for ; Mon, 2 Nov 2020 18:55:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 45F8822268 for ; Mon, 2 Nov 2020 18:55:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="uhPN5FZy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726366AbgKBSzV (ORCPT ); Mon, 2 Nov 2020 13:55:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726236AbgKBSzV (ORCPT ); Mon, 2 Nov 2020 13:55:21 -0500 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8EDFFC061A48 for ; Mon, 2 Nov 2020 10:55:20 -0800 (PST) Received: by mail-wm1-x342.google.com with SMTP id k18so10512004wmj.5 for ; Mon, 02 Nov 2020 10:55:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hz04PDbA9UEQiHo/9pZvZ4q4Euew0clND1poGrjjarc=; b=uhPN5FZyV1pkfpMasGAV6hIs6WfZHKI6ZSj3Cj0ce7mSV79D/92Oz2XLbBUQkGB0p6 Y7kXt5fcMpfeCHt4ajTGs0OQu6dEVxt/Sl9MGCxJXDwefcbmLYkxEJBbtYcV3kWXqhbo 7UAfVydVbwxlkZoxJHLBqwOjKjaREiD+Yy7LYbSYUQb7T/G51YUcJk0AWJ19Klz5dE95 0IUgSxCHhN2HSE/rpThpZIZsD8KGTg8X2pACDNGSZyYthHP7w9udz8MS+JB6FzjfdYXz XarJWOas8AwzLA+d4l29M5gPp5Xp7WBpC2hUt8WW0HnjHbr4CLmsu3seB1ZQMbEFqbzv Xd7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hz04PDbA9UEQiHo/9pZvZ4q4Euew0clND1poGrjjarc=; b=F/wpJbiLd4IdxeS7bLziYS6d5V7HyAsRIwnxglhjrwnmVK0IFLicLhDEtCF2nOLl6L Qbx2/Em7x5JblPzSU2VuJiaDjhWtGo19INiV4+XOGt6NeSvy37SYXqo0KFKOC0AdzXr/ aM1VwFOF01XSQUg/qZxy46rFyicRYNtFkPC1Ucd6fcgnkRGj35+6larzpKVwSDpOqxPz Ou5t3u4YGpUZP5gkhOuLtXXXBrdrswrzhgu+8s9zCOW7aNqOt/13JsJKBHcx/5jkkDZt 3eeuGSLY1Epig0+rq3FFawqSee5+2PZZMQdaYZsgxR9XcaxhhKumwGtGEp/RRHsYXZLQ vcfg== X-Gm-Message-State: AOAM532bJ41aErl6BEjh08ekj64HqfIidHjtvzZEU2tptSXL1B323zUf 5du9PP7l+zkPZFpnw/kSqQAjbAMSNr8= X-Google-Smtp-Source: ABdhPJxkueQRgNPtLI/AYkPHvni9PJebmphDeScuSGItJnCP1T7Fr2+OQ8GpI/aAVFzNAD+y+FI36Q== X-Received: by 2002:a1c:3103:: with SMTP id x3mr19147958wmx.107.1604343319133; Mon, 02 Nov 2020 10:55:19 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p4sm24633338wrf.67.2020.11.02.10.55.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:18 -0800 (PST) Message-Id: <61f1da3c51b521035ba728c025cd6a28397f8051.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:04 +0000 Subject: [PATCH v3 04/13] hashmap: introduce a new hashmap_partial_clear() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren merge-ort is a heavy user of strmaps, which are built on hashmap.[ch]. clear_or_reinit_internal_opts() in merge-ort was taking about 12% of overall runtime in my testcase involving rebasing 35 patches of linux.git across a big rename. clear_or_reinit_internal_opts() was calling hashmap_free() followed by hashmap_init(), meaning that not only was it freeing all the memory associated with each of the strmaps just to immediately allocate a new array again, it was allocating a new array that was likely smaller than needed (thus resulting in later need to rehash things). The ending size of the map table on the previous commit was likely almost perfectly sized for the next commit we wanted to pick, and not dropping and reallocating the table immediately is a win. Add some new API to hashmap to clear a hashmap of entries without freeing map->table (and instead only zeroing it out like alloc_table() would do, along with zeroing the count of items in the table and the shrink_at field). Signed-off-by: Elijah Newren --- hashmap.c | 39 +++++++++++++++++++++++++++------------ hashmap.h | 13 ++++++++++++- 2 files changed, 39 insertions(+), 13 deletions(-) diff --git a/hashmap.c b/hashmap.c index bb7c9979b8..922ed07954 100644 --- a/hashmap.c +++ b/hashmap.c @@ -174,22 +174,37 @@ void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, map->do_count_items = 1; } +static void free_individual_entries(struct hashmap *map, ssize_t entry_offset) +{ + struct hashmap_iter iter; + struct hashmap_entry *e; + + hashmap_iter_init(map, &iter); + while ((e = hashmap_iter_next(&iter))) + /* + * like container_of, but using caller-calculated + * offset (caller being hashmap_free_entries) + */ + free((char *)e - entry_offset); +} + +void hashmap_partial_clear_(struct hashmap *map, ssize_t entry_offset) +{ + if (!map || !map->table) + return; + if (entry_offset >= 0) /* called by hashmap_clear_entries */ + free_individual_entries(map, entry_offset); + memset(map->table, 0, map->tablesize * sizeof(struct hashmap_entry *)); + map->shrink_at = 0; + map->private_size = 0; +} + void hashmap_free_(struct hashmap *map, ssize_t entry_offset) { if (!map || !map->table) return; - if (entry_offset >= 0) { /* called by hashmap_free_entries */ - struct hashmap_iter iter; - struct hashmap_entry *e; - - hashmap_iter_init(map, &iter); - while ((e = hashmap_iter_next(&iter))) - /* - * like container_of, but using caller-calculated - * offset (caller being hashmap_free_entries) - */ - free((char *)e - entry_offset); - } + if (entry_offset >= 0) /* called by hashmap_free_entries */ + free_individual_entries(map, entry_offset); free(map->table); memset(map, 0, sizeof(*map)); } diff --git a/hashmap.h b/hashmap.h index 3b0f2bcade..e9430d582a 100644 --- a/hashmap.h +++ b/hashmap.h @@ -235,7 +235,8 @@ void hashmap_init(struct hashmap *map, const void *equals_function_data, size_t initial_size); -/* internal function for freeing hashmap */ +/* internal functions for clearing or freeing hashmap */ +void hashmap_partial_clear_(struct hashmap *map, ssize_t offset); void hashmap_free_(struct hashmap *map, ssize_t offset); /* @@ -268,6 +269,16 @@ void hashmap_free_(struct hashmap *map, ssize_t offset); */ #define hashmap_free(map) hashmap_free_(map, -1) +/* + * Basically the same as calling hashmap_free() followed by hashmap_init(), + * but doesn't incur the overhead of deallocating and reallocating + * map->table; it leaves map->table allocated and the same size but zeroes + * it out so it's ready for use again as an empty map. As with + * hashmap_free(), you may need to free the entries yourself before calling + * this function. + */ +#define hashmap_partial_clear(map) hashmap_partial_clear_(map, -1) + /* * Frees @map and all entries. @type is the struct type of the entry * where @member is the hashmap_entry struct used to associate with @map. From patchwork Mon Nov 2 18:55:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CC42C2D0A3 for ; Mon, 2 Nov 2020 18:55:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C30F22268 for ; Mon, 2 Nov 2020 18:55:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="d5wUG/MU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726088AbgKBSz1 (ORCPT ); Mon, 2 Nov 2020 13:55:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726369AbgKBSzW (ORCPT ); Mon, 2 Nov 2020 13:55:22 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A89A4C0617A6 for ; Mon, 2 Nov 2020 10:55:21 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id w14so15818170wrs.9 for ; Mon, 02 Nov 2020 10:55:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=eD1ODDn8YazYa2TxKvej2ageSWRW9od1dWKHaSlyndc=; b=d5wUG/MU2pRZ8XpHAj7Y+AEdsInDnuP3JWlNblQE7qyldhVDGt//jaAE4n0GHyMS6O nlG1DWa5cajKKAf39OPMG3Ri6xcwrQAU7fZUu+NhdNADtviX2oMC1bovXpLRgDVZn68H xhOFk7wWTJSokIKZfxkfH24n0kDiYjfagvByA9i6oyA/9e0BWmxnZVx0+Fj8ige+Eq+5 5+m426XbPxGk6R1TXWkzM0F3y9tivVaz6YyZ6hy1GTVgQpktWNn6VioG9j0xR73HidfX zr01iHTXR4ZX+hiyEl30nGsi5jMq3LaIMbvYUFRm9G1qBY+qHEOzuLAv9I1vBe/0U0cD SPKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=eD1ODDn8YazYa2TxKvej2ageSWRW9od1dWKHaSlyndc=; b=TKrQ3OH2afvHuJ6rK841qzL95KcUlRP6X6DkIYe+JF60Y3wUs5+xqhrP/9ESZZrhHE KdPN0Dd0ZP6a/gSZcSHJtKfaiTW8y/8IKf+O0sEBHgGNQnbI5ilYO+oD7qOI463Y9fqA +0XKEDn+TkJSmI2YLexYFVz0iA7Q1llKULa1lxi1/JG9cWfBLrdEUJ08IzqCnU65N0Z3 hG6gneqAq5JpZkU1VQJ8YA9NJxcXBOjzAfaNViscAcm5FrMBAOrpNV/9YacAYC07A8S3 misRqu71pBmN82EX19/kzoXmPyy16ny9jn6qnNRAJ5OdVmje7+mYuJcVIpSE1fdzvf2A c2nw== X-Gm-Message-State: AOAM532M6nNJkBc/1NZXB23Zp0agFwAmc8K2FVxaGIfGfws/ApuXIohW RCM6IU9H1eUBolLDY0iNBwMUE1+VSSY= X-Google-Smtp-Source: ABdhPJxyuTmgf3aC/UsZV/FYf3SlzOnuaetYbTCN7ni+FpSAV0PSbOh3UqjrS/w5nhSkLgZlqcbp/A== X-Received: by 2002:a05:6000:10e:: with SMTP id o14mr22276276wrx.225.1604343319959; Mon, 02 Nov 2020 10:55:19 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l3sm442526wmg.32.2020.11.02.10.55.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:19 -0800 (PST) Message-Id: <861e8d65ae8065595d9d4ccff5f70155fec408c9.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:05 +0000 Subject: [PATCH v3 05/13] hashmap: provide deallocation function names Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed for a while, but aren't necessarily the clearest names, especially with hashmap_partial_clear() being added to the mix and lazy-initialization now being supported. Peff suggested we adopt the following names[1]: - hashmap_clear() - remove all entries and de-allocate any hashmap-specific data, but be ready for reuse - hashmap_clear_and_free() - ditto, but free the entries themselves - hashmap_partial_clear() - remove all entries but don't deallocate table - hashmap_partial_clear_and_free() - ditto, but free the entries This patch provides the new names and converts all existing callers over to the new naming scheme. [1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/ Signed-off-by: Elijah Newren --- add-interactive.c | 2 +- blame.c | 2 +- bloom.c | 2 +- builtin/fetch.c | 6 +++--- builtin/shortlog.c | 2 +- config.c | 2 +- diff.c | 4 ++-- diffcore-rename.c | 2 +- dir.c | 8 ++++---- hashmap.c | 6 +++--- hashmap.h | 44 +++++++++++++++++++++++++---------------- merge-recursive.c | 6 +++--- name-hash.c | 4 ++-- object.c | 2 +- oidmap.c | 2 +- patch-ids.c | 2 +- range-diff.c | 2 +- ref-filter.c | 2 +- revision.c | 2 +- sequencer.c | 4 ++-- submodule-config.c | 4 ++-- t/helper/test-hashmap.c | 6 +++--- 22 files changed, 63 insertions(+), 53 deletions(-) diff --git a/add-interactive.c b/add-interactive.c index 555c4abf32..a14c0feaa2 100644 --- a/add-interactive.c +++ b/add-interactive.c @@ -557,7 +557,7 @@ static int get_modified_files(struct repository *r, if (ps) clear_pathspec(&rev.prune_data); } - hashmap_free_entries(&s.file_map, struct pathname_entry, ent); + hashmap_clear_and_free(&s.file_map, struct pathname_entry, ent); if (unmerged_count) *unmerged_count = s.unmerged_count; if (binary_count) diff --git a/blame.c b/blame.c index 686845b2b4..229beb6452 100644 --- a/blame.c +++ b/blame.c @@ -435,7 +435,7 @@ static void get_fingerprint(struct fingerprint *result, static void free_fingerprint(struct fingerprint *f) { - hashmap_free(&f->map); + hashmap_clear(&f->map); free(f->entries); } diff --git a/bloom.c b/bloom.c index 68c73200a5..719c313a1c 100644 --- a/bloom.c +++ b/bloom.c @@ -287,7 +287,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, } cleanup: - hashmap_free_entries(&pathmap, struct pathmap_hash_entry, entry); + hashmap_clear_and_free(&pathmap, struct pathmap_hash_entry, entry); } else { for (i = 0; i < diff_queued_diff.nr; i++) diff_free_filepair(diff_queued_diff.queue[i]); diff --git a/builtin/fetch.c b/builtin/fetch.c index f9c3c49f14..ecf8537605 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -393,7 +393,7 @@ static void find_non_local_tags(const struct ref *refs, item = refname_hash_add(&remote_refs, ref->name, &ref->old_oid); string_list_insert(&remote_refs_list, ref->name); } - hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent); + hashmap_clear_and_free(&existing_refs, struct refname_hash_entry, ent); /* * We may have a final lightweight tag that needs to be @@ -428,7 +428,7 @@ static void find_non_local_tags(const struct ref *refs, **tail = rm; *tail = &rm->next; } - hashmap_free_entries(&remote_refs, struct refname_hash_entry, ent); + hashmap_clear_and_free(&remote_refs, struct refname_hash_entry, ent); string_list_clear(&remote_refs_list, 0); oidset_clear(&fetch_oids); } @@ -573,7 +573,7 @@ static struct ref *get_ref_map(struct remote *remote, } } if (existing_refs_populated) - hashmap_free_entries(&existing_refs, struct refname_hash_entry, ent); + hashmap_clear_and_free(&existing_refs, struct refname_hash_entry, ent); return ref_map; } diff --git a/builtin/shortlog.c b/builtin/shortlog.c index 0a5c4968f6..83f0a739b4 100644 --- a/builtin/shortlog.c +++ b/builtin/shortlog.c @@ -220,7 +220,7 @@ static void strset_clear(struct strset *ss) { if (!ss->map.table) return; - hashmap_free_entries(&ss->map, struct strset_item, ent); + hashmap_clear_and_free(&ss->map, struct strset_item, ent); } static void insert_records_from_trailers(struct shortlog *log, diff --git a/config.c b/config.c index 2bdff4457b..8f324ed3a6 100644 --- a/config.c +++ b/config.c @@ -1963,7 +1963,7 @@ void git_configset_clear(struct config_set *cs) free(entry->key); string_list_clear(&entry->value_list, 1); } - hashmap_free_entries(&cs->config_hash, struct config_set_element, ent); + hashmap_clear_and_free(&cs->config_hash, struct config_set_element, ent); cs->hash_initialized = 0; free(cs->list.items); cs->list.nr = 0; diff --git a/diff.c b/diff.c index 2bb2f8f57e..8e0e59f5cf 100644 --- a/diff.c +++ b/diff.c @@ -6289,9 +6289,9 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) if (o->color_moved == COLOR_MOVED_ZEBRA_DIM) dim_moved_lines(o); - hashmap_free_entries(&add_lines, struct moved_entry, + hashmap_clear_and_free(&add_lines, struct moved_entry, ent); - hashmap_free_entries(&del_lines, struct moved_entry, + hashmap_clear_and_free(&del_lines, struct moved_entry, ent); } diff --git a/diffcore-rename.c b/diffcore-rename.c index 99e63e90f8..d367a6d244 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -407,7 +407,7 @@ static int find_exact_renames(struct diff_options *options) renames += find_identical_files(&file_table, i, options); /* Free the hash data structure and entries */ - hashmap_free_entries(&file_table, struct file_similarity, entry); + hashmap_clear_and_free(&file_table, struct file_similarity, entry); return renames; } diff --git a/dir.c b/dir.c index 78387110e6..161dce121e 100644 --- a/dir.c +++ b/dir.c @@ -817,8 +817,8 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern clear_hashmaps: warning(_("disabling cone pattern matching")); - hashmap_free_entries(&pl->parent_hashmap, struct pattern_entry, ent); - hashmap_free_entries(&pl->recursive_hashmap, struct pattern_entry, ent); + hashmap_clear_and_free(&pl->parent_hashmap, struct pattern_entry, ent); + hashmap_clear_and_free(&pl->recursive_hashmap, struct pattern_entry, ent); pl->use_cone_patterns = 0; } @@ -921,8 +921,8 @@ void clear_pattern_list(struct pattern_list *pl) free(pl->patterns[i]); free(pl->patterns); free(pl->filebuf); - hashmap_free_entries(&pl->recursive_hashmap, struct pattern_entry, ent); - hashmap_free_entries(&pl->parent_hashmap, struct pattern_entry, ent); + hashmap_clear_and_free(&pl->recursive_hashmap, struct pattern_entry, ent); + hashmap_clear_and_free(&pl->parent_hashmap, struct pattern_entry, ent); memset(pl, 0, sizeof(*pl)); } diff --git a/hashmap.c b/hashmap.c index 922ed07954..5009471800 100644 --- a/hashmap.c +++ b/hashmap.c @@ -183,7 +183,7 @@ static void free_individual_entries(struct hashmap *map, ssize_t entry_offset) while ((e = hashmap_iter_next(&iter))) /* * like container_of, but using caller-calculated - * offset (caller being hashmap_free_entries) + * offset (caller being hashmap_clear_and_free) */ free((char *)e - entry_offset); } @@ -199,11 +199,11 @@ void hashmap_partial_clear_(struct hashmap *map, ssize_t entry_offset) map->private_size = 0; } -void hashmap_free_(struct hashmap *map, ssize_t entry_offset) +void hashmap_clear_(struct hashmap *map, ssize_t entry_offset) { if (!map || !map->table) return; - if (entry_offset >= 0) /* called by hashmap_free_entries */ + if (entry_offset >= 0) /* called by hashmap_clear_and_free */ free_individual_entries(map, entry_offset); free(map->table); memset(map, 0, sizeof(*map)); diff --git a/hashmap.h b/hashmap.h index e9430d582a..7251687d73 100644 --- a/hashmap.h +++ b/hashmap.h @@ -96,7 +96,7 @@ * } * * if (!strcmp("end", action)) { - * hashmap_free_entries(&map, struct long2string, ent); + * hashmap_clear_and_free(&map, struct long2string, ent); * break; * } * } @@ -237,7 +237,7 @@ void hashmap_init(struct hashmap *map, /* internal functions for clearing or freeing hashmap */ void hashmap_partial_clear_(struct hashmap *map, ssize_t offset); -void hashmap_free_(struct hashmap *map, ssize_t offset); +void hashmap_clear_(struct hashmap *map, ssize_t offset); /* * Frees a hashmap structure and allocated memory for the table, but does not @@ -253,40 +253,50 @@ void hashmap_free_(struct hashmap *map, ssize_t offset); * free(e->somefield); * free(e); * } - * hashmap_free(map); + * hashmap_clear(map); * * instead of * * hashmap_for_each_entry(map, hashmap_iter, e, hashmap_entry_name) { * free(e->somefield); * } - * hashmap_free_entries(map, struct my_entry_struct, hashmap_entry_name); + * hashmap_clear_and_free(map, struct my_entry_struct, hashmap_entry_name); * * to avoid the implicit extra loop over the entries. However, if there are * no special fields in your entry that need to be freed beyond the entry * itself, it is probably simpler to avoid the explicit loop and just call - * hashmap_free_entries(). + * hashmap_clear_and_free(). */ -#define hashmap_free(map) hashmap_free_(map, -1) +#define hashmap_clear(map) hashmap_clear_(map, -1) /* - * Basically the same as calling hashmap_free() followed by hashmap_init(), - * but doesn't incur the overhead of deallocating and reallocating - * map->table; it leaves map->table allocated and the same size but zeroes - * it out so it's ready for use again as an empty map. As with - * hashmap_free(), you may need to free the entries yourself before calling - * this function. + * Similar to hashmap_clear(), except that the table is no deallocated; it + * is merely zeroed out but left the same size as before. If the hashmap + * will be reused, this avoids the overhead of deallocating and + * reallocating map->table. As with hashmap_clear(), you may need to free + * the entries yourself before calling this function. */ #define hashmap_partial_clear(map) hashmap_partial_clear_(map, -1) /* - * Frees @map and all entries. @type is the struct type of the entry - * where @member is the hashmap_entry struct used to associate with @map. + * Similar to hashmap_clear() but also frees all entries. @type is the + * struct type of the entry where @member is the hashmap_entry struct used + * to associate with @map. * - * See usage note above hashmap_free(). + * See usage note above hashmap_clear(). */ -#define hashmap_free_entries(map, type, member) \ - hashmap_free_(map, offsetof(type, member)); +#define hashmap_clear_and_free(map, type, member) \ + hashmap_clear_(map, offsetof(type, member)) + +/* + * Similar to hashmap_partial_clear() but also frees all entries. @type is + * the struct type of the entry where @member is the hashmap_entry struct + * used to associate with @map. + * + * See usage note above hashmap_clear(). + */ +#define hashmap_partial_clear_and_free(map, type, member) \ + hashmap_partial_clear_(map, offsetof(type, member)) /* hashmap_entry functions */ diff --git a/merge-recursive.c b/merge-recursive.c index d0214335a7..f736a0f632 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -2651,7 +2651,7 @@ static struct string_list *get_renames(struct merge_options *opt, free(e->target_file); string_list_clear(&e->source_files, 0); } - hashmap_free_entries(&collisions, struct collision_entry, ent); + hashmap_clear_and_free(&collisions, struct collision_entry, ent); return renames; } @@ -2870,7 +2870,7 @@ static void initial_cleanup_rename(struct diff_queue_struct *pairs, strbuf_release(&e->new_dir); /* possible_new_dirs already cleared in get_directory_renames */ } - hashmap_free_entries(dir_renames, struct dir_rename_entry, ent); + hashmap_clear_and_free(dir_renames, struct dir_rename_entry, ent); free(dir_renames); free(pairs->queue); @@ -3497,7 +3497,7 @@ static int merge_trees_internal(struct merge_options *opt, string_list_clear(entries, 1); free(entries); - hashmap_free_entries(&opt->priv->current_file_dir_set, + hashmap_clear_and_free(&opt->priv->current_file_dir_set, struct path_hashmap_entry, e); if (clean < 0) { diff --git a/name-hash.c b/name-hash.c index fb526a3775..5d3c7b12c1 100644 --- a/name-hash.c +++ b/name-hash.c @@ -726,6 +726,6 @@ void free_name_hash(struct index_state *istate) return; istate->name_hash_initialized = 0; - hashmap_free(&istate->name_hash); - hashmap_free_entries(&istate->dir_hash, struct dir_entry, ent); + hashmap_clear(&istate->name_hash); + hashmap_clear_and_free(&istate->dir_hash, struct dir_entry, ent); } diff --git a/object.c b/object.c index 3257518656..b8406409d5 100644 --- a/object.c +++ b/object.c @@ -532,7 +532,7 @@ void raw_object_store_clear(struct raw_object_store *o) close_object_store(o); o->packed_git = NULL; - hashmap_free(&o->pack_map); + hashmap_clear(&o->pack_map); } void parsed_object_pool_clear(struct parsed_object_pool *o) diff --git a/oidmap.c b/oidmap.c index 423aa014a3..286a04a53c 100644 --- a/oidmap.c +++ b/oidmap.c @@ -27,7 +27,7 @@ void oidmap_free(struct oidmap *map, int free_entries) return; /* TODO: make oidmap itself not depend on struct layouts */ - hashmap_free_(&map->map, free_entries ? 0 : -1); + hashmap_clear_(&map->map, free_entries ? 0 : -1); } void *oidmap_get(const struct oidmap *map, const struct object_id *key) diff --git a/patch-ids.c b/patch-ids.c index 12aa6d494b..21973e4933 100644 --- a/patch-ids.c +++ b/patch-ids.c @@ -71,7 +71,7 @@ int init_patch_ids(struct repository *r, struct patch_ids *ids) int free_patch_ids(struct patch_ids *ids) { - hashmap_free_entries(&ids->patches, struct patch_id, ent); + hashmap_clear_and_free(&ids->patches, struct patch_id, ent); return 0; } diff --git a/range-diff.c b/range-diff.c index 24dc435e48..befeecae44 100644 --- a/range-diff.c +++ b/range-diff.c @@ -266,7 +266,7 @@ static void find_exact_matches(struct string_list *a, struct string_list *b) } } - hashmap_free(&map); + hashmap_clear(&map); } static void diffsize_consume(void *data, char *line, unsigned long len) diff --git a/ref-filter.c b/ref-filter.c index c62f6b4822..5e66b8cd76 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2222,7 +2222,7 @@ void ref_array_clear(struct ref_array *array) used_atom_cnt = 0; if (ref_to_worktree_map.worktrees) { - hashmap_free_entries(&(ref_to_worktree_map.map), + hashmap_clear_and_free(&(ref_to_worktree_map.map), struct ref_to_worktree_entry, ent); free_worktrees(ref_to_worktree_map.worktrees); ref_to_worktree_map.worktrees = NULL; diff --git a/revision.c b/revision.c index aa62212040..f27649d45d 100644 --- a/revision.c +++ b/revision.c @@ -139,7 +139,7 @@ static void paths_and_oids_clear(struct hashmap *map) free(entry->path); } - hashmap_free_entries(map, struct path_and_oids_entry, ent); + hashmap_clear_and_free(map, struct path_and_oids_entry, ent); } static void paths_and_oids_insert(struct hashmap *map, diff --git a/sequencer.c b/sequencer.c index 00acb12496..23a09c3e7a 100644 --- a/sequencer.c +++ b/sequencer.c @@ -5058,7 +5058,7 @@ static int make_script_with_merges(struct pretty_print_context *pp, oidmap_free(&commit2todo, 1); oidmap_free(&state.commit2label, 1); - hashmap_free_entries(&state.labels, struct labels_entry, entry); + hashmap_clear_and_free(&state.labels, struct labels_entry, entry); strbuf_release(&state.buf); return 0; @@ -5577,7 +5577,7 @@ int todo_list_rearrange_squash(struct todo_list *todo_list) for (i = 0; i < todo_list->nr; i++) free(subjects[i]); free(subjects); - hashmap_free_entries(&subject2item, struct subject2item_entry, entry); + hashmap_clear_and_free(&subject2item, struct subject2item_entry, entry); clear_commit_todo_item(&commit_todo); diff --git a/submodule-config.c b/submodule-config.c index c569e22aa3..f502505566 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -103,8 +103,8 @@ static void submodule_cache_clear(struct submodule_cache *cache) ent /* member name */) free_one_config(entry); - hashmap_free_entries(&cache->for_path, struct submodule_entry, ent); - hashmap_free_entries(&cache->for_name, struct submodule_entry, ent); + hashmap_clear_and_free(&cache->for_path, struct submodule_entry, ent); + hashmap_clear_and_free(&cache->for_name, struct submodule_entry, ent); cache->initialized = 0; cache->gitmodules_read = 0; } diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c index f38706216f..2475663b49 100644 --- a/t/helper/test-hashmap.c +++ b/t/helper/test-hashmap.c @@ -110,7 +110,7 @@ static void perf_hashmap(unsigned int method, unsigned int rounds) hashmap_add(&map, &entries[i]->ent); } - hashmap_free(&map); + hashmap_clear(&map); } } else { /* test map lookups */ @@ -130,7 +130,7 @@ static void perf_hashmap(unsigned int method, unsigned int rounds) } } - hashmap_free(&map); + hashmap_clear(&map); } } @@ -262,6 +262,6 @@ int cmd__hashmap(int argc, const char **argv) } strbuf_release(&line); - hashmap_free_entries(&map, struct test_entry, ent); + hashmap_clear_and_free(&map, struct test_entry, ent); return 0; } From patchwork Mon Nov 2 18:55:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E62B2C56202 for ; Mon, 2 Nov 2020 18:55:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 93F7B2225E for ; Mon, 2 Nov 2020 18:55:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="M72sFb/7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726594AbgKBSzh (ORCPT ); Mon, 2 Nov 2020 13:55:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726236AbgKBSzW (ORCPT ); Mon, 2 Nov 2020 13:55:22 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29183C061A04 for ; Mon, 2 Nov 2020 10:55:22 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id x7so15867288wrl.3 for ; Mon, 02 Nov 2020 10:55:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=oEYc2S4ksSfmtDUY2461bdvZotWwPmh9voOUaSelCBs=; b=M72sFb/77Y6ZmuFVCCyRmF7XLt+hfA433cxJejnib0s7C+QarbDRlh0L0ldWXElXC4 KJF0H00+Zwv0H5zvjeJdB4/PpspyUioOs9aNXKvqXok8uK3U7OwZ+XzJBq/D2nrQuNqN T/YywIpOf55548XxLSrEhIE2an8w4IOxpAWKLHhyEGIkEarw2fBJWWW7aMm4yZFh0jhb 9rXlzi82LPKMVcuAA3kO9YRVwvF3kdLoyltV3JNmHnaKp7BOUrlKqbEmtYfTcM7NbQlu phONxcUyxFetEB4UetBFuPvgO8BNYdRnd5wIkdCiALCVRy1sgy4R1aFMs8lrL6bp5Jkn gf0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=oEYc2S4ksSfmtDUY2461bdvZotWwPmh9voOUaSelCBs=; b=A+DCRQogq7xvYLoYHTIxAT3iYaKj9WlKaUxRYUfMewRoA9LS7pC3yeL03EdWah5XOt krgWIZupHlWmLijpxvqsrHQotNVSH1nXkkLcmQa4Z4j+qd9xgNp/zGETPflkopDKdGYN Q59U1ADvRT0uqqKcDcXwVI2/L7wU4zngPWxSy8MExyWdDnKfRyRZaDeVceWHdk2UROng oNT+HZ2ATQz19L1/+2dvYUeEMPk8Sm2Oj+NF2GivFBarew9uR8wrmAD+IgheH0JGRsJm qerAg0T4lnrIzo9QNYaQ0glD8x/PwFoARxbJ+GStcVR+e4W+OJvzp50u6+7ECmFtf102 IQzQ== X-Gm-Message-State: AOAM532cZG9ZLK/sEyvakXwYAopId3ds86R4njdjcTe2Z2Eq2SAh2zJu EZiM14BsQF6Cd1IsgP5Gd+g1Ruy/6XM= X-Google-Smtp-Source: ABdhPJxO1b2vd44mPbc0KsbUBWoruDbCjPFGwr7PP4Z01k3ns+IYFzyD76bDTSsf0Fi9/9QvByxmCQ== X-Received: by 2002:a05:6000:1050:: with SMTP id c16mr21950966wrx.400.1604343320655; Mon, 02 Nov 2020 10:55:20 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n8sm377160wmc.11.2020.11.02.10.55.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:20 -0800 (PST) Message-Id: <448d3b219ffebbc0daa4ef033d78fd45693c5ccd.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:06 +0000 Subject: [PATCH v3 06/13] strmap: new utility functions Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Add strmap as a new struct and associated utility functions, specifically for hashmaps that map strings to some value. The API is taken directly from Peff's proposal at https://lore.kernel.org/git/20180906191203.GA26184@sigill.intra.peff.net/ Note that similar string-list, I have a strdup_strings setting. However, unlike string-list, strmap_init() does not take a parameter for this setting and instead automatically sets it to 1; callers who want to control this detail need to instead call strmap_init_with_options(). (Future patches will add additional parameters to strmap_init_with_options()). Signed-off-by: Elijah Newren --- Makefile | 1 + strmap.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ strmap.h | 65 +++++++++++++++++++++++++++++++++++++ 3 files changed, 165 insertions(+) create mode 100644 strmap.c create mode 100644 strmap.h diff --git a/Makefile b/Makefile index 95571ee3fc..777a34c01c 100644 --- a/Makefile +++ b/Makefile @@ -1000,6 +1000,7 @@ LIB_OBJS += stable-qsort.o LIB_OBJS += strbuf.o LIB_OBJS += streaming.o LIB_OBJS += string-list.o +LIB_OBJS += strmap.o LIB_OBJS += strvec.o LIB_OBJS += sub-process.o LIB_OBJS += submodule-config.o diff --git a/strmap.c b/strmap.c new file mode 100644 index 0000000000..53f284eb20 --- /dev/null +++ b/strmap.c @@ -0,0 +1,99 @@ +#include "git-compat-util.h" +#include "strmap.h" + +int cmp_strmap_entry(const void *hashmap_cmp_fn_data, + const struct hashmap_entry *entry1, + const struct hashmap_entry *entry2, + const void *keydata) +{ + const struct strmap_entry *e1, *e2; + + e1 = container_of(entry1, const struct strmap_entry, ent); + e2 = container_of(entry2, const struct strmap_entry, ent); + return strcmp(e1->key, e2->key); +} + +static struct strmap_entry *find_strmap_entry(struct strmap *map, + const char *str) +{ + struct strmap_entry entry; + hashmap_entry_init(&entry.ent, strhash(str)); + entry.key = str; + return hashmap_get_entry(&map->map, &entry, ent, NULL); +} + +void strmap_init(struct strmap *map) +{ + strmap_init_with_options(map, 1); +} + +void strmap_init_with_options(struct strmap *map, + int strdup_strings) +{ + hashmap_init(&map->map, cmp_strmap_entry, NULL, 0); + map->strdup_strings = strdup_strings; +} + +static void strmap_free_entries_(struct strmap *map, int free_values) +{ + struct hashmap_iter iter; + struct strmap_entry *e; + + if (!map) + return; + + /* + * We need to iterate over the hashmap entries and free + * e->key and e->value ourselves; hashmap has no API to + * take care of that for us. Since we're already iterating over + * the hashmap, though, might as well free e too and avoid the need + * to make some call into the hashmap API to do that. + */ + hashmap_for_each_entry(&map->map, &iter, e, ent) { + if (free_values) + free(e->value); + if (map->strdup_strings) + free((char*)e->key); + free(e); + } +} + +void strmap_clear(struct strmap *map, int free_values) +{ + strmap_free_entries_(map, free_values); + hashmap_clear(&map->map); +} + +void *strmap_put(struct strmap *map, const char *str, void *data) +{ + struct strmap_entry *entry = find_strmap_entry(map, str); + void *old = NULL; + + if (entry) { + old = entry->value; + entry->value = data; + } else { + const char *key = str; + + entry = xmalloc(sizeof(*entry)); + hashmap_entry_init(&entry->ent, strhash(str)); + + if (map->strdup_strings) + key = xstrdup(str); + entry->key = key; + entry->value = data; + hashmap_add(&map->map, &entry->ent); + } + return old; +} + +void *strmap_get(struct strmap *map, const char *str) +{ + struct strmap_entry *entry = find_strmap_entry(map, str); + return entry ? entry->value : NULL; +} + +int strmap_contains(struct strmap *map, const char *str) +{ + return find_strmap_entry(map, str) != NULL; +} diff --git a/strmap.h b/strmap.h new file mode 100644 index 0000000000..96888c23ad --- /dev/null +++ b/strmap.h @@ -0,0 +1,65 @@ +#ifndef STRMAP_H +#define STRMAP_H + +#include "hashmap.h" + +struct strmap { + struct hashmap map; + unsigned int strdup_strings:1; +}; + +struct strmap_entry { + struct hashmap_entry ent; + const char *key; + void *value; +}; + +int cmp_strmap_entry(const void *hashmap_cmp_fn_data, + const struct hashmap_entry *entry1, + const struct hashmap_entry *entry2, + const void *keydata); + +#define STRMAP_INIT { \ + .map = HASHMAP_INIT(cmp_strmap_entry, NULL), \ + .strdup_strings = 1, \ + } + +/* + * Initialize the members of the strmap. Any keys added to the strmap will + * be strdup'ed with their memory managed by the strmap. + */ +void strmap_init(struct strmap *map); + +/* + * Same as strmap_init, but for those who want to control the memory management + * carefully instead of using the default of strdup_strings=1. + */ +void strmap_init_with_options(struct strmap *map, + int strdup_strings); + +/* + * Remove all entries from the map, releasing any allocated resources. + */ +void strmap_clear(struct strmap *map, int free_values); + +/* + * Insert "str" into the map, pointing to "data". + * + * If an entry for "str" already exists, its data pointer is overwritten, and + * the original data pointer returned. Otherwise, returns NULL. + */ +void *strmap_put(struct strmap *map, const char *str, void *data); + +/* + * Return the data pointer mapped by "str", or NULL if the entry does not + * exist. + */ +void *strmap_get(struct strmap *map, const char *str); + +/* + * Return non-zero iff "str" is present in the map. This differs from + * strmap_get() in that it can distinguish entries with a NULL data pointer. + */ +int strmap_contains(struct strmap *map, const char *str); + +#endif /* STRMAP_H */ From patchwork Mon Nov 2 18:55:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874981 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1772AC00A89 for ; Mon, 2 Nov 2020 18:55:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B18CC22268 for ; Mon, 2 Nov 2020 18:55:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="sJ/OtHZc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726586AbgKBSze (ORCPT ); Mon, 2 Nov 2020 13:55:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726423AbgKBSzX (ORCPT ); Mon, 2 Nov 2020 13:55:23 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC280C061A47 for ; Mon, 2 Nov 2020 10:55:22 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id i16so10430444wrv.1 for ; Mon, 02 Nov 2020 10:55:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=wCvsDyKAwt2mhxtzknMOHiGx7z0ADi24Dwp/8iGpbIo=; b=sJ/OtHZcFIFTq4oPl92U5PGf1dZPFzEfdxnJst5IJ9KeH7C3Ku2nJg/kglLseOJA3r tVtDiP8D47MY09W0F9VLQsLTJRaPDvt/AbAc4ILoBJ0lEIbU1lAwndlWQKxnvqD1pHN6 hO/tCv1PLpNsxjyAGoDnxUAOfgnsMXb5r4QZzYWSRI5xgDapu4pw02qPzCrBp4PFR3UK XI2MaftEwdPvxfH/eQTrXJDK1Q6tuALxtx1RpqGkxJU1pkm7ejMZPk8iNhpRKFEqb9Uu o67t6WAaSk1EiU0qC4cnOjEJ3VWIjOPuVghUAVgjNBxAafAXM26U1K4x086W7ZW19bca GFJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=wCvsDyKAwt2mhxtzknMOHiGx7z0ADi24Dwp/8iGpbIo=; b=HrQSrRfYZZY8gFRAGyg+6L7qKkRG3X/tbvmbDML+RrT0pbhU1W5y3nIQDNgFl0mR3e m2iEm1zsNcgfDYNtqhJkqaz/DzlV3NIYmliT0sogSjT1SY6XFIWxGhznHqewzf6EZUns V0JeZQC/9J5+rAFMgeRO+/HtTQMWTRJXth3qWuiM/MQNLGnIYT8i6udZzjD/dKHf9lX2 OWc/TsKL5G5Ry4Zjly/Rw35mZYtGAi+UiLN5ADnZ9EsAuR8cCBB/XSjMt3b/jiLR0smE q7rjOqc/7Fli1+xZAnX/IzONWz/27lNfVv2iS9cDL02EZK+oMhXccWXvcmFRLVvNGee4 heYA== X-Gm-Message-State: AOAM531gfzP9hTC8uGskfBQyiJAN5+9TjkkS+04USFIUbU1QPnPM3XIh BoZDCJzhy58yTUvkxsNJE9zvHN2ExF4= X-Google-Smtp-Source: ABdhPJy+i2P9pGJRABipou9/0Vf/yHyEDIjnP+W6JyvHrIlcQ/ZjMQMb5u4sZeomJvWA0+GuDzBHSA== X-Received: by 2002:adf:cd01:: with SMTP id w1mr21914575wrm.298.1604343321449; Mon, 02 Nov 2020 10:55:21 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d2sm23549565wrq.34.2020.11.02.10.55.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:20 -0800 (PST) Message-Id: <42633b8d03008a159bc42bde319f50e87ddb00f6.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:07 +0000 Subject: [PATCH v3 07/13] strmap: add more utility functions Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren This adds a number of additional convienence functions I want/need: * strmap_get_size() * strmap_empty() * strmap_remove() * strmap_for_each_entry() * strmap_get_entry() I suspect the first four are self-explanatory. strmap_get_entry() is similar to strmap_get() except that instead of just returning the void* value that the string maps to, it returns the strmap_entry that contains both the string and the void* value (or NULL if the string isn't in the map). This is helpful because it avoids multiple lookups, e.g. in some cases a caller would need to call: * strmap_contains() to check that the map has an entry for the string * strmap_get() to get the void* value * * strmap_put() to update/overwrite the value If the void* pointer returned really is a pointer, then the last step is unnecessary, but if the void* pointer is just cast to an integer then strmap_put() will be needed. In contrast, one can call strmap_get_entry() and then: * check if the string was in the map by whether the pointer is NULL * access the value via entry->value * directly update entry->value meaning that we can replace two or three hash table lookups with one. Signed-off-by: Elijah Newren --- strmap.c | 20 ++++++++++++++++++++ strmap.h | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) diff --git a/strmap.c b/strmap.c index 53f284eb20..829f1bc095 100644 --- a/strmap.c +++ b/strmap.c @@ -87,6 +87,11 @@ void *strmap_put(struct strmap *map, const char *str, void *data) return old; } +struct strmap_entry *strmap_get_entry(struct strmap *map, const char *str) +{ + return find_strmap_entry(map, str); +} + void *strmap_get(struct strmap *map, const char *str) { struct strmap_entry *entry = find_strmap_entry(map, str); @@ -97,3 +102,18 @@ int strmap_contains(struct strmap *map, const char *str) { return find_strmap_entry(map, str) != NULL; } + +void strmap_remove(struct strmap *map, const char *str, int free_value) +{ + struct strmap_entry entry, *ret; + hashmap_entry_init(&entry.ent, strhash(str)); + entry.key = str; + ret = hashmap_remove_entry(&map->map, &entry, ent, NULL); + if (!ret) + return; + if (free_value) + free(ret->value); + if (map->strdup_strings) + free((char*)ret->key); + free(ret); +} diff --git a/strmap.h b/strmap.h index 96888c23ad..ee4307cca5 100644 --- a/strmap.h +++ b/strmap.h @@ -50,6 +50,12 @@ void strmap_clear(struct strmap *map, int free_values); */ void *strmap_put(struct strmap *map, const char *str, void *data); +/* + * Return the strmap_entry mapped by "str", or NULL if there is not such + * an item in map. + */ +struct strmap_entry *strmap_get_entry(struct strmap *map, const char *str); + /* * Return the data pointer mapped by "str", or NULL if the entry does not * exist. @@ -62,4 +68,34 @@ void *strmap_get(struct strmap *map, const char *str); */ int strmap_contains(struct strmap *map, const char *str); +/* + * Remove the given entry from the strmap. If the string isn't in the + * strmap, the map is not altered. + */ +void strmap_remove(struct strmap *map, const char *str, int free_value); + +/* + * Return how many entries the strmap has. + */ +static inline unsigned int strmap_get_size(struct strmap *map) +{ + return hashmap_get_size(&map->map); +} + +/* + * Return whether the strmap is empty. + */ +static inline int strmap_empty(struct strmap *map) +{ + return strmap_get_size(map) == 0; +} + +/* + * iterate through @map using @iter, @var is a pointer to a type strmap_entry + */ +#define strmap_for_each_entry(mystrmap, iter, var) \ + for (var = hashmap_iter_first_entry_offset(&(mystrmap)->map, iter, 0); \ + var; \ + var = hashmap_iter_next_entry_offset(iter, 0)) + #endif /* STRMAP_H */ From patchwork Mon Nov 2 18:55:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68584C56201 for ; Mon, 2 Nov 2020 18:55:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1DAED22268 for ; Mon, 2 Nov 2020 18:55:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ts8scDof" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726588AbgKBSzf (ORCPT ); Mon, 2 Nov 2020 13:55:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726470AbgKBSzX (ORCPT ); Mon, 2 Nov 2020 13:55:23 -0500 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E57FC0617A6 for ; Mon, 2 Nov 2020 10:55:23 -0800 (PST) Received: by mail-wm1-x341.google.com with SMTP id c18so10441558wme.2 for ; Mon, 02 Nov 2020 10:55:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=HHeM+c7NkTWs7nebSEJ3cvO5n6VeizXf/xOxj+EbJxc=; b=ts8scDofaqQ+2sNfBR7EznkK8RKDGWdUPg0wuojl964DQRI4yDerD3xnR7WZFn04XO HPkwkRZTEhRTFoaRji5F3sdrwx0cCV/DzlzEW59Z441IOFfC3EWmqhazMsTE7pj9evyZ AcLK/0dr7ccmrKag8n9YpBcfiA54DNRCF+3j5fEg/zIJdoWEHP1OFkIW4C0ZLCsu8+d/ y93eJEf3tRsqDy2qkZYG0e5YMYMKb7dRmiI5aiLldFPCO2WnJG16ot4nlkz4bj5pqmxl uIKBpVU8xXv+9oAR3LFi5hqsvCC0tIW+AYBcDcekcP5nHB0YKPyw5H2jSuZa3LobWB9B vpmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=HHeM+c7NkTWs7nebSEJ3cvO5n6VeizXf/xOxj+EbJxc=; b=ammSsvm4yqsuxXxNbQMQok/n6ZmK3WYOxMqG/ui8z2rpCHKhxW4qIcwLC+bIdC6vto nepFz4X0rdYJP72Sa0evmi7cqYaLUKTGYn+AfeAUCICIeEv9XuVe1VB0CxxkOR+SDVbc LwzMfmNZ6G4LuLza1PxqjTzEE2koH53mgCrJiWUmTvX9HQY2MPEOf7opVH29O698PisK 4JFQ2gWkwrz9gG27EEzUtGzGHG7IEUMGT3He1UK9aWMRJIyKUIkxcFI3wn0ipCtqRID2 T7m7/Eebf04WYS0TrqcZ18Cox7EKJs4KNfeHNN2lbK/tTdo/9c0DAjtxbFzDasQEVj6k OisA== X-Gm-Message-State: AOAM5303IaEzR1I4EASXN2RtUmKfBajO1Ru9L455mt7Tg8t2v3S+BoMw /rGhUOD65JOGS0BW9NjnVkoFfkP3M9U= X-Google-Smtp-Source: ABdhPJw1rvXCWdveqNEpL56n/CLTJ8Orq9KqCwdZJnIH7hjc+pn+N0kS0gfGWW8+paF41hrPUquk7w== X-Received: by 2002:a1c:e482:: with SMTP id b124mr15764632wmh.25.1604343322283; Mon, 02 Nov 2020 10:55:22 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u10sm23084120wrw.36.2020.11.02.10.55.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:21 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:08 +0000 Subject: [PATCH v3 08/13] strmap: enable faster clearing and reusing of strmaps Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren When strmaps are used heavily, such as is done by my new merge-ort algorithm, and strmaps need to be cleared but then re-used (because of e.g. picking multiple commits to cherry-pick, or due to a recursive merge having several different merges while recursing), free-ing and reallocating map->table repeatedly can add up in time, especially since it will likely be reallocated to a much smaller size but the previous merge provides a good guide to the right size to use for the next merge. Introduce strmap_partial_clear() to take advantage of this type of situation; it will act similar to strmap_clear() except that map->table's entries are zeroed instead of map->table being free'd. Making use of this function reduced the cost of clear_or_reinit_internal_opts() by about 20% in mert-ort, and dropped the overall runtime of my rebase testcase by just under 2%. Signed-off-by: Elijah Newren --- strmap.c | 6 ++++++ strmap.h | 6 ++++++ 2 files changed, 12 insertions(+) diff --git a/strmap.c b/strmap.c index 829f1bc095..c410c5241a 100644 --- a/strmap.c +++ b/strmap.c @@ -64,6 +64,12 @@ void strmap_clear(struct strmap *map, int free_values) hashmap_clear(&map->map); } +void strmap_partial_clear(struct strmap *map, int free_values) +{ + strmap_free_entries_(map, free_values); + hashmap_partial_clear(&map->map); +} + void *strmap_put(struct strmap *map, const char *str, void *data) { struct strmap_entry *entry = find_strmap_entry(map, str); diff --git a/strmap.h b/strmap.h index ee4307cca5..10b4642860 100644 --- a/strmap.h +++ b/strmap.h @@ -42,6 +42,12 @@ void strmap_init_with_options(struct strmap *map, */ void strmap_clear(struct strmap *map, int free_values); +/* + * Similar to strmap_clear() but leaves map->map->table allocated and + * pre-sized so that subsequent uses won't need as many rehashings. + */ +void strmap_partial_clear(struct strmap *map, int free_values); + /* * Insert "str" into the map, pointing to "data". * From patchwork Mon Nov 2 18:55:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEC4BC388F9 for ; Mon, 2 Nov 2020 18:55:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5FE172225E for ; Mon, 2 Nov 2020 18:55:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Y7nDxDLE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725833AbgKBSz3 (ORCPT ); Mon, 2 Nov 2020 13:55:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726525AbgKBSz0 (ORCPT ); Mon, 2 Nov 2020 13:55:26 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73D5CC061A48 for ; Mon, 2 Nov 2020 10:55:24 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id 33so5051805wrl.7 for ; Mon, 02 Nov 2020 10:55:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=xYsZusN3B7Ocz/+8CSNj5PLz4D9pdVqEn3wjc3y9pYs=; b=Y7nDxDLEUMUOaEnOo/qQbwopr8/7B+7uTk/yicx4Vx44YMNx5dLcavCbjH7LDvOBad j7y8ZuwG28OaMTcpO85QIFtaxsuEuHY3cT7C4hxmM5z6EBnXPq0YVQThmUb5q1oxhzix AqM46Vn0iLODtmVU1+6uHKEFDTLUws5r6J/zDMAsNt6BtjCO3DZqz+SXhPR99J05Fhtd KGw9rfAQxu9GUU5MUg0QutfLURbZBH2rRUE7AOtODb3NYr3iYlz1bGO00ljpM25BB4HD L/YiTzqYhffu/+1vhJ6tEwSkfTERMeslkGFf54oIEYvz3OENOOr55glDDlL8zMRQxycI 1rIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=xYsZusN3B7Ocz/+8CSNj5PLz4D9pdVqEn3wjc3y9pYs=; b=L43VwzRKq49rrT5kdLhEPs7ISc7G4GtRkSy+1pKN/ErSSrHRYZPI8sxQdhfmG2hnho rj0wMh6CTb/kcx9/2FgO6JhRWsh6eL8FHaoLFE9u/jlFg3UsbuQpH7WlqASCsDAkzwVM T4tiZeykrmnM/vRHjYv4qwr47aJ18qu4HyTwqgognWIS4l+htFty0Xt6TnfLtIjt3kdV vB/J/pOmSVIycW6YAREss4pz/I9wO2WyxoD8Fg4+OluejZJrQqFtOHytDxVwYVkK7YMq By2h5qehE0vr1kLxGtYgbdsK4/b0OibpCAGz7E4oc6wRBBlLlq6BtJZSSpqhnRQDPSFF WdXQ== X-Gm-Message-State: AOAM533HnsFGwuIY2xz3tJe2fX7bOc86VByRvWEFuIURw/sm4qMVqXUg Ij1lLv4eqNlzmfl2yI8bzn0mXS7kJio= X-Google-Smtp-Source: ABdhPJwhGhMSlfPLB2ksqaBxgsxbo8ULF9uG4J/mlOUZj/RHz0ZCwmr772dz5QIYvxxxr1PSx7p07g== X-Received: by 2002:a5d:6783:: with SMTP id v3mr21244478wru.385.1604343323046; Mon, 02 Nov 2020 10:55:23 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o3sm22512394wru.15.2020.11.02.10.55.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:22 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:09 +0000 Subject: [PATCH v3 09/13] strmap: add functions facilitating use as a string->int map Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Although strmap could be used as a string->int map, one either had to allocate an int for every entry and then deallocate later, or one had to do a bunch of casting between (void*) and (intptr_t). Add some special functions that do the casting. Also, rename put->set for such wrapper functions since 'put' implied there may be some deallocation needed if the string was already found in the map, which isn't the case when we're storing an int value directly in the void* slot instead of using the void* slot as a pointer to data. A note on the name: if anyone has a better name suggestion than strintmap, I'm happy to take it. It seems slightly unwieldy, but I have not been able to come up with a better name. Signed-off-by: Elijah Newren --- strmap.c | 11 +++++++ strmap.h | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 107 insertions(+) diff --git a/strmap.c b/strmap.c index c410c5241a..0d10a884b5 100644 --- a/strmap.c +++ b/strmap.c @@ -123,3 +123,14 @@ void strmap_remove(struct strmap *map, const char *str, int free_value) free((char*)ret->key); free(ret); } + +void strintmap_incr(struct strintmap *map, const char *str, intptr_t amt) +{ + struct strmap_entry *entry = find_strmap_entry(&map->map, str); + if (entry) { + intptr_t *whence = (intptr_t*)&entry->value; + *whence += amt; + } + else + strintmap_set(map, str, map->default_value + amt); +} diff --git a/strmap.h b/strmap.h index 10b4642860..31474f781e 100644 --- a/strmap.h +++ b/strmap.h @@ -23,6 +23,11 @@ int cmp_strmap_entry(const void *hashmap_cmp_fn_data, .map = HASHMAP_INIT(cmp_strmap_entry, NULL), \ .strdup_strings = 1, \ } +#define STRINTMAP_INIT { \ + .map.map = HASHMAP_INIT(cmp_strmap_entry, NULL), \ + .map.strdup_strings = 1, \ + .default_value = 0, \ + } /* * Initialize the members of the strmap. Any keys added to the strmap will @@ -104,4 +109,95 @@ static inline int strmap_empty(struct strmap *map) var; \ var = hashmap_iter_next_entry_offset(iter, 0)) + +/* + * strintmap: + * A map of string -> int, typecasting the void* of strmap to an int. + * + * Primary differences: + * 1) Since the void* value is just an int in disguise, there is no value + * to free. (Thus one fewer argument to strintmap_clear) + * 2) strintmap_get() returns an int; it also requires an extra parameter to + * be specified so it knows what value to return if the underlying strmap + * has not key matching the given string. + * 3) No strmap_put() equivalent; strintmap_set() and strintmap_incr() + * instead. + */ + +struct strintmap { + struct strmap map; + int default_value; +}; + +#define strintmap_for_each_entry(mystrmap, iter, var) \ + strmap_for_each_entry(&(mystrmap)->map, iter, var) + +static inline void strintmap_init(struct strintmap *map, int default_value) +{ + strmap_init(&map->map); + map->default_value = default_value; +} + +static inline void strintmap_init_with_options(struct strintmap *map, + int default_value, + int strdup_strings) +{ + strmap_init_with_options(&map->map, strdup_strings); + map->default_value = default_value; +} + +static inline void strintmap_clear(struct strintmap *map) +{ + strmap_clear(&map->map, 0); +} + +static inline void strintmap_partial_clear(struct strintmap *map) +{ + strmap_partial_clear(&map->map, 0); +} + +static inline int strintmap_contains(struct strintmap *map, const char *str) +{ + return strmap_contains(&map->map, str); +} + +static inline void strintmap_remove(struct strintmap *map, const char *str) +{ + return strmap_remove(&map->map, str, 0); +} + +static inline int strintmap_empty(struct strintmap *map) +{ + return strmap_empty(&map->map); +} + +static inline unsigned int strintmap_get_size(struct strintmap *map) +{ + return strmap_get_size(&map->map); +} + +/* + * Returns the value for str in the map. If str isn't found in the map, + * the map's default_value is returned. + */ +static inline int strintmap_get(struct strintmap *map, const char *str) +{ + struct strmap_entry *result = strmap_get_entry(&map->map, str); + if (!result) + return map->default_value; + return (intptr_t)result->value; +} + +static inline void strintmap_set(struct strintmap *map, const char *str, + intptr_t v) +{ + strmap_put(&map->map, str, (void *)v); +} + +/* + * Increment the value for str by amt. If str isn't in the map, add it and + * set its value to default_value + amt. + */ +void strintmap_incr(struct strintmap *map, const char *str, intptr_t amt); + #endif /* STRMAP_H */ From patchwork Mon Nov 2 18:55:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 353A9C55179 for ; Mon, 2 Nov 2020 18:55:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D339B2225E for ; Mon, 2 Nov 2020 18:55:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TZnCbySF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726579AbgKBSzb (ORCPT ); Mon, 2 Nov 2020 13:55:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726503AbgKBSz0 (ORCPT ); Mon, 2 Nov 2020 13:55:26 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 588B3C061A49 for ; Mon, 2 Nov 2020 10:55:25 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id 23so1640615wmg.1 for ; Mon, 02 Nov 2020 10:55:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AkbmOxlqD+5g+uBtBbf9cb6JjA2jR78TMumBFI/Q+i8=; b=TZnCbySFS6+c5zI9hX+B218h0YeKM7bhRqTNj5uI+oM5dhgsBmzgdym5Yhfl78vJ9Y jRL+WtsQkDjpjeIXheKZXMlLNY7YGdMLDo8RE5B1OfZ4SAMU8pEHP7EPQkZycM5Ufeqj l8B8GLCV7ffIRCRmT5ZZccT46VpOjEzeLZvRGv3nVKbFTXzYUTRP7PcxOlQVENHZ6azK DGEMEjHhr13+lppnKf4BMXBUgx3en4Ua07Pu8Lq/UIef9RFRJGQ8hy6eKGTTBdunbl86 Q634IjQYIuQZ0oS3FqJZvtq6vuqZD8DCU3GwLKgH2bIQjdBBSnqP9U/i2BbIRRHv1cgj oENA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AkbmOxlqD+5g+uBtBbf9cb6JjA2jR78TMumBFI/Q+i8=; b=FX9mLKbasoxeWT9aqF32uUIiwnp28HevLO6uO5dAR8BPbirT03Dr7VryPEfmDeLTv/ K7GmYYOaoDjB96gzKlFChVwOIwpV1FHFMte8RQX2LAMmiL8mKWU/p94mstc+zOgdKyIm zk2YdGOVlYpJZ0YYNpIC4fNFoVoKNSaD/eBtkVAGbLNOxLBKOiM0+J6pWKM4dBnfsMub gMnnXpkplri548caN4GT6AygM6ZyE2gauzVZ6tQfo18r5MM9c75Towu8kt4Sn7TFidZa bpl8IzSWHjLAs9zDxcuZLjZs5c+YE/PoEXerCimSTnZP+hWKybDTPhSzA7TSY63KIMjZ PeiQ== X-Gm-Message-State: AOAM532p0qNWwuyTYhWmznSBmJs9ehH3wYUra6+QzAtsQV+LJvCHTUyw HSzhuxF/cIkPQoLgW56a0PpU9zi4O7E= X-Google-Smtp-Source: ABdhPJyOlS57o4XlQpLa4oUk7wN4tBm36O5YjaQ5aYQYLDMOmqg3pzX/PPGsUipxTTbWjPNjxsWaJg== X-Received: by 2002:a1c:6843:: with SMTP id d64mr19907220wmc.131.1604343323915; Mon, 02 Nov 2020 10:55:23 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x7sm20622317wrt.78.2020.11.02.10.55.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:23 -0800 (PST) Message-Id: <0f57735f5e30ad61a2e6fdb118067afbcea69660.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:10 +0000 Subject: [PATCH v3 10/13] strmap: add a strset sub-type Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Similar to adding strintmap for special-casing a string -> int mapping, add a strset type for cases where we really are only interested in using strmap for storing a set rather than a mapping. In this case, we'll always just store NULL for the value but the different struct type makes it clearer than code comments how a variable is intended to be used. The difference in usage also results in some differences in API: a few things that aren't necessary or meaningful are dropped (namely, the free_values argument to *_clear(), and the *_get() function), and strset_add() is chosen as the API instead of strset_put(). Finally, shortlog already had a more minimal strset API; so this adds a strset_check_and_add() function for its benefit to allow it to switch over to this strset implementation. Signed-off-by: Elijah Newren --- strmap.c | 8 +++++++ strmap.h | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 79 insertions(+) diff --git a/strmap.c b/strmap.c index 0d10a884b5..2aff985f40 100644 --- a/strmap.c +++ b/strmap.c @@ -134,3 +134,11 @@ void strintmap_incr(struct strintmap *map, const char *str, intptr_t amt) else strintmap_set(map, str, map->default_value + amt); } + +int strset_check_and_add(struct strset *set, const char *str) +{ + if (strset_contains(set, str)) + return 1; + strset_add(set, str); + return 0; +} diff --git a/strmap.h b/strmap.h index 31474f781e..fca1e9f639 100644 --- a/strmap.h +++ b/strmap.h @@ -28,6 +28,10 @@ int cmp_strmap_entry(const void *hashmap_cmp_fn_data, .map.strdup_strings = 1, \ .default_value = 0, \ } +#define STRSET_INIT { \ + .map.map = HASHMAP_INIT(cmp_strmap_entry, NULL), \ + .map.strdup_strings = 1, \ + } /* * Initialize the members of the strmap. Any keys added to the strmap will @@ -200,4 +204,71 @@ static inline void strintmap_set(struct strintmap *map, const char *str, */ void strintmap_incr(struct strintmap *map, const char *str, intptr_t amt); +/* + * strset: + * A set of strings. + * + * Primary differences with strmap: + * 1) The value is always NULL, and ignored. As there is no value to free, + * there is one fewer argument to strset_clear + * 2) No strset_get() because there is no value. + * 3) No strset_put(); use strset_add() instead. + */ + +struct strset { + struct strmap map; +}; + +#define strset_for_each_entry(mystrset, iter, var) \ + strmap_for_each_entry(&(mystrset)->map, iter, var) + +static inline void strset_init(struct strset *set) +{ + strmap_init(&set->map); +} + +static inline void strset_init_with_options(struct strset *set, + int strdup_strings) +{ + strmap_init_with_options(&set->map, strdup_strings); +} + +static inline void strset_clear(struct strset *set) +{ + strmap_clear(&set->map, 0); +} + +static inline void strset_partial_clear(struct strset *set) +{ + strmap_partial_clear(&set->map, 0); +} + +static inline int strset_contains(struct strset *set, const char *str) +{ + return strmap_contains(&set->map, str); +} + +static inline void strset_remove(struct strset *set, const char *str) +{ + return strmap_remove(&set->map, str, 0); +} + +static inline int strset_empty(struct strset *set) +{ + return strmap_empty(&set->map); +} + +static inline unsigned int strset_get_size(struct strset *set) +{ + return strmap_get_size(&set->map); +} + +static inline void strset_add(struct strset *set, const char *str) +{ + strmap_put(&set->map, str, NULL); +} + +/* Returns 1 if str already in set. Otherwise adds str to set and returns 0 */ +int strset_check_and_add(struct strset *set, const char *str); + #endif /* STRMAP_H */ From patchwork Mon Nov 2 18:55:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93D49C4742C for ; Mon, 2 Nov 2020 18:55:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3046E22268 for ; Mon, 2 Nov 2020 18:55:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="I7iVWWdZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726573AbgKBSza (ORCPT ); Mon, 2 Nov 2020 13:55:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726512AbgKBSz0 (ORCPT ); Mon, 2 Nov 2020 13:55:26 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34CEEC061A4A for ; Mon, 2 Nov 2020 10:55:26 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id y12so15836905wrp.6 for ; Mon, 02 Nov 2020 10:55:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ajkc8/WOW9jIlOT43OaYG5Cl5zTQVr0y/ErISb/22MA=; b=I7iVWWdZV4CmCTVvnTQ+xrBigvi73YenJgI+zCG6A3g+0TqZ1myLphjHA1FWMKMW62 uzaoOysdCWHOlksfIjGPZHn8HzVH1t+8LKLgRT1veZYJEYBGukcYy/bWLvdXtFX6KLBm nJtyYZQXGhiW3GhxdJPzdqjtKNYnUgEIKQnaRZnHl3giWqxAgdwlyQVoFa6LaaItLZ23 3UIqf5jvtrCcWC6y383+eeQg/VJEbjdieB8GR6QLgplBruOVEe29NLYHyjDsEZbl+0su ypRCQGWnMxQ1qSvuw/aIjVqN5nW8+iQrxOiaKVQ3MaJ6mXh4YW6jsmF7NJW26LrdBHZo jIVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ajkc8/WOW9jIlOT43OaYG5Cl5zTQVr0y/ErISb/22MA=; b=FJaXxyGkevFvGEJuv7C57lg0YZ6RFjWHQBd2EAWxxzvSy2NUrY6c5YMckuDjSgepFa LBuVbkB3rGw+zOjJ8mx5eVPjr8rCJzkOxWmzPNsZ2w/00fcM1SdSPT25Ia6fQ4by1DM6 ItDkuJwyGayDGI/R2iC1kODUntUAtKjl6ikABW6bvO7SpybbdgM9XncpgU/cCXnLC7LF /won7xfMYZKFRT03Ar4Hd/gLgeKIvtPdVe6NXq4FKpqjoxI3Peid0w6Zx4LMg2WlI3dw 1l4HsLgcvlwo9McdOfTNgUuyosNT7ihQJydWqMuQXUVc5DhcZlWEYRdcGjFrAoPCzZHP 0jSg== X-Gm-Message-State: AOAM53336idTPOeKKklJ4x1VTAKGUZgOkhTIzeteKcSxpxryy0x43bAV j98n6V9mnMjX6hmzs9geidGx7xQE9Qo= X-Google-Smtp-Source: ABdhPJxUWEkqv5Le40NqxKbF3VehPF9tV0zq30wljFQLU82W0+Y5+AJMinLTfs9XXkAuZ0MsApAcUw== X-Received: by 2002:a5d:4d0d:: with SMTP id z13mr20947432wrt.23.1604343324736; Mon, 02 Nov 2020 10:55:24 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d134sm358293wmd.8.2020.11.02.10.55.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:24 -0800 (PST) Message-Id: <980537e877a3e690d451a574593b26f827b0d33d.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:11 +0000 Subject: [PATCH v3 11/13] strmap: enable allocations to come from a mem_pool Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren For heavy users of strmaps, allowing the keys and entries to be allocated from a memory pool can provide significant overhead savings. Add an option to strmap_init_with_options() to specify a memory pool. Signed-off-by: Elijah Newren --- strmap.c | 31 ++++++++++++++++++++++--------- strmap.h | 11 ++++++++--- 2 files changed, 30 insertions(+), 12 deletions(-) diff --git a/strmap.c b/strmap.c index 2aff985f40..34bca92522 100644 --- a/strmap.c +++ b/strmap.c @@ -1,5 +1,6 @@ #include "git-compat-util.h" #include "strmap.h" +#include "mem-pool.h" int cmp_strmap_entry(const void *hashmap_cmp_fn_data, const struct hashmap_entry *entry1, @@ -24,13 +25,15 @@ static struct strmap_entry *find_strmap_entry(struct strmap *map, void strmap_init(struct strmap *map) { - strmap_init_with_options(map, 1); + strmap_init_with_options(map, NULL, 1); } void strmap_init_with_options(struct strmap *map, + struct mem_pool *pool, int strdup_strings) { hashmap_init(&map->map, cmp_strmap_entry, NULL, 0); + map->pool = pool; map->strdup_strings = strdup_strings; } @@ -42,6 +45,10 @@ static void strmap_free_entries_(struct strmap *map, int free_values) if (!map) return; + if (!free_values && map->pool) + /* Memory other than util is owned by and freed with the pool */ + return; + /* * We need to iterate over the hashmap entries and free * e->key and e->value ourselves; hashmap has no API to @@ -52,9 +59,11 @@ static void strmap_free_entries_(struct strmap *map, int free_values) hashmap_for_each_entry(&map->map, &iter, e, ent) { if (free_values) free(e->value); - if (map->strdup_strings) - free((char*)e->key); - free(e); + if (!map->pool) { + if (map->strdup_strings) + free((char*)e->key); + free(e); + } } } @@ -81,11 +90,13 @@ void *strmap_put(struct strmap *map, const char *str, void *data) } else { const char *key = str; - entry = xmalloc(sizeof(*entry)); + entry = map->pool ? mem_pool_alloc(map->pool, sizeof(*entry)) + : xmalloc(sizeof(*entry)); hashmap_entry_init(&entry->ent, strhash(str)); if (map->strdup_strings) - key = xstrdup(str); + key = map->pool ? mem_pool_strdup(map->pool, str) + : xstrdup(str); entry->key = key; entry->value = data; hashmap_add(&map->map, &entry->ent); @@ -119,9 +130,11 @@ void strmap_remove(struct strmap *map, const char *str, int free_value) return; if (free_value) free(ret->value); - if (map->strdup_strings) - free((char*)ret->key); - free(ret); + if (!map->pool) { + if (map->strdup_strings) + free((char*)ret->key); + free(ret); + } } void strintmap_incr(struct strintmap *map, const char *str, intptr_t amt) diff --git a/strmap.h b/strmap.h index fca1e9f639..6ffa6afb6a 100644 --- a/strmap.h +++ b/strmap.h @@ -3,8 +3,10 @@ #include "hashmap.h" +struct mempool; struct strmap { struct hashmap map; + struct mem_pool *pool; unsigned int strdup_strings:1; }; @@ -41,9 +43,10 @@ void strmap_init(struct strmap *map); /* * Same as strmap_init, but for those who want to control the memory management - * carefully instead of using the default of strdup_strings=1. + * carefully instead of using the default of strdup_strings=1 and pool=NULL. */ void strmap_init_with_options(struct strmap *map, + struct mem_pool *pool, int strdup_strings); /* @@ -144,9 +147,10 @@ static inline void strintmap_init(struct strintmap *map, int default_value) static inline void strintmap_init_with_options(struct strintmap *map, int default_value, + struct mem_pool *pool, int strdup_strings) { - strmap_init_with_options(&map->map, strdup_strings); + strmap_init_with_options(&map->map, pool, strdup_strings); map->default_value = default_value; } @@ -228,9 +232,10 @@ static inline void strset_init(struct strset *set) } static inline void strset_init_with_options(struct strset *set, + struct mem_pool *pool, int strdup_strings) { - strmap_init_with_options(&set->map, strdup_strings); + strmap_init_with_options(&set->map, pool, strdup_strings); } static inline void strset_clear(struct strset *set) From patchwork Mon Nov 2 18:55:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03A65C388F9 for ; Mon, 2 Nov 2020 18:55:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AB8F122268 for ; Mon, 2 Nov 2020 18:55:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SQHByRsc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726600AbgKBSzj (ORCPT ); Mon, 2 Nov 2020 13:55:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725801AbgKBSz3 (ORCPT ); Mon, 2 Nov 2020 13:55:29 -0500 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1AFE7C061A4B for ; Mon, 2 Nov 2020 10:55:27 -0800 (PST) Received: by mail-wr1-x42c.google.com with SMTP id w1so15868549wrm.4 for ; Mon, 02 Nov 2020 10:55:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Oxdzbg13YO8U12UWlzUNnE5Q1NvcmWK9n+pCvBDtaLE=; b=SQHByRsc7tESNX590LsRxOQgAUY1klxKjdNYpK3e2OUMfdE65VjAWbxvipP7ZleaGA RQngreGCXCfH0eomBJBhDgMyGm4+L9CU42WW1m0TrestPQ1fEPKXSiH88i1bkKC0jxtY UeB+rIBaLclE+iwLD5YuUhaVf0vXc7lgpTE+35fkyZ9+mzwVIhjH98p0v8XittNSLPkW MgvWLvB+FHjNNq6lepF3Duzb0PBqUg5zYVuyexLyLVepKki7Q3A4PR5RbVyuQGhR9jlI yIeRAbgNDmpw9Ya1N0inkyNaY1cFOkUiGgmOWR+dhflhp2aAINpQWDbU9Hr4IcbampOe PSbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Oxdzbg13YO8U12UWlzUNnE5Q1NvcmWK9n+pCvBDtaLE=; b=qYm2VwA4c6C639/RfWsQtrVygr+uOgdCacjoGdRlw7BE0eNQg8EePOs89T4QLuh1pG 4PY09anqNQkjwfQTr0tqsWHNKvHrdbb5BsMoTXvD9qPq/Cr6do32pA4E7lUBOyV35G8A DmCVB+PZPqLDFxVSbXSN7YMztmtfjW26fD4czd7xB4DOgtq22EESumBVuffgqPq1LoBh XPVqgvJy5Vm7Z9eo0t9G+nIdDoM5HWVDIBE58nUczxYFCkkgxGvUmG0+KnxwSdjr8HgN EVYpAFdeQgA7k+kyjJmrKMDr1KwLc/6hBpRwnM6vBYH2F9I5TC9oIGEV5hVUTFSqoo8/ dkng== X-Gm-Message-State: AOAM533TTkSc2MVxNRvgzte014+VfgC0TrvpUluurWavI5WGpklcritq m6kKarWWZ9uI05MTYaacGCbMTlzZ+DE= X-Google-Smtp-Source: ABdhPJzvE/sFEwQK1ZA1thimKsUzU965KnGKKtRX/DxYiceM2tZXJcWQB5v9cULqfE4qqa/GrsO7lw== X-Received: by 2002:adf:e610:: with SMTP id p16mr15447335wrm.302.1604343325643; Mon, 02 Nov 2020 10:55:25 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x22sm427042wmj.25.2020.11.02.10.55.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:25 -0800 (PST) Message-Id: <7f93cbb525704c0bd254181082e3ed1a2782a2d2.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:12 +0000 Subject: [PATCH v3 12/13] strmap: take advantage of FLEXPTR_ALLOC_STR when relevant Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren By default, we do not use a mempool and strdup_strings is true; in this case, we can avoid both an extra allocation and an extra free by just over-allocating for the strmap_entry leaving enough space at the end to copy the key. FLEXPTR_ALLOC_STR exists for exactly this purpose, so make use of it. Also, adjust the case when we are using a memory pool and strdup_strings is true to just do one allocation from the memory pool instead of two so that the strmap_clear() and strmap_remove() code can just avoid freeing the key in all cases. Signed-off-by: Elijah Newren --- strmap.c | 35 ++++++++++++++++++----------------- strmap.h | 1 + 2 files changed, 19 insertions(+), 17 deletions(-) diff --git a/strmap.c b/strmap.c index 34bca92522..9abd47fd4b 100644 --- a/strmap.c +++ b/strmap.c @@ -59,11 +59,8 @@ static void strmap_free_entries_(struct strmap *map, int free_values) hashmap_for_each_entry(&map->map, &iter, e, ent) { if (free_values) free(e->value); - if (!map->pool) { - if (map->strdup_strings) - free((char*)e->key); + if (!map->pool) free(e); - } } } @@ -88,16 +85,23 @@ void *strmap_put(struct strmap *map, const char *str, void *data) old = entry->value; entry->value = data; } else { - const char *key = str; - - entry = map->pool ? mem_pool_alloc(map->pool, sizeof(*entry)) - : xmalloc(sizeof(*entry)); + if (map->strdup_strings) { + if (!map->pool) { + FLEXPTR_ALLOC_STR(entry, key, str); + } else { + /* Remember +1 for nul byte twice below */ + size_t len = strlen(str); + entry = mem_pool_alloc(map->pool, + st_add3(sizeof(*entry), len, 1)); + memcpy(entry->keydata, str, len+1); + } + } else if (!map->pool) { + entry = xmalloc(sizeof(*entry)); + } else { + entry = mem_pool_alloc(map->pool, sizeof(*entry)); + } hashmap_entry_init(&entry->ent, strhash(str)); - - if (map->strdup_strings) - key = map->pool ? mem_pool_strdup(map->pool, str) - : xstrdup(str); - entry->key = key; + entry->key = map->strdup_strings ? entry->keydata : str; entry->value = data; hashmap_add(&map->map, &entry->ent); } @@ -130,11 +134,8 @@ void strmap_remove(struct strmap *map, const char *str, int free_value) return; if (free_value) free(ret->value); - if (!map->pool) { - if (map->strdup_strings) - free((char*)ret->key); + if (!map->pool) free(ret); - } } void strintmap_incr(struct strintmap *map, const char *str, intptr_t amt) diff --git a/strmap.h b/strmap.h index 6ffa6afb6a..0dd80b276e 100644 --- a/strmap.h +++ b/strmap.h @@ -14,6 +14,7 @@ struct strmap_entry { struct hashmap_entry ent; const char *key; void *value; + char keydata[FLEX_ARRAY]; /* if strdup_strings=1, key == &keydata[0] */ }; int cmp_strmap_entry(const void *hashmap_cmp_fn_data, From patchwork Mon Nov 2 18:55:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11874993 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B275C5517A for ; Mon, 2 Nov 2020 18:55:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0DAAE22275 for ; Mon, 2 Nov 2020 18:55:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AMFbcPMM" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726584AbgKBSzc (ORCPT ); Mon, 2 Nov 2020 13:55:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34346 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726369AbgKBSz2 (ORCPT ); Mon, 2 Nov 2020 13:55:28 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E77A5C061A4C for ; Mon, 2 Nov 2020 10:55:27 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id e2so10514485wme.1 for ; Mon, 02 Nov 2020 10:55:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=OA4wQ/JUZvjUd14KqKKplav5D5ohsMHFqJIjlPS/lz8=; b=AMFbcPMMD4pXCZiwE/8jgc0AXLrlUnsIxdbm/K7BTsNYAUDpAo8VZpyPJ+M4qmdCT8 +F/c+yOLh70//MfW2lxfYuaVLoGzYJdjOxVPDwF9U3QpLR+nae/72PJmkIrhK748aOUj KHD7bICQFh8hECmdYefb/ybCxYmcq3w7FYfigNWPNBQ9dCAn3LdhKedoPV7ftkM7wwcV 8n+lfjI+PbQuiCa8ypSgjqkD4EVMt6lTza0iy/annImO/7eTRGzZu2ckGXKRxIfaOK98 xKh23Co+NTWJKz0+8CiP6bMukQ9/SRSxOEClZQPV42uwvZoGfzkv9EHIRmOxkGs/1cJw fcQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=OA4wQ/JUZvjUd14KqKKplav5D5ohsMHFqJIjlPS/lz8=; b=SuaPhh5+KsGGMnz72Zypt5HfhhwmT1tZqh4lNHREOC1JOibCBkFIVyv7/qyQewXZl9 X2n6BTXJPJWagA/94Thy19sTQScPMNRQeRSkBVobQbeEOGPD7NAzwzcYTgzsA5+ILbni 2j3nlszEX9ZvXZmfX8x3MI7lUVuLq88spEHat386JZOKzlTULwc562nWvTnJ4YLKLj5a kbkp31rgj9owHY3Fy9mNJ6urlqNOPM1yTeWRqg/iVE4Da20pBWQSmtFm7AzxmOFZFIJI DfgQFKrWkXQT4aTznXNiqYPqoUanfM25BA3DXra6SCV9uTNHugNOHlYKKIXO9uTJekTJ 20rg== X-Gm-Message-State: AOAM531RDNsWXbc8flRADCfQ4SN1+mvZlkOU++QI/47n0EVBA7/UG7Ud WJx9Gx8dJxA+ySyc6nyPKerA03Jv4r0= X-Google-Smtp-Source: ABdhPJyowC8nLL3TknD6HAaJrhLbcVHteEiU2lcVN5VevArVEi0vfTp9fV3zcbU1xwehGM1P8TrRLg== X-Received: by 2002:a1c:4888:: with SMTP id v130mr17979207wma.84.1604343326427; Mon, 02 Nov 2020 10:55:26 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e3sm23540507wrn.32.2020.11.02.10.55.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Nov 2020 10:55:26 -0800 (PST) Message-Id: <5f41fc63e5355121cf882f63e1b38b8df1948df8.1604343314.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 02 Nov 2020 18:55:13 +0000 Subject: [PATCH v3 13/13] Use new HASHMAP_INIT macro to simplify hashmap initialization Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Now that hashamp has lazy initialization and a HASHMAP_INIT macro, hashmaps allocated on the stack can be initialized without a call to hashmap_init() and in some cases makes the code a bit shorter. Convert some callsites over to take advantage of this. Signed-off-by: Elijah Newren --- attr.c | 26 ++++++++------------------ bloom.c | 3 +-- builtin/difftool.c | 9 ++++----- range-diff.c | 4 +--- revision.c | 9 +-------- t/helper/test-hashmap.c | 3 +-- 6 files changed, 16 insertions(+), 38 deletions(-) diff --git a/attr.c b/attr.c index a826b2ef1f..4ef85d668b 100644 --- a/attr.c +++ b/attr.c @@ -52,13 +52,6 @@ static inline void hashmap_unlock(struct attr_hashmap *map) pthread_mutex_unlock(&map->mutex); } -/* - * The global dictionary of all interned attributes. This - * is a singleton object which is shared between threads. - * Access to this dictionary must be surrounded with a mutex. - */ -static struct attr_hashmap g_attr_hashmap; - /* The container for objects stored in "struct attr_hashmap" */ struct attr_hash_entry { struct hashmap_entry ent; @@ -80,11 +73,14 @@ static int attr_hash_entry_cmp(const void *unused_cmp_data, return (a->keylen != b->keylen) || strncmp(a->key, b->key, a->keylen); } -/* Initialize an 'attr_hashmap' object */ -static void attr_hashmap_init(struct attr_hashmap *map) -{ - hashmap_init(&map->map, attr_hash_entry_cmp, NULL, 0); -} +/* + * The global dictionary of all interned attributes. This + * is a singleton object which is shared between threads. + * Access to this dictionary must be surrounded with a mutex. + */ +static struct attr_hashmap g_attr_hashmap = { + HASHMAP_INIT(attr_hash_entry_cmp, NULL) +}; /* * Retrieve the 'value' stored in a hashmap given the provided 'key'. @@ -96,9 +92,6 @@ static void *attr_hashmap_get(struct attr_hashmap *map, struct attr_hash_entry k; struct attr_hash_entry *e; - if (!map->map.tablesize) - attr_hashmap_init(map); - hashmap_entry_init(&k.ent, memhash(key, keylen)); k.key = key; k.keylen = keylen; @@ -114,9 +107,6 @@ static void attr_hashmap_add(struct attr_hashmap *map, { struct attr_hash_entry *e; - if (!map->map.tablesize) - attr_hashmap_init(map); - e = xmalloc(sizeof(struct attr_hash_entry)); hashmap_entry_init(&e->ent, memhash(key, keylen)); e->key = key; diff --git a/bloom.c b/bloom.c index 719c313a1c..b176f28f53 100644 --- a/bloom.c +++ b/bloom.c @@ -229,10 +229,9 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, diffcore_std(&diffopt); if (diff_queued_diff.nr <= settings->max_changed_paths) { - struct hashmap pathmap; + struct hashmap pathmap = HASHMAP_INIT(pathmap_cmp, NULL); struct pathmap_hash_entry *e; struct hashmap_iter iter; - hashmap_init(&pathmap, pathmap_cmp, NULL, 0); for (i = 0; i < diff_queued_diff.nr; i++) { const char *path = diff_queued_diff.queue[i]->two->path; diff --git a/builtin/difftool.c b/builtin/difftool.c index 7ac432b881..6e18e623fd 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -342,7 +342,10 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, const char *workdir, *tmp; int ret = 0, i; FILE *fp; - struct hashmap working_tree_dups, submodules, symlinks2; + struct hashmap working_tree_dups = HASHMAP_INIT(working_tree_entry_cmp, + NULL); + struct hashmap submodules = HASHMAP_INIT(pair_cmp, NULL); + struct hashmap symlinks2 = HASHMAP_INIT(pair_cmp, NULL); struct hashmap_iter iter; struct pair_entry *entry; struct index_state wtindex; @@ -383,10 +386,6 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, rdir_len = rdir.len; wtdir_len = wtdir.len; - hashmap_init(&working_tree_dups, working_tree_entry_cmp, NULL, 0); - hashmap_init(&submodules, pair_cmp, NULL, 0); - hashmap_init(&symlinks2, pair_cmp, NULL, 0); - child.no_stdin = 1; child.git_cmd = 1; child.use_shell = 0; diff --git a/range-diff.c b/range-diff.c index befeecae44..b9950f10c8 100644 --- a/range-diff.c +++ b/range-diff.c @@ -232,11 +232,9 @@ static int patch_util_cmp(const void *dummy, const struct patch_util *a, static void find_exact_matches(struct string_list *a, struct string_list *b) { - struct hashmap map; + struct hashmap map = HASHMAP_INIT((hashmap_cmp_fn)patch_util_cmp, NULL); int i; - hashmap_init(&map, (hashmap_cmp_fn)patch_util_cmp, NULL, 0); - /* First, add the patches of a to a hash map */ for (i = 0; i < a->nr; i++) { struct patch_util *util = a->items[i].util; diff --git a/revision.c b/revision.c index f27649d45d..c6e169e3eb 100644 --- a/revision.c +++ b/revision.c @@ -124,11 +124,6 @@ static int path_and_oids_cmp(const void *hashmap_cmp_fn_data, return strcmp(e1->path, e2->path); } -static void paths_and_oids_init(struct hashmap *map) -{ - hashmap_init(map, path_and_oids_cmp, NULL, 0); -} - static void paths_and_oids_clear(struct hashmap *map) { struct hashmap_iter iter; @@ -213,7 +208,7 @@ void mark_trees_uninteresting_sparse(struct repository *r, struct oidset *trees) { unsigned has_interesting = 0, has_uninteresting = 0; - struct hashmap map; + struct hashmap map = HASHMAP_INIT(path_and_oids_cmp, NULL); struct hashmap_iter map_iter; struct path_and_oids_entry *entry; struct object_id *oid; @@ -237,8 +232,6 @@ void mark_trees_uninteresting_sparse(struct repository *r, if (!has_uninteresting || !has_interesting) return; - paths_and_oids_init(&map); - oidset_iter_init(trees, &iter); while ((oid = oidset_iter_next(&iter))) { struct tree *tree = lookup_tree(r, oid); diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c index 2475663b49..36ff07bd4b 100644 --- a/t/helper/test-hashmap.c +++ b/t/helper/test-hashmap.c @@ -151,12 +151,11 @@ static void perf_hashmap(unsigned int method, unsigned int rounds) int cmd__hashmap(int argc, const char **argv) { struct strbuf line = STRBUF_INIT; - struct hashmap map; int icase; + struct hashmap map = HASHMAP_INIT(test_entry_cmp, &icase); /* init hash map */ icase = argc > 1 && !strcmp("ignorecase", argv[1]); - hashmap_init(&map, test_entry_cmp, &icase, 0); /* process commands from stdin */ while (strbuf_getline(&line, stdin) != EOF) {