From patchwork Thu Feb 4 21:05:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 971C4C433DB for ; Thu, 4 Feb 2021 21:07:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4CDAE64FA7 for ; Thu, 4 Feb 2021 21:07:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229979AbhBDVGz (ORCPT ); Thu, 4 Feb 2021 16:06:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229750AbhBDVGw (ORCPT ); Thu, 4 Feb 2021 16:06:52 -0500 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E0F1C061786 for ; Thu, 4 Feb 2021 13:06:12 -0800 (PST) Received: by mail-wm1-x331.google.com with SMTP id w4so4312630wmi.4 for ; Thu, 04 Feb 2021 13:06:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HP0yyJ4j7RACAlRFx8OBETIiftoiMIfiR5q9Tabuz8w=; b=b/aeuef1XntTT0mrnRQ6xSn18zX9n37yTSt+FXsbMWcjf7AiE3ja6Kzu2+kKYzwNQJ rodHER4RNY9ne4H8YskaNO9kN89uYW9Y9kDaJMNzGfF/mRBGKIEa3RU0IDIIAMhMxERz RArfSYm3AzX6qp7vXLPAdXVSIeTLtR+YSwVPFFnPqeqaGs4FF/F3ox2fWpNMb4U/QbC1 IRSDDIq9dt1Qg3Hh/63KrGXTQivH6MROH00kiLPWs/+x/VRhfG/7fUgjZhCaJZ2f/S9h cD6JwQVCF1Qgmp3ku93fKMx9c+AAnEcHLl+8cUHS242CejR7tEEyAxrWch/EOn/ImHG2 oMlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HP0yyJ4j7RACAlRFx8OBETIiftoiMIfiR5q9Tabuz8w=; b=bJOgOT92yM2YOOIQZV/Yeq6YKqmqQgmE34svG19lnI+3QLBBJKPyKNI7QJoiNetsTl QtSCYUEAyeSRd7aAsl2ZAKz9iRzJA7DUe8VG3FL661U0tuoA3BCrd9iMOUNoGU02DlHA Kudds/aKk413nNGGEw2eo+adTcUGRRhIT8vrutMpaE+9zIXwmM6wXXA9JFbrKLBAg5jf 5R0YXV3UovJ5fKA3rAcgs8tW+VVjlOfmOVnd3x8SjidFCrTcl92df9abwt1/P5voSMJ7 wwzsCgTv1O3/WXn8eKlenrLZRf5EhaIEEwUNRTjujAT/IHwzsaGCWAlbi+CgRTYqCUqU 8sow== X-Gm-Message-State: AOAM533F2dXnMHvH5H5WdH87q//8Ep8bcTHpIEJcgdbf+tNbW55/+R5w l58lFulfC7jn8vqMerEBdEFpffHfep4= X-Google-Smtp-Source: ABdhPJxtMn6G0slgEV/KKEKqW2FCzb8tWodf75MI0I2oF4dDlbmJiJy6xdfo6D2lwePrJaugfn9Vfg== X-Received: by 2002:a1c:bc46:: with SMTP id m67mr898485wmf.82.1612472770993; Thu, 04 Feb 2021 13:06:10 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:10 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Date: Thu, 4 Feb 2021 22:05:47 +0100 Message-Id: <20210204210556.25242-2-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Drop an assignment added in b65abcafc7a (grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) and the overly cautious assert() I added in 94da9193a6e (grep: add support for PCRE v2, 2017-06-01). There was never a good reason for this, it's just a relic from when I initially wrote the PCREv2 support. We're not going to have confusion about compile_pcre2_pattern() being called when it shouldn't just because we forgot to cargo-cult this opt->pcre2 option, and "opt" is (mostly) used for the options the user supplied, let's avoid the pattern of needlessly assigning to it. With my in-flight removal of PCREv1 [1] ("Remove support for v1 of the PCRE library", 2021-01-24) there'll be even less confusion around what we call where in these codepaths, which is one more reason to remove this. 1. https://lore.kernel.org/git/xmqqmtwy29x8.fsf@gitster.c.googlers.com/ Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/grep.c b/grep.c index aabfaaa4c3..816e23f17e 100644 --- a/grep.c +++ b/grep.c @@ -373,8 +373,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt int patinforet; size_t jitsizearg; - assert(opt->pcre2); - p->pcre2_compile_context = NULL; /* pcre2_global_context is initialized in append_grep_pattern */ @@ -555,7 +553,6 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt) #endif if (p->fixed || p->is_fixed) { #ifdef USE_LIBPCRE2 - opt->pcre2 = 1; if (p->is_fixed) { compile_pcre2_pattern(p, opt); } else { From patchwork Thu Feb 4 21:05:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76E1EC433DB for ; Thu, 4 Feb 2021 21:07:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1B8A164FA7 for ; Thu, 4 Feb 2021 21:07:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229834AbhBDVHC (ORCPT ); Thu, 4 Feb 2021 16:07:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229596AbhBDVGy (ORCPT ); Thu, 4 Feb 2021 16:06:54 -0500 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9CEF3C061788 for ; Thu, 4 Feb 2021 13:06:13 -0800 (PST) Received: by mail-wm1-x32b.google.com with SMTP id f16so4304140wmq.5 for ; Thu, 04 Feb 2021 13:06:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nb2aFZd97c/wh37Gd4ijmsU13GYAH3g2f72fzvQUjWA=; b=F5IEN0BNEB75rxE/slhRUsXsL/La/MDuuA7A1wUyo2j1Ue4fDrgPHp/WQWtPhorRIz 2yxIOEecHq/GoBs+RPU750zA6Dmo3Uxk+Dp8plaJi44gId9+SH1wNB+31/GXOT0oF8MP VhcJ+iYhoCPTpZKHPQWd6F3Epi2/mbZ+cQqHYUADYrisDDPaIbb2jaoT4PDEadNLVCup bDTOBhUVR0FDn55xnQysrKGsQeJO2Gt6zA42ysWy4dzgC0CPfZGLGjMhKeqfmfVWDsce 0/lecOhObpTPB7fEBOqIg/U8Z4Ywe4H31QgX4gKnYwN6KVXzrQ4TnslUDvhn+4nu9kxK ioAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nb2aFZd97c/wh37Gd4ijmsU13GYAH3g2f72fzvQUjWA=; b=cS4BMo7qSfIfu6pSpzQIRlSNMUkuTfftIzHByNCqNR5gAZrlsFWQBI8St+aU4mTGQq rknEIFydVCQRFqZ0iYDiJjnpeanj5yHhtMNK75iuLpyVEIqJnAZp6BFLyx6PTfP+NfOs 8C5x6i488rFGeGfJYuPJxWLn51/tb3pGMmqwz8w5mOd/NLqO9SDd+3AS1UHpjrcZpfNu 2+Dh69z1flWJTMFULom6shTt4Rk2v+/jyKMSvQHeU6tDRpF9ScU1XX4ZcHtBOLkuhT5D T4u0WfIub2D7PNSU9cKKb1SkNMXjJmcfcV0IvO60e/o3EfktH4jL4Cvidu8TAk7IzWqq 0Bag== X-Gm-Message-State: AOAM532HGQuQcS6dhe7oS1emtH4DfpJOti44tXb8ZBTShOsvL5pVu0K0 aGy2Ne65yUMfpU8T+sAFF5PIPprPctGWxw== X-Google-Smtp-Source: ABdhPJyaEOjpwdskl1lfyym0d8XaI6yHWING38r/wRAbUpjEtkdfWwnGNqghs9s62xoFCSLmgMdEGw== X-Received: by 2002:a1c:a553:: with SMTP id o80mr874438wme.20.1612472772169; Thu, 04 Feb 2021 13:06:12 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:11 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 02/10] grep/pcre2: drop needless assignment to NULL Date: Thu, 4 Feb 2021 22:05:48 +0100 Message-Id: <20210204210556.25242-3-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Remove a redundant assignment of pcre2_compile_context dating back to my 94da9193a6e (grep: add support for PCRE v2, 2017-06-01). In create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no need to NULL out individual members here. I think this was probably something left over from an earlier development version of mine. Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/grep.c b/grep.c index 816e23f17e..f27c5de7f5 100644 --- a/grep.c +++ b/grep.c @@ -373,8 +373,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt int patinforet; size_t jitsizearg; - p->pcre2_compile_context = NULL; - /* pcre2_global_context is initialized in append_grep_pattern */ if (opt->ignore_case) { if (!opt->ignore_locale && has_non_ascii(p->pattern)) { From patchwork Thu Feb 4 21:05:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D669C433DB for ; Thu, 4 Feb 2021 21:07:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4F03F64FA7 for ; Thu, 4 Feb 2021 21:07:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230070AbhBDVHE (ORCPT ); Thu, 4 Feb 2021 16:07:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229988AbhBDVGz (ORCPT ); Thu, 4 Feb 2021 16:06:55 -0500 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D78E3C06178A for ; Thu, 4 Feb 2021 13:06:14 -0800 (PST) Received: by mail-wm1-x333.google.com with SMTP id t142so2150875wmt.1 for ; Thu, 04 Feb 2021 13:06:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2x+6fBGe6X4YeDVGOhgxbAWi+iNw2jGFwEwAGRWfI4M=; b=obSUYekUqUfgGxt7DPbqn+4HtcM2eeXzRfXPT40rPc6iw/mwAmFjLbZ7SMfGKWBY9K Ubu/2rupc4eK9kf6fPLxa2JYJOr2BBPnmVAkDuKOhzjGCAqEvcTr22WIVmnEaWH8wXcf 5zQlINhYVnbWsj1LLUxSwUc3+xOLDuCZR3xdD1kE4DlwauHvowrXdl6jXlTFi1EP5pBD ohIIGanHhCkDD3mI3kMRL5OB7D2eIQJHD7k7/oI+ByJW7n0GrVeirWrL8J9h6nNeVxry R+ur+9NmmBydjv5SL6Qx9qIzFcrX2XDDZjBsMi3o1+sCKSiimyte+BTEJk1cPzUPR8gU O6KA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2x+6fBGe6X4YeDVGOhgxbAWi+iNw2jGFwEwAGRWfI4M=; b=dWBtPG9c0foEJHInpt55C2nLg1yv0x5rU5P4WX6b/LgIoi6FMl6K+Il2KEyGHOAY89 jmkhbToJ5m/hrvdXf9kaNGEZRA0QcSDBugYrMPjFaFD+HfI2hGtsxuBM1+lfUDfcemgQ 2kqRTLY8YzyPrb+YB5cDUorCqbsWtsU9E0GyGM0KNP25UkJmXGJWxzoucn5SKP+hyHAD E/kSFFpQSlXCXTfxq0pvVISZIJ1canmzk8cc3xa9Fwrq+7mPqTq/3wwf5iV24ZzEB8vx ac081+t0ERjoMSWGdyoq+Whk1FMYV443cbOvUx5mZkciwXA2j8max3HOr4ytvdxnTZpO xnvw== X-Gm-Message-State: AOAM532nUxbGXOJ+IVgp78OV87mluseb3TwpNPL2ZTkfvhBeTuuSocLd ot7o6joJ9uK9ssqkkN3P/d/2gkKC6RnbTQ== X-Google-Smtp-Source: ABdhPJy3irpTjxI+AF7U3o/nLl2bk9YQKpMriUgFIa1dKFnRTfo3DJxTgZ69/bSw0q2+dOrhS761+w== X-Received: by 2002:a1c:750e:: with SMTP id o14mr898146wmc.60.1612472773297; Thu, 04 Feb 2021 13:06:13 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:12 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 03/10] grep/pcre2: correct reference to grep_init() in comment Date: Thu, 4 Feb 2021 22:05:49 +0100 Message-Id: <20210204210556.25242-4-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Correct a comment added in 513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16). This comment was never correct in git.git, but was consistent with an older version of the patch[1]. 1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/grep.c b/grep.c index f27c5de7f5..b9adcd83e7 100644 --- a/grep.c +++ b/grep.c @@ -373,7 +373,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt int patinforet; size_t jitsizearg; - /* pcre2_global_context is initialized in append_grep_pattern */ + /* pcre2_global_context is initialized in grep_init */ if (opt->ignore_case) { if (!opt->ignore_locale && has_non_ascii(p->pattern)) { if (!pcre2_global_context) From patchwork Thu Feb 4 21:05:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2F41C433DB for ; Thu, 4 Feb 2021 21:07:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 659F464FA7 for ; Thu, 4 Feb 2021 21:07:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230086AbhBDVHN (ORCPT ); Thu, 4 Feb 2021 16:07:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230013AbhBDVG4 (ORCPT ); Thu, 4 Feb 2021 16:06:56 -0500 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D33C9C06178B for ; Thu, 4 Feb 2021 13:06:15 -0800 (PST) Received: by mail-wr1-x42a.google.com with SMTP id 7so5277244wrz.0 for ; Thu, 04 Feb 2021 13:06:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LDnOFW7p/ky7+KVyj8lT1tPoESnNZ9bzBGmd+kLhGR4=; b=clMWBvHxP8PByXcof+P2od5q2Hkt17niVHDbTESnbNMu0dGjrasGB9L6zMffp50Evv 1tUQGBciDqsvPuVtm+8Ay/gGRruPmPfKPwfQupWeg/6QcTNdH3hAVJvv9IQ/P0YiRDfj mjNwaMoaBDYmeYOjkJAkaFjEaHcddq8YjU2Gno0bNcgXxLrOioOLQ2ccoPAVcNBKDBI8 Kx+yttXifQV7n7a7bwwOLzFy7qlJXo5dgpQE49N6iiVnFCAW86pvaa14+OTF0oIh5Thg j+E0A1GRz49DMZTmpalsEJUtwc/y9MqrWnzq/EAh3Mzj4WIR9/09bykfux7CjVgjJ6w9 E7hQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LDnOFW7p/ky7+KVyj8lT1tPoESnNZ9bzBGmd+kLhGR4=; b=KWYeMQaeCNjznb0y3dl8IfmshH8bNerVtCXznxqfMaTG0FriHmCAfVUpTU/K1+9EJo trx+s0yP/x86yGAyXnlRgh9Ubh2lsGz4olX830i4UK52j7FPozpJ97Fc4eiMTLs9oASM fWEujcpwszi+KAvrDbpDt+n8AjQ9xJFgyk+K4yKdOYJJZoVDUosTwQyBkZC+xCZlBaS2 xsdBL5r3x8VfGmToUe6llGKuoXOkxJEkptXDFShr8IkdvhWT1MEZjTXfhDPxTbnnz7On AJJyFw3LBsGxJKY+DXyuKIKNyQ7aJXHLiTM4eWi2TvWnrU31oic07WybXEEG6dshBOWU a4Kw== X-Gm-Message-State: AOAM531JKadlqsx1aEp35GQ/jgSVcjfzS8gXHaWQPCPgkZzaZzZ9hTju NBNpS/9dieb0XL7xb1UEVRdkwRRxZ4SNlQ== X-Google-Smtp-Source: ABdhPJyLhkIi5XT0SdRbvhK1BETT/cGi5H3j+phCzMf3x8kCZFfWB2foxg9q13tFdK7GUz/5jxqK2A== X-Received: by 2002:a5d:690b:: with SMTP id t11mr1371640wru.12.1612472774351; Thu, 04 Feb 2021 13:06:14 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:13 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Date: Thu, 4 Feb 2021 22:05:50 +0100 Message-Id: <20210204210556.25242-5-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Change pcre2_malloc() in a way that'll make it easier for a debugging fprintf() to spew out the allocated pointer. This doesn't introduce any functional change, it just makes a subsequent commit's diff easier to read. Changes code added in 513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16). Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/grep.c b/grep.c index b9adcd83e7..f96d86c929 100644 --- a/grep.c +++ b/grep.c @@ -45,7 +45,8 @@ static pcre2_general_context *pcre2_global_context; static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data) { - return malloc(size); + void *pointer = malloc(size); + return pointer; } static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data) From patchwork Thu Feb 4 21:05:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BDE2C433E0 for ; Thu, 4 Feb 2021 21:07:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D66FD64FA4 for ; Thu, 4 Feb 2021 21:07:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230120AbhBDVHf (ORCPT ); Thu, 4 Feb 2021 16:07:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229813AbhBDVHc (ORCPT ); Thu, 4 Feb 2021 16:07:32 -0500 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1D35C06178C for ; Thu, 4 Feb 2021 13:06:16 -0800 (PST) Received: by mail-wm1-x32f.google.com with SMTP id 190so4330980wmz.0 for ; Thu, 04 Feb 2021 13:06:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UViOvZ6ebifCYycIoB4e1ICF+utBA/Mwn0gFFMwUjoM=; b=ojHLDaRF6pNp7g9rpIKhA//Oc7ysRTBIFZrhBDfnxtzJxFtoJOFTDBQgfG3UJMlVzH jeX9ncHhfafMfYIEo5U2P+4QHa1nJacztHMN2lBT1G40zWIym696M+MtBiC1Jl+YC8P9 rQQTgEOLkMOYPFr58z/psnQa2FfwLQyWDKVzojw1fShXynV8gj6vSWL6B2WrmPeOdmri W00mSr+Svnl1Mb0Vzb8qsoAv8TNar8ObbY7CU75KcgeKnJHADzVL8VcWLaiFzpRpgV4V MG89+q2y/3hekVcfpa7bmmKyF+mXc12dj8bFVVhF8mqP3btkAy2P1TyNADVdcewPsm4f a+Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UViOvZ6ebifCYycIoB4e1ICF+utBA/Mwn0gFFMwUjoM=; b=T7BjMBTufJFbdg7xmA0IJXd2XgZObsofrCLLXWEZIgFoxsRKouHjGeIZav5wBSNs1E C3zsmoeJD7InuWAUT0lP1wD76KNTrNKx1GGhVypnetonadunDNcnr7XXJpajqAcU4N0t A9DpITsDPJ8NOSZomHCXyw7TImXvneq+iRVui+msibli7nY7CUYD6ofTFnRh+1iO2caT U4Us8sNp4FwLdcCLV9Z6suL74SOJc3L9CUAx9jipBB3y3fn/wDIscLreLUG6hO3XacDe oK9UiOrGzMFG8TIsp2xiKG9jf+xl6V3fyzQrwIE981gDyWfK47KdtoJA4fvPZfS5do4D v9Jw== X-Gm-Message-State: AOAM53125ctAeuzseJ2Ed8f06EY3mNLk5T559LVq8tGg5L6QRg4wVEKN vbFhb5nIxxtlnuRTsWIrm5e3JvgXr04GpQ== X-Google-Smtp-Source: ABdhPJxpqFM0g6wh3ektsp5Yqr/KwOLIMvmn2exVZjxFhtiO95psDsMpvQN576z5RLd7/NxIqelBWw== X-Received: by 2002:a7b:c0d8:: with SMTP id s24mr907651wmh.4.1612472775481; Thu, 04 Feb 2021 13:06:15 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:14 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Date: Thu, 4 Feb 2021 22:05:51 +0100 Message-Id: <20210204210556.25242-6-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add optional printing of PCREv2 allocations to stderr for a developer who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to "1". This will be referenced a subsequent commit, and is generally useful to manually see what's going on with PCREv2 allocations while working on that code. Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/grep.c b/grep.c index f96d86c929..7d262a23d8 100644 --- a/grep.c +++ b/grep.c @@ -42,15 +42,25 @@ static struct grep_opt grep_defaults = { #ifdef USE_LIBPCRE2 static pcre2_general_context *pcre2_global_context; +#define GREP_PCRE2_DEBUG_MALLOC 0 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data) { void *pointer = malloc(size); +#if GREP_PCRE2_DEBUG_MALLOC + static int count = 1; + fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size); +#endif return pointer; } static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data) { +#if GREP_PCRE2_DEBUG_MALLOC + static int count = 1; + if (pointer) + fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++); +#endif free(pointer); } #endif From patchwork Thu Feb 4 21:05:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068787 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC4BDC433DB for ; Thu, 4 Feb 2021 21:07:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7867064FA4 for ; Thu, 4 Feb 2021 21:07:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230127AbhBDVHi (ORCPT ); Thu, 4 Feb 2021 16:07:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229981AbhBDVHc (ORCPT ); Thu, 4 Feb 2021 16:07:32 -0500 Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com [IPv6:2a00:1450:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5AA6C061793 for ; Thu, 4 Feb 2021 13:06:17 -0800 (PST) Received: by mail-wr1-x429.google.com with SMTP id q7so5150338wre.13 for ; Thu, 04 Feb 2021 13:06:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hN5eqxAbcgOVluoOGksJ+3Djft9wOOrqeKVpO1EdDw4=; b=Km0tCbhKkV2peaiFGgGwAD2TkBsSCS/8txdk2A4zwHvBiWU816BqtXbLIHGP2N5Ibq WO4cUJRz7mCfq3oNzqemTMPIw36Z/PE0PedQPcyUJ7DpPWIn/gUtz2Qx6Tf+rz2LAqfD tm1llxK9J6TAu3nzZw2d9lp7NqI4tBe7NXim7I2sOsXKw7WAxuMJD4esa1OINGN2WAmR iYSBJa87aGICsvHtE5WjRYjLAHRv9DdtaXmKPSCRNlh3E4XG2+BdZkY9/woO5TyL971Q Coqio5Sa5esVtDycH7ItT5rMHTaQuzSJ7H2Ttodo49Pl8ekER3QrVC2WNQLELwYYivEn 6kmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hN5eqxAbcgOVluoOGksJ+3Djft9wOOrqeKVpO1EdDw4=; b=OnDbiVVbvMfysd35utjxhSDQsA06yIYGD0TjxUoEEmV/qKN7cdsLz2B+5HtY6OVm4i TX9r+6jH7wsulfFFlc1qPxk6pG3Qxsd75Xnv7cG1s5YfvvRCn+ENCEZKOIpOAG9Y388U lkfih3BtH7ZHkBGj/1/4w4AzNNybHK6mFkD9ajzqSaXDna8XVaGLCcYDpZWGWeRb68CK a1RfmI2c9OX27IR5gp0/CV86ffu0Z0/51K7kGMcBEZG/n/q4DSV5+ITcxPXTZdfprm/d W4Lb0SLWdt0VO2S3EKPBHvAfM9CIAoAyTOyXAOgV3WB5COlb3xxVpj7R2at20wrHd15E 8NcQ== X-Gm-Message-State: AOAM533kzyeivV1cUmopZGFoebbc5NfeJ5hnXt6u5PaoItN0oZj1+3Ks npV2Yrt4HvQJckx7h0IiBNnpAR6sWqTE2Q== X-Google-Smtp-Source: ABdhPJwfxXJT58wAFLLI/WnleZf09l6ze2ZK0PAsQHSdjLlNfT1/BwbmSOzPnVA0QZoSK1PpHHBpYQ== X-Received: by 2002:adf:f009:: with SMTP id j9mr1299943wro.35.1612472776451; Thu, 04 Feb 2021 13:06:16 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:15 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 06/10] grep/pcre2: use compile-time PCREv2 version test Date: Thu, 4 Feb 2021 22:05:52 +0100 Message-Id: <20210204210556.25242-7-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Replace a use of pcre2_config(PCRE2_CONFIG_VERSION, ...) which I added in 95ca1f987ed (grep/pcre2: better support invalid UTF-8 haystacks, 2021-01-24) with the same test done at compile-time. It might be cuter to do this at runtime since we don't have to do the "major >= 11 || (major >= 10 && ...)" test. But in the next commit we'll add another version comparison that absolutely needs to be done at compile-time, so we're better of being consistent across the board. Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 18 ++++-------------- grep.h | 3 +++ 2 files changed, 7 insertions(+), 14 deletions(-) diff --git a/grep.c b/grep.c index 7d262a23d8..e58044474d 100644 --- a/grep.c +++ b/grep.c @@ -400,21 +400,11 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt !(!opt->ignore_case && (p->fixed || p->is_fixed))) options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF); +#ifdef GIT_PCRE2_VERSION_10_36_OR_HIGHER /* Work around https://bugs.exim.org/show_bug.cgi?id=2642 fixed in 10.36 */ - if (PCRE2_MATCH_INVALID_UTF && options & (PCRE2_UTF | PCRE2_CASELESS)) { - struct strbuf buf; - int len; - int err; - - if ((len = pcre2_config(PCRE2_CONFIG_VERSION, NULL)) < 0) - BUG("pcre2_config(..., NULL) failed: %d", len); - strbuf_init(&buf, len + 1); - if ((err = pcre2_config(PCRE2_CONFIG_VERSION, buf.buf)) < 0) - BUG("pcre2_config(..., buf.buf) failed: %d", err); - if (versioncmp(buf.buf, "10.36") < 0) - options |= PCRE2_NO_START_OPTIMIZE; - strbuf_release(&buf); - } + if (PCRE2_MATCH_INVALID_UTF && options & (PCRE2_UTF | PCRE2_CASELESS)) + options |= PCRE2_NO_START_OPTIMIZE; +#endif p->pcre2_pattern = pcre2_compile((PCRE2_SPTR)p->pattern, p->patternlen, options, &error, &erroffset, diff --git a/grep.h b/grep.h index ae89d6254b..54e52042cb 100644 --- a/grep.h +++ b/grep.h @@ -4,6 +4,9 @@ #ifdef USE_LIBPCRE2 #define PCRE2_CODE_UNIT_WIDTH 8 #include +#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11 +#define GIT_PCRE2_VERSION_10_36_OR_HIGHER +#endif #else typedef int pcre2_code; typedef int pcre2_match_data; From patchwork Thu Feb 4 21:05:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068791 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66E41C433E0 for ; Thu, 4 Feb 2021 21:07:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2553A64FA7 for ; Thu, 4 Feb 2021 21:07:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230165AbhBDVHp (ORCPT ); Thu, 4 Feb 2021 16:07:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230088AbhBDVHd (ORCPT ); Thu, 4 Feb 2021 16:07:33 -0500 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24DFEC061794 for ; Thu, 4 Feb 2021 13:06:19 -0800 (PST) Received: by mail-wm1-x334.google.com with SMTP id 190so4331039wmz.0 for ; Thu, 04 Feb 2021 13:06:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JrxhJITvh/JFLu/TQm4W9HxXuDchoeDmPm9ANtG+qqY=; b=o9T78UxClv6BuYu0KTODfdBQQXtNHkzyxCVfi/Oe/wSa/RYMJNBAgJ/xxkGKPzWAE3 OVYany0c5LNZhMexWOzp60gLmE3X5gS/uqqpy53jCqSIUqV1CbXcwCwixSHkF89RakU+ Um+f9mKETtUpU/BRwAWIMlcs29pSKEjzwobS8VFTeciwLlQ9C0X7F9mSRWVpEpupmTk5 NTuFrdXgghPKZhwAPb3pLckDJ5Kl9znSIXOgNiIUvw7w125we+UCGSJSjkiACCIROKkC PDow8afoqaoLGO9iE7VhCPKWBXOdGgqQVh8M0D3aVgHywqffHo7GesA3pjgyYnb0AceY +fNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JrxhJITvh/JFLu/TQm4W9HxXuDchoeDmPm9ANtG+qqY=; b=WWnhuVLJf79ULBtbScaL0HUtu1ZLvO34V1UoSYsdWexZW9B2lelJ4M575mlMD2ppt3 BsNgCYGUsmS8G/VykDNVIlEuP3Q/bjHyfGu1Hdzez2Wcw1psH0ylyFD8jCf+ZZmP98an mzVeaFkMRaD2SWmEKh/048kDP6KxiFeuEx9jBlNl26FtunJwLaZNSMzbW1YqJgg7i9wK pWjDOEz8hNMg7VJNnv07gDg2wRqvh52huf/qY3t+a2WhGcLNER9JYYsTSU9NBz0Jhe6W rK2pE43CYq1qXpBLbSUAProHZq9DcdtZjAo1N14Myrlu0Wcdc59W4UrjHcAvIHoyXvKO XKPA== X-Gm-Message-State: AOAM533FKgmyUWfnu4yK4lHbeNccQfR/ivMZZHf3I2BRn1U6Fu3j8Eto qgwEBlajBFPfDWspLHZQzqzxcK5M1pqQLg== X-Google-Smtp-Source: ABdhPJxpRBeAYEi9J/GRNzI053dFTkY3hVyHLYcpT30TNDYUE8Gaayl8Ura8wmUkir/E8t5/Zk4fZQ== X-Received: by 2002:a1c:7e4e:: with SMTP id z75mr883295wmc.168.1612472777629; Thu, 04 Feb 2021 13:06:17 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:16 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function Date: Thu, 4 Feb 2021 22:05:53 +0100 Message-Id: <20210204210556.25242-8-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Make use of the pcre2_maketables_free() function to free the memory allocated by pcre2_maketables(). At first sight it's strange that 10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16) which added the free() call here doesn't make use of the pcre2_free() the author introduced in the preceding commit in 513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16). The reason is that at the time the function didn't exist. It was first introduced in PCREv2 version 10.34, released on 2019-11-21. Let's make use of it behind a macro. I don't think this matters for anything to do with custom allocators, but it makes our use of PCREv2 more discoverable. At some distant point in the future we'll be able to drop the version guard, as nobody will be running a version older than 10.34. Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 4 ++++ grep.h | 3 +++ 2 files changed, 7 insertions(+) diff --git a/grep.c b/grep.c index e58044474d..c63dbff4b2 100644 --- a/grep.c +++ b/grep.c @@ -490,7 +490,11 @@ static void free_pcre2_pattern(struct grep_pat *p) pcre2_compile_context_free(p->pcre2_compile_context); pcre2_code_free(p->pcre2_pattern); pcre2_match_data_free(p->pcre2_match_data); +#ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER + pcre2_maketables_free(pcre2_global_context, p->pcre2_tables); +#else free((void *)p->pcre2_tables); +#endif } #else /* !USE_LIBPCRE2 */ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt) diff --git a/grep.h b/grep.h index 54e52042cb..64666e9204 100644 --- a/grep.h +++ b/grep.h @@ -7,6 +7,9 @@ #if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11 #define GIT_PCRE2_VERSION_10_36_OR_HIGHER #endif +#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 34) || PCRE2_MAJOR >= 11 +#define GIT_PCRE2_VERSION_10_34_OR_HIGHER +#endif #else typedef int pcre2_code; typedef int pcre2_match_data; From patchwork Thu Feb 4 21:05:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6F3FC433E0 for ; Thu, 4 Feb 2021 21:07:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9141F64FA7 for ; Thu, 4 Feb 2021 21:07:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230134AbhBDVHk (ORCPT ); Thu, 4 Feb 2021 16:07:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230075AbhBDVHd (ORCPT ); Thu, 4 Feb 2021 16:07:33 -0500 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 528D1C061797 for ; Thu, 4 Feb 2021 13:06:20 -0800 (PST) Received: by mail-wr1-x42b.google.com with SMTP id c12so5194195wrc.7 for ; Thu, 04 Feb 2021 13:06:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9fpIbgVB1ioLE6p5hinYpbuExDDHGymIPBD7V65ryiY=; b=tBC1ddeZSDjPhSyQrmBbveUdV6IU0GY+ns7M85+TmCmKj5iajIoAXrf3tmyB0Ko8Ce iCzEm+aYYUj34iyTnCzHuPLSsVZOILLzDwW/USUStf7mvTy4KAKDumvGiSyQ5GdMzvgf 9bPhJtsruvOEF1hjVlzozYmJAuHliGeySXHawXlhImBuwbR0+yShrqjQ9MbLy28NlukG bcdzLZWcNVwccxIm2Lku4rr51lCRHZteUHxqJLO0ualy2REi9q5tIs7muoJNs075Hh8R 7abVR3kfAL6NzZHUSkVj4qD36nfyS4+Y6Rj5sfkhT1PVqEwpx6DLyVNnib2SdC4Z2D/u UnXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9fpIbgVB1ioLE6p5hinYpbuExDDHGymIPBD7V65ryiY=; b=HmN4ha6RqBJ7AUQdaO53T0y9C9gUpbpQZ26y5K82oMHc8nkXZSIvpwt0zHG7A/ArXJ 1rjizqlJZOnzJmraeQtiRkzf51AX0cFILENKFHMlX8TG9K/8JYO4ukwIO09ICZowi5uv p5IFPoEsdYvI3KIY00Ngz8z5UVMqVnoYlLETYL3btnLBrz489yL8UWnrPfo+gGWVqNHj y78H0qTxloVMdP2m9B345/Pqx/ChDcSWydHIa07QYJqPe48E+vllwfuMzpBdZdrmAGyT 3aHOhuyl4J00fiUZoovyO+QaZpKLpdlFygiQA15I/DeyOnYekRpBpGMviANqRp5RXnKo XpCw== X-Gm-Message-State: AOAM532D4N/A1bAUEz+2PQykQv+HSPz5g8LP5I6hjtuLO3V7XkFhYP+B apnIvVtr9kH4NgvLiedkhHLJhfedkzdiog== X-Google-Smtp-Source: ABdhPJwOW37K8h4XLgzatJXFHqbja6JNuLw9L7a8m0OjZYPWRQlJ+jSGqtEnm4yM6PmR/xLfB9Z78Q== X-Received: by 2002:adf:c109:: with SMTP id r9mr1290246wre.261.1612472778785; Thu, 04 Feb 2021 13:06:18 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:17 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator Date: Thu, 4 Feb 2021 22:05:54 +0100 Message-Id: <20210204210556.25242-9-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}(). functions for allocation. We'll now use it for all PCREv2 allocations. The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR issue is because it managed to target pretty much the allocation freed via free(), as opposed to by a pcre2_*free() function. I.e. the pcre2_maketables() and pcre2_maketables_free() pair. For most of the rest we continued allocating with stock malloc() inside PCREv2 itself, but didn't segfault because we'd use its corresponding free(). In a preceding commit of mine I changed the free() to pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as far as fixing the segfault goes we could revert 513f2b0bbd4. But then we wouldn't use the desired allocator, let's just use it instead. Before this patch we'd on e.g.: grep --threads=1 -iP æ.*var.*xyz Only use pcre2_{malloc,free}() for 2 malloc() calls and 2 corresponding free() call. Now it's 12 calls to each. This can be observed with the GREP_PCRE2_DEBUG_MALLOC debug mode. Reading the history of how this bug got introduced it wasn't present in Johannes's original patch[1] to fix the issue. My reading of that thread is that the approach the follow-up patches to Johannes's original pursued were based on misunderstanding of how the PCREv2 API works. In particular this part of [2]: "most of the time (like when using UTF-8) the chartable (and therefore the global context) is not needed (even when using alternate allocators)" That's simply not how PCREv2 memory allocation works. It's easy to see how the misunderstanding came about. It's because (as noted above) the issue was noticed because of our use of free() in our own grep.c for freeing the memory allocated by pcre2_maketables(). Thus the misunderstanding that PCREv2's compile context is something only needed for pcre2_maketables(), and e.g. an aborted earlier attempt[3] to only set it up when we ourselves called pcre2_maketables(). That's not what PCREv2's compile context is. To quote PCREv2's documentation: "This context just contains pointers to (and data for) external memory management functions that are called from several places in the PCRE2 library." Thus the failed attempts to go down the route of only creating the general context in cases where we ourselves call pcre2_maketables(), before finally settling on the approach 513f2b0bbd4 took of always creating it. Instead we should always create it, and then pass the general context to those functions that accept it, so that they'll consistently use our preferred memory allocation functions. 1. https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/ 2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/ 3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/grep.c b/grep.c index c63dbff4b2..0116ff5f09 100644 --- a/grep.c +++ b/grep.c @@ -390,7 +390,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt if (!pcre2_global_context) BUG("pcre2_global_context uninitialized"); p->pcre2_tables = pcre2_maketables(pcre2_global_context); - p->pcre2_compile_context = pcre2_compile_context_create(NULL); + p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context); pcre2_set_character_tables(p->pcre2_compile_context, p->pcre2_tables); } @@ -411,7 +411,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt p->pcre2_compile_context); if (p->pcre2_pattern) { - p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, NULL); + p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context); if (!p->pcre2_match_data) die("Couldn't allocate PCRE2 match data"); } else { From patchwork Thu Feb 4 21:05:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18EF7C433DB for ; Thu, 4 Feb 2021 21:07:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D3E2664FA7 for ; Thu, 4 Feb 2021 21:07:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230171AbhBDVHy (ORCPT ); Thu, 4 Feb 2021 16:07:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230106AbhBDVHe (ORCPT ); Thu, 4 Feb 2021 16:07:34 -0500 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81CCFC0617A7 for ; Thu, 4 Feb 2021 13:06:21 -0800 (PST) Received: by mail-wm1-x336.google.com with SMTP id j21so1508617wmj.0 for ; Thu, 04 Feb 2021 13:06:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/xIet10MhWiHzzWeiSUKVPoxFWBjUs8GWyU+LV8FRZc=; b=LCEV2aemf1213mPC7mEhf6USd0uIpCNLSZrkuJIjnVVWPreXK2lw3NPXM3w1e/2W7R L+/xfXNgNKHFjstzLKURofXdh3nmn3kHsWr2KlbUNAe9ihIk9Znl9Bx0ZT8A8p6smCWM RRrDen0JBatpAkkKpGt8J5gs0ljeWuTsh2WnbFm+WyFFIypYVFYMFDX5YUki5Er3i4hI CU1Yh5g7xlQgR+UOZ+tAoqDr21X3GN/u/D9klEZOWHrafOHsieD4P9Yl1lXwEbuzvGmv IoKUSeAK0xdo6LWVvIR5Wz/2/LlEKQPrR6JkA91Rza7djWxAGlAbdr9/GD38oxy3VWUg OKlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/xIet10MhWiHzzWeiSUKVPoxFWBjUs8GWyU+LV8FRZc=; b=bFwUxDZZk9UlAATgQNNcRUZMTRu8Fy8c/BOUKA3PZ7mDn7idIBS9yH/svoUQYPnihN /gZrapFKXvbklU2pPgw6xWRj2r6bOhseTUOtW6UfkPVMtPpRu1LV0OYTGv+96CUrRXYd B9xJw+BlaBFmQuj3hdmirmFR0+hK8bmUFdihFBAm/Tm6CW/LPwNOaeQxpor9e5dZGM8X i1vBe1UznLwUCxhdb2SoAYClMPzNHIYxt+Xybpr7oGh0HyWJUNa5X0EFpBdqrcpGkfng kHO8MwiaITgdxYOVCpFFprhbn8tPT3rAKlLzz44Rsy/fMnLa/uWXrkU3cSgjtzi+1wIF C4Zw== X-Gm-Message-State: AOAM530ItYAgWnkWtRuojOn66tpCEMuzw+9YYAd+ZeX+fU+K6KopJ6EL uqYT0aetc0C9fx3xk/YOm2sCyNtGmiZAVQ== X-Google-Smtp-Source: ABdhPJxPUC9Jv3zZC1WlSkchJqQrs9se+Cleu5jFoYlKWRZcq3c3fbr0GL4+ZH8K5AELBFqSKvlIZQ== X-Received: by 2002:a1c:ab88:: with SMTP id u130mr847803wme.185.1612472779862; Thu, 04 Feb 2021 13:06:19 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:19 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 09/10] grep/pcre2: move back to thread-only PCREv2 structures Date: Thu, 4 Feb 2021 22:05:55 +0100 Message-Id: <20210204210556.25242-10-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Change the setup of the "pcre2_general_context" to happen per-thread in compile_pcre2_pattern() instead of in grep_init(), as happens with all the rest of the pcre2_* members of the grep_pat structure. As noted in the preceding commit the approach 513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16) took to allocate the pcre2_general_context seems to have been initially based on a misunderstanding of how PCREv2 memory allocation works. This approach of creating a global context is just added complexity for almost zero gain. On my system it's 24 bytes saved per-thread, for context PCREv2 will then go on to some kilobytes for its own thread-local state. As noted in 6d423dd542f (grep: don't redundantly compile throwaway patterns under threading, 2017-05-25) the grep code is intentionally not trying to micro-optimize allocations by e.g. sharing some PCREv2 structures globally, while making others thread-local. So let's remove this special case and make all of them thread-local for simplicity again. See also the discussion in 94da9193a6 (grep: add support for PCRE v2, 2017-06-01) about thread safety, and Johannes's comments[1] to the effect that we should be doing what this patch is doing. 1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/ Signed-off-by: Ævar Arnfjörð Bjarmason --- builtin/grep.c | 1 - grep.c | 41 +++++++++++++++-------------------------- grep.h | 3 ++- 3 files changed, 17 insertions(+), 28 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 55d06c9513..c69fe99340 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1175,6 +1175,5 @@ int cmd_grep(int argc, const char **argv, const char *prefix) run_pager(&opt, prefix); clear_pathspec(&pathspec); free_grep_patterns(&opt); - grep_destroy(); return !hit; } diff --git a/grep.c b/grep.c index 0116ff5f09..2599f329cd 100644 --- a/grep.c +++ b/grep.c @@ -41,7 +41,6 @@ static struct grep_opt grep_defaults = { }; #ifdef USE_LIBPCRE2 -static pcre2_general_context *pcre2_global_context; #define GREP_PCRE2_DEBUG_MALLOC 0 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data) @@ -163,20 +162,9 @@ int grep_config(const char *var, const char *value, void *cb) * Initialize one instance of grep_opt and copy the * default values from the template we read the configuration * information in an earlier call to git_config(grep_config). - * - * If using PCRE, make sure that the library is configured - * to use the same allocator as Git (e.g. nedmalloc on Windows). - * - * Any allocated memory needs to be released in grep_destroy(). */ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix) { -#if defined(USE_LIBPCRE2) - if (!pcre2_global_context) - pcre2_global_context = pcre2_general_context_create( - pcre2_malloc, pcre2_free, NULL); -#endif - *opt = grep_defaults; opt->repo = repo; @@ -186,13 +174,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix opt->header_tail = &opt->header_list; } -void grep_destroy(void) -{ -#ifdef USE_LIBPCRE2 - pcre2_general_context_free(pcre2_global_context); -#endif -} - static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt) { /* @@ -384,13 +365,20 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt int patinforet; size_t jitsizearg; - /* pcre2_global_context is initialized in grep_init */ + /* + * Call pcre2_general_context_create() before calling any + * other pcre2_*(). It sets up our malloc()/free() functions + * with which everything else is allocated. + */ + p->pcre2_general_context = pcre2_general_context_create( + pcre2_malloc, pcre2_free, NULL); + if (!p->pcre2_general_context) + die("Couldn't allocate PCRE2 general context"); + if (opt->ignore_case) { if (!opt->ignore_locale && has_non_ascii(p->pattern)) { - if (!pcre2_global_context) - BUG("pcre2_global_context uninitialized"); - p->pcre2_tables = pcre2_maketables(pcre2_global_context); - p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context); + p->pcre2_tables = pcre2_maketables(p->pcre2_general_context); + p->pcre2_compile_context = pcre2_compile_context_create(p->pcre2_general_context); pcre2_set_character_tables(p->pcre2_compile_context, p->pcre2_tables); } @@ -411,7 +399,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt p->pcre2_compile_context); if (p->pcre2_pattern) { - p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context); + p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, p->pcre2_general_context); if (!p->pcre2_match_data) die("Couldn't allocate PCRE2 match data"); } else { @@ -491,10 +479,11 @@ static void free_pcre2_pattern(struct grep_pat *p) pcre2_code_free(p->pcre2_pattern); pcre2_match_data_free(p->pcre2_match_data); #ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER - pcre2_maketables_free(pcre2_global_context, p->pcre2_tables); + pcre2_maketables_free(p->pcre2_general_context, p->pcre2_tables); #else free((void *)p->pcre2_tables); #endif + pcre2_general_context_free(p->pcre2_general_context); } #else /* !USE_LIBPCRE2 */ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt) diff --git a/grep.h b/grep.h index 64666e9204..72f82b1e30 100644 --- a/grep.h +++ b/grep.h @@ -14,6 +14,7 @@ typedef int pcre2_code; typedef int pcre2_match_data; typedef int pcre2_compile_context; +typedef int pcre2_general_context; #endif #ifndef PCRE2_MATCH_INVALID_UTF /* PCRE2_MATCH_* dummy also with !USE_LIBPCRE2, for test-pcre2-config.c */ @@ -75,6 +76,7 @@ struct grep_pat { pcre2_code *pcre2_pattern; pcre2_match_data *pcre2_match_data; pcre2_compile_context *pcre2_compile_context; + pcre2_general_context *pcre2_general_context; const uint8_t *pcre2_tables; uint32_t pcre2_jit_on; unsigned fixed:1; @@ -167,7 +169,6 @@ struct grep_opt { int grep_config(const char *var, const char *value, void *); void grep_init(struct grep_opt *, struct repository *repo, const char *prefix); -void grep_destroy(void); void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt); void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t); From patchwork Thu Feb 4 21:05:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= X-Patchwork-Id: 12068795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78381C433E0 for ; Thu, 4 Feb 2021 21:08:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3122E64FA7 for ; Thu, 4 Feb 2021 21:08:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230172AbhBDVH6 (ORCPT ); Thu, 4 Feb 2021 16:07:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbhBDVHf (ORCPT ); Thu, 4 Feb 2021 16:07:35 -0500 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A9C11C0617A9 for ; Thu, 4 Feb 2021 13:06:22 -0800 (PST) Received: by mail-wm1-x331.google.com with SMTP id l12so4328079wmq.2 for ; Thu, 04 Feb 2021 13:06:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Cfc9HKftHNiMcn9e6q5gY0WzMkB21tulNMrnFpqEr90=; b=ZciB0oft0mcKiFPiX44cNn7gnIMVjP3Lo7lSUIiv6Cyk7YDwb+am+1TCO1E3dhzCyQ 92vHZYvF1Vq7k3qs6Jg92lh59rY9wbb/jUX5BClsegcJLtO4+p8gJIqTcMm9asorAU5u rDOmxtsKvSKx4aVjZlNcovMtcmPC1FYano3/MT/HP/kp4v5UC25SvLC/pcI1P7HH9N8P y1AIuZE8f+McPHvQkkTyKLCq+lUR+x1N15I3TglgdMzBmfXRjYF55Z1hkcYfvudy6BeR 8LrCqgbr6Vkt28XzdTUz0ZDXeuR9XIpS96bXCr+VDxx/RgEoJfTstPFn+HsEmM0Y0l/n HVtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Cfc9HKftHNiMcn9e6q5gY0WzMkB21tulNMrnFpqEr90=; b=J0Kqt//DQ3LFkjxQP8otckE+i39WO+mdKJjP7KtqdSdJGaVkVsZB4f0dpRExs4qiS6 VYK1Rab5xTa9mvDQZblYSSef4xzJQ5nJ2MmKwKsn5tjn98Q8m/fSrxVOYArFcTE4u5TO OIxb4wMdwVJsayxBK7Y65ejTQWzjIACKyUs01RHrjf+kS5407g0m/q+Hrbn0d1DiFw6w RWfvsWdKl37BYpCHnettUfTgWbmaCHJDSBcwsiWwIZl+cmVcUOPSw84zhY19KdRaqx6n ldH2zeRm6pHO7EKfF/2sh+BPlcV5ZzZDdMK3MUovXdO6o6HUwFalggQyMMKwoqGZGXke fnxw== X-Gm-Message-State: AOAM530YLnhkmUhJ/my2/XuBGuWSjnEsuT519+blumYWueOJIxjZiGW2 Irwgb77QU/IBehYpk0Z28bkJbhMIfaUgbQ== X-Google-Smtp-Source: ABdhPJwKXMbuYhzzvtK14SQ87m0GMRXu1iTbEQbqKT/20xlWdjwTQ6Ur4x879J6ZqLIUpOi6gOwcjA== X-Received: by 2002:a1c:356:: with SMTP id 83mr883700wmd.31.1612472781206; Thu, 04 Feb 2021 13:06:21 -0800 (PST) Received: from vm.nix.is (vm.nix.is. [2a01:4f8:120:2468::2]) by smtp.gmail.com with ESMTPSA id n5sm6779318wmq.7.2021.02.04.13.06.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 13:06:20 -0800 (PST) From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= To: git@vger.kernel.org Cc: Junio C Hamano , Jeff King , Johannes Schindelin , =?utf-8?q?Carlo_Marcelo_A?= =?utf-8?q?renas_Bel=C3=B3n?= , =?utf-8?b?w4Z2YXIgQXJu?= =?utf-8?b?ZmrDtnLDsCBCamFybWFzb24=?= Subject: [PATCH 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Date: Thu, 4 Feb 2021 22:05:56 +0100 Message-Id: <20210204210556.25242-11-avarab@gmail.com> X-Mailer: git-send-email 2.30.0.284.gd98b1dd5eaa7 In-Reply-To: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> References: <191d3a2280232ff98964fd42bfe0bc85ee3708f5.1571227824.git.gitgitgadget@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Move the definitions of the pcre2_{malloc,free} functions above the compile_pcre2_pattern() function they're used it. Before the preceding commit they used to be needed earlier, but now we can move them to be adjacent to the other PCREv2 functions. Signed-off-by: Ævar Arnfjörð Bjarmason --- grep.c | 46 ++++++++++++++++++++++------------------------ 1 file changed, 22 insertions(+), 24 deletions(-) diff --git a/grep.c b/grep.c index 2599f329cd..636ac48bf0 100644 --- a/grep.c +++ b/grep.c @@ -40,30 +40,6 @@ static struct grep_opt grep_defaults = { .output = std_output, }; -#ifdef USE_LIBPCRE2 -#define GREP_PCRE2_DEBUG_MALLOC 0 - -static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data) -{ - void *pointer = malloc(size); -#if GREP_PCRE2_DEBUG_MALLOC - static int count = 1; - fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size); -#endif - return pointer; -} - -static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data) -{ -#if GREP_PCRE2_DEBUG_MALLOC - static int count = 1; - if (pointer) - fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++); -#endif - free(pointer); -} -#endif - static const char *color_grep_slots[] = { [GREP_COLOR_CONTEXT] = "context", [GREP_COLOR_FILENAME] = "filename", @@ -355,6 +331,28 @@ static int is_fixed(const char *s, size_t len) } #ifdef USE_LIBPCRE2 +#define GREP_PCRE2_DEBUG_MALLOC 0 + +static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data) +{ + void *pointer = malloc(size); +#if GREP_PCRE2_DEBUG_MALLOC + static int count = 1; + fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size); +#endif + return pointer; +} + +static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data) +{ +#if GREP_PCRE2_DEBUG_MALLOC + static int count = 1; + if (pointer) + fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++); +#endif + free(pointer); +} + static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt) { int error;