From patchwork Mon Sep 30 01:50:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165905 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 15CFE76 for ; Mon, 30 Sep 2019 01:51:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E920821835 for ; Mon, 30 Sep 2019 01:51:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="nU+vo6Ex" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729280AbfI3BvP (ORCPT ); Sun, 29 Sep 2019 21:51:15 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:40160 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726360AbfI3BvP (ORCPT ); Sun, 29 Sep 2019 21:51:15 -0400 Received: by mail-qt1-f194.google.com with SMTP id f7so14987019qtq.7 for ; Sun, 29 Sep 2019 18:51:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GuBB+1Z30y1D2AQnWS40HJipgE2bRJGCPbGP610hHKQ=; b=nU+vo6Ex7J4/qVhkBrA2Lt3GjKGvhncKjgorzYTUW4sICCkwDPyNK0rStYah61k93Y kIYQW5OmZbahZiLmNqfIh9SFChqDVygWeoYMFgIv4NxIEHWFplD79YGWKNS0ucOLIrYc dIoAVLFnbsQggVUttfBQlVDJrbZR4MrTbtSkxh63Gzc8LF2Zxi77r3BU74YQvajnc6U9 9n7/SPgohAO2MUySiiJri8B3+eq5snKR8S7BEi0DrhanAOd9QsukJ/E+XUzYYyTosQ2l EHUWVJgmjgwUgrpKYFyLOIdolujiwfgWlw5BRd61fIDKMOOskZP6Bu/bSop7RH4RqJKk Ss2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GuBB+1Z30y1D2AQnWS40HJipgE2bRJGCPbGP610hHKQ=; b=WGUSw4a4pBQ6xJGcp5f9VDn8AngfM5al+Di05LNwYVpLWXmkrwQWKN8FJN2csc7r0p L8eVBxNP2jhnqltT9A334ZpwyJA/6HpnQQMyh+Fp5QiSNFJftZGFy0waP/WClRENbBkG /sZmyFms5pwwHV1N5R6yvHbavOsi+O6OznCnfF8ij+ntteOXfM+82lQdgpHqLFRNWEmw tq9cPefQPxPG8LQCyWT4XsQFP1k6tkadmYpbsstfd9mHlZe2hOSBNhMIgHYmHfa+KPjV IYuNfK7neM/Zttz5lpqK5JCyG3CDKLGINIWs040UM9urzQM4wne+kFTOwEmdHRosfutf EZ/Q== X-Gm-Message-State: APjAAAXs3iHlqu+7tC7qlpsSwRkOGkynNOofKD09zTFnRBBM5ex6q9+I XxHkaWPPAk+PfLa170QwehZYngzviXY= X-Google-Smtp-Source: APXvYqxZpbu3COuOBfQ5BNPqrhWtQb7FjzttUdQ0Tgoa815boj0T6DnfECDfblF6hLZGoBzyq8GqZw== X-Received: by 2002:aed:3be3:: with SMTP id s32mr22422387qte.156.1569808274294; Sun, 29 Sep 2019 18:51:14 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:13 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com Subject: [PATCH v2 01/11] grep: fix race conditions on userdiff calls Date: Sun, 29 Sep 2019 22:50:47 -0300 Message-Id: <0f31cb0c126e824008d35d5cba52dd1c3c115c00.1569808052.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org git-grep uses an internal grep_read_mutex to protect object reading operations. Similarly, there's a grep_attr_mutex to protect calls to the gitattributes machinery. However, two of the three functions protected by the last mutex may also perform object reading, as seen bellow: - userdiff_get_textconv() > notes_cache_init() > notes_cache_match_validity() > lookup_commit_reference_gently() > parse_object() > repo_has_object_file() > repo_has_object_file_with_flags() > oid_object_info_extended() - userdiff_find_by_path() > git_check_attr() > collect_some_attrs() > prepare_attr_stack() > read_attr() > read_attr_from_index() > read_blob_data_from_index() > read_object_file() As these calls are not protected by grep_read_mutex, there might be race conditions with other threads performing object reading (e.g. threads calling fill_textconv() at grep.c:fill_textconv_grep()). To prevent that, let's make sure to acquire the lock before both of these calls. Note: this patch might slow down the threaded grep in worktree, for the sake of thread-safeness. However, in the following patches we should regain performance by replacing grep_read_mutex for an internal object reading lock and allowing parallel inflation during object reading. Signed-off-by: Matheus Tavares --- grep.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/grep.c b/grep.c index cd952ef5d3..b29946def2 100644 --- a/grep.c +++ b/grep.c @@ -1809,7 +1809,9 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle * is not thread-safe. */ grep_attr_lock(); + grep_read_lock(); textconv = userdiff_get_textconv(opt->repo, gs->driver); + grep_read_unlock(); grep_attr_unlock(); } @@ -2177,8 +2179,11 @@ void grep_source_load_driver(struct grep_source *gs, return; grep_attr_lock(); - if (gs->path) + if (gs->path) { + grep_read_lock(); gs->driver = userdiff_find_by_path(istate, gs->path); + grep_read_unlock(); + } if (!gs->driver) gs->driver = userdiff_find_by_name("default"); grep_attr_unlock(); From patchwork Mon Sep 30 01:50:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165909 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F23B1599 for ; Mon, 30 Sep 2019 01:51:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7DC5F2082F for ; Mon, 30 Sep 2019 01:51:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="RkUNq/LN" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729326AbfI3BvT (ORCPT ); Sun, 29 Sep 2019 21:51:19 -0400 Received: from mail-qk1-f196.google.com ([209.85.222.196]:39492 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726360AbfI3BvT (ORCPT ); Sun, 29 Sep 2019 21:51:19 -0400 Received: by mail-qk1-f196.google.com with SMTP id 4so6410954qki.6 for ; Sun, 29 Sep 2019 18:51:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HEIHFvzdwAhSMqiGSbq8PfzCVlfGdZT5BATSO8gHVmU=; b=RkUNq/LNdMHm7HRBzzVc0GmdEDNOEMWmJSWIUYu71hW5qOgdac0+6u2AEe4jp/ybo8 Mm8Xq/wEqcHgANA8/vxGg9QB4kP5P1HtDzp6OB2vF7tTmYQ7Yyp1rRCuI7TF34FvTlps fiU67c2RZTSEIG4mLjlf0xVXzy7uDuBvPixtlkXIuBmqBn6UlbLgk//T3nw+7C6Lr8+u bUbjmZBXwSLGVdkLokdlEEy32vBUzdql1Be0Tk5Qv4ovQQknTno9NPl/bs4g2EvTatyZ TXgJbrSVG96tvM7NNS8LyhOLklE9E2553uqiw8GCQkBciC8jyvkM3scZjet2sMTwQBMv g1dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HEIHFvzdwAhSMqiGSbq8PfzCVlfGdZT5BATSO8gHVmU=; b=GJ+PO/7Nw3wcpIFpCWDhBUK/k6d/clVUV+VqaMRSpJjRMiIPxRgPN/8/2YH3v7/41C /89brz5GbJ8MHiJFF2oigjjnswTRInwlMlZdNHoSfdyqjyiMUfpzvgAEKVHRJWNzxQ2I rZjqng7CeaCKFmZjIe1n2HWOuI+g03vEZ0tH3pPnGjLffTYmFAGfxhInKToY9T2Qbz8t dyfPBz5gmdbOBmlo9Jjv10ETwkoMFia82Tl6L5jWgo21ezeZQnyqyJ0t7cmFMP5OGQrq LTMcQRCr0twduG/CImrfSbpu0rUE8kiBTty5lpw5cgYE7sjwv7oZmxFuUWxQcQl2MDFU QZRA== X-Gm-Message-State: APjAAAVCgLuMNo+uTPxOZ6hmL9FdTfbvG+JCcg/IHLVZYlYUxBsik2OK ZrqY1p71Xvz80IbYI8A1W/VPNMSCvsE= X-Google-Smtp-Source: APXvYqxCbuflG8utemjZgjgqsKX6isF07KHjzDqNYGWjVnzQx4QQc7IThsVnFyUsnVx8auRJT+gUQA== X-Received: by 2002:a37:dc06:: with SMTP id v6mr16508690qki.163.1569808277832; Sun, 29 Sep 2019 18:51:17 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:17 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Antonio Ospite , Brandon Williams , Jonathan Tan , Stefan Beller Subject: [PATCH v2 02/11] grep: fix race conditions at grep_submodule() Date: Sun, 29 Sep 2019 22:50:48 -0300 Message-Id: X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org There're currently two function calls in builtin/grep.c:grep_submodule() which might result in race conditions: - submodule_from_path(): it has config_with_options() in its call stack which, in turn, may have read_object_file() in its own. Therefore, calling the first function without acquiring grep_read_mutex may end up causing a race condition with other object read operations performed by worker threads (for example, at the fill_textconv() call in grep.c:fill_textconv_grep()). - parse_object_or_die(): it falls into the same problem, having repo_has_object_file(the_repository, ...) in its call stack. Besides that, parse_object(), which is also called by parse_object_or_die(), is thread-unsafe and also called by object reading functions. It's unlikely to really fall into a data race with these operations as the volume of calls to them is usually very low. But we better protect ourselves against this possibility, anyway. So, to solve these issues, move both of these function calls into the critical section of grep_read_mutex. Signed-off-by: Matheus Tavares --- builtin/grep.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 2699001fbd..626dbe554d 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -407,8 +407,7 @@ static int grep_submodule(struct grep_opt *opt, { struct repository subrepo; struct repository *superproject = opt->repo; - const struct submodule *sub = submodule_from_path(superproject, - &null_oid, path); + const struct submodule *sub; struct grep_opt subopt; int hit; @@ -419,6 +418,7 @@ static int grep_submodule(struct grep_opt *opt, * object. */ grep_read_lock(); + sub = submodule_from_path(superproject, &null_oid, path); if (!is_submodule_active(superproject, path)) { grep_read_unlock(); @@ -455,9 +455,8 @@ static int grep_submodule(struct grep_opt *opt, unsigned long size; struct strbuf base = STRBUF_INIT; - object = parse_object_or_die(oid, oid_to_hex(oid)); - grep_read_lock(); + object = parse_object_or_die(oid, oid_to_hex(oid)); data = read_object_with_reference(&subrepo, &object->oid, tree_type, &size, NULL); From patchwork Mon Sep 30 01:50:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165911 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C9FB417E1 for ; Mon, 30 Sep 2019 01:51:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A928F2082F for ; Mon, 30 Sep 2019 01:51:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="0QH8YtUj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729331AbfI3BvX (ORCPT ); Sun, 29 Sep 2019 21:51:23 -0400 Received: from mail-qk1-f194.google.com ([209.85.222.194]:45424 "EHLO mail-qk1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726360AbfI3BvW (ORCPT ); Sun, 29 Sep 2019 21:51:22 -0400 Received: by mail-qk1-f194.google.com with SMTP id z67so6386270qkb.12 for ; Sun, 29 Sep 2019 18:51:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=x2GhhSRV6LXMMvgatnOqDZDnEP37kWP7zdaFkJZKwHE=; b=0QH8YtUjsmESg8NCanQ5mphknGlgX2VipNTvzb6gncT+NOEx3TAkEPWh2UtOpCmelX awTzmdOo/+xyKJ59vOrbtkMEf+T0RYyPSCeSlQyp9b/aoylU0DpCFpcTdjuZSnQPqULR lZ8xOH25FUbGQZWzky5SQNzJ9zpQfqgNU26+knQSfkOAH/qLX7nmedFXo0e9sF59xoqC 0Sz33fnciru1InetNOtuZj9wk9/gqLuqODitPYx9EXMLoYvoCNpyE9+qLqXvmcrTTEcq dJO2M1JVeEFKRKJA4s/PTjhkPGl7kxMYK/WCCOqcUiLHWJkDeHnceym9mmG7NxpNpZnM i3KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=x2GhhSRV6LXMMvgatnOqDZDnEP37kWP7zdaFkJZKwHE=; b=sPvOR2m1bDL9A4sB7ekLExetzZgPjeIKbVyOpoBTRlDvNPbcfzfKCdWiJsqsto2gor I15WAxlKiENs4JB5uS+Rt6ZPaF901RBOIQWn0lwdl4P5wxOdnNNP5jbskyJFSE+eqgjB CqakOoh/HxMAeBp/WjrFJ0tPWQab8qMTOBOlpfD8sk+I6AWKbb8ZNDlSeSJvaVlxf16R kIwMD0/HZAEyyEHYdBN/hXxsQB63dRNWr1dtXOGcs5GNNcmBfxQjyRHqzTYgcdwlivsc Z4+kk/xRLeeZdOIPDLpOL4ckrWKaJ42J5OlenRBdR5IMKjYOrsGwpp0AworA8pUvehFf VoFA== X-Gm-Message-State: APjAAAWGbisX3bO9Cv0jB/F5rUryukFB7fjmrkCsfh5cT02KH7U85oQX sDa6/e8OOrIEpAsqSN/2DGhfuwGxuos= X-Google-Smtp-Source: APXvYqxX7U6u6qIkljl+jwKds3nndM5rvBcoJk41ex8JeNziSStY9J+9LIxZIrYLVCRoWc+ivm2Olg== X-Received: by 2002:a05:620a:6cd:: with SMTP id 13mr16793534qky.266.1569808281282; Sun, 29 Sep 2019 18:51:21 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:20 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Jonathan Tan , Stefan Beller , Brandon Williams , "brian m. carlson" Subject: [PATCH v2 03/11] grep: fix racy calls in grep_objects() Date: Sun, 29 Sep 2019 22:50:49 -0300 Message-Id: <34aeb218bf266ac1a8fabcd9e8b307130d31eb0b.1569808052.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org deref_tag() calls is_promisor_object() and parse_object(), both of which perform lazy initializations and other thread-unsafe operations. If it was only called by grep_objects() this wouldn't be a problem as the latter is only executed by the main thread. However, deref_tag() is also present in read_object_file()'s call stack. So calling deref_tag() in grep_objects() without acquiring the grep_read_mutex may incur in a race condition with object reading operations (such as the ones internally performed by fill_textconv(), called at fill_textconv_grep()). The same problem happens with the call to gitmodules_config_oid() which also has parse_object() in its call stack. Fix that protecting both call with the said grep_read_mutex. Signed-off-by: Matheus Tavares --- builtin/grep.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/builtin/grep.c b/builtin/grep.c index 626dbe554d..fa8b9996d1 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -658,13 +658,18 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, for (i = 0; i < nr; i++) { struct object *real_obj; + + grep_read_lock(); real_obj = deref_tag(opt->repo, list->objects[i].item, NULL, 0); + grep_read_unlock(); /* load the gitmodules file for this rev */ if (recurse_submodules) { submodule_free(opt->repo); + grep_read_lock(); gitmodules_config_oid(&real_obj->oid); + grep_read_unlock(); } if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path)) { From patchwork Mon Sep 30 01:50:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165913 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B173E76 for ; Mon, 30 Sep 2019 01:51:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 861B62082F for ; Mon, 30 Sep 2019 01:51:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="J5bdIGRI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729337AbfI3Bv1 (ORCPT ); Sun, 29 Sep 2019 21:51:27 -0400 Received: from mail-qk1-f195.google.com ([209.85.222.195]:43399 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729332AbfI3Bv0 (ORCPT ); Sun, 29 Sep 2019 21:51:26 -0400 Received: by mail-qk1-f195.google.com with SMTP id h126so6394544qke.10 for ; Sun, 29 Sep 2019 18:51:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XjgY7U/fvsDJln7bP14c0fp0bf0y0uP4VAMF2X8hlH4=; b=J5bdIGRIoAdsHb4iIZBVofV8+gz9Pp89U1SUCfR1+6MEZrF/Ca6sYy9tnxQyomPYpV AlOYYFJJ+zmWTd3xyquUrIhJtYW/iPtYAadCScBhRmPEU1nK8lx5jlC30bPQqF20dluw sXSBDrM9VlUO0+k03zB/RPu04miZeaJk9GLhLdBPUcvUadH99Gy5S86kYxRw9n2TsvRX oc/tbM3PaBeSDoGg29JVaveBxh+8y+WTxeblKFJ4Wl3enT83mLULAKkvYOcgU54yoJgR dN11kqbPu/L0kq4bHjAtsE7y0GxhBnvU2HDVf4gM/9z9aQXBORki7PMC67Yu4h+Ilntu yZig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XjgY7U/fvsDJln7bP14c0fp0bf0y0uP4VAMF2X8hlH4=; b=k25ThJfZT8e94FY3IlluDImyik3U/gJCn0eVakUY4VoqaJDY+kBA6NAXR5nuim16aE FfZqpLTq+tEIeVPPmvUFlwv/VNbTw2RCE2Ei7JBcOhXSfzz5EDszIkhTJcqakXT6h8ku K0hgSGF0LRJyT3/8IW1zY+FqCsgDX6nWCLgMOR1pjkXxcr45a+IglOSwbT275UZa4HAu CdmdbDqjLrRMZgdoqqgGLRbX99FxBsiR1Xov0m4y9uiAwLm2ITqmv7BqR6LWq6QgbML0 eiAdz7W3G4QBIivtTjTtvBO8Q8NGbmYBbNGmR6NVt3yMOj9ow5tHO8XdP/5UyLqixv5f cAyQ== X-Gm-Message-State: APjAAAWWB7GME6NImWrztgmWlRzpNDflMNLBpUiWdo4Avw5fhEv/2QQX WOrGduMaNvA3cP7WHMM05oX4WOXjfzI= X-Google-Smtp-Source: APXvYqzIE6TafNEQyekya/UwhGp+csMJvh8myje4UFybMwt0txPuaI7iuLSKIfh9LuEKYMkfxz+ykg== X-Received: by 2002:a37:bd5:: with SMTP id 204mr15002726qkl.330.1569808284813; Sun, 29 Sep 2019 18:51:24 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:24 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Stefan Beller Subject: [PATCH v2 04/11] replace-object: make replace operations thread-safe Date: Sun, 29 Sep 2019 22:50:50 -0300 Message-Id: <5deee3cf11e6f67c696617a9264395fb1ab04f73.1569808052.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org replace-object functions are very close to being thread-safe: the only current racy section is the lazy initialization at prepare_replace_object(). The following patches will protect some object reading operations to be called threaded, but before that, replace functions must be protected. To do so, add a mutex to struct raw_object_store and acquire it before lazy initializing the replace_map. This won't cause any noticeable performance drop as the mutex will no longer be used after the replace_map is initialized. Later, when the replace functions are called in parallel, thread debuggers might point our use of the added replace_map_initialized flag as a data race. However, as this boolean variable is initialized as false and it's only updated once, there's no real harm. It's perfectly fine if the value is updated right after a thread read it in replace-map.h:lookup_replace_object() (there'll only be a performance penalty for the affected threads at that moment). We could cease the debugger warning protecting the variable reading at the said function. However, this would negatively affect performance for all threads calling it, at any time, so it's not really worthy since the warning doesn't represent a real problem. Instead, to make sure we don't get false positives (at ThreadSanitizer, at least) an entry for the respective function is added to .tsan-suppressions. Signed-off-by: Matheus Tavares --- .tsan-suppressions | 6 ++++++ object-store.h | 2 ++ object.c | 2 ++ replace-object.c | 11 ++++++++++- replace-object.h | 7 ++++++- 5 files changed, 26 insertions(+), 2 deletions(-) diff --git a/.tsan-suppressions b/.tsan-suppressions index 8c85014a0a..5ba86d6845 100644 --- a/.tsan-suppressions +++ b/.tsan-suppressions @@ -8,3 +8,9 @@ # in practice it (hopefully!) doesn't matter. race:^want_color$ race:^transfer_debug$ + +# A boolean value, which tells whether the replace_map has been initialized or +# not, is read racily with an update. As this variable is written to only once, +# and it's OK if the value change right after reading it, this shouldn't be a +# problem. +race:^lookup_replace_object$ diff --git a/object-store.h b/object-store.h index 7f7b3cdd80..b22e20ad7d 100644 --- a/object-store.h +++ b/object-store.h @@ -110,6 +110,8 @@ struct raw_object_store { * (see git-replace(1)). */ struct oidmap *replace_map; + unsigned replace_map_initialized : 1; + pthread_mutex_t replace_mutex; /* protect object replace functions */ struct commit_graph *commit_graph; unsigned commit_graph_attempted : 1; /* if loading has been attempted */ diff --git a/object.c b/object.c index 07bdd5b26e..7ef5856f57 100644 --- a/object.c +++ b/object.c @@ -480,6 +480,7 @@ struct raw_object_store *raw_object_store_new(void) memset(o, 0, sizeof(*o)); INIT_LIST_HEAD(&o->packed_git_mru); + pthread_mutex_init(&o->replace_mutex, NULL); return o; } @@ -507,6 +508,7 @@ void raw_object_store_clear(struct raw_object_store *o) oidmap_free(o->replace_map, 1); FREE_AND_NULL(o->replace_map); + pthread_mutex_destroy(&o->replace_mutex); free_commit_graph(o->commit_graph); o->commit_graph = NULL; diff --git a/replace-object.c b/replace-object.c index e295e87943..7bd9aba6ee 100644 --- a/replace-object.c +++ b/replace-object.c @@ -34,14 +34,23 @@ static int register_replace_ref(struct repository *r, void prepare_replace_object(struct repository *r) { - if (r->objects->replace_map) + if (r->objects->replace_map_initialized) return; + pthread_mutex_lock(&r->objects->replace_mutex); + if (r->objects->replace_map_initialized) { + pthread_mutex_unlock(&r->objects->replace_mutex); + return; + } + r->objects->replace_map = xmalloc(sizeof(*r->objects->replace_map)); oidmap_init(r->objects->replace_map, 0); for_each_replace_ref(r, register_replace_ref, NULL); + r->objects->replace_map_initialized = 1; + + pthread_mutex_unlock(&r->objects->replace_mutex); } /* We allow "recursive" replacement. Only within reason, though */ diff --git a/replace-object.h b/replace-object.h index 04ed7a85a2..3fbc32eb7b 100644 --- a/replace-object.h +++ b/replace-object.h @@ -24,12 +24,17 @@ const struct object_id *do_lookup_replace_object(struct repository *r, * name (replaced recursively, if necessary). The return value is * either sha1 or a pointer to a permanently-allocated value. When * object replacement is suppressed, always return sha1. + * + * Note: some thread debuggers might point a data race on the + * replace_map_initialized reading in this function. However, we know there's no + * problem in the value being updated by one thread right after another one read + * it here (and it should be written to only once, anyway). */ static inline const struct object_id *lookup_replace_object(struct repository *r, const struct object_id *oid) { if (!read_replace_refs || - (r->objects->replace_map && + (r->objects->replace_map_initialized && r->objects->replace_map->map.tablesize == 0)) return oid; return do_lookup_replace_object(r, oid); From patchwork Mon Sep 30 01:50:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165915 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2171B76 for ; Mon, 30 Sep 2019 01:51:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E15552082F for ; Mon, 30 Sep 2019 01:51:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="ToSfrbTE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729346AbfI3Bva (ORCPT ); Sun, 29 Sep 2019 21:51:30 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:37827 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729332AbfI3Bva (ORCPT ); Sun, 29 Sep 2019 21:51:30 -0400 Received: by mail-qt1-f196.google.com with SMTP id l3so14998555qtr.4 for ; Sun, 29 Sep 2019 18:51:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xgiB40YHaRVet1Sl9BloM9o+wnaTv1b6Qxmf3/gnBVI=; b=ToSfrbTEgOhkOg2tinGqYc3vLv5sdaDmnDVrlpJVQHFdiZIBIfEf1yb57hzO/2eHz5 N4rnHN4Ig1O9qCYR0agidAtmpurCaZZOktmMPqUP1yOsidvlVpUBFCVATtREAnJDQBrD ZfHl4HuB7M3JaUAyoEjQtNNe1Ldlco9O1iocZbdCaKB/tIZ3r0Fn50qTF/YfCJc5mKHv PuLrmINEHfQe0QfPFAVHbU9R6Y5eQoQGnoaNDplfYLnyztpDRZf63f6oquGn1QYR5+Iu Z5tGndsuk2tPoZ4JEHpyAVuXLcsT/bo7UbASYhhYLDFu8hSDrAe3b9Vzx8B/05lk/sOP deIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xgiB40YHaRVet1Sl9BloM9o+wnaTv1b6Qxmf3/gnBVI=; b=Ju6YNeWEM8o/SkUarzOOJnr5zvG2ioKjbK7b2Bs1Xz8UthU0D1c4+x8/QKwBQ4/PWQ DjgubfGzSCrF+K95oUqGta8N9BCxqT9o1h95WJRjIQyq7aduMKrIOLangCAENhrMFXyC yzsbqV4fmjcBb4JWD5XsySJ8J4pRIyE8OFvo6PDgxOOf5BBMLNtOLs4Uzc/Yo3Cga3tT QrDrYmbXbSYnRQo8t94ypmK7AayYwR8ZvBjCskDAEpLVEXv/VJ4LQCaSOShDmFcgyxu7 ViDI7XEy2Lj9u1jjrnX3AbXrQDy5MMS1egXtv3iBj+VWrQsbxFZX5hxGdR3GEhQeV8SU crYQ== X-Gm-Message-State: APjAAAUhUuc25/lE5zYF0lwklU3qq2FIvF9GxSfLX7zVOnkxvLgyW56T 7/hyMqjOUstFGe7QfSUGj3BIEdjgxFE= X-Google-Smtp-Source: APXvYqyV188mKs7XrkpyMjBZeUM7NHgY6Wg859aZT0jdsorRLuSUudPKnrHJ5V7N8JgGCbWYify+DQ== X-Received: by 2002:ac8:50c:: with SMTP id u12mr22205686qtg.322.1569808288773; Sun, 29 Sep 2019 18:51:28 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:28 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Stefan Beller , Jonathan Tan , Jeff King Subject: [PATCH v2 05/11] object-store: allow threaded access to object reading Date: Sun, 29 Sep 2019 22:50:51 -0300 Message-Id: <4c5652ab34f0989856aba919ca84b2b091dcad98.1569808052.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Allow object reading to be performed by multiple threads protecting it with an internal lock. The lock usage can be toggled with enable_obj_read_lock() and disable_obj_read_lock(). Currently, the functions which can be safely called in parallel are: read_object_file_extended(), repo_read_object_file(), read_object_file(), read_object_with_reference(), read_object(), oid_object_info() and oid_object_info_extended(). It's also possible to use obj_read_lock() and obj_read_unlock() to protect other sections that cannot execute in parallel with object reading. Probably there are many spots in the functions listed above that could be executed unlocked (and thus, in parallel). But, for now, we are most interested in allowing parallel access to zlib inflation. This is one of the sections where object reading spends most of the time and it's already thread-safe. So, to take advantage of that, the respective lock is released when calling git_inflate() and re-acquired right after, for every calling spot in oid_object_info_extended()'s call chain. We may refine the lock to also exploit other possible parallel spots in the future, but threaded zlib inflation should already give great speedups for now. Note that add_delta_base_cache() was also modified to skip adding already present entries to the cache. This wasn't possible before, but now it is since phase I and phase III of unpack_entry() may execute concurrently. Another important thing to notice is that the object reading lock only works in conjunction with the 'struct raw_object_store's replace_mutex. Otherwise, there would still be racy spots in object reading functions. Signed-off-by: Matheus Tavares --- object-store.h | 35 +++++++++++++++++++++++++++++++ packfile.c | 7 +++++++ sha1-file.c | 57 +++++++++++++++++++++++++++++++++++++++++++++----- 3 files changed, 94 insertions(+), 5 deletions(-) diff --git a/object-store.h b/object-store.h index b22e20ad7d..8f63f21ad2 100644 --- a/object-store.h +++ b/object-store.h @@ -6,6 +6,7 @@ #include "list.h" #include "sha1-array.h" #include "strbuf.h" +#include "thread-utils.h" struct object_directory { struct object_directory *next; @@ -230,6 +231,40 @@ int has_loose_object_nonlocal(const struct object_id *); void assert_oid_type(const struct object_id *oid, enum object_type expect); +/* + * Enabling the object read lock allows multiple threads to safely call the + * following functions in parallel: repo_read_object_file(), read_object_file(), + * read_object_file_extended(), read_object_with_reference(), read_object(), + * oid_object_info() and oid_object_info_extended(). + * + * obj_read_lock() and obj_read_unlock() may also be used to protect other + * section which cannot execute in parallel with object reading. Since the used + * lock is a recursive mutex, these sections can even contain calls to object + * reading functions. However, beware that in these cases zlib inflation won't + * be performed in parallel, losing performance. + * + * TODO: oid_object_info_extended()'s call stack has a recursive behavior. If + * any of its callees end up calling it, this recursive call won't benefit from + * parallel inflation. + */ +void enable_obj_read_lock(void); +void disable_obj_read_lock(void); + +extern int obj_read_use_lock; +extern pthread_mutex_t obj_read_mutex; + +static inline void obj_read_lock(void) +{ + if(obj_read_use_lock) + pthread_mutex_lock(&obj_read_mutex); +} + +static inline void obj_read_unlock(void) +{ + if(obj_read_use_lock) + pthread_mutex_unlock(&obj_read_mutex); +} + struct object_info { /* Request */ enum object_type *typep; diff --git a/packfile.c b/packfile.c index 1a7d69fe32..a336972174 100644 --- a/packfile.c +++ b/packfile.c @@ -1098,7 +1098,9 @@ unsigned long get_size_from_delta(struct packed_git *p, do { in = use_pack(p, w_curs, curpos, &stream.avail_in); stream.next_in = in; + obj_read_unlock(); st = git_inflate(&stream, Z_FINISH); + obj_read_lock(); curpos += stream.next_in - in; } while ((st == Z_OK || st == Z_BUF_ERROR) && stream.total_out < sizeof(delta_head)); @@ -1451,6 +1453,9 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset, struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent)); struct list_head *lru, *tmp; + if (get_delta_base_cache_entry(p, base_offset)) + return; + delta_base_cached += base_size; list_for_each_safe(lru, tmp, &delta_base_cache_lru) { @@ -1580,7 +1585,9 @@ static void *unpack_compressed_entry(struct packed_git *p, do { in = use_pack(p, w_curs, curpos, &stream.avail_in); stream.next_in = in; + obj_read_unlock(); st = git_inflate(&stream, Z_FINISH); + obj_read_lock(); if (!stream.avail_out) break; /* the payload is larger than it should be */ curpos += stream.next_in - in; diff --git a/sha1-file.c b/sha1-file.c index e85f249a5d..b4f2f5cb94 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1148,6 +1148,8 @@ static int unpack_loose_short_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz) { + int ret; + /* Get the data stream */ memset(stream, 0, sizeof(*stream)); stream->next_in = map; @@ -1156,7 +1158,11 @@ static int unpack_loose_short_header(git_zstream *stream, stream->avail_out = bufsiz; git_inflate_init(stream); - return git_inflate(stream, 0); + obj_read_unlock(); + ret = git_inflate(stream, 0); + obj_read_lock(); + + return ret; } int unpack_loose_header(git_zstream *stream, @@ -1201,7 +1207,9 @@ static int unpack_loose_header_to_strbuf(git_zstream *stream, unsigned char *map stream->avail_out = bufsiz; do { + obj_read_unlock(); status = git_inflate(stream, 0); + obj_read_lock(); strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer); if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer)) return 0; @@ -1241,8 +1249,11 @@ static void *unpack_loose_rest(git_zstream *stream, */ stream->next_out = buf + bytes; stream->avail_out = size - bytes; - while (status == Z_OK) + while (status == Z_OK) { + obj_read_unlock(); status = git_inflate(stream, Z_FINISH); + obj_read_lock(); + } } if (status == Z_STREAM_END && !stream->avail_in) { git_inflate_end(stream); @@ -1412,10 +1423,32 @@ static int loose_object_info(struct repository *r, return (status < 0) ? status : 0; } +int obj_read_use_lock = 0; +pthread_mutex_t obj_read_mutex; + +void enable_obj_read_lock(void) +{ + if (obj_read_use_lock) + return; + + obj_read_use_lock = 1; + init_recursive_mutex(&obj_read_mutex); +} + +void disable_obj_read_lock(void) +{ + if (!obj_read_use_lock) + return; + + obj_read_use_lock = 0; + pthread_mutex_destroy(&obj_read_mutex); +} + int fetch_if_missing = 1; -int oid_object_info_extended(struct repository *r, const struct object_id *oid, - struct object_info *oi, unsigned flags) +static int do_oid_object_info_extended(struct repository *r, + const struct object_id *oid, + struct object_info *oi, unsigned flags) { static struct object_info blank_oi = OBJECT_INFO_INIT; struct pack_entry e; @@ -1423,6 +1456,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid, const struct object_id *real = oid; int already_retried = 0; + if (flags & OBJECT_INFO_LOOKUP_REPLACE) real = lookup_replace_object(r, oid); @@ -1498,7 +1532,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid, rtype = packed_object_info(r, e.p, e.offset, oi); if (rtype < 0) { mark_bad_packed_object(e.p, real->hash); - return oid_object_info_extended(r, real, oi, 0); + return do_oid_object_info_extended(r, real, oi, 0); } else if (oi->whence == OI_PACKED) { oi->u.packed.offset = e.offset; oi->u.packed.pack = e.p; @@ -1509,6 +1543,17 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid, return 0; } +int oid_object_info_extended(struct repository *r, const struct object_id *oid, + struct object_info *oi, unsigned flags) +{ + int ret; + obj_read_lock(); + ret = do_oid_object_info_extended(r, oid, oi, flags); + obj_read_unlock(); + return ret; +} + + /* returns enum object_type or negative */ int oid_object_info(struct repository *r, const struct object_id *oid, @@ -1581,6 +1626,7 @@ void *read_object_file_extended(struct repository *r, if (data) return data; + obj_read_lock(); if (errno && errno != ENOENT) die_errno(_("failed to read object %s"), oid_to_hex(oid)); @@ -1596,6 +1642,7 @@ void *read_object_file_extended(struct repository *r, if ((p = has_packed_and_bad(r, repl->hash)) != NULL) die(_("packed object %s (stored in %s) is corrupt"), oid_to_hex(repl), p->pack_name); + obj_read_unlock(); return NULL; } From patchwork Mon Sep 30 01:50:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165917 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2257F76 for ; Mon, 30 Sep 2019 01:51:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E23A62082F for ; Mon, 30 Sep 2019 01:51:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="olhFkpZ9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729351AbfI3Bve (ORCPT ); Sun, 29 Sep 2019 21:51:34 -0400 Received: from mail-qk1-f194.google.com ([209.85.222.194]:37431 "EHLO mail-qk1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729332AbfI3Bve (ORCPT ); Sun, 29 Sep 2019 21:51:34 -0400 Received: by mail-qk1-f194.google.com with SMTP id u184so6434764qkd.4 for ; Sun, 29 Sep 2019 18:51:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BCXLfX08GEjmLMIhE6ljGeSzKkdsW1qzo97krZsV1tE=; b=olhFkpZ9Dpy8H9ckaxqRoiaird5Nbu97mAdS9cNW95yg0p2LPsfkSr0tgnsfabPoc+ wQSyl0sZWBIgTrBtoewJFodBEDiOxPS6KWJvIxZlqTy/O8dAsQyUQA4xIo9DgQtBsJWw IW3pRBAc1jWCS9DrmFefzz0AnLA1vq7dxQtBzZy5AHGDnyiEASMak2w/D51c4z59Y6hI C+Kx4tBk+HShlnnD1y9R/1jTEzBpWnlH1XBhqBV5Czv8Khc6eJ5UHAXaPo/LyGhUGkWT 91a9eXjmwgVPByxmqb2WkO7uJ9bmdDhg39fxI43/XBUMi1ftzUpZL8uZkBJI8n8jwRyE S04g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BCXLfX08GEjmLMIhE6ljGeSzKkdsW1qzo97krZsV1tE=; b=qKM2jhlf7Lo6R9GPFiOtjvpyfKVWMcEDfupT4+TLnoYiRDi+pPzBC8CGM1CA/XuAiX YI0m+eR1MVvO4/iVSzNJDXQPnDWtAFi5WeHUppZYm3qQLphKhpYcDnp6XJzGGK9g0kEp IH+91+CVtOix0RAyWKtftIivgK6v6KiYRIDHKexZ5ZHldgs8OJvBBhZM8fmyld58A/zs K88+rzvc3zTuXUNaZxMZKQk5kFsYq4QoeDFulzM7xJed2rdrnLdKV6napeJOR8jSGUvx sMaAfQD2W0rBK+MxkmkmXEb9xcbggLadbNfm/AbQKh+lc7vUbkhTfNmWUFK4GUKJRNzU pCVQ== X-Gm-Message-State: APjAAAUjheqfI9uUebI5RLgP6hLLhwWBYOZ4B7Xt9BihAtCFfFTkoto0 e0xEwFRRUyMBfmI5BS8zBKdNniV1gcI= X-Google-Smtp-Source: APXvYqyFENJtt8UMEpu0IiLhhrnv/N8EcYgJIeyDJmsbcWTDbMCylBUSmP5HTeoLtp7oLgUnmEb7iQ== X-Received: by 2002:a37:af87:: with SMTP id y129mr15915368qke.98.1569808292530; Sun, 29 Sep 2019 18:51:32 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:32 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Stefan Beller , "brian m. carlson" , Brandon Williams Subject: [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Date: Sun, 29 Sep 2019 22:50:52 -0300 Message-Id: <48b632d7a0278f4abb4f0b0390f316a631a9d0ef.1569808052.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org git-grep uses 'grep_read_mutex' to protect its calls to object reading operations. But these have their own internal lock now, which ensures a better performance (allowing parallel access to more regions). So, let's remove the former and, instead, activate the latter with enable_obj_read_lock(). Sections that are currently protected by 'grep_read_mutex' but are not internally protected by the object reading lock should be surrounded by obj_read_lock() and obj_read_unlock(). These guarantee mutual exclusion with object reading operations, keeping the current behavior and avoiding race conditions. Namely, these places are: In grep.c: - fill_textconv() at fill_textconv_grep(). - userdiff_get_textconv() at grep_source_1(). In builtin/grep.c: - parse_object_or_die() and the submodule functions at grep_submodule(). - deref_tag() and gitmodules_config_oid() at grep_objects(). If these functions become thread-safe, in the future, we might remove the locking and probably get some speedup. Note that some of the submodule functions will already be thread-safe (or close to being thread-safe) with the internal object reading lock. However, as some of them will require additional modifications to be removed from the critical section, this will be done in its own patch. Signed-off-by: Matheus Tavares --- builtin/grep.c | 46 ++++++++++++++++------------------------------ grep.c | 39 +++++++++++++++++++-------------------- grep.h | 13 ------------- 3 files changed, 35 insertions(+), 63 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index fa8b9996d1..5a404ee1db 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -200,12 +200,12 @@ static void start_threads(struct grep_opt *opt) int i; pthread_mutex_init(&grep_mutex, NULL); - pthread_mutex_init(&grep_read_mutex, NULL); pthread_mutex_init(&grep_attr_mutex, NULL); pthread_cond_init(&cond_add, NULL); pthread_cond_init(&cond_write, NULL); pthread_cond_init(&cond_result, NULL); grep_use_locks = 1; + enable_obj_read_lock(); for (i = 0; i < ARRAY_SIZE(todo); i++) { strbuf_init(&todo[i].out, 0); @@ -257,12 +257,12 @@ static int wait_all(void) free(threads); pthread_mutex_destroy(&grep_mutex); - pthread_mutex_destroy(&grep_read_mutex); pthread_mutex_destroy(&grep_attr_mutex); pthread_cond_destroy(&cond_add); pthread_cond_destroy(&cond_write); pthread_cond_destroy(&cond_result); grep_use_locks = 0; + disable_obj_read_lock(); return hit; } @@ -295,16 +295,6 @@ static int grep_cmd_config(const char *var, const char *value, void *cb) return st; } -static void *lock_and_read_oid_file(const struct object_id *oid, enum object_type *type, unsigned long *size) -{ - void *data; - - grep_read_lock(); - data = read_object_file(oid, type, size); - grep_read_unlock(); - return data; -} - static int grep_oid(struct grep_opt *opt, const struct object_id *oid, const char *filename, int tree_name_len, const char *path) @@ -413,20 +403,20 @@ static int grep_submodule(struct grep_opt *opt, /* * NEEDSWORK: submodules functions need to be protected because they - * access the object store via config_from_gitmodules(): the latter - * uses get_oid() which, for now, relies on the global the_repository - * object. + * call config_from_gitmodules(): the latter contains in its call stack + * many thread-unsafe operations that are racy with object reading, such + * as parse_object() and is_promisor_object(). */ - grep_read_lock(); + obj_read_lock(); sub = submodule_from_path(superproject, &null_oid, path); if (!is_submodule_active(superproject, path)) { - grep_read_unlock(); + obj_read_unlock(); return 0; } if (repo_submodule_init(&subrepo, superproject, sub)) { - grep_read_unlock(); + obj_read_unlock(); return 0; } @@ -443,7 +433,7 @@ static int grep_submodule(struct grep_opt *opt, * object. */ add_to_alternates_memory(subrepo.objects->odb->path); - grep_read_unlock(); + obj_read_unlock(); memcpy(&subopt, opt, sizeof(subopt)); subopt.repo = &subrepo; @@ -455,13 +445,12 @@ static int grep_submodule(struct grep_opt *opt, unsigned long size; struct strbuf base = STRBUF_INIT; - grep_read_lock(); + obj_read_lock(); object = parse_object_or_die(oid, oid_to_hex(oid)); + obj_read_unlock(); data = read_object_with_reference(&subrepo, &object->oid, tree_type, &size, NULL); - grep_read_unlock(); - if (!data) die(_("unable to read tree (%s)"), oid_to_hex(&object->oid)); @@ -586,7 +575,7 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec, void *data; unsigned long size; - data = lock_and_read_oid_file(&entry.oid, &type, &size); + data = read_object_file(&entry.oid, &type, &size); if (!data) die(_("unable to read tree (%s)"), oid_to_hex(&entry.oid)); @@ -624,12 +613,9 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec, struct strbuf base; int hit, len; - grep_read_lock(); data = read_object_with_reference(opt->repo, &obj->oid, tree_type, &size, NULL); - grep_read_unlock(); - if (!data) die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid)); @@ -659,17 +645,17 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, for (i = 0; i < nr; i++) { struct object *real_obj; - grep_read_lock(); + obj_read_lock(); real_obj = deref_tag(opt->repo, list->objects[i].item, NULL, 0); - grep_read_unlock(); + obj_read_unlock(); /* load the gitmodules file for this rev */ if (recurse_submodules) { submodule_free(opt->repo); - grep_read_lock(); + obj_read_lock(); gitmodules_config_oid(&real_obj->oid); - grep_read_unlock(); + obj_read_unlock(); } if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path)) { diff --git a/grep.c b/grep.c index b29946def2..0ca400f7b6 100644 --- a/grep.c +++ b/grep.c @@ -1533,11 +1533,6 @@ static inline void grep_attr_unlock(void) pthread_mutex_unlock(&grep_attr_mutex); } -/* - * Same as git_attr_mutex, but protecting the thread-unsafe object db access. - */ -pthread_mutex_t grep_read_mutex; - static int match_funcname(struct grep_opt *opt, struct grep_source *gs, char *bol, char *eol) { xdemitconf_t *xecfg = opt->priv; @@ -1734,13 +1729,20 @@ static int fill_textconv_grep(struct repository *r, } /* - * fill_textconv is not remotely thread-safe; it may load objects - * behind the scenes, and it modifies the global diff tempfile - * structure. + * fill_textconv is not remotely thread-safe; it modifies the global + * diff tempfile structure, writes to the_repo's odb and might + * internally call thread-unsafe functions such as the + * prepare_packed_git() lazy-initializator. Because of the last two, we + * must ensure mutual exclusion between this call and the object reading + * API, thus we use obj_read_lock() here. + * + * TODO: allowing text conversion to run in parallel with object + * reading operations might increase performance in the multithreaded + * non-worktreee git-grep with --textconv. */ - grep_read_lock(); + obj_read_lock(); size = fill_textconv(r, driver, df, &buf); - grep_read_unlock(); + obj_read_unlock(); free_filespec(df); /* @@ -1806,13 +1808,16 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle grep_source_load_driver(gs, opt->repo->index); /* * We might set up the shared textconv cache data here, which - * is not thread-safe. + * is not thread-safe. Also, get_oid_with_context() and + * parse_object() might be internally called. As they are not + * currenty thread-safe and might be racy with object reading, + * obj_read_lock() must be called. */ + obj_read_lock(); grep_attr_lock(); - grep_read_lock(); textconv = userdiff_get_textconv(opt->repo, gs->driver); - grep_read_unlock(); grep_attr_unlock(); + obj_read_unlock(); } /* @@ -2111,10 +2116,7 @@ static int grep_source_load_oid(struct grep_source *gs) { enum object_type type; - grep_read_lock(); gs->buf = read_object_file(gs->identifier, &type, &gs->size); - grep_read_unlock(); - if (!gs->buf) return error(_("'%s': unable to read %s"), gs->name, @@ -2179,11 +2181,8 @@ void grep_source_load_driver(struct grep_source *gs, return; grep_attr_lock(); - if (gs->path) { - grep_read_lock(); + if (gs->path) gs->driver = userdiff_find_by_path(istate, gs->path); - grep_read_unlock(); - } if (!gs->driver) gs->driver = userdiff_find_by_name("default"); grep_attr_unlock(); diff --git a/grep.h b/grep.h index 1875880f37..54bf3a1ed4 100644 --- a/grep.h +++ b/grep.h @@ -235,18 +235,5 @@ int grep_threads_ok(const struct grep_opt *opt); */ extern int grep_use_locks; extern pthread_mutex_t grep_attr_mutex; -extern pthread_mutex_t grep_read_mutex; - -static inline void grep_read_lock(void) -{ - if (grep_use_locks) - pthread_mutex_lock(&grep_read_mutex); -} - -static inline void grep_read_unlock(void) -{ - if (grep_use_locks) - pthread_mutex_unlock(&grep_read_mutex); -} #endif From patchwork Mon Sep 30 01:50:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165919 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 82E3176 for ; Mon, 30 Sep 2019 01:51:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5752421835 for ; Mon, 30 Sep 2019 01:51:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="rKb04N8R" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729357AbfI3Bvi (ORCPT ); Sun, 29 Sep 2019 21:51:38 -0400 Received: from mail-qk1-f196.google.com ([209.85.222.196]:42553 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729332AbfI3Bvh (ORCPT ); Sun, 29 Sep 2019 21:51:37 -0400 Received: by mail-qk1-f196.google.com with SMTP id f16so6408983qkl.9 for ; Sun, 29 Sep 2019 18:51:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RxfiWCZo0YKl0QO8A60X9gdAN0bHMQeS4a3WgubKrxI=; b=rKb04N8RoZxum1e1BBA5fkndiGdam/Len6OBLeaxJkOKDE6LHd0KEiMcrF+6egeCVF lWnqeGMS6pdES/El82K8aczti0nDCBzzVlM3fwVhsmTtuD+Snao+4OdHlkDFL6P1AMXn ZeoVr/M8DAAY3xQVbS78eRk47UdMCHmRy+jVMcL44XSLqWNZ+QFcD+cCiB9W//Wl26MZ j/5FsqDleaWZYkjxOrq9NM32JUFJyumKL3bpxaL96iD/4cY562fD26aEo5eeVKnoh3jm EyuVv2cEBC8HxfISbkpTzumRNX/oyXVwK3lamPK5vIHF5oUAi1NIfbsLJ1buF/zg3qMD vzhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RxfiWCZo0YKl0QO8A60X9gdAN0bHMQeS4a3WgubKrxI=; b=a1DLnD2n85kf+Vj03/8mL6ndiP9R5dSZ7DfoYFYfDqyHbPoziVnmZcT2ZlmOzg6Aw9 Pvu1GyZz0xlIRUxLcRScFlINW1InksjYJ+Pb4xdzQ7lgOb7GFCD19xVt+a3twZDilJGT NY4PQZjb1HiSkbTN69c1EWA00Wd85sjNoze1gIstX2WAXhcQu+RA7qWPmcwF5UKycmrX blOuk/KJGUMKNUMx1zK5Gp2/YfMMvRj3KCFqOt7J+yOv/xNQkKbX1gUrfBCbi9XnkB2A 1iy3+SCRmQ0zTy92jT8T6qJIyxklSRdtjuTKW6c2jmf58uiUrgyrAoeVk7NjcKnfGXpL 2dPg== X-Gm-Message-State: APjAAAWVfYhheM48OrhGQy6xKoMHUWmacoy7qDPqj2PWmAeeW6AXoV5u 1diV07P8yY0m2vPI9fEDJTIk9cQul8Y= X-Google-Smtp-Source: APXvYqw4VfcOg4qptuoRkpRfqGiPGi+o8FlbzCXRTEYQP85putAEIjqMBVSYT+FaN/d67qE+ed/JWg== X-Received: by 2002:a37:6005:: with SMTP id u5mr15242355qkb.79.1569808296340; Sun, 29 Sep 2019 18:51:36 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:35 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Stefan Beller , Brandon Williams , Jonathan Tan Subject: [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules() Date: Sun, 29 Sep 2019 22:50:53 -0300 Message-Id: <38940b38af5337646b0ddd4b06d3efb1b97aec81.1569808052.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Currently, submodule-config.c doesn't have an externally acessible function to read gitmodules only if it wasn't already read. But this exactly behavior is internally implemented by gitmodules_read_check(), to perform a lazy load. Let's merge this function with repo_read_gitmodules() adding an 'skip_if_read' which allow both internal and external callers to access this functionality. This simplifies a little the code. The added option will also be used in the following patch. Signed-off-by: Matheus Tavares --- builtin/grep.c | 2 +- submodule-config.c | 18 ++++++------------ submodule-config.h | 2 +- unpack-trees.c | 4 ++-- 4 files changed, 10 insertions(+), 16 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 5a404ee1db..1c4ff4a75f 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -420,7 +420,7 @@ static int grep_submodule(struct grep_opt *opt, return 0; } - repo_read_gitmodules(&subrepo); + repo_read_gitmodules(&subrepo, 0); /* * NEEDSWORK: This adds the submodule's object directory to the list of diff --git a/submodule-config.c b/submodule-config.c index 4264ee216f..8c4333120a 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -660,10 +660,13 @@ static int gitmodules_cb(const char *var, const char *value, void *data) return parse_config(var, value, ¶meter); } -void repo_read_gitmodules(struct repository *repo) +void repo_read_gitmodules(struct repository *repo, int skip_if_read) { submodule_cache_check_init(repo); + if (repo->submodule_cache->gitmodules_read && skip_if_read) + return; + if (repo_read_index(repo) < 0) return; @@ -689,20 +692,11 @@ void gitmodules_config_oid(const struct object_id *commit_oid) the_repository->submodule_cache->gitmodules_read = 1; } -static void gitmodules_read_check(struct repository *repo) -{ - submodule_cache_check_init(repo); - - /* read the repo's .gitmodules file if it hasn't been already */ - if (!repo->submodule_cache->gitmodules_read) - repo_read_gitmodules(repo); -} - const struct submodule *submodule_from_name(struct repository *r, const struct object_id *treeish_name, const char *name) { - gitmodules_read_check(r); + repo_read_gitmodules(r, 1); return config_from(r->submodule_cache, treeish_name, name, lookup_name); } @@ -710,7 +704,7 @@ const struct submodule *submodule_from_path(struct repository *r, const struct object_id *treeish_name, const char *path) { - gitmodules_read_check(r); + repo_read_gitmodules(r, 1); return config_from(r->submodule_cache, treeish_name, path, lookup_path); } diff --git a/submodule-config.h b/submodule-config.h index 1b4e2da658..7a76ef8cd8 100644 --- a/submodule-config.h +++ b/submodule-config.h @@ -39,7 +39,7 @@ int option_fetch_parse_recurse_submodules(const struct option *opt, const char *arg, int unset); int parse_update_recurse_submodules_arg(const char *opt, const char *arg); int parse_push_recurse_submodules_arg(const char *opt, const char *arg); -void repo_read_gitmodules(struct repository *repo); +void repo_read_gitmodules(struct repository *repo, int skip_if_read); void gitmodules_config_oid(const struct object_id *commit_oid); const struct submodule *submodule_from_name(struct repository *r, const struct object_id *commit_or_tree, diff --git a/unpack-trees.c b/unpack-trees.c index 9c25126aec..689575944c 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -292,11 +292,11 @@ static void load_gitmodules_file(struct index_state *index, if (pos >= 0) { struct cache_entry *ce = index->cache[pos]; if (!state && ce->ce_flags & CE_WT_REMOVE) { - repo_read_gitmodules(the_repository); + repo_read_gitmodules(the_repository, 0); } else if (state && (ce->ce_flags & CE_UPDATE)) { submodule_free(the_repository); checkout_entry(ce, state, NULL, NULL); - repo_read_gitmodules(the_repository); + repo_read_gitmodules(the_repository, 0); } } } From patchwork Mon Sep 30 01:50:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165921 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D70C76 for ; Mon, 30 Sep 2019 01:51:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E0AE621835 for ; Mon, 30 Sep 2019 01:51:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="cw8F0ogW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729362AbfI3Bvm (ORCPT ); Sun, 29 Sep 2019 21:51:42 -0400 Received: from mail-qt1-f196.google.com ([209.85.160.196]:35476 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729332AbfI3Bvm (ORCPT ); Sun, 29 Sep 2019 21:51:42 -0400 Received: by mail-qt1-f196.google.com with SMTP id m15so15034765qtq.2 for ; Sun, 29 Sep 2019 18:51:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fqBGrCUWa9QM3lyZChHVyrO7vjAONfPnBWLMDPdXlmE=; b=cw8F0ogWxLE1mNHB8crnvmdLKkdpBhA3x8PW9zVbwODp1bsyk1jzZMtEA+yt8X7zSA Vl2w21joa2gL+G8N1uAUNd5z2JC6aebLAsrbxFl0KQEdMTLWWM1MwJ4ala9bgbN8xpS2 dlUzrnSUpKeAx5r5BAQXcXK/MDTDoZjlshwBCsjYkhsV+o7F4CR/rtTKmlXYZkmY1lyY HYTDEZZYCNIYtL92UN64fmgrWkEljAH1TfVSurzikwT58bj4Ux7nPjyLoy8exqugOQ6t pVXtJEtLQVkIbuhGOArG28m9cZnFUtnMDyHKpwjkTck387aH5U8/0/sFAguupqEYx7ud mm5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fqBGrCUWa9QM3lyZChHVyrO7vjAONfPnBWLMDPdXlmE=; b=n4P7p+M9/Ygy25fIHMr2qqJnv6UkcNlw0q5QG0wYrIHIhCJulo7L8ZCutTwVxxArfT f8W9i8g+FDbnn1jTRFpDQNZrNmavqN6VW1ymmIt7KRo/RqwHXEj9uaKE1HWv4I/IToki MlmNWZlDW0i+RwSU+J4gKk5/kKmT9bUhCLJfAaS42DmjRjLn0JiJUx/IaGIYfQ1nrf9j nZQjyeqt8Ii/pxzo2SJ60+hPITvsRYyb0ICp/Gt3C2uUX8EOMENlqnfzxot6han+vgPc AtxK9foOwKk+26+t608kCFMa4k/o2WGTSmCbHRD8lUQ5RJ5zIBSnwZ8BODgokq2UGWx4 QqsQ== X-Gm-Message-State: APjAAAWqi0jMHleCjLSq76/kRVEIUq6keGx9UW4rt5gnYlEtvb36nVEn hqQQfE+WbozxNwyrKG6Xd5Q5GajbfIg= X-Google-Smtp-Source: APXvYqwec1ngAUgkpw1ZWzhn06D8TlqFehUu8sNjJD+cxgspfH4a6dipaG0x9IdF7ErU9wpwkqh9Lw== X-Received: by 2002:a0c:9369:: with SMTP id e38mr18566952qve.25.1569808299593; Sun, 29 Sep 2019 18:51:39 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:39 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Brandon Williams , Stefan Beller , Jeff King Subject: [PATCH v2 08/11] grep: allow submodule functions to run in parallel Date: Sun, 29 Sep 2019 22:50:54 -0300 Message-Id: <8070e154070e6eaef6f31077d15349e74147ff84.1569808052.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Now that object reading operations are internally protected, the submodule initialization functions at builtin/grep.c:grep_submodule() are very close to being thread-safe. Let's take a look at each call and remove from the critical section what we can, for better performance: - submodule_from_path() and is_submodule_active() cannot be called in parallel yet only because they call repo_read_gitmodules() which contains, in its call stack, operations that would otherwise be in race condition with object reading (for example parse_object() and is_promisor_remote()). However, they only call repo_read_gitmodules() if it wasn't read before. So let's pre-read it before firing the threads and allow these two functions to safelly be called in parallel. - repo_submodule_init() is already thread-safe, so remove it from the critical section without other necessary changes. - The repo_read_gitmodules(&subrepo) call at grep_submodule() is safe as no other thread is performing object reading operations in the subrepo yet. However, threads might be working in the superproject, and this function calls add_to_alternates_memory() internally, which is racy with object readings in the superproject. So it must be kept protected for now. Let's add a "NEEDSWORK" to it, informing why it cannot be removed from the critical section yet. - Finally, add_to_alternates_memory() must be kept protected by the same reason of the above item. Signed-off-by: Matheus Tavares --- builtin/grep.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 1c4ff4a75f..c973ac46a7 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -401,25 +401,23 @@ static int grep_submodule(struct grep_opt *opt, struct grep_opt subopt; int hit; - /* - * NEEDSWORK: submodules functions need to be protected because they - * call config_from_gitmodules(): the latter contains in its call stack - * many thread-unsafe operations that are racy with object reading, such - * as parse_object() and is_promisor_object(). - */ - obj_read_lock(); sub = submodule_from_path(superproject, &null_oid, path); - if (!is_submodule_active(superproject, path)) { - obj_read_unlock(); + if (!is_submodule_active(superproject, path)) return 0; - } - if (repo_submodule_init(&subrepo, superproject, sub)) { - obj_read_unlock(); + if (repo_submodule_init(&subrepo, superproject, sub)) return 0; - } + /* + * NEEDSWORK: repo_read_gitmodules() might call + * add_to_alternates_memory() via config_from_gitmodules(). This + * operation causes a race condition with concurrent object readings + * performed by the worker threads. That's why we need obj_read_lock() + * here. It should be removed once it's no longer necessary to add the + * subrepo's odbs to the in-memory alternates list. + */ + obj_read_lock(); repo_read_gitmodules(&subrepo, 0); /* @@ -1052,6 +1050,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) pathspec.recursive = 1; pathspec.recurse_submodules = !!recurse_submodules; + if (recurse_submodules && (!use_index || untracked)) + die(_("option not supported with --recurse-submodules")); + if (list.nr || cached || show_in_pager) { if (num_threads > 1) warning(_("invalid option combination, ignoring --threads")); @@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix) && (opt.pre_context || opt.post_context || opt.file_break || opt.funcbody)) skip_first_line = 1; + + /* + * Pre-read gitmodules (if not read already) to prevent racy + * lazy reading in worker threads. + */ + if (recurse_submodules) + repo_read_gitmodules(the_repository, 1); + start_threads(&opt); } else { /* @@ -1105,9 +1114,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix) } } - if (recurse_submodules && (!use_index || untracked)) - die(_("option not supported with --recurse-submodules")); - if (!show_in_pager && !opt.status_only) setup_pager(); From patchwork Mon Sep 30 01:50:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165923 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1BC711599 for ; Mon, 30 Sep 2019 01:51:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EEE0121835 for ; Mon, 30 Sep 2019 01:51:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="q0yYY3Cf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729371AbfI3Bvp (ORCPT ); Sun, 29 Sep 2019 21:51:45 -0400 Received: from mail-qk1-f193.google.com ([209.85.222.193]:45471 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729332AbfI3Bvo (ORCPT ); Sun, 29 Sep 2019 21:51:44 -0400 Received: by mail-qk1-f193.google.com with SMTP id z67so6386757qkb.12 for ; Sun, 29 Sep 2019 18:51:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=eZfDqgFYsaqy0pboOGfXHX5w1U2DUJ/RUk17H0TOAIU=; b=q0yYY3CfNrEEkSGYlZSClpXp8xK3H8w3fk1Q5aSnoJ/BVngvxYI8lc4CKbf6vJllL7 P4zVARGUBlXlhGvf9tnOjUEBUsG7jwEcZesGGxfv7iqj216r9Nc9Qel/I7wha3e7D/+6 XMWMc5HVWFH9UZx2tHXZikhVvVfx391oyAV4dq6YomX5fVwlIcCOOG0i9bidUn0gMuOY wDWvNtVPSNu6Twvkz+k4rgB7bL0I3GPDaaRxRcti7aWAFdhi7w6oBZ8PWAGd0vilkCuk U8E9TOhFChzeqnqpQNK/Zy+h8E9oAc40PSWxUzyeWHEmLh994c802/rCuNLXN8sohfpg kITg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=eZfDqgFYsaqy0pboOGfXHX5w1U2DUJ/RUk17H0TOAIU=; b=LYUGjLiX1VA0cjMKWj/WKhUR5oQ/WriEEj3lvrrgMKLjojJuyobF/+wv2HmPWQ6Uvx lTCMStfzvDhCsw83im+ysWS9cQx35lRjsvVc2a5p18ybHaFvyCfkFQjwUaitJi4gE5AO +ePAHPRkKmL7eChkikxRpb1ZpUgOGHiw+8e5c5tFu6j2cJE11J6ZEFGubgK+dch4X20q 0VcOOGB1UrH9uzXgcDGQKV2aHU8+LZU7GeyLs0kYrwKejydnZ12iKj3QaDvgfBo6m4I/ TTwjX3UxI8qprRoeKBV6zioJPNBQqFZzukn2+gHd/n+g9ogdvx9fSBKSm45JRaGnnlOI f+oQ== X-Gm-Message-State: APjAAAWTUMjeCCthLmJyyO0DcmaSwgOVSuPe4vsqoqwfNV6JeQGlA3WN E2RzOKO06pRLYR2YBSsTFsEtQh4SO+o= X-Google-Smtp-Source: APXvYqxsMN7rHwobtV5e+WRVuYlEvbdGXux8lQ8X2KmSVPkxqBhlrxX0RkOswycGz+eU52+d0xdtgA== X-Received: by 2002:a37:a410:: with SMTP id n16mr16804090qke.429.1569808303498; Sun, 29 Sep 2019 18:51:43 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:43 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Brandon Williams , =?utf-8?q?Ren=C3=A9_Scharfe?= , Jonathan Tan , Jeff King , Stefan Beller Subject: [PATCH v2 09/11] grep: protect packed_git [re-]initialization Date: Sun, 29 Sep 2019 22:50:55 -0300 Message-Id: X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Some fields in struct raw_object_store are lazy initialized by the thread-unsafe packfile.c:prepare_packed_git(). Although this function is present in the call stack of git-grep threads, all paths to it are currently protected by obj_read_lock() (and the main thread usually indirectly calls it before firing the worker threads, anyway). However, it's possible that future modifications add new unprotected paths to it, introducing a race condition. Because errors derived from it wouldn't happen often, it could be hard to detect. So to prevent future headaches, let's force eager initialization of packed_git when setting git-grep up. There'll be a small overhead in the cases where we didn't really needed to prepare packed_git during execution but this shouldn't be very noticeable. Also, packed_git may be re-initialized by packfile.c:reprepare_packed_git(). Again, all paths to it in git-grep are already protected by obj_read_lock() but it may suffer from the same problem in the future. So let's also internally protect it with obj_read_lock() (which is a recursive mutex). Signed-off-by: Matheus Tavares --- builtin/grep.c | 8 ++++++-- packfile.c | 2 ++ 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index c973ac46a7..0947596bcd 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -24,6 +24,7 @@ #include "submodule.h" #include "submodule-config.h" #include "object-store.h" +#include "packfile.h" static char const * const grep_usage[] = { N_("git grep [] [-e] [...] [[--] ...]"), @@ -1074,11 +1075,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix) skip_first_line = 1; /* - * Pre-read gitmodules (if not read already) to prevent racy - * lazy reading in worker threads. + * Pre-read gitmodules (if not read already) and force eager + * initialization of packed_git to prevent racy lazy + * reading/initialization once worker threads are started. */ if (recurse_submodules) repo_read_gitmodules(the_repository, 1); + if (startup_info->have_repository) + (void)get_packed_git(the_repository); start_threads(&opt); } else { diff --git a/packfile.c b/packfile.c index a336972174..5b32dac4ce 100644 --- a/packfile.c +++ b/packfile.c @@ -1016,12 +1016,14 @@ void reprepare_packed_git(struct repository *r) { struct object_directory *odb; + obj_read_lock(); for (odb = r->objects->odb; odb; odb = odb->next) odb_clear_loose_cache(odb); r->objects->approximate_object_count_valid = 0; r->objects->packed_git_initialized = 0; prepare_packed_git(r); + obj_read_unlock(); } struct packed_git *get_packed_git(struct repository *r) From patchwork Mon Sep 30 01:50:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165925 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8C0D21599 for ; Mon, 30 Sep 2019 01:51:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6A98621835 for ; Mon, 30 Sep 2019 01:51:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="YMGzzhU0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729332AbfI3Bvt (ORCPT ); Sun, 29 Sep 2019 21:51:49 -0400 Received: from mail-qk1-f193.google.com ([209.85.222.193]:39552 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729032AbfI3Bvt (ORCPT ); Sun, 29 Sep 2019 21:51:49 -0400 Received: by mail-qk1-f193.google.com with SMTP id 4so6411620qki.6 for ; Sun, 29 Sep 2019 18:51:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JS/RrePUGZe/cr8DroQGq0RI4LOSZ3t+Db1OrFTxxGA=; b=YMGzzhU0XpWWIqLMrNYBNRPG/cQjz2rtbZ0hcuvsBUZFrdecqFP9oLnmfXPxAe0j24 E8mBbJgMh4hORW29HXqiISIyhKYk/8D+PXMby7IZf+dVpyq0xjBQyMpS0I8WG5HCq0tG r5nP5LFZYVDNGHpyAOd+ky9VqNV5dsIvkHDl2h+w9dPxY4sscgCm37c3iSLYY9vdRBk5 4WUS7YvhPu/91lSj02B7CZiOKRgbLz19BOgPZeykfyO0EzyNZP1eAWaZdH4EMMDjdUIt 23wT6IS1OGbHEx+iFsOcDE8v5wou+7lOU+2vd1QxsXCTk9wWOVHp0bf9PNBw/cPK1LZW DBxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JS/RrePUGZe/cr8DroQGq0RI4LOSZ3t+Db1OrFTxxGA=; b=gl6lEPmh5DIdahomFaxu7WJIHgkIq+ESu+lBE3JAkyK0ijtVT4OKMXaL0oL45yXJcq l67gs3UiKZ+F+GjxvdDnsvGMSvHgBypjDrW5J20B76ozHzHfYjqhmr2W9Kty6eG5zmWq +MOG7Kf3hmlkNttwljoeMcoSHm+5a0Gix+Z9uJLKyQp2JLTrtaee7+zgHE8/qalUw0Nt bCsV45RXFpVwSlQZFr+kJMFF9aYV6dJ+9TL9MUP+YD8dsXGbiBQflQt2vX+WMVhfyXSQ wr3WMbvhiQmJW5Wv6Yd/VLS6awaWxYKVX11l173QNV0tOocLR3t0qEnkqXgLt+Mb68B2 gwQg== X-Gm-Message-State: APjAAAVoQUcr397m/fwpXdV9POR2OnYEdGUkpf1bhCo6OAwSSDcsRUq+ t1ovnCc2yLR7cFVMjRe8tqnqx3GAlTc= X-Google-Smtp-Source: APXvYqz+hOhs6LJDdLys59mlfIGx+RSWBNx7wQrFDQQdFoJl1Jg1B02I8oUxcPy++icxkTIH9aoB0A== X-Received: by 2002:a37:a2c3:: with SMTP id l186mr16024219qke.461.1569808306764; Sun, 29 Sep 2019 18:51:46 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:46 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Brandon Williams , Manav Rathi Subject: [PATCH v2 10/11] grep: re-enable threads in non-worktree case Date: Sun, 29 Sep 2019 22:50:56 -0300 Message-Id: X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org They were disabled at 53b8d93 ("grep: disable threading in non-worktree case", 12-12-2011), due to observable performance drops (to the point that using a single thread would be faster than multiple threads). But now that zlib inflation can be performed in parallel we can regain the speedup, so let's re-enable threads in non-worktree grep. Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*' ("Regex 2") at chromium's repository[1] I got: Threads | Regex 1 | Regex 2 ---------|------------|----------- 1 | 17.2920s | 20.9624s 2 | 9.6512s | 11.3184s 4 | 6.7723s | 7.6268s 8** | 6.2886s | 6.9843s These are all means of 30 executions after 2 warmup runs. All tests were executed on an i7-7700HQ (quad core w/ hyper-threading), 16GB of RAM and SSD, running Manjaro Linux. But to make sure the optimization also performs well on HDD, the tests were repeated on another machine with an i5-4210U (dual core w/ hyper-threading), 8GB of RAM and HDD (SATA III, 5400 rpm), also running Manjaro Linux: Threads | Regex 1 | Regex 2 ---------|------------|----------- 1 | 18.4035s | 22.5368s 2 | 12.5063s | 14.6409s 4** | 10.9136s | 12.7106s ** Note that in these cases we relied on hyper-threading, and that's probably why we don't see a big difference in time. Unfortunately, multithreaded git-grep might be slow in the non-worktree case when --textconv is used and there're too many text conversions. Probably the reason for this is that the object read lock is used to protect fill_textconv() and therefore there is a mutual exclusion between textconv execution and object reading. Because both are time consuming operations, not being able to perform them in parallel can cause performance drops. To inform the users about this (and other threading detais), let's also add a "NOTES ON THREADS" section to Documentation/git-grep.txt. [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares --- Documentation/git-grep.txt | 11 +++++++++++ builtin/grep.c | 2 +- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 2d27969057..00fc59d565 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -330,6 +330,17 @@ EXAMPLES `git grep solution -- :^Documentation`:: Looks for `solution`, excluding files in `Documentation`. +NOTES ON THREADS +---------------- + +The `--threads` option (and the grep.threads configuration) will be ignored when +`--open-files-in-pager` is used, forcing a single-threaded execution. + +When grepping the object store (with `--cached` or giving tree objects), running +with multiple threads might perform slower than single threaded if `--textconv` +is given and there're too many text conversions. So if you experience low +performance in this case, it might be desirable to use `--threads=1`. + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/grep.c b/builtin/grep.c index 0947596bcd..163f14b60d 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1054,7 +1054,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) if (recurse_submodules && (!use_index || untracked)) die(_("option not supported with --recurse-submodules")); - if (list.nr || cached || show_in_pager) { + if (show_in_pager) { if (num_threads > 1) warning(_("invalid option combination, ignoring --threads")); num_threads = 1; From patchwork Mon Sep 30 01:50:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11165927 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B71876 for ; Mon, 30 Sep 2019 01:51:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 79CDE2082F for ; Mon, 30 Sep 2019 01:51:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp-br.20150623.gappssmtp.com header.i=@usp-br.20150623.gappssmtp.com header.b="RuMovP3w" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729374AbfI3Bvv (ORCPT ); Sun, 29 Sep 2019 21:51:51 -0400 Received: from mail-qk1-f196.google.com ([209.85.222.196]:40717 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729032AbfI3Bvv (ORCPT ); Sun, 29 Sep 2019 21:51:51 -0400 Received: by mail-qk1-f196.google.com with SMTP id y144so6410379qkb.7 for ; Sun, 29 Sep 2019 18:51:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp-br.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aAHx6dNVHtQnsN4hlXQaghh9Z6aH9U6OUj77ClLhva4=; b=RuMovP3wwuaW7SkfmoEkw7ghHq+KrA79f5pcjHxvpm5QKQQ28mVC3g+C6nT/EBKM4l w6Vg1AEPd6DIKQKQzEjbFRPNpGsF08L3xzLT5uY9k+LCJ/763szAp54IqCAd3bPFiXSp /PD+vkUwjVm5t6feNUZQlSXa274tTX+dnd/tZfwrr55nPkuLNuJLbK5UaQ8peymKgKUa 92EozDb2fSMUk0EdCHP7c5GWomu6Bd77IUeXad8c1f7P3lFB4DOsnM7rzADcd8ZT06Gx dURcNVIYmdj84dodf76kguo4ZrTJSLNz4mYeFx7sjqbatIgKL8nWT5uThFcG0UBzKQHU JtWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aAHx6dNVHtQnsN4hlXQaghh9Z6aH9U6OUj77ClLhva4=; b=Qyy8A+gNECwiq2yI7+6/mUD1CVPUiJJkUNyRhdwTMNo21T/u1WAoBIi3irlFvz/bRT 9kxAFZ/hE9HuZsOFGWx0KQbejc2lP+XCrChC+w0RBDojCLf9eBGDq6CzFy3+5jrsugIr YjLSWy7PVw7I3FDgzbm/5VNNmF25EiN4hw9qINb4Vpf7R2cKEH/cb4c5xYE+V2IUTBhA VOYE/zps6L+62/YaxazzP8LjNO+GTgH//GTm8obR+WIn1dMMDTaI+6qb2cdFah7W9BKv b7kMs0Zz3T5h6aIw0LVbNGAVCAjq9M9duzk6HS6ZzmGR3FyxeZ+LP2kYub3bfyaTKGES R1Gw== X-Gm-Message-State: APjAAAWgpdJbi4430LteMU4UfQElBK0pRY2Ox2Y2/oisLbD3yMSaLNcQ YQ7q+ajOdjqari9VShlo6e2dGzykxJg= X-Google-Smtp-Source: APXvYqwD27toE/d/dwKGVw2u3+AtsEz3rjvH1SnvUcwOBWIWubXV83jBMtGDK/mdJLc0WzSr7Oycrg== X-Received: by 2002:a37:bd45:: with SMTP id n66mr16206201qkf.272.1569808309907; Sun, 29 Sep 2019 18:51:49 -0700 (PDT) Received: from mango.spo.virtua.com.br ([2804:14c:81:942d::3]) by smtp.gmail.com with ESMTPSA id f11sm4706954qkk.76.2019.09.29.18.51.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Sep 2019 18:51:49 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: christian.couder@gmail.com, olyatelezhnaya@gmail.com, pclouds@gmail.com, gitster@pobox.com, jrnieder@gmail.com, Rasmus Villemoes , Jeff King Subject: [PATCH v2 11/11] grep: move driver pre-load out of critical section Date: Sun, 29 Sep 2019 22:50:57 -0300 Message-Id: <7b0358d1596e6673d12f9592cf05f5193247f3ec.1569808052.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.23.0 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In builtin/grep.c:add_work() we pre-load the userdiff drivers before adding the grep_source in the todo list. This operation is currently being performed after acquiring the grep_mutex, but as it's already thread-safe, we don't need to protect it here. So let's move it out of the critical section which should avoid thread contention and improve performance. Running[1] `git grep --threads=8 abcd[02] HEAD` on chromium's repository[2], I got the following mean times for 30 executions after 2 warmups: Original | 6.2886s -------------------------|----------- Out of critical section | 5.7852s [1]: Tests performed on an i7-7700HQ with 16GB of RAM and SSD, running Manjaro Linux. [2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares --- builtin/grep.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 163f14b60d..d275b76647 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -92,8 +92,11 @@ static pthread_cond_t cond_result; static int skip_first_line; -static void add_work(struct grep_opt *opt, const struct grep_source *gs) +static void add_work(struct grep_opt *opt, struct grep_source *gs) { + if (opt->binary != GREP_BINARY_TEXT) + grep_source_load_driver(gs, opt->repo->index); + grep_lock(); while ((todo_end+1) % ARRAY_SIZE(todo) == todo_done) { @@ -101,9 +104,6 @@ static void add_work(struct grep_opt *opt, const struct grep_source *gs) } todo[todo_end].source = *gs; - if (opt->binary != GREP_BINARY_TEXT) - grep_source_load_driver(&todo[todo_end].source, - opt->repo->index); todo[todo_end].done = 0; strbuf_reset(&todo[todo_end].out); todo_end = (todo_end + 1) % ARRAY_SIZE(todo);