mbox series

[v6,00/15] Trace2 tracing facility

Message ID pull.108.v6.git.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Trace2 tracing facility | expand

Message

Derrick Stolee via GitGitGadget Feb. 6, 2019, 5:15 p.m. UTC
V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h

There are no other outstanding comments that I'm aware of.

Thanks Jeff


----------------------------------------------------------------------------

V5 addresses: [] renames "verb" and "subverb" to "cmd_name" and "cmd_mode"
in code and documentation. [] updates clang-format config to not complain
about my for_each macros. [] update formatting around each use of my
for_each macros. [] update the platform-specific process info commit to
indicate it only handles Windows and leaves other platform for future
efforts. [] alters title of pack-objects commit to include "trace2:data:" to
match the other data-only commits.

I think this version addresses all of the feedback received do date.

Thanks Jeff


----------------------------------------------------------------------------

Sorry to spam the list, but here is V4. After building V3 on 3 platforms
without error and submitting, the compilers on platforms 4 and 5 complained
about a variable declaration. (sigh) [] fix declaration after first
statement [] add -DNO_UNIX_SOCKETS to BASIC_CFLAGS when NO_UNIX_SOCKETS is
defined in the Makefile.


----------------------------------------------------------------------------

V3 addresses: [] re-fix the trace2 tests using an inline environment
variable rather than exporting and unsetting. [] overhaul the design
document to include prototype declarations and more file format information.
[] incorporate most of the suggestions from clang-format. [] add ability to
trace to a unix domain socket. [] added forward declarations suggested by
Ramsay. [] rebased onto current master to fixup conflict with
sb/submodule-recursive-fetch-gets-the-tip that was merged yesterday.


----------------------------------------------------------------------------

V2 addresses: [] "jh/trace2" bad interaction with "js/vsts-ci" in "pu". []
coccinelle warnings in trace2/tr2_tgt_perf.c reported during CI testing.


----------------------------------------------------------------------------

This patch series contains a greatly refactored version of my original
Trace2 series [1] from August 2018.

A new design doc in Documentation/technical/api-trace2.txt (in the first
commit) explains the relationship of Trace2 to the current tracing facility.
Calls to the current tracing facility have not been changed, rather new
trace2 calls have been added so that both continue to work in parallel for
the time being.

[1] https://public-inbox.org/git/pull.29.git.gitgitgadget@gmail.com/

Cc: gitster@pobox.comCc: peff@peff.netCc: jrnieder@gmail.com

Derrick Stolee (1):
  trace2:data: pack-objects: add trace2 regions

Jeff Hostetler (14):
  trace2: Documentation/technical/api-trace2.txt
  trace2: create new combined trace facility
  trace2: collect Windows-specific process information
  trace2:data: add trace2 regions to wt-status
  trace2:data: add editor/pager child classification
  trace2:data: add trace2 sub-process classification
  trace2:data: add trace2 transport child classification
  trace2:data: add trace2 hook classification
  trace2:data: add trace2 instrumentation to index read/write
  trace2:data: add subverb to checkout command
  trace2:data: add subverb to reset command
  trace2:data: add subverb for rebase
  trace2: t/helper/test-trace2, t0210.sh, t0211.sh, t0212.sh
  trace2: add for_each macros to clang-format

 .clang-format                            |    2 +-
 Documentation/technical/api-trace2.txt   | 1347 ++++++++++++++++++++++
 Makefile                                 |   15 +-
 builtin/am.c                             |    1 +
 builtin/checkout.c                       |    7 +
 builtin/pack-objects.c                   |   16 +-
 builtin/rebase.c                         |   17 +
 builtin/receive-pack.c                   |    4 +
 builtin/reset.c                          |    6 +
 builtin/submodule--helper.c              |    9 +-
 builtin/worktree.c                       |    1 +
 cache.h                                  |    1 +
 common-main.c                            |   13 +-
 compat/mingw.c                           |   11 +-
 compat/mingw.h                           |    3 +-
 compat/win32/trace2_win32_process_info.c |  101 ++
 config.c                                 |    2 +
 config.mak.uname                         |    2 +
 connect.c                                |    3 +
 editor.c                                 |    1 +
 exec-cmd.c                               |    2 +
 git-compat-util.h                        |    7 +
 git.c                                    |   65 ++
 pager.c                                  |    1 +
 read-cache.c                             |   51 +-
 remote-curl.c                            |    7 +
 repository.c                             |    2 +
 repository.h                             |    3 +
 run-command.c                            |   59 +-
 run-command.h                            |   13 +-
 sequencer.c                              |    2 +
 sh-i18n--envsubst.c                      |    3 +
 sub-process.c                            |    1 +
 submodule.c                              |   11 +-
 t/helper/test-parse-options.c            |    3 +
 t/helper/test-tool.c                     |    4 +
 t/helper/test-tool.h                     |    1 +
 t/helper/test-trace2.c                   |  273 +++++
 t/t0001-init.sh                          |    1 +
 t/t0210-trace2-normal.sh                 |  135 +++
 t/t0210/scrub_normal.perl                |   48 +
 t/t0211-trace2-perf.sh                   |  153 +++
 t/t0211/scrub_perf.perl                  |   76 ++
 t/t0212-trace2-event.sh                  |  234 ++++
 t/t0212/parse_events.perl                |  251 ++++
 trace2.c                                 |  762 ++++++++++++
 trace2.h                                 |  385 +++++++
 trace2/tr2_cfg.c                         |   90 ++
 trace2/tr2_cfg.h                         |   19 +
 trace2/tr2_cmd_name.c                    |   30 +
 trace2/tr2_cmd_name.h                    |   24 +
 trace2/tr2_dst.c                         |  197 ++++
 trace2/tr2_dst.h                         |   36 +
 trace2/tr2_sid.c                         |   67 ++
 trace2/tr2_sid.h                         |   18 +
 trace2/tr2_tbuf.c                        |   32 +
 trace2/tr2_tbuf.h                        |   23 +
 trace2/tr2_tgt.h                         |  133 +++
 trace2/tr2_tgt_event.c                   |  589 ++++++++++
 trace2/tr2_tgt_normal.c                  |  324 ++++++
 trace2/tr2_tgt_perf.c                    |  535 +++++++++
 trace2/tr2_tls.c                         |  164 +++
 trace2/tr2_tls.h                         |   97 ++
 transport-helper.c                       |    2 +
 transport.c                              |    1 +
 usage.c                                  |   31 +
 wt-status.c                              |   24 +-
 67 files changed, 6528 insertions(+), 23 deletions(-)
 create mode 100644 Documentation/technical/api-trace2.txt
 create mode 100644 compat/win32/trace2_win32_process_info.c
 create mode 100644 t/helper/test-trace2.c
 create mode 100755 t/t0210-trace2-normal.sh
 create mode 100644 t/t0210/scrub_normal.perl
 create mode 100755 t/t0211-trace2-perf.sh
 create mode 100644 t/t0211/scrub_perf.perl
 create mode 100755 t/t0212-trace2-event.sh
 create mode 100644 t/t0212/parse_events.perl
 create mode 100644 trace2.c
 create mode 100644 trace2.h
 create mode 100644 trace2/tr2_cfg.c
 create mode 100644 trace2/tr2_cfg.h
 create mode 100644 trace2/tr2_cmd_name.c
 create mode 100644 trace2/tr2_cmd_name.h
 create mode 100644 trace2/tr2_dst.c
 create mode 100644 trace2/tr2_dst.h
 create mode 100644 trace2/tr2_sid.c
 create mode 100644 trace2/tr2_sid.h
 create mode 100644 trace2/tr2_tbuf.c
 create mode 100644 trace2/tr2_tbuf.h
 create mode 100644 trace2/tr2_tgt.h
 create mode 100644 trace2/tr2_tgt_event.c
 create mode 100644 trace2/tr2_tgt_normal.c
 create mode 100644 trace2/tr2_tgt_perf.c
 create mode 100644 trace2/tr2_tls.c
 create mode 100644 trace2/tr2_tls.h


base-commit: b5101f929789889c2e536d915698f58d5c5c6b7a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-108%2Fjeffhostetler%2Fcore-trace2-2019-v0-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-108/jeffhostetler/core-trace2-2019-v0-v6
Pull-Request: https://github.com/gitgitgadget/git/pull/108

Range-diff vs v5:

  1:  4c006f4995 =  1:  4c006f4995 trace2: Documentation/technical/api-trace2.txt
  2:  6120ce1bbe !  2:  6bad326bbd trace2: create new combined trace facility
     @@ -4208,7 +4208,7 @@
      +#ifndef TR2_TLS_H
      +#define TR2_TLS_H
      +
     -+struct strbuf;
     ++#include "strbuf.h"
      +
      +/*
      + * Arbitry limit for thread names for column alignment.
  3:  7c987cde86 =  3:  12de7e42de trace2: collect Windows-specific process information
  4:  cd860799f8 =  4:  5835edbd01 trace2:data: add trace2 regions to wt-status
  5:  30bcea9435 =  5:  99d13ef478 trace2:data: add editor/pager child classification
  6:  c869de8063 =  6:  d7ce85b702 trace2:data: add trace2 sub-process classification
  7:  3cd525b80e =  7:  b70c289903 trace2:data: add trace2 transport child classification
  8:  c2c1ea5ce3 =  8:  2a0da88579 trace2:data: add trace2 hook classification
  9:  57b23542b3 =  9:  bca6a7f0d6 trace2:data: add trace2 instrumentation to index read/write
 10:  ab74ef5f23 = 10:  68269ee060 trace2:data: pack-objects: add trace2 regions
 11:  6a82426a83 = 11:  0191283fff trace2:data: add subverb to checkout command
 12:  5c7f0de228 = 12:  24d8d8d768 trace2:data: add subverb to reset command
 13:  cc61201061 = 13:  90cec071cf trace2:data: add subverb for rebase
 14:  00b25da38b = 14:  b9f0c6fd66 trace2: t/helper/test-trace2, t0210.sh, t0211.sh, t0212.sh
 15:  5f77c9b633 = 15:  93a25d09a1 trace2: add for_each macros to clang-format

Comments

Ævar Arnfjörð Bjarmason Feb. 14, 2019, 12:33 p.m. UTC | #1
On Wed, Feb 06 2019, Jeff Hostetler via GitGitGadget wrote:

> V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h
>
> There are no other outstanding comments that I'm aware of.

Not a comment on this, just a follow-up question. I started looking into
whether this could be driven by config instead of getenv(). A lot easier
to set up in some cases than injecting env variables, especialy if the
log target supported a strftime() string, is any of that something
you've looked into already (so I don't do dupe work...).

There's the chicken & egg problem with wanting to do traces way before
we get to reading config, so I expect that such a facility would need to
work by always trace record at the beginning until we get far enough to
write the config, and then either stop and throw away the buffer, or
write out the existing trace to the configured target, and continue.
Jeff Hostetler Feb. 15, 2019, 5:25 p.m. UTC | #2
On 2/14/2019 7:33 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Feb 06 2019, Jeff Hostetler via GitGitGadget wrote:
> 
>> V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h
>>
>> There are no other outstanding comments that I'm aware of.
> 
> Not a comment on this, just a follow-up question. I started looking into
> whether this could be driven by config instead of getenv(). A lot easier
> to set up in some cases than injecting env variables, especialy if the
> log target supported a strftime() string, is any of that something
> you've looked into already (so I don't do dupe work...).
> 
> There's the chicken & egg problem with wanting to do traces way before
> we get to reading config, so I expect that such a facility would need to
> work by always trace record at the beginning until we get far enough to
> write the config, and then either stop and throw away the buffer, or
> write out the existing trace to the configured target, and continue.
> 

Yes, I beat my head against the config settings for quite a while
before settling on using an env var.

I wanted to get the:
() full process elapsed time,
() the full original argv,
() all of the command dispatch, run-dashed, and alias expansion,
() and for my atexit() to run last.
So I hooked into main() before the config is loaded.

In most commands, the config is processed about the same time as
parse_options() is called.  And we have to insert code in
git_default_config() to load my settings.  This happens after all
of the .git dir discovery, "-c" and "-C" processing, alias expansion,
command dispatch and etc.  And the argv received in the cmd_*()
function has been modified.  So we lose some information.  (I tried
this for a while and didn't like the results.)

I also tried using read_early_config() various places, but there
were problems here too.  Too early and the "-c" settings weren't
parsed yet.  And there was an issue about when .git dir was discovered,
so local config settings weren't ready yet.

I also recall having a problem when doing an early iteration with
side effects (or rather the early iteration caused something to be
set that caused the real iteration (in cmd_*()) to short-cut), but
I don't remember the details.

So we have a custom installer that also sets the environment variable
after git is installed and haven't had any problems.


I hesitate to say we should always start allocating a bunch of data
and spinning up the TLS data and etc. before we know if tracing is
wanted.  Just seems expensive for most users.


I could see having a "~/.git_tr2_config" or something similar in
some place like "/etc" that only contained the Trace2 settings.
It would be safe to read very early inside main() and we would not
consider it to be part of the real config.  For example, "git config"
would not know about it.  Then you could enforce a system-wide
setting without any of the env var issues.


WRT the strftime() question, we could either add syntax to the
env var value (or the tr2 config setting) to have some tokens
for that.  I just stuck with absolute pathnames since I started
by copying what was done for GIT_TRACE.

Jeff
Ævar Arnfjörð Bjarmason March 17, 2019, 2:22 p.m. UTC | #3
On Fri, Feb 15 2019, Jeff Hostetler wrote:

> On 2/14/2019 7:33 AM, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Wed, Feb 06 2019, Jeff Hostetler via GitGitGadget wrote:
>>
>>> V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h
>>>
>>> There are no other outstanding comments that I'm aware of.
>>
>> Not a comment on this, just a follow-up question. I started looking into
>> whether this could be driven by config instead of getenv(). A lot easier
>> to set up in some cases than injecting env variables, especialy if the
>> log target supported a strftime() string, is any of that something
>> you've looked into already (so I don't do dupe work...).
>>
>> There's the chicken & egg problem with wanting to do traces way before
>> we get to reading config, so I expect that such a facility would need to
>> work by always trace record at the beginning until we get far enough to
>> write the config, and then either stop and throw away the buffer, or
>> write out the existing trace to the configured target, and continue.
>>
>
> Yes, I beat my head against the config settings for quite a while
> before settling on using an env var.
>
> I wanted to get the:
> () full process elapsed time,
> () the full original argv,
> () all of the command dispatch, run-dashed, and alias expansion,
> () and for my atexit() to run last.
> So I hooked into main() before the config is loaded.
>
> In most commands, the config is processed about the same time as
> parse_options() is called.  And we have to insert code in
> git_default_config() to load my settings.  This happens after all
> of the .git dir discovery, "-c" and "-C" processing, alias expansion,
> command dispatch and etc.  And the argv received in the cmd_*()
> function has been modified.  So we lose some information.  (I tried
> this for a while and didn't like the results.)
>
> I also tried using read_early_config() various places, but there
> were problems here too.  Too early and the "-c" settings weren't
> parsed yet.  And there was an issue about when .git dir was discovered,
> so local config settings weren't ready yet.
>
> I also recall having a problem when doing an early iteration with
> side effects (or rather the early iteration caused something to be
> set that caused the real iteration (in cmd_*()) to short-cut), but
> I don't remember the details.
>
> So we have a custom installer that also sets the environment variable
> after git is installed and haven't had any problems.
>
>
> I hesitate to say we should always start allocating a bunch of data
> and spinning up the TLS data and etc. before we know if tracing is
> wanted.  Just seems expensive for most users.
>
>
> I could see having a "~/.git_tr2_config" or something similar in
> some place like "/etc" that only contained the Trace2 settings.
> It would be safe to read very early inside main() and we would not
> consider it to be part of the real config.  For example, "git config"
> would not know about it.  Then you could enforce a system-wide
> setting without any of the env var issues.

I haven't written a patch for this, but it seems to me that we could
just start reading /etc/gitconfig via some custom config callback code
early as e.g. 58b284a2e91 ("worktree: add per-worktree config files",
2018-10-21) does for the worktree config.

It would ignore everything except trace.* or wherever namespace we'll
put this in, and "-c" etc. wouldn't work. We could just document that as
a limitation for now which could be fixed later.

It wouldn't make things worse, and would mean you could easily set
logging system-wide without needing to inject environment variables in
lots of custom (and sometimes hard-to-do) places, which I expect is the
most common use-case, and is the one I have.

> WRT the strftime() question, we could either add syntax to the
> env var value (or the tr2 config setting) to have some tokens
> for that.  I just stuck with absolute pathnames since I started
> by copying what was done for GIT_TRACE.
>
> Jeff
Johannes Schindelin March 21, 2019, 2:30 p.m. UTC | #4
Hi Ævar,

On Sun, 17 Mar 2019, Ævar Arnfjörð Bjarmason wrote:

>
> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>
> > On 2/14/2019 7:33 AM, Ævar Arnfjörð Bjarmason wrote:
> >>
> >> On Wed, Feb 06 2019, Jeff Hostetler via GitGitGadget wrote:
> >>
> >>> V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h
> >>>
> >>> There are no other outstanding comments that I'm aware of.
> >>
> >> Not a comment on this, just a follow-up question. I started looking into
> >> whether this could be driven by config instead of getenv(). A lot easier
> >> to set up in some cases than injecting env variables, especialy if the
> >> log target supported a strftime() string, is any of that something
> >> you've looked into already (so I don't do dupe work...).
> >>
> >> There's the chicken & egg problem with wanting to do traces way before
> >> we get to reading config, so I expect that such a facility would need to
> >> work by always trace record at the beginning until we get far enough to
> >> write the config, and then either stop and throw away the buffer, or
> >> write out the existing trace to the configured target, and continue.
> >>
> >
> > Yes, I beat my head against the config settings for quite a while
> > before settling on using an env var.
> >
> > I wanted to get the:
> > () full process elapsed time,
> > () the full original argv,
> > () all of the command dispatch, run-dashed, and alias expansion,
> > () and for my atexit() to run last.
> > So I hooked into main() before the config is loaded.
> >
> > In most commands, the config is processed about the same time as
> > parse_options() is called.  And we have to insert code in
> > git_default_config() to load my settings.  This happens after all
> > of the .git dir discovery, "-c" and "-C" processing, alias expansion,
> > command dispatch and etc.  And the argv received in the cmd_*()
> > function has been modified.  So we lose some information.  (I tried
> > this for a while and didn't like the results.)
> >
> > I also tried using read_early_config() various places, but there
> > were problems here too.  Too early and the "-c" settings weren't
> > parsed yet.  And there was an issue about when .git dir was discovered,
> > so local config settings weren't ready yet.
> >
> > I also recall having a problem when doing an early iteration with
> > side effects (or rather the early iteration caused something to be
> > set that caused the real iteration (in cmd_*()) to short-cut), but
> > I don't remember the details.
> >
> > So we have a custom installer that also sets the environment variable
> > after git is installed and haven't had any problems.
> >
> >
> > I hesitate to say we should always start allocating a bunch of data
> > and spinning up the TLS data and etc. before we know if tracing is
> > wanted.  Just seems expensive for most users.
> >
> >
> > I could see having a "~/.git_tr2_config" or something similar in
> > some place like "/etc" that only contained the Trace2 settings.
> > It would be safe to read very early inside main() and we would not
> > consider it to be part of the real config.  For example, "git config"
> > would not know about it.  Then you could enforce a system-wide
> > setting without any of the env var issues.
>
> I haven't written a patch for this, but it seems to me that we could
> just start reading /etc/gitconfig via some custom config callback code
> early as e.g. 58b284a2e91 ("worktree: add per-worktree config files",
> 2018-10-21) does for the worktree config.

Oy. Oy, oy, oy.

Maybe use `read_early_config()` instead? That would be *a lot* cleaner.
You could maybe use a9bcf6586d1a (alias: use the early config machinery to
expand aliases, 2017-06-14) as an inspiration.

> It would ignore everything except trace.* or wherever namespace we'll
> put this in, and "-c" etc. wouldn't work. We could just document that as
> a limitation for now which could be fixed later.
>
> It wouldn't make things worse, and would mean you could easily set
> logging system-wide without needing to inject environment variables in
> lots of custom (and sometimes hard-to-do) places, which I expect is the
> most common use-case, and is the one I have.

Yes, I agree, those are good goals to address.

Ciao,
Dscho

> > WRT the strftime() question, we could either add syntax to the
> > env var value (or the tr2 config setting) to have some tokens
> > for that.  I just stuck with absolute pathnames since I started
> > by copying what was done for GIT_TRACE.
> >
> > Jeff
>
Ævar Arnfjörð Bjarmason March 21, 2019, 2:48 p.m. UTC | #5
On Thu, Mar 21 2019, Johannes Schindelin wrote:

> Hi Ævar,
>
> On Sun, 17 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>
>>
>> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>>
>> > On 2/14/2019 7:33 AM, Ævar Arnfjörð Bjarmason wrote:
>> >>
>> >> On Wed, Feb 06 2019, Jeff Hostetler via GitGitGadget wrote:
>> >>
>> >>> V6 addresses: [] The remaining hdr-check warning in trace2/tr2_tls.h
>> >>>
>> >>> There are no other outstanding comments that I'm aware of.
>> >>
>> >> Not a comment on this, just a follow-up question. I started looking into
>> >> whether this could be driven by config instead of getenv(). A lot easier
>> >> to set up in some cases than injecting env variables, especialy if the
>> >> log target supported a strftime() string, is any of that something
>> >> you've looked into already (so I don't do dupe work...).
>> >>
>> >> There's the chicken & egg problem with wanting to do traces way before
>> >> we get to reading config, so I expect that such a facility would need to
>> >> work by always trace record at the beginning until we get far enough to
>> >> write the config, and then either stop and throw away the buffer, or
>> >> write out the existing trace to the configured target, and continue.
>> >>
>> >
>> > Yes, I beat my head against the config settings for quite a while
>> > before settling on using an env var.
>> >
>> > I wanted to get the:
>> > () full process elapsed time,
>> > () the full original argv,
>> > () all of the command dispatch, run-dashed, and alias expansion,
>> > () and for my atexit() to run last.
>> > So I hooked into main() before the config is loaded.
>> >
>> > In most commands, the config is processed about the same time as
>> > parse_options() is called.  And we have to insert code in
>> > git_default_config() to load my settings.  This happens after all
>> > of the .git dir discovery, "-c" and "-C" processing, alias expansion,
>> > command dispatch and etc.  And the argv received in the cmd_*()
>> > function has been modified.  So we lose some information.  (I tried
>> > this for a while and didn't like the results.)
>> >
>> > I also tried using read_early_config() various places, but there
>> > were problems here too.  Too early and the "-c" settings weren't
>> > parsed yet.  And there was an issue about when .git dir was discovered,
>> > so local config settings weren't ready yet.
>> >
>> > I also recall having a problem when doing an early iteration with
>> > side effects (or rather the early iteration caused something to be
>> > set that caused the real iteration (in cmd_*()) to short-cut), but
>> > I don't remember the details.
>> >
>> > So we have a custom installer that also sets the environment variable
>> > after git is installed and haven't had any problems.
>> >
>> >
>> > I hesitate to say we should always start allocating a bunch of data
>> > and spinning up the TLS data and etc. before we know if tracing is
>> > wanted.  Just seems expensive for most users.
>> >
>> >
>> > I could see having a "~/.git_tr2_config" or something similar in
>> > some place like "/etc" that only contained the Trace2 settings.
>> > It would be safe to read very early inside main() and we would not
>> > consider it to be part of the real config.  For example, "git config"
>> > would not know about it.  Then you could enforce a system-wide
>> > setting without any of the env var issues.
>>
>> I haven't written a patch for this, but it seems to me that we could
>> just start reading /etc/gitconfig via some custom config callback code
>> early as e.g. 58b284a2e91 ("worktree: add per-worktree config files",
>> 2018-10-21) does for the worktree config.
>
> Oy. Oy, oy, oy.
>
> Maybe use `read_early_config()` instead? That would be *a lot* cleaner.
> You could maybe use a9bcf6586d1a (alias: use the early config machinery to
> expand aliases, 2017-06-14) as an inspiration.

Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
.git/config -> ~/.gitconfig etc. sequence just in terms of saving
critical time (this is the performance trace path, after all...).

But on a second reading I see that read_early_config() can do that if
you set config_source->file, opts->respect_includes etc. I.e. it just
(depending on options) resolves to git_config_from_file() which
58b284a2e91 used directly.

>> It would ignore everything except trace.* or wherever namespace we'll
>> put this in, and "-c" etc. wouldn't work. We could just document that as
>> a limitation for now which could be fixed later.
>>
>> It wouldn't make things worse, and would mean you could easily set
>> logging system-wide without needing to inject environment variables in
>> lots of custom (and sometimes hard-to-do) places, which I expect is the
>> most common use-case, and is the one I have.
>
> Yes, I agree, those are good goals to address.
>
> Ciao,
> Dscho
>
>> > WRT the strftime() question, we could either add syntax to the
>> > env var value (or the tr2 config setting) to have some tokens
>> > for that.  I just stuck with absolute pathnames since I started
>> > by copying what was done for GIT_TRACE.
>> >
>> > Jeff
>>
Johannes Schindelin March 22, 2019, 1:17 p.m. UTC | #6
Hi Ævar,

On Thu, 21 Mar 2019, Ævar Arnfjörð Bjarmason wrote:

> On Thu, Mar 21 2019, Johannes Schindelin wrote:
>
> > On Sun, 17 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
> >
> >>
> >> On Fri, Feb 15 2019, Jeff Hostetler wrote:
> >>
> >> > I could see having a "~/.git_tr2_config" or something similar in
> >> > some place like "/etc" that only contained the Trace2 settings. It
> >> > would be safe to read very early inside main() and we would not
> >> > consider it to be part of the real config.  For example, "git
> >> > config" would not know about it.  Then you could enforce a
> >> > system-wide setting without any of the env var issues.
> >>
> >> I haven't written a patch for this, but it seems to me that we could
> >> just start reading /etc/gitconfig via some custom config callback
> >> code early as e.g. 58b284a2e91 ("worktree: add per-worktree config
> >> files", 2018-10-21) does for the worktree config.
> >
> > Oy. Oy, oy, oy.
> >
> > Maybe use `read_early_config()` instead? That would be *a lot*
> > cleaner. You could maybe use a9bcf6586d1a (alias: use the early config
> > machinery to expand aliases, 2017-06-14) as an inspiration.
>
> Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
> .git/config -> ~/.gitconfig etc. sequence just in terms of saving
> critical time (this is the performance trace path, after all...).
>
> But on a second reading I see that read_early_config() can do that if
> you set config_source->file, opts->respect_includes etc. I.e. it just
> (depending on options) resolves to git_config_from_file() which
> 58b284a2e91 used directly.

Sure, it can exclude the repo and user config, but would that not be
rather confusing?

Ciao,
Dscho
Jeff Hostetler March 22, 2019, 2:01 p.m. UTC | #7
On 3/22/2019 9:17 AM, Johannes Schindelin wrote:
> Hi Ævar,
> 
> On Thu, 21 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
> 
>> On Thu, Mar 21 2019, Johannes Schindelin wrote:
>>
>>> On Sun, 17 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>>
>>>> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>>>>
>>>>> I could see having a "~/.git_tr2_config" or something similar in
>>>>> some place like "/etc" that only contained the Trace2 settings. It
>>>>> would be safe to read very early inside main() and we would not
>>>>> consider it to be part of the real config.  For example, "git
>>>>> config" would not know about it.  Then you could enforce a
>>>>> system-wide setting without any of the env var issues.
>>>>
>>>> I haven't written a patch for this, but it seems to me that we could
>>>> just start reading /etc/gitconfig via some custom config callback
>>>> code early as e.g. 58b284a2e91 ("worktree: add per-worktree config
>>>> files", 2018-10-21) does for the worktree config.
>>>
>>> Oy. Oy, oy, oy.
>>>
>>> Maybe use `read_early_config()` instead? That would be *a lot*
>>> cleaner. You could maybe use a9bcf6586d1a (alias: use the early config
>>> machinery to expand aliases, 2017-06-14) as an inspiration.
>>
>> Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
>> .git/config -> ~/.gitconfig etc. sequence just in terms of saving
>> critical time (this is the performance trace path, after all...).
>>
>> But on a second reading I see that read_early_config() can do that if
>> you set config_source->file, opts->respect_includes etc. I.e. it just
>> (depending on options) resolves to git_config_from_file() which
>> 58b284a2e91 used directly.
> 
> Sure, it can exclude the repo and user config, but would that not be
> rather confusing?

This was hidden in my earlier message.

There's a lot a config machinery here with lots of chicken-n-egg
problems.  I want to know at the top of main() as quickly as possible
whether trace2 should be enabled.

I don't want to slow down git by spinning up a bunch of trace2 state
and wait until the git-dir is discovered, the "-c" args are processed,
and we dispatch into the builtin layer and the config is enumerated
to know if it should really be on or not.

I also didn't want to introduce another full iteration of the full
config system for startup performance reasons.

I played with read_early_config() at one point and it always seemed
to introduce side-effects.  Perhaps I was calling it earlier than it
was expecting and that triggered some of the git-dir discovery or
something. I don't remember all the details, I just remember that it
changed some behaviors in subtle ways.

Perhaps I could call something like git_config_from_file() with the
right set of magic bits to get it to parse exactly 1 system config
file that would contain my trace2 settings.  Hopefully, this will
not have any side-effects.

But if we lump them in with the main /etc/gitconfig settings, we
would have to explain that these trace2 config settings are
system-only and ARE NOT overridden by "-c", global, local, ...
config settings.  This would get confusing if the user tried to
set local values and did:
	git config --list --show-origin
and it showed system and local values but yet "stubbornly" refused
to use the local values over the system values. (I think this was
Johannes' point.)

That's why I was suggesting a system trace2 config file that is a
peer of /etc/gitconfig (maybe /etc/gittrace2) that would have these
values and not be expected to interact with the main config system.
That is, we just use the git_config_ routines to parse the file format,
rather than inventing another file format, but not change the
expectation of the established config value inheritance system.

If there's no objections, I'll take a look at doing this.

Thanks
Jeff
Ævar Arnfjörð Bjarmason March 22, 2019, 2:53 p.m. UTC | #8
On Fri, Mar 22 2019, Jeff Hostetler wrote:

> On 3/22/2019 9:17 AM, Johannes Schindelin wrote:
>> Hi Ævar,
>>
>> On Thu, 21 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>>
>>> On Thu, Mar 21 2019, Johannes Schindelin wrote:
>>>
>>>> On Sun, 17 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>>>>
>>>>>
>>>>> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>>>>>
>>>>>> I could see having a "~/.git_tr2_config" or something similar in
>>>>>> some place like "/etc" that only contained the Trace2 settings. It
>>>>>> would be safe to read very early inside main() and we would not
>>>>>> consider it to be part of the real config.  For example, "git
>>>>>> config" would not know about it.  Then you could enforce a
>>>>>> system-wide setting without any of the env var issues.
>>>>>
>>>>> I haven't written a patch for this, but it seems to me that we could
>>>>> just start reading /etc/gitconfig via some custom config callback
>>>>> code early as e.g. 58b284a2e91 ("worktree: add per-worktree config
>>>>> files", 2018-10-21) does for the worktree config.
>>>>
>>>> Oy. Oy, oy, oy.
>>>>
>>>> Maybe use `read_early_config()` instead? That would be *a lot*
>>>> cleaner. You could maybe use a9bcf6586d1a (alias: use the early config
>>>> machinery to expand aliases, 2017-06-14) as an inspiration.
>>>
>>> Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
>>> .git/config -> ~/.gitconfig etc. sequence just in terms of saving
>>> critical time (this is the performance trace path, after all...).
>>>
>>> But on a second reading I see that read_early_config() can do that if
>>> you set config_source->file, opts->respect_includes etc. I.e. it just
>>> (depending on options) resolves to git_config_from_file() which
>>> 58b284a2e91 used directly.
>>
>> Sure, it can exclude the repo and user config, but would that not be
>> rather confusing?
>
> This was hidden in my earlier message.
>
> There's a lot a config machinery here with lots of chicken-n-egg
> problems.  I want to know at the top of main() as quickly as possible
> whether trace2 should be enabled.
>
> I don't want to slow down git by spinning up a bunch of trace2 state
> and wait until the git-dir is discovered, the "-c" args are processed,
> and we dispatch into the builtin layer and the config is enumerated
> to know if it should really be on or not.
>
> I also didn't want to introduce another full iteration of the full
> config system for startup performance reasons.
>
> I played with read_early_config() at one point and it always seemed
> to introduce side-effects.  Perhaps I was calling it earlier than it
> was expecting and that triggered some of the git-dir discovery or
> something. I don't remember all the details, I just remember that it
> changed some behaviors in subtle ways.
>
> Perhaps I could call something like git_config_from_file() with the
> right set of magic bits to get it to parse exactly 1 system config
> file that would contain my trace2 settings.  Hopefully, this will
> not have any side-effects.

Right, it also occurred to me that e.g. /home tends to be on NFS on some
systems, but very rarely /etc. I'd hate for trace reporting for git to
stall because NFS slows to a halt, so aside from temporary
implementation details (e.g. -c on the CLI not working) there's a good
case to be made for "read this from /etc/gitconfig only".

> But if we lump them in with the main /etc/gitconfig settings, we
> would have to explain that these trace2 config settings are
> system-only and ARE NOT overridden by "-c", global, local, ...
> config settings.  This would get confusing if the user tried to
> set local values and did:
> 	git config --list --show-origin
> and it showed system and local values but yet "stubbornly" refused
> to use the local values over the system values. (I think this was
> Johannes' point.)
>
> That's why I was suggesting a system trace2 config file that is a
> peer of /etc/gitconfig (maybe /etc/gittrace2) that would have these
> values and not be expected to interact with the main config system.
> That is, we just use the git_config_ routines to parse the file format,
> rather than inventing another file format, but not change the
> expectation of the established config value inheritance system.
>
> If there's no objections, I'll take a look at doing this.

I'd much rather just drop it in /etc/gitconfig with documented caveats
than introduce a new and permanent thing like /etc/gittrace2 due to a
current implementation detail.

Unlike something in core.* or whatever this facility is specialized
enough that I think it's fine to make it a bit of a special snowflake
given what it does and the target audience.

But even with those caveats it's still useful to see it in 'git config
-l --show-origin' for inspecting, and e.g. have it just work out of the
box with say the default puppetry for the likes of GitLab that now knows
how to set stuff in its /etc/gitconfig, but would need a special case
just for this.
Jeff Hostetler March 22, 2019, 3:21 p.m. UTC | #9
On 3/22/2019 10:53 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Mar 22 2019, Jeff Hostetler wrote:
> 
>> On 3/22/2019 9:17 AM, Johannes Schindelin wrote:
>>> Hi Ævar,
>>>
>>> On Thu, 21 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>> On Thu, Mar 21 2019, Johannes Schindelin wrote:
>>>>
>>>>> On Sun, 17 Mar 2019, Ævar Arnfjörð Bjarmason wrote:
>>>>>
>>>>>>
>>>>>> On Fri, Feb 15 2019, Jeff Hostetler wrote:
>>>>>>
>>>>>>> I could see having a "~/.git_tr2_config" or something similar in
>>>>>>> some place like "/etc" that only contained the Trace2 settings. It
>>>>>>> would be safe to read very early inside main() and we would not
>>>>>>> consider it to be part of the real config.  For example, "git
>>>>>>> config" would not know about it.  Then you could enforce a
>>>>>>> system-wide setting without any of the env var issues.
>>>>>>
>>>>>> I haven't written a patch for this, but it seems to me that we could
>>>>>> just start reading /etc/gitconfig via some custom config callback
>>>>>> code early as e.g. 58b284a2e91 ("worktree: add per-worktree config
>>>>>> files", 2018-10-21) does for the worktree config.
>>>>>
>>>>> Oy. Oy, oy, oy.
>>>>>
>>>>> Maybe use `read_early_config()` instead? That would be *a lot*
>>>>> cleaner. You could maybe use a9bcf6586d1a (alias: use the early config
>>>>> machinery to expand aliases, 2017-06-14) as an inspiration.
>>>>
>>>> Thanks. I was thinking *only* to do /etc/gitconfig and not the whole
>>>> .git/config -> ~/.gitconfig etc. sequence just in terms of saving
>>>> critical time (this is the performance trace path, after all...).
>>>>
>>>> But on a second reading I see that read_early_config() can do that if
>>>> you set config_source->file, opts->respect_includes etc. I.e. it just
>>>> (depending on options) resolves to git_config_from_file() which
>>>> 58b284a2e91 used directly.
>>>
>>> Sure, it can exclude the repo and user config, but would that not be
>>> rather confusing?
>>
>> This was hidden in my earlier message.
>>
>> There's a lot a config machinery here with lots of chicken-n-egg
>> problems.  I want to know at the top of main() as quickly as possible
>> whether trace2 should be enabled.
>>
>> I don't want to slow down git by spinning up a bunch of trace2 state
>> and wait until the git-dir is discovered, the "-c" args are processed,
>> and we dispatch into the builtin layer and the config is enumerated
>> to know if it should really be on or not.
>>
>> I also didn't want to introduce another full iteration of the full
>> config system for startup performance reasons.
>>
>> I played with read_early_config() at one point and it always seemed
>> to introduce side-effects.  Perhaps I was calling it earlier than it
>> was expecting and that triggered some of the git-dir discovery or
>> something. I don't remember all the details, I just remember that it
>> changed some behaviors in subtle ways.
>>
>> Perhaps I could call something like git_config_from_file() with the
>> right set of magic bits to get it to parse exactly 1 system config
>> file that would contain my trace2 settings.  Hopefully, this will
>> not have any side-effects.
> 
> Right, it also occurred to me that e.g. /home tends to be on NFS on some
> systems, but very rarely /etc. I'd hate for trace reporting for git to
> stall because NFS slows to a halt, so aside from temporary
> implementation details (e.g. -c on the CLI not working) there's a good
> case to be made for "read this from /etc/gitconfig only".
> 
>> But if we lump them in with the main /etc/gitconfig settings, we
>> would have to explain that these trace2 config settings are
>> system-only and ARE NOT overridden by "-c", global, local, ...
>> config settings.  This would get confusing if the user tried to
>> set local values and did:
>> 	git config --list --show-origin
>> and it showed system and local values but yet "stubbornly" refused
>> to use the local values over the system values. (I think this was
>> Johannes' point.)
>>
>> That's why I was suggesting a system trace2 config file that is a
>> peer of /etc/gitconfig (maybe /etc/gittrace2) that would have these
>> values and not be expected to interact with the main config system.
>> That is, we just use the git_config_ routines to parse the file format,
>> rather than inventing another file format, but not change the
>> expectation of the established config value inheritance system.
>>
>> If there's no objections, I'll take a look at doing this.
> 
> I'd much rather just drop it in /etc/gitconfig with documented caveats
> than introduce a new and permanent thing like /etc/gittrace2 due to a
> current implementation detail.
> 
> Unlike something in core.* or whatever this facility is specialized
> enough that I think it's fine to make it a bit of a special snowflake
> given what it does and the target audience.
> 
> But even with those caveats it's still useful to see it in 'git config
> -l --show-origin' for inspecting, and e.g. have it just work out of the
> box with say the default puppetry for the likes of GitLab that now knows
> how to set stuff in its /etc/gitconfig, but would need a special case
> just for this.
> 

ok, if we're comfortable with those caveats, i'll proceed as
you suggested. thanks for the quick response.

Jeff