From patchwork Fri Feb 1 17:58:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 10793471 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2410813B4 for ; Fri, 1 Feb 2019 17:59:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1116A31F97 for ; Fri, 1 Feb 2019 17:59:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0284731FA6; Fri, 1 Feb 2019 17:59:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74A0831FA5 for ; Fri, 1 Feb 2019 17:59:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731510AbfBAR7G (ORCPT ); Fri, 1 Feb 2019 12:59:06 -0500 Received: from mail-ed1-f45.google.com ([209.85.208.45]:33976 "EHLO mail-ed1-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730783AbfBAR7F (ORCPT ); Fri, 1 Feb 2019 12:59:05 -0500 Received: by mail-ed1-f45.google.com with SMTP id b3so6197712ede.1 for ; Fri, 01 Feb 2019 09:59:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:message-id:in-reply-to:references:from:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=pi4I86NQWs+7DGJqxKq0GWY/ZR+d2hGku/ohrkceSDM=; b=CBbWkySQ1MFZjn54mX5xLfrm6JgXKxnQmXBWDytYZm5QadhmOPpAOrkqO4d13PEKC/ 0qN/G3fkDk5NJx+w9kYRVItZLOk42XgCKDokCr4wl1JAZm9QzsmjmVxCOa8nP58nrvLM 6sbkr+aJQuZ1LXUz8yeUtSdaUd3uWAbAI5ZJFedAHS6lY7gDzH+rq+8Xkk0JslG4cdU9 voX2uPUNjAiFCM56/UzNiThuqP24umv67VKLVIWcqZUkdsK0+EiCT1hwgUzFZmLl0aBq lZogbxyi/UAFkhbRNnzV1fUrL0bdTRkiFq4s7GHV01LJQglgh4dCrPGjYRIIXhUr6wwq KR7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:in-reply-to:references:from :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=pi4I86NQWs+7DGJqxKq0GWY/ZR+d2hGku/ohrkceSDM=; b=GyfCkonvjsu/udII+k1OP0QH+qN9cIQW1zUqrh8N4xsR1AsruYftG/UsOgSLSwltLN 8s5PGH4YFSOiwKWUSQs59cRgC0K2xXnFvlsuKw+qKwLJZFU0grpwjGWSP4w4gCmNRa6q ziRP1/AfxyDVnSBON91t8Ys0PPOlYxkPCR5Yctyli0c8AHFUZ3H/WefRzF/j35rEacK0 +6TGDssgUGYupXBZCA9facC0IlbD2uJ6xm127ZJASaiGq9TSp5zbszA+6mfLSIuRb5k2 k1YlwqaS7opGhvM0TriTXk6oreFV7hL+RAD+I/vZlBC1irLsRCRkeQvBhGHfsDLjiWbI UmYg== X-Gm-Message-State: AJcUukfnZFdNPFqqdt35auVR+rWLhJlSwmtNAlb69OU9aW6uVOCRwEY4 tO8Aba9gqjn/17F5Y2Pb9d8VvGwS X-Google-Smtp-Source: ALg8bN4SUeeaJ3UFe9Dfu6i5eOVR4l+007UgllmYS6OFQk1R7tugBOh2tq9ZUiYfbKgKQ01qUVGyYw== X-Received: by 2002:a50:c017:: with SMTP id r23mr40835696edb.278.1549043939753; Fri, 01 Feb 2019 09:58:59 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f19-v6sm1281884eje.28.2019.02.01.09.58.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 01 Feb 2019 09:58:59 -0800 (PST) Date: Fri, 01 Feb 2019 09:58:59 -0800 (PST) X-Google-Original-Date: Fri, 01 Feb 2019 17:58:42 GMT Message-Id: <4c006f4995e8c4c1356d9d93b6352455069e4b6d.1549043936.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Jeff Hostetler via GitGitGadget" Subject: [PATCH v5 01/15] trace2: Documentation/technical/api-trace2.txt Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jeffhost@microsoft.com, Junio C Hamano , Jeff Hostetler Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Jeff Hostetler Created design document for Trace2 feature. Signed-off-by: Jeff Hostetler --- Documentation/technical/api-trace2.txt | 1347 ++++++++++++++++++++++++ 1 file changed, 1347 insertions(+) create mode 100644 Documentation/technical/api-trace2.txt diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt new file mode 100644 index 0000000000..6ee574a86e --- /dev/null +++ b/Documentation/technical/api-trace2.txt @@ -0,0 +1,1347 @@ += Trace2 API + +The Trace2 API can be used to print debug, performance, and telemetry +information to stderr or a file. The Trace2 feature is inactive unless +explicitly enabled by enabling one or more Trace2 Targets. + +The Trace2 API is intended to replace the existing (Trace1) +printf-style tracing provided by the existing `GIT_TRACE` and +`GIT_TRACE_PERFORMANCE` facilities. During initial implementation, +Trace2 and Trace1 may operate in parallel. + +The Trace2 API defines a set of high-level messages with known fields, +such as (`start`: `argv`) and (`exit`: {`exit-code`, `elapsed-time`}). + +Trace2 instrumentation throughout the Git code base sends Trace2 +messages to the enabled Trace2 Targets. Targets transform these +messages content into purpose-specific formats and write events to +their data streams. In this manner, the Trace2 API can drive +many different types of analysis. + +Targets are defined using a VTable allowing easy extension to other +formats in the future. This might be used to define a binary format, +for example. + +== Trace2 Targets + +Trace2 defines the following set of Trace2 Targets. +Format details are given in a later section. + +`GIT_TR2` (NORMAL):: + + a simple printf format like GIT_TRACE. ++ +------------ +$ export GIT_TR2=~/log.normal +$ git version +git version 2.20.1.155.g426c96fcdb +------------ ++ +------------ +$ cat ~/log.normal +12:28:42.620009 common-main.c:38 version 2.20.1.155.g426c96fcdb +12:28:42.620989 common-main.c:39 start git version +12:28:42.621101 git.c:432 cmd_name version (version) +12:28:42.621215 git.c:662 exit elapsed:0.001227 code:0 +12:28:42.621250 trace2/tr2_tgt_normal.c:124 atexit elapsed:0.001265 code:0 +------------ + +`GIT_TR2_PERF` (PERF):: + + a column-based format to replace GIT_TRACE_PERFORMANCE suitable for + development and testing, possibly to complement tools like gprof. ++ +------------ +$ export GIT_TR2_PERF=~/log.perf +$ git version +git version 2.20.1.155.g426c96fcdb +------------ ++ +------------ +$ cat ~/log.perf +12:28:42.620675 common-main.c:38 | d0 | main | version | | | | | 2.20.1.155.g426c96fcdb +12:28:42.621001 common-main.c:39 | d0 | main | start | | | | | git version +12:28:42.621111 git.c:432 | d0 | main | cmd_name | | | | | version (version) +12:28:42.621225 git.c:662 | d0 | main | exit | | 0.001227 | | | code:0 +12:28:42.621259 trace2/tr2_tgt_perf.c:211 | d0 | main | atexit | | 0.001265 | | | code:0 +------------ + +`GIT_TR2_EVENT` (EVENT):: + + a JSON-based format of event data suitable for telemetry analysis. ++ +------------ +$ export GIT_TR2_EVENT=~/log.event +$ git version +git version 2.20.1.155.g426c96fcdb +------------ ++ +------------ +$ cat ~/log.event +{"event":"version","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.620713","file":"common-main.c","line":38,"evt":"1","exe":"2.20.1.155.g426c96fcdb"} +{"event":"start","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.621027","file":"common-main.c","line":39,"argv":["git","version"]} +{"event":"cmd_name","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.621122","file":"git.c","line":432,"name":"version","hierarchy":"version"} +{"event":"exit","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.621236","file":"git.c","line":662,"t_abs":0.001227,"code":0} +{"event":"atexit","sid":"1547659722619736-11614","thread":"main","time":"2019-01-16 17:28:42.621268","file":"trace2/tr2_tgt_event.c","line":163,"t_abs":0.001265,"code":0} +------------ + +== Enabling a Target + +A Trace2 Target is enabled when the corresponding environment variable +(`GIT_TR2`, `GIT_TR2_PERF`, or `GIT_TR2_EVENT`) is set. The following +values are recognized. + +`0`:: +`false`:: + + Disables the target. + +`1`:: +`true`:: + + Enables the target and writes stream to `STDERR`. + +`[2-9]`:: + + Enables the target and writes to the already opened file descriptor. + +``:: + + Enables the target, opens and writes to the file in append mode. + +`af_unix:`:: + + Enables the target, opens and writes to a Unix Domain Socket + (on platforms that support them). + +== Trace2 API + +All public Trace2 functions and macros are defined in `trace2.h` and +`trace2.c`. All public symbols are prefixed with `trace2_`. + +There are no public Trace2 data structures. + +The Trace2 code also defines a set of private functions and data types +in the `trace2/` directory. These symbols are prefixed with `tr2_` +and should only be used by functions in `trace2.c`. + +== Conventions for Public Functions and Macros + +The functions defined by the Trace2 API are declared and documented +in `trace2.h`. It defines the API functions and wrapper macros for +Trace2. + +Some functions have a `_fl()` suffix to indicate that they take `file` +and `line-number` arguments. + +Some functions have a `_va_fl()` suffix to indicate that they also +take a `va_list` argument. + +Some functions have a `_printf_fl()` suffix to indicate that they also +take a varargs argument. + +There are CPP wrapper macros and ifdefs to hide most of these details. +See `trace2.h` for more details. The following discussion will only +describe the simplified forms. + +== Public API + +All Trace2 API functions send a messsage to all of the active +Trace2 Targets. This section describes the set of available +messages. + +It helps to divide these functions into groups for discussion +purposes. + +=== Basic Command Messages + +These are concerned with the lifetime of the overall git process. + +`void trace2_initialize()`:: + + Determines if any Trace2 Targets should be enabled and + initializes the Trace2 facility. This includes starting the + elapsed time clocks and thread local storage (TLS). ++ +This function emits a "version" message containing the version of git +and the Trace2 protocol. ++ +This function should be called from `main()` as early as possible in +the life of the process. + +`int trace2_is_enabled()`:: + + Returns 1 if Trace2 is enabled (at least one target is + active). + +`void trace2_cmd_start(int argc, const char **argv)`:: + + Emits a "start" message containing the process command line + arguments. + +`int trace2_cmd_exit(int exit_code)`:: + + Emits an "exit" message containing the process exit-code and + elapsed time. ++ +Returns the exit-code. + +`void trace2_cmd_error(const char *fmt, va_list ap)`:: + + Emits an "error" message containing a formatted error message. + +`void trace2_cmd_path(const char *pathname)`:: + + Emits a "cmd_path" message with the full pathname of the + current process. + +=== Command Detail Messages + +These are concerned with describing the specific Git command +after the command line, config, and environment are inspected. + +`void trace2_cmd_name(const char *name)`:: + + Emits a "cmd_name" message with the canonical name of the + command, for example "status" or "checkout". + +`void trace2_cmd_mode(const char *mode)`:: + + Emits a "cmd_mode" message with a qualifier name to further + describe the current git command. ++ +This message is intended to be used with git commands having multiple +major modes. For example, a "checkout" command can checkout a new +branch or it can checkout a single file, so the checkout code could +emit a cmd_mode message of "branch" or "file". + +`void trace2_cmd_alias(const char *alias, const char **argv_expansion)`:: + + Emits an "alias" message containing the alias used and the + argument expansion. + +`void trace2_def_param(const char *parameter, const char *value)`:: + + Emits a "def_param" message containing a key/value pair. ++ +This message is intended to report some global aspect of the current +command, such as a configuration setting or command line switch that +significantly affects program performance or behavior, such as +`core.abbrev`, `status.showUntrackedFiles`, or `--no-ahead-behind`. + +`void trace2_cmd_list_config()`:: + + Emits a "def_param" messages for "important" configuration + settings. ++ +The environment variable `GIT_TR2_CONFIG_PARAMS` can be set to a +list of patterns of important configuration settings, for example: +`core.*,remote.*.url`. This function will iterate over all config +settings and emit a "def_param" message for each match. + +`void trace2_cmd_set_config(const char *key, const char *value)`:: + + Emits a "def_param" message for a specific configuration + setting IFF it matches the `GIT_TR2_CONFIG_PARAMS` pattern. ++ +This is used to hook into `git_config_set()` and catch any +configuration changes and update a value previously reported by +`trace2_cmd_list_config()`. + +`void trace2_def_repo(struct repository *repo)`:: + + Registers a repository with the Trace2 layer. Assigns a + unique "repo-id" to `repo->trace2_repo_id`. ++ +Emits a "worktree" messages containing the repo-id and the worktree +pathname. ++ +Region and data messages (described later) may refer to this repo-id. ++ +The main/top-level repository will have repo-id value 1 (aka "r1"). ++ +The repo-id field is in anticipation of future in-proc submodule +repositories. + +=== Child Process Messages + +These are concerned with the various spawned child processes, +including shell scripts, git commands, editors, pagers, and hooks. + +`void trace2_child_start(struct child_process *cmd)`:: + + Emits a "child_start" message containing the "child-id", + "child-argv", and "child-classification". ++ +Before calling this, set `cmd->trace2_child_class` to a name +describing the type of child process, for example "editor". ++ +This function assigns a unique "child-id" to `cmd->trace2_child_id`. +This field is used later during the "child_exit" message to associate +it with the "child_start" message. ++ +This function should be called before spawning the child process. + +`void trace2_child_exit(struct child_proess *cmd, int child_exit_code)`:: + + Emits a "child_exit" message containing the "child-id", + the child's elapsed time and exit-code. ++ +The reported elapsed time includes the process creation overhead and +time spend waiting for it to exit, so it may be slightly longer than +the time reported by the child itself. ++ +This function should be called after reaping the child process. + +`int trace2_exec(const char *exe, const char **argv)`:: + + Emits a "exec" message containing the "exec-id" and the + argv of the new process. ++ +This function should be called before calling one of the `exec()` +variants, such as `execvp()`. ++ +This function returns a unique "exec-id". This value is used later +if the exec() fails and a "exec-result" message is necessary. + +`void trace2_exec_result(int exec_id, int error_code)`:: + + Emits a "exec_result" message containing the "exec-id" + and the error code. ++ +On Unix-based systems, `exec()` does not return if successful. +This message is used to indicate that the `exec()` failed and +that the current program is continuing. + +=== Git Thread Messages + +These messages are concerned with Git thread usage. + +`void trace2_thread_start(const char *thread_name)`:: + + Emits a "thread_start" message. ++ +The `thread_name` field should be a descriptive name, such as the +unique name of the thread-proc. A unique "thread-id" will be added +to the name to uniquely identify thread instances. ++ +Region and data messages (described later) may refer to this thread +name. ++ +This function must be called by the thread-proc of the new thread +(so that TLS data is properly initialized) and not by the caller +of `pthread_create()`. + +`void trace2_thread_exit()`:: + + Emits a "thread_exit" message containing the thread name + and the thread elapsed time. ++ +This function must be called by the thread-proc before it returns +(so that the coorect TLS data is used and cleaned up. It should +not be called by the caller of `pthread_join()`. + +=== Region and Data Messages + +These are concerned with recording performance data +over regions or spans of code. + +`void trace2_region_enter(const char *category, const char *label, const struct repository *repo)`:: + +`void trace2_region_enter_printf(const char *category, const char *label, const struct repository *repo, const char *fmt, ...)`:: + +`void trace2_region_enter_printf_va(const char *category, const char *label, const struct repository *repo, const char *fmt, va_list ap)`:: + + Emits a thread-relative "region_enter" message with optional + printf string. ++ +This function pushes a new region nesting stack level on the current +thread and starts a clock for the new stack frame. ++ +The `category` field is an arbitrary category name used to classify +regions by feature area, such as "status" or "index". At this time +it is only just printed along with the rest of the message. It may +be used in the future to filter messages. ++ +The `label` field is an arbitrary label used to describe the activity +being started, such as "read_recursive" or "do_read_index". ++ +The `repo` field, if set, will be used to get the "repo-id", so that +recursive oerations can be attributed to the correct repository. + +`void trace2_region_leave(const char *category, const char *label, const struct repository *repo)`:: + +`void trace2_region_leave_printf(const char *category, const char *label, const struct repository *repo, const char *fmt, ...)`:: + +`void trace2_region_leave_printf_va(const char *category, const char *label, const struct repository *repo, const char *fmt, va_list ap)`:: + + Emits a thread-relative "region_leave" message with optional + printf string. ++ +This function pops the region nesting stack on the current thread +and reports the elapsed time of the stack frame. ++ +The `category`, `label`, and `repo` fields are the same as above. +The `category` and `label` do not need to match the correpsonding +"region_enter" message, but it makes the data stream easier to +understand. + +`void trace2_data_string(const char *category, const struct repository *repo, const char *key, const char * value)`:: + +`void trace2_data_intmax(const char *category, const struct repository *repo, const char *key, intmax value)`:: + +`void trace2_data_json(const char *category, const struct repository *repo, const char *key, const struct json_writer *jw)`:: + + Emits a region- and thread-relative "data" or "data_json" message. ++ +This is a key/value pair message containing information about the +current thread, region stack, and repository. This could be used +to print the number of files in a directory during a multi-threaded +recursive tree walk. + +`void trace2_printf(const char *fmt, ...)`:: + +`void trace2_printf_va(const char *fmt, va_list ap)`:: + + Emits a region- and thread-relative "printf" message. + +== Trace2 Target Formats + +=== NORMAL Format + +NORMAL format is enabled when the `GIT_TR2` environment variable is +set. + +Events are written as lines of the form: + +------------ +[