From patchwork Fri Sep 1 20:43:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Beau Belgrave X-Patchwork-Id: 13372892 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C69FCA0FF6 for ; Fri, 1 Sep 2023 20:43:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350742AbjIAUnm (ORCPT ); Fri, 1 Sep 2023 16:43:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230018AbjIAUnm (ORCPT ); Fri, 1 Sep 2023 16:43:42 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E8418E7E; Fri, 1 Sep 2023 13:43:38 -0700 (PDT) Received: from localhost.localdomain (unknown [4.155.48.116]) by linux.microsoft.com (Postfix) with ESMTPSA id 7A212212A783; Fri, 1 Sep 2023 13:43:38 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 7A212212A783 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1693601018; bh=3z+yFv8sRqi+Qau3BLbrmhn68JlC082f944I45jGU4k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qHBSLOsH0T/AAiuxjaVpXMsM9UJQEwOCrRE0rojGwAxuYYvhOlKgGEgAp4xushbr+ L9ObQx1T6MVjbes+3USoRq4CXscIYKD+hOFH4DxqAN/ulkq5FHzNlTDsKR38vTicQ4 WPyC4J0DiKbdiaSlRVJbSWgpdwi3Yh3HQCCbLvPo= From: Beau Belgrave To: rostedt@goodmis.org, mhiramat@kernel.org Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, ast@kernel.org, dcook@linux.microsoft.com Subject: [PATCH 1/3] tracing/user_events: Allow events to persist for perfmon_capable users Date: Fri, 1 Sep 2023 20:43:30 +0000 Message-Id: <20230901204332.159-2-beaub@linux.microsoft.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230901204332.159-1-beaub@linux.microsoft.com> References: <20230901204332.159-1-beaub@linux.microsoft.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-trace-kernel@vger.kernel.org There are several scenarios that have come up where having a user_event persist even if the process that registered it exits. The main one is having a daemon create events on bootup that shouldn't get deleted if the daemon has to exit or reload. Another is within OpenTelemetry exporters, they wish to potentially check if a user_event exists on the system to determine if exporting the data out should occur. The user_event in this case must exist even in the absence of the owning process running (such as the above daemon case). Expose the previously internal flag USER_EVENT_REG_PERSIST to user processes. Upon register or delete of events with this flag, ensure the user is perfmon_capable to prevent random user processes with access to tracefs from creating events that persist after exit. Signed-off-by: Beau Belgrave --- include/uapi/linux/user_events.h | 11 ++++++++++- kernel/trace/trace_events_user.c | 28 ++++++++++++++-------------- 2 files changed, 24 insertions(+), 15 deletions(-) diff --git a/include/uapi/linux/user_events.h b/include/uapi/linux/user_events.h index 2984aae4a2b4..f74f3aedd49c 100644 --- a/include/uapi/linux/user_events.h +++ b/include/uapi/linux/user_events.h @@ -17,6 +17,15 @@ /* Create dynamic location entry within a 32-bit value */ #define DYN_LOC(offset, size) ((size) << 16 | (offset)) +/* List of supported registration flags */ +enum user_reg_flag { + /* Event will not delete upon last reference closing */ + USER_EVENT_REG_PERSIST = 1U << 0, + + /* This value or above is currently non-ABI */ + USER_EVENT_REG_MAX = 1U << 1, +}; + /* * Describes an event registration and stores the results of the registration. * This structure is passed to the DIAG_IOCSREG ioctl, callers at a minimum @@ -33,7 +42,7 @@ struct user_reg { /* Input: Enable size in bytes at address */ __u8 enable_size; - /* Input: Flags for future use, set to 0 */ + /* Input: Flags to use, if any */ __u16 flags; /* Input: Address to update when enabled */ diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c index 6f046650e527..af663de93492 100644 --- a/kernel/trace/trace_events_user.c +++ b/kernel/trace/trace_events_user.c @@ -49,18 +49,6 @@ #define EVENT_STATUS_PERF BIT(1) #define EVENT_STATUS_OTHER BIT(7) -/* - * User register flags are not allowed yet, keep them here until we are - * ready to expose them out to the user ABI. - */ -enum user_reg_flag { - /* Event will not delete upon last reference closing */ - USER_EVENT_REG_PERSIST = 1U << 0, - - /* This value or above is currently non-ABI */ - USER_EVENT_REG_MAX = 1U << 1, -}; - /* * Stores the system name, tables, and locks for a group of events. This * allows isolation for events by various means. @@ -1888,10 +1876,16 @@ static int user_event_parse(struct user_event_group *group, char *name, int argc = 0; char **argv; - /* User register flags are not ready yet */ - if (reg_flags != 0 || flags != NULL) + /* Currently don't support any text based flags */ + if (flags != NULL) return -EINVAL; + /* Persistent events require CAP_PERFMON / CAP_SYS_ADMIN */ + if (reg_flags & USER_EVENT_REG_PERSIST) { + if (!perfmon_capable()) + return -EPERM; + } + /* Prevent dyn_event from racing */ mutex_lock(&event_mutex); user = find_user_event(group, name, &key); @@ -2024,6 +2018,12 @@ static int delete_user_event(struct user_event_group *group, char *name) if (!user_event_last_ref(user)) return -EBUSY; + /* Persistent events require CAP_PERFMON / CAP_SYS_ADMIN */ + if (user->reg_flags & USER_EVENT_REG_PERSIST) { + if (!perfmon_capable()) + return -EPERM; + } + return destroy_user_event(user); } From patchwork Fri Sep 1 20:43:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Beau Belgrave X-Patchwork-Id: 13372895 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A69C3CA0FFD for ; Fri, 1 Sep 2023 20:43:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350314AbjIAUnn (ORCPT ); Fri, 1 Sep 2023 16:43:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230018AbjIAUnn (ORCPT ); Fri, 1 Sep 2023 16:43:43 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5020010CF; Fri, 1 Sep 2023 13:43:39 -0700 (PDT) Received: from localhost.localdomain (unknown [4.155.48.116]) by linux.microsoft.com (Postfix) with ESMTPSA id 93E49212A784; Fri, 1 Sep 2023 13:43:38 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 93E49212A784 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1693601018; bh=9WF2llb3VqzKQ7G8ifFcjGtwVjENCP3yBl4NYTE80qk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cElSQ/68kkCBEHkiuelP17kyEinYxMw9AOowVZ/m+JIWRHlUT8Z/v8TvqEVgL3nFo f5uLJinr1DR+XsJI+VhNjFfPfHmEZYsOoNi7w8zmlgmLcHGbbLO/ZGztzj0XgpP1Pm qWQ1YAzrG/p2P5TI7L0w6cZIEwGb+KIVXTcJJRjM= From: Beau Belgrave To: rostedt@goodmis.org, mhiramat@kernel.org Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, ast@kernel.org, dcook@linux.microsoft.com Subject: [PATCH 2/3] selftests/user_events: Test persist flag cases Date: Fri, 1 Sep 2023 20:43:31 +0000 Message-Id: <20230901204332.159-3-beaub@linux.microsoft.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230901204332.159-1-beaub@linux.microsoft.com> References: <20230901204332.159-1-beaub@linux.microsoft.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-trace-kernel@vger.kernel.org Now that we have exposed USER_EVENT_REG_PERSIST events can persist both via the ABI and in the /sys/kernel/tracing/dynamic_events file. Ensure both the ABI and DYN cases work by calling both during the parse tests. Add new flags test that ensures only USER_EVENT_REG_PERSIST is honored and any other flag is invalid. Signed-off-by: Beau Belgrave --- .../testing/selftests/user_events/abi_test.c | 55 ++++++++++++++++++- .../testing/selftests/user_events/dyn_test.c | 54 +++++++++++++++++- 2 files changed, 107 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/user_events/abi_test.c b/tools/testing/selftests/user_events/abi_test.c index 5125c42efe65..b95fc15496ba 100644 --- a/tools/testing/selftests/user_events/abi_test.c +++ b/tools/testing/selftests/user_events/abi_test.c @@ -23,6 +23,18 @@ const char *data_file = "/sys/kernel/tracing/user_events_data"; const char *enable_file = "/sys/kernel/tracing/events/user_events/__abi_event/enable"; +static bool event_exists(void) +{ + int fd = open(enable_file, O_RDWR); + + if (fd < 0) + return false; + + close(fd); + + return true; +} + static int change_event(bool enable) { int fd = open(enable_file, O_RDWR); @@ -46,7 +58,22 @@ static int change_event(bool enable) return ret; } -static int reg_enable(long *enable, int size, int bit) +static int event_delete(void) +{ + int fd = open(data_file, O_RDWR); + int ret; + + if (fd < 0) + return -1; + + ret = ioctl(fd, DIAG_IOCSDEL, "__abi_event"); + + close(fd); + + return ret; +} + +static int reg_enable_flags(long *enable, int size, int bit, int flags) { struct user_reg reg = {0}; int fd = open(data_file, O_RDWR); @@ -57,6 +84,7 @@ static int reg_enable(long *enable, int size, int bit) reg.size = sizeof(reg); reg.name_args = (__u64)"__abi_event"; + reg.flags = flags; reg.enable_bit = bit; reg.enable_addr = (__u64)enable; reg.enable_size = size; @@ -68,6 +96,11 @@ static int reg_enable(long *enable, int size, int bit) return ret; } +static int reg_enable(long *enable, int size, int bit) +{ + return reg_enable_flags(enable, size, bit, 0); +} + static int reg_disable(long *enable, int bit) { struct user_unreg reg = {0}; @@ -121,6 +154,26 @@ TEST_F(user, enablement) { ASSERT_EQ(0, change_event(false)); } +TEST_F(user, flags) { + /* USER_EVENT_REG_PERSIST is allowed */ + ASSERT_EQ(0, reg_enable_flags(&self->check, sizeof(int), 0, + USER_EVENT_REG_PERSIST)); + ASSERT_EQ(0, reg_disable(&self->check, 0)); + + /* Ensure it exists after close and disable */ + ASSERT_TRUE(event_exists()); + + /* Ensure we can delete it */ + ASSERT_EQ(0, event_delete()); + + /* USER_EVENT_REG_MAX or above is not allowed */ + ASSERT_EQ(-1, reg_enable_flags(&self->check, sizeof(int), 0, + USER_EVENT_REG_MAX)); + + /* Ensure it does not exist after invalid flags */ + ASSERT_FALSE(event_exists()); +} + TEST_F(user, bit_sizes) { /* Allow 0-31 bits for 32-bit */ ASSERT_EQ(0, reg_enable(&self->check, sizeof(int), 0)); diff --git a/tools/testing/selftests/user_events/dyn_test.c b/tools/testing/selftests/user_events/dyn_test.c index 91a4444ad42b..f2a41bcb5ad8 100644 --- a/tools/testing/selftests/user_events/dyn_test.c +++ b/tools/testing/selftests/user_events/dyn_test.c @@ -16,9 +16,25 @@ #include "../kselftest_harness.h" +const char *dyn_file = "/sys/kernel/tracing/dynamic_events"; const char *abi_file = "/sys/kernel/tracing/user_events_data"; const char *enable_file = "/sys/kernel/tracing/events/user_events/__test_event/enable"; +static int event_delete(void) +{ + int fd = open(abi_file, O_RDWR); + int ret; + + if (fd < 0) + return -1; + + ret = ioctl(fd, DIAG_IOCSDEL, "__test_event"); + + close(fd); + + return ret; +} + static bool wait_for_delete(void) { int i; @@ -63,7 +79,31 @@ static int unreg_event(int fd, int *check, int bit) return ioctl(fd, DIAG_IOCSUNREG, &unreg); } -static int parse(int *check, const char *value) +static int parse_dyn(const char *value) +{ + int fd = open(dyn_file, O_RDWR | O_APPEND); + int len = strlen(value); + int ret; + + if (fd == -1) + return -1; + + ret = write(fd, value, len); + + if (ret == len) + ret = 0; + else + ret = -1; + + close(fd); + + if (ret == 0) + event_delete(); + + return ret; +} + +static int parse_abi(int *check, const char *value) { int fd = open(abi_file, O_RDWR); int ret; @@ -89,6 +129,18 @@ static int parse(int *check, const char *value) return ret; } +static int parse(int *check, const char *value) +{ + int abi_ret = parse_abi(check, value); + int dyn_ret = parse_dyn(value); + + /* Ensure both ABI and DYN parse the same way */ + if (dyn_ret != abi_ret) + return -1; + + return dyn_ret; +} + static int check_match(int *check, const char *first, const char *second, bool *match) { int fd = open(abi_file, O_RDWR); From patchwork Fri Sep 1 20:43:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Beau Belgrave X-Patchwork-Id: 13372894 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF4E7CA0FFA for ; Fri, 1 Sep 2023 20:43:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350750AbjIAUnn (ORCPT ); Fri, 1 Sep 2023 16:43:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350747AbjIAUnm (ORCPT ); Fri, 1 Sep 2023 16:43:42 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4E15C1B1; Fri, 1 Sep 2023 13:43:39 -0700 (PDT) Received: from localhost.localdomain (unknown [4.155.48.116]) by linux.microsoft.com (Postfix) with ESMTPSA id AFE95212A79D; Fri, 1 Sep 2023 13:43:38 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com AFE95212A79D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1693601018; bh=1YG8rYRy90yBzfjk0YrTjAqtyvOMIk+tMPk2iMh/bSY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JNY55OwkyzJOC2gs8+rHWkVG1rEQ53WNbb3f+VEN8Bv/rye7/loCDwPiaFYByYWWu TsHItk5wcBV3QWhFM0+DkRvTDAbiCQrj5AlcyLbhlwDnCEJWT7k2WTAb3qTmBot6KK K5xyrYyODluD4LarEXgs6Dr5faB0Ja2aj+wCNsBw= From: Beau Belgrave To: rostedt@goodmis.org, mhiramat@kernel.org Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, ast@kernel.org, dcook@linux.microsoft.com Subject: [PATCH 3/3] tracing/user_events: Document persist event flags Date: Fri, 1 Sep 2023 20:43:32 +0000 Message-Id: <20230901204332.159-4-beaub@linux.microsoft.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230901204332.159-1-beaub@linux.microsoft.com> References: <20230901204332.159-1-beaub@linux.microsoft.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-trace-kernel@vger.kernel.org Users need to know how to make events persist, now that we allow for that. We also now allow the dynamic_events file to create events by utilizing the persist flag during event register. Add back in to documentation how /sys/kernel/tracing/dynamic_events can be used to create peristent user_events. Add a section under registering for the currently supported flags (USER_EVENT_REG_PERSIST) and the required permissions. Add a note under deleting that deleting a persistent event also requires sufficient permission. Signed-off-by: Beau Belgrave --- Documentation/trace/user_events.rst | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/Documentation/trace/user_events.rst b/Documentation/trace/user_events.rst index e7b07313550a..c5388a47376f 100644 --- a/Documentation/trace/user_events.rst +++ b/Documentation/trace/user_events.rst @@ -14,6 +14,11 @@ Programs can view status of the events via /sys/kernel/tracing/user_events_status and can both register and write data out via /sys/kernel/tracing/user_events_data. +Programs can also use /sys/kernel/tracing/dynamic_events to register and +delete user based events via the u: prefix. The format of the command to +dynamic_events is the same as the ioctl with the u: prefix applied. This +requires CAP_PERFMON due to the event persisting, otherwise -EPERM is returned. + Typically programs will register a set of events that they wish to expose to tools that can read trace_events (such as ftrace and perf). The registration process tells the kernel which address and bit to reflect if any tool has @@ -45,7 +50,7 @@ This command takes a packed struct user_reg as an argument:: /* Input: Enable size in bytes at address */ __u8 enable_size; - /* Input: Flags for future use, set to 0 */ + /* Input: Flags to be used, if any */ __u16 flags; /* Input: Address to update when enabled */ @@ -69,7 +74,7 @@ The struct user_reg requires all the above inputs to be set appropriately. This must be 4 (32-bit) or 8 (64-bit). 64-bit values are only allowed to be used on 64-bit kernels, however, 32-bit can be used on all kernels. -+ flags: The flags to use, if any. For the initial version this must be 0. ++ flags: The flags to use, if any. Callers should first attempt to use flags and retry without flags to ensure support for lower versions of the kernel. If a flag is not supported -EINVAL is returned. @@ -80,6 +85,13 @@ The struct user_reg requires all the above inputs to be set appropriately. + name_args: The name and arguments to describe the event, see command format for details. +The following flags are currently supported. + ++ USER_EVENT_REG_PERSIST: The event will not delete upon the last reference + closing. Callers may use this if an event should exist even after the + process closes or unregisters the event. Requires CAP_PERFMON otherwise + -EPERM is returned. + Upon successful registration the following is set. + write_index: The index to use for this file descriptor that represents this @@ -141,7 +153,10 @@ event (in both user and kernel space). User programs should use a separate file to request deletes than the one used for registration due to this. **NOTE:** By default events will auto-delete when there are no references left -to the event. Flags in the future may change this logic. +to the event. If programs do not want auto-delete, they must use the +USER_EVENT_REG_PERSIST flag when registering the event. Once that flag is used +the event exists until DIAG_IOCSDEL is invoked. Both register and delete of an +event that persists requires CAP_PERFMON, otherwise -EPERM is returned. Unregistering -------------