From patchwork Thu May 2 12:09:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tzvetomir Stoyanov X-Patchwork-Id: 10926627 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E70F14C0 for ; Thu, 2 May 2019 12:09:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E83528BF0 for ; Thu, 2 May 2019 12:09:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 72A8528E54; Thu, 2 May 2019 12:09:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C7DB628BF0 for ; Thu, 2 May 2019 12:09:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726267AbfEBMJ5 (ORCPT ); Thu, 2 May 2019 08:09:57 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:34683 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726189AbfEBMJ5 (ORCPT ); Thu, 2 May 2019 08:09:57 -0400 Received: by mail-wm1-f66.google.com with SMTP id b67so6487493wmg.1 for ; Thu, 02 May 2019 05:09:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=6XFTI+DO3Sbv3ECf89iqSkHY+C784SDGc3vukLXttxc=; b=aFT48mLPQHUrlsGMmDyAAaAyCb93+79/bzNlRQGBAeT4frHzXMoc8L9vVACZ44yag6 uCsh/y4XWa9xKwjz7iQwW7JGrwGumMW1XqAo0l/i/cDsJfX1KwhtxjdYDaC6B6qthb/1 KdvQQ6H4tA3hM4UNdwiaj2+TFPbGKQLAwPFE48mmZ4Hy9wN0Cx7/eo1nmPthepXv8oOi cMFeXaR8p1HEhBXrHnEGRgb5mom31msZZzc6KggxXh8jyKX7PgyhsbRrupc9R6fH7S1i afwAso0fFE91D1Lc++H9IhXfpnJhdWZpAkUUlJlp/WKAKRshIBdwIU7+DMSQUSkl0CSd kskQ== X-Gm-Message-State: APjAAAWtDEYZVTDGp+y3qQCE6ha5oHXszNgI+X+lVBWApXFzvPeDjyVg TBzY+dnHEkTxLRWqXfDCrus= X-Google-Smtp-Source: APXvYqzLYf+p2JwnxtubCInhYRuDlF5IkAVy69iStxWvx5P0R1LF4G6C0wt3wAcyNfTzo8/LZRIl3w== X-Received: by 2002:a7b:cb11:: with SMTP id u17mr2276545wmj.144.1556798994645; Thu, 02 May 2019 05:09:54 -0700 (PDT) Received: from oberon.eng.vmware.com ([146.247.46.5]) by smtp.gmail.com with ESMTPSA id u2sm42029219wru.36.2019.05.02.05.09.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 02 May 2019 05:09:54 -0700 (PDT) From: Tzvetomir Stoyanov To: rostedt@goodmis.org Cc: linux-trace-devel@vger.kernel.org Subject: [PATCH 1/2] trace-cmd: Fix crash when trace-cmd is executed with args "profile -F sleep 1" Date: Thu, 2 May 2019 15:09:51 +0300 Message-Id: <20190502120952.20449-1-tstoyanov@vmware.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Sender: linux-trace-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-trace-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A fix for https://bugzilla.kernel.org/show_bug.cgi?id=203411 When trace-cmd is running in "profile" mode, trace files are not generated. Instead, pipes are used to collect trace data from recorder threads. Some internal functions, originally designed for working with files, are reused in pipes use case: init_cpu() allocate_page() get_next_page() There was an undesired behaviour in those functions, when working with pipes, which causes the segmentation fault, described in the bug report. Signed-off-by: Tzvetomir Stoyanov --- lib/trace-cmd/trace-input.c | 69 ++++++++++++++++++++++++------------- 1 file changed, 45 insertions(+), 24 deletions(-) diff --git a/lib/trace-cmd/trace-input.c b/lib/trace-cmd/trace-input.c index ba20ef1..8d1001a 100644 --- a/lib/trace-cmd/trace-input.c +++ b/lib/trace-cmd/trace-input.c @@ -62,6 +62,7 @@ struct cpu_data { struct list_head page_maps; struct page_map *page_map; struct page **pages; + int num_pages; struct tep_record *next; struct page *page; struct kbuffer *kbuf; @@ -921,34 +922,50 @@ static struct page *allocate_page(struct tracecmd_input *handle, struct cpu_data *cpu_data = &handle->cpu_data[cpu]; struct page *page; int index; + int ret; index = (offset - cpu_data->file_offset) / handle->page_size; - if (cpu_data->pages[index]) { - cpu_data->pages[index]->ref_count++; - return cpu_data->pages[index]; - } - - page = malloc(sizeof(*page)); - if (!page) + if (index >= cpu_data->num_pages) return NULL; + page = cpu_data->pages[index]; + if (page && !handle->read_page) { + page->ref_count++; + return page; + } - memset(page, 0, sizeof(*page)); - page->offset = offset; - page->handle = handle; - page->cpu = cpu; + if (!page) { + page = malloc(sizeof(*page)); + if (!page) + return NULL; - page->map = allocate_page_map(handle, page, cpu, offset); + memset(page, 0, sizeof(*page)); + page->offset = offset; + page->handle = handle; + page->cpu = cpu; + } if (!page->map) { - free(page); - return NULL; + page->map = allocate_page_map(handle, page, cpu, offset); + if (!page->map) { + free(page); + return NULL; + } + + cpu_data->pages[index] = page; + cpu_data->page_cnt++; + page->ref_count = 1; + + return page; } - cpu_data->pages[index] = page; - cpu_data->page_cnt++; - page->ref_count = 1; + if (handle->read_page) { + ret = read_page(handle, offset, cpu, page->map); + if (ret < 0) + return NULL; + return page; + } - return page; + return NULL; } static void __free_page(struct tracecmd_input *handle, struct page *page) @@ -960,6 +977,8 @@ static void __free_page(struct tracecmd_input *handle, struct page *page) die("Page ref count is zero!\n"); page->ref_count--; + if (cpu_data->page == page) + cpu_data->page = NULL; if (page->ref_count) return; @@ -1125,7 +1144,7 @@ static int get_page(struct tracecmd_input *handle, int cpu, static int get_next_page(struct tracecmd_input *handle, int cpu) { - off64_t offset; + off64_t offset = 0; if (!handle->cpu_data[cpu].page && !handle->use_pipe) return 0; @@ -1137,7 +1156,8 @@ static int get_next_page(struct tracecmd_input *handle, int cpu) return 0; } - offset = handle->cpu_data[cpu].offset + handle->page_size; + if (!handle->use_pipe) + offset = handle->cpu_data[cpu].offset + handle->page_size; return get_page(handle, cpu, offset); } @@ -2026,7 +2046,6 @@ tracecmd_read_prev(struct tracecmd_input *handle, struct tep_record *record) static int init_cpu(struct tracecmd_input *handle, int cpu) { struct cpu_data *cpu_data = &handle->cpu_data[cpu]; - int num_pages; int i; cpu_data->offset = cpu_data->file_offset; @@ -2040,13 +2059,13 @@ static int init_cpu(struct tracecmd_input *handle, int cpu) return 0; } - num_pages = (cpu_data->size + handle->page_size - 1) / handle->page_size; - cpu_data->pages = calloc(num_pages + 1, sizeof(*cpu_data->pages)); + cpu_data->num_pages = (cpu_data->size + handle->page_size - 1) / handle->page_size; + cpu_data->pages = calloc(cpu_data->num_pages + 1, sizeof(*cpu_data->pages)); if (!cpu_data->pages) return -1; /* Add stopper */ - cpu_data->pages[num_pages] = PAGE_STOPPER; + cpu_data->pages[cpu_data->num_pages] = PAGE_STOPPER; if (handle->use_pipe) { /* Just make a page, it will be nuked later */ @@ -2056,8 +2075,10 @@ static int init_cpu(struct tracecmd_input *handle, int cpu) memset(cpu_data->page, 0, sizeof(*cpu_data->page)); cpu_data->pages[0] = cpu_data->page; + cpu_data->num_pages = 1; cpu_data->page_cnt = 1; cpu_data->page->ref_count = 1; + cpu_data->page->cpu = cpu; return 0; } From patchwork Thu May 2 12:09:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tzvetomir Stoyanov X-Patchwork-Id: 10926625 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7EEBC1395 for ; Thu, 2 May 2019 12:09:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 632FD28D48 for ; Thu, 2 May 2019 12:09:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 51FA128E90; Thu, 2 May 2019 12:09:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F32FD28D48 for ; Thu, 2 May 2019 12:09:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726189AbfEBMJ5 (ORCPT ); Thu, 2 May 2019 08:09:57 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:53141 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726197AbfEBMJ5 (ORCPT ); Thu, 2 May 2019 08:09:57 -0400 Received: by mail-wm1-f68.google.com with SMTP id j13so2482736wmh.2 for ; Thu, 02 May 2019 05:09:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=72uUKgi2Igx5O5RllnfA4/C0lnOX05XfwkmUDwNSm/w=; b=pE1asid0iW3qZqG2l3HxKb9Ar2ScmwROVJEH9Hvu5XSMYXOM2R7KC1+XHpe47EHPXr 83HdgiEkupTNGLGYvsPRYBJSrZaP4hTJQDOeFX+O283ATBRgl8WfSAQ/Tft58IzkZDur PSH6kwLo6Fxb0R6y55cs9wXyl50Bxo09lwUNMizhLIao6xJ1IGkBBlS9HMmNRUZXghpI ant0hy8/1MYTXWgVj7nlgFLTaDHSm2NaGguaMYdC0p5R1ePnNJAzJYJcHwMtR0U33w5/ an9cxXGtRdO3AMAUIX17YbZUK7wX5L/Uqq0IwyPMKnEGYJZlJ4OmT3fukbYUVDbwkO1w +SGQ== X-Gm-Message-State: APjAAAXsBkoUhWkdCzrU/kEoIu6b1yWx3YfCxRt9dYFFFLXVMoCKx+ZV wbez87WdSrQDkviPd0q7/Nd5di5M X-Google-Smtp-Source: APXvYqyca5bLZammyAvJv3nXEOelLWSA3fjYvfhCmUueiPVT4Z89b8vFiw72/3pOhM/0VqHXMXnY8g== X-Received: by 2002:a1c:f70e:: with SMTP id v14mr2238161wmh.74.1556798995528; Thu, 02 May 2019 05:09:55 -0700 (PDT) Received: from oberon.eng.vmware.com ([146.247.46.5]) by smtp.gmail.com with ESMTPSA id u2sm42029219wru.36.2019.05.02.05.09.54 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 02 May 2019 05:09:54 -0700 (PDT) From: Tzvetomir Stoyanov To: rostedt@goodmis.org Cc: linux-trace-devel@vger.kernel.org Subject: [PATCH 2/2] trace-cmd: Fix a possible race condition and deadlock in trace-cmd Date: Thu, 2 May 2019 15:09:52 +0300 Message-Id: <20190502120952.20449-2-tstoyanov@vmware.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190502120952.20449-1-tstoyanov@vmware.com> References: <20190502120952.20449-1-tstoyanov@vmware.com> MIME-Version: 1.0 Sender: linux-trace-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-trace-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When pipes are used for communication between trace-cmd main thread and per-cpu recorder threads, there is a possible race condition in stop_threads(), which can cause a deadlock between the main thread and cpu recorder thread: In trace_stream_read(), the select() call can return 0 if threads have no data to send. This will force stop_threads() to stop reading the thread's pipes and enter a waitpid() loop, to wait for all threads to be terminated. However, there is a case when some threads are still flushing its data - tracecmd_flush_recording() tries a blocking write() to the pipe. A dead lock appears - the cpu thread is blocked in write(), as its buffer is full and no one is reading it. The main thread is blocked in waitpid(), to wait the same thread to exit. The deadlock can be (randomly) observed with the command "trace-cmd profile -p function -F sleep 10" The proposed fix increases select timeout from 0 to 1 second, to ensure the threads are flushed its data before going in waitpid() loop. Signed-off-by: Tzvetomir Stoyanov --- tracecmd/trace-record.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tracecmd/trace-record.c b/tracecmd/trace-record.c index 4523128..9aef5c3 100644 --- a/tracecmd/trace-record.c +++ b/tracecmd/trace-record.c @@ -626,7 +626,7 @@ static void delete_thread_data(void) static void stop_threads(enum trace_type type) { - struct timeval tv = { 0, 0 }; + struct timeval tv = { 1, 0 }; int ret; int i;