From patchwork Thu Apr 11 01:25:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Rostedt X-Patchwork-Id: 13625294 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AF1E64CC0; Thu, 11 Apr 2024 01:26:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712798787; cv=none; b=i5EPhwDX737BncE5cPxLtbvpMJZ8aMuhr/A8a0FezEjy21nXXNRihUyHtpLHJcvmjNxQuzsi76+uJILhpheNPIzKhlSLLIiDXYdTyVD2OIm93ZswD0TTTBR8vIDXc5wSfPs9wNQkuT5g92C2RIwuzj49xU9piOxMskmwt0mYEwE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712798787; c=relaxed/simple; bh=JSQojgpzWXRR0Az6t0VzaaXRtYk8WJbAdEauGbfshYc=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=dJLGmL0xT5y1C7EeA6Ob7Rj5TlrsRXtOG+kanEnoSem8Vte2vGv43OuqIYLOmv+Bj83iVxelIwoeJl+jbIbazDG4ntszrt7M5V9fzFb66BYcvv3pBHCcLAJdfNnmHvNlosIm1qX6YPuaZ038h76t8fogdaTpspnDMzJlTANH0t4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECD02C32787; Thu, 11 Apr 2024 01:26:26 +0000 (UTC) Received: from rostedt by gandalf with local (Exim 4.97) (envelope-from ) id 1rujFR-00000000nXO-1Oke; Wed, 10 Apr 2024 21:29:05 -0400 Message-ID: <20240411012905.194026306@goodmis.org> User-Agent: quilt/0.67 Date: Wed, 10 Apr 2024 21:25:48 -0400 From: Steven Rostedt To: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Masami Hiramatsu , Mark Rutland , Mathieu Desnoyers , Andrew Morton , Vincent Donnefort , Joel Fernandes , Daniel Bristot de Oliveira , Ingo Molnar , Peter Zijlstra , suleiman@google.com, Thomas Gleixner , Vineeth Pillai , Youssef Esmat , Beau Belgrave , Alexander Graf , Baoquan He , Borislav Petkov , "Paul E. McKenney" , David Howells Subject: [PATCH v2 07/11] ring-buffer: Validate boot range memory events References: <20240411012541.285904543@goodmis.org> Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: "Steven Rostedt (Google)" Make sure all the events in each of the sub-buffers that were mapped in a memory region are valid. This moves the code that walks the buffers for time-stamp validation out of the CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS ifdef block and is used to validate the content. Only the ring buffer event meta data is checked and not the data load. This also has a second purpose. The buffer_page structure that points to the data sub-buffers has accounting that keeps track of the number of events that are on the sub-buffer. This updates that counter as well. That counter is used in reading the buffer and knowing if the ring buffer is empty or not. Signed-off-by: Steven Rostedt (Google) --- kernel/trace/ring_buffer.c | 190 ++++++++++++++++++++++++++++++++----- 1 file changed, 166 insertions(+), 24 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index d05b60fd5695..6f19e7612472 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -1663,10 +1663,170 @@ static bool rb_meta_valid(struct ring_buffer_meta *meta, int cpu, subbuf = (void *)subbuf + subbuf_size; } - pr_info("Ring buffer meta is from previous boot!\n"); return true; } +static int rb_meta_subbuf_idx(struct ring_buffer_meta *meta, void *subbuf); + +#ifdef CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS +static DEFINE_PER_CPU(atomic_t, checking); +static atomic_t ts_dump; + +#define buffer_warn_return(fmt, ...) \ + do { \ + /* If another report is happening, ignore this one */ \ + if (atomic_inc_return(&ts_dump) != 1) { \ + atomic_dec(&ts_dump); \ + goto out; \ + } \ + atomic_inc(&cpu_buffer->record_disabled); \ + pr_warn(fmt, ##__VA_ARGS__); \ + dump_buffer_page(bpage, info, tail); \ + atomic_dec(&ts_dump); \ + /* There's some cases in boot up that this can happen */ \ + if (WARN_ON_ONCE(system_state != SYSTEM_BOOTING)) \ + /* Do not re-enable checking */ \ + return; \ + } while (0) +#else +#define buffer_warn_return(fmt, ...) do { } while (0) +#endif + +static int rb_read_data_buffer(struct buffer_data_page *dpage, int tail, int cpu, + unsigned long long *timestamp, bool warn) +{ + struct ring_buffer_event *event; + u64 ts, delta; + int events = 0; + int e; + + ts = dpage->time_stamp; + + for (e = 0; e < tail; e += rb_event_length(event)) { + + event = (struct ring_buffer_event *)(dpage->data + e); + + switch (event->type_len) { + + case RINGBUF_TYPE_TIME_EXTEND: + delta = rb_event_time_stamp(event); + ts += delta; + break; + + case RINGBUF_TYPE_TIME_STAMP: + delta = rb_event_time_stamp(event); + if (warn && delta < ts) { + buffer_warn_return("[CPU: %d]ABSOLUTE TIME WENT BACKWARDS: last ts: %lld absolute ts: %lld\n", + cpu, ts, delta); + } + ts = delta; + break; + + case RINGBUF_TYPE_PADDING: + if (event->time_delta == 1) + break; + fallthrough; + case RINGBUF_TYPE_DATA: + events++; + ts += event->time_delta; + break; + + default: + return -1; + } + } + *timestamp = ts; + return events; +} + +static int rb_validate_buffer(struct buffer_data_page *dpage, int cpu) +{ + unsigned long long ts; + int tail; + + tail = local_read(&dpage->commit); + return rb_read_data_buffer(dpage, tail, cpu, &ts, false); +} + +/* If the meta data has been validated, now validate the events */ +static void rb_meta_validate_events(struct ring_buffer_per_cpu *cpu_buffer) +{ + struct ring_buffer_meta *meta = cpu_buffer->ring_meta; + struct buffer_page *head_page; + unsigned long entry_bytes = 0; + unsigned long entries = 0; + int ret; + int i; + + if (!meta || !meta->head_buffer) + return; + + /* Do the reader page first */ + ret = rb_validate_buffer(cpu_buffer->reader_page->page, cpu_buffer->cpu); + if (ret < 0) { + printk("INVALID READER PAGE\n"); + goto invalid; + } + entries += ret; + entry_bytes += local_read(&cpu_buffer->reader_page->page->commit); + local_set(&cpu_buffer->reader_page->entries, ret); + + head_page = cpu_buffer->head_page; + + /* If both the head and commit are on the reader_page then we are done. */ + if (head_page == cpu_buffer->reader_page && + head_page == cpu_buffer->commit_page) + goto done; + + /* Iterate until finding the commit page */ + for (i = 0; i < meta->nr_subbufs + 1; i++, rb_inc_page(&head_page)) { + + /* Reader page has already been done */ + if (head_page == cpu_buffer->reader_page) + continue; + + ret = rb_validate_buffer(head_page->page, cpu_buffer->cpu); + if (ret < 0) { + pr_info("Ring buffer meta [%d] invalid buffer page\n", + cpu_buffer->cpu); + goto invalid; + } + entries += ret; + entry_bytes += local_read(&head_page->page->commit); + local_set(&cpu_buffer->head_page->entries, ret); + + if (head_page == cpu_buffer->commit_page) + break; + } + + if (head_page != cpu_buffer->commit_page) { + pr_info("Ring buffer meta [%d] commit page not found\n", + cpu_buffer->cpu); + goto invalid; + } + done: + local_set(&cpu_buffer->entries, entries); + local_set(&cpu_buffer->entries_bytes, entry_bytes); + + pr_info("Ring buffer meta [%d] is from previous boot!\n", cpu_buffer->cpu); + return; + + invalid: + /* The content of the buffers are invalid, reset the meta data */ + meta->head_buffer = 0; + meta->commit_buffer = 0; + + /* Reset the reader page */ + local_set(&cpu_buffer->reader_page->entries, 0); + local_set(&cpu_buffer->reader_page->page->commit, 0); + + /* Reset all the subbuffers */ + for (i = 0; i < meta->nr_subbufs - 1; i++, rb_inc_page(&head_page)) { + local_set(&head_page->entries, 0); + local_set(&head_page->page->commit, 0); + } +} + static void rb_range_meta_init(struct trace_buffer *buffer, int nr_pages) { struct ring_buffer_meta *meta; @@ -1745,8 +1905,6 @@ static void *rbm_next(struct seq_file *m, void *v, loff_t *pos) return rbm_start(m, pos); } -static int rb_meta_subbuf_idx(struct ring_buffer_meta *meta, void *subbuf); - static int rbm_show(struct seq_file *m, void *v) { struct ring_buffer_per_cpu *cpu_buffer = m->private; @@ -1995,6 +2153,8 @@ rb_allocate_cpu_buffer(struct trace_buffer *buffer, long nr_pages, int cpu) if (ret < 0) goto fail_free_reader; + rb_meta_validate_events(cpu_buffer); + /* If the boot meta was valid then this has already been updated */ meta = cpu_buffer->ring_meta; if (!meta || !meta->head_buffer || @@ -3899,26 +4059,6 @@ static void dump_buffer_page(struct buffer_data_page *bpage, pr_warn("expected end:0x%lx last event actually ended at:0x%x\n", tail, e); } -static DEFINE_PER_CPU(atomic_t, checking); -static atomic_t ts_dump; - -#define buffer_warn_return(fmt, ...) \ - do { \ - /* If another report is happening, ignore this one */ \ - if (atomic_inc_return(&ts_dump) != 1) { \ - atomic_dec(&ts_dump); \ - goto out; \ - } \ - atomic_inc(&cpu_buffer->record_disabled); \ - pr_warn(fmt, ##__VA_ARGS__); \ - dump_buffer_page(bpage, info, tail); \ - atomic_dec(&ts_dump); \ - /* There's some cases in boot up that this can happen */ \ - if (WARN_ON_ONCE(system_state != SYSTEM_BOOTING)) \ - /* Do not re-enable checking */ \ - return; \ - } while (0) - /* * Check if the current event time stamp matches the deltas on * the buffer page. @@ -3957,8 +4097,9 @@ static void check_buffer(struct ring_buffer_per_cpu *cpu_buffer, if (atomic_inc_return(this_cpu_ptr(&checking)) != 1) goto out; - ts = bpage->time_stamp; + ret = rb_read_data_buffer(bpage, tail, &ts, cpu_buffer->cpu, true); +#if 0 for (e = 0; e < tail; e += rb_event_length(event)) { event = (struct ring_buffer_event *)(bpage->data + e); @@ -3992,6 +4133,7 @@ static void check_buffer(struct ring_buffer_per_cpu *cpu_buffer, RB_WARN_ON(cpu_buffer, 1); } } +#endif if ((full && ts > info->ts) || (!full && ts + info->delta != info->ts)) { buffer_warn_return("[CPU: %d]TIME DOES NOT MATCH expected:%lld actual:%lld delta:%lld before:%lld after:%lld%s context:%s\n",