trace-cmd: Do not free pages from the lookup table in struct cpu_data in case trace file is loaded.
diff mbox series

Message ID 20190619114922.3169-1-tz.stoyanov@gmail.com
State Accepted
Commit 0bdae7b3fbcd1d6d06096584915b9770fd07ccb2
Headers show
Series
  • trace-cmd: Do not free pages from the lookup table in struct cpu_data in case trace file is loaded.
Related show

Commit Message

Tzvetomir Stoyanov (VMware) June 19, 2019, 11:49 a.m. UTC
From: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>

A major speed regression in trace-cmd v2.8 is reported by Johannes Berg
when parsing a huge trace.dat file:

"I have a ~1.7G file with just under 620k events (not exactly big by our standards),
and parsing speed (with -N to disable plugins) goes from ~4.5 seconds on commit
1ad32c24746 to >>4.5 minutes (I aborted there) on master.
I was talking to Steven about another issue, and he pointed me to
commit c2fc2bc296f7. Reverting that on master makes it take ~2 seconds,
so that'd actually be an improvement."

Proposed solution: do not free pages from "struct page **pages" lookup table
in struct cpu_data, in case a trace file is loaded. This reverts the behavior
for this use case, as it was before commit c2fc2bc296f7.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203411
Fixes: c2fc2bc296f7 ("trace-cmd: Fix crash when trace-cmd is executed with args profile -F sleep 1")

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
---
 lib/trace-cmd/trace-input.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

Comments

Johannes Berg June 19, 2019, 11:52 a.m. UTC | #1
On Wed, 2019-06-19 at 14:49 +0300, tz.stoyanov@gmail.com wrote:
> From: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
> 
> A major speed regression in trace-cmd v2.8 is reported by Johannes Berg
> when parsing a huge trace.dat file:
> 
> "I have a ~1.7G file with just under 620k events (not exactly big by our standards),
> and parsing speed (with -N to disable plugins) goes from ~4.5 seconds on commit
> 1ad32c24746 to >>4.5 minutes (I aborted there) on master.
> I was talking to Steven about another issue, and he pointed me to
> commit c2fc2bc296f7. Reverting that on master makes it take ~2 seconds,
> so that'd actually be an improvement."
> 
> Proposed solution: do not free pages from "struct page **pages" lookup table
> in struct cpu_data, in case a trace file is loaded. This reverts the behavior
> for this use case, as it was before commit c2fc2bc296f7.

Seems to work, more or less. I now see ~6.5 seconds which is slower than
it was, and in particular comparing to just a revert (which was ~2
seconds), but it's usable :-)

johannes
Steven Rostedt June 25, 2019, 10:32 p.m. UTC | #2
On Wed, 19 Jun 2019 13:52:14 +0200
Johannes Berg <johannes@sipsolutions.net> wrote:

> Seems to work, more or less. I now see ~6.5 seconds which is slower than
> it was, and in particular comparing to just a revert (which was ~2
> seconds), but it's usable :-)

I'm totally confused. Unless it is a cache miss thing, the only thing
it did was add two branches that should basically follow the old path
with the revert.

Can you run perf comparing this patch against the revert and see where
the difference lies?

Thanks!

-- Steve
Johannes Berg June 26, 2019, 6:51 a.m. UTC | #3
On Tue, 2019-06-25 at 18:32 -0400, Steven Rostedt wrote:
> On Wed, 19 Jun 2019 13:52:14 +0200
> Johannes Berg <johannes@sipsolutions.net> wrote:
> 
> > Seems to work, more or less. I now see ~6.5 seconds which is slower than
> > it was, and in particular comparing to just a revert (which was ~2
> > seconds), but it's usable :-)
> 
> I'm totally confused. Unless it is a cache miss thing, the only thing
> it did was add two branches that should basically follow the old path
> with the revert.
> 
> Can you run perf comparing this patch against the revert and see where
> the difference lies?

I tried, but it looks the same. Then I tried to reproduce and now I see
it also fairly consistently just under 2 seconds with the patch, so
perhaps CPU scaling or something was throwing me a curve ball before.

Sorry for the noise, patch works great.

Tested-by: Johannes Berg <johannes@sipsolutions.net>

johannes

Patch
diff mbox series

diff --git a/lib/trace-cmd/trace-input.c b/lib/trace-cmd/trace-input.c
index 7acecf3..32af20c 100644
--- a/lib/trace-cmd/trace-input.c
+++ b/lib/trace-cmd/trace-input.c
@@ -987,15 +987,17 @@  static void __free_page(struct tracecmd_input *handle, struct page *page)
 
 	free(page);
 
-	for (index = cpu_data->nr_pages - 1; index > 0; index--)
-		if (cpu_data->pages[index])
-			break;
-	if (index < (cpu_data->nr_pages - 1)) {
-		pages = realloc(cpu_data->pages, (index + 1) * sizeof(*cpu_data->pages));
-		if (!pages)
-			return;
-		cpu_data->pages = pages;
-		cpu_data->nr_pages = index + 1;
+	if (handle->use_pipe) {
+		for (index = cpu_data->nr_pages - 1; index > 0; index--)
+			if (cpu_data->pages[index])
+				break;
+		if (index < (cpu_data->nr_pages - 1)) {
+			pages = realloc(cpu_data->pages, (index + 1) * sizeof(*cpu_data->pages));
+			if (!pages)
+				return;
+			cpu_data->pages = pages;
+			cpu_data->nr_pages = index + 1;
+		}
 	}
 }