diff mbox

Add Dmesg Triage Feature: further triage i-g-t kmsg log to reduce result noise resulted from piglit dmesg defect

Message ID 1426732611-24404-1-git-send-email-ethan.gao@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

ethan gao March 19, 2015, 2:36 a.m. UTC
tests/igt.py: add igt env to enable or disable dmesg triage
framework/test/base.py: trigger dmesg triage depending on dmesg log occurrence
framework/dmesg.py: employ dmesg triage simply for Linux dmesg
dmesg_triage/*: deal with kmsg log with pre-defined dmesg oops pattern

In general, if dmesg triage is enabled and there is new dmesg along with i-g-t testcases
running, the new dmesg will be captured and worked out a tag and head to rollback the
detail when necessary, in addition, on the basis of different dmesg tag or head, the final
result noise of a testcase can be reduced with defined rules or strategy.

Signed-off-by: ethan gao <ethan.gao@intel.com>
---
 dmesg_triage/debug.sh             |  67 ++++++++++++++
 dmesg_triage/dmesg.rb             | 117 ++++++++++++++++++++++++
 dmesg_triage/kmsg_triage          | 182 ++++++++++++++++++++++++++++++++++++++
 dmesg_triage/libdmesg.sh          | 156 ++++++++++++++++++++++++++++++++
 dmesg_triage/oops-context-pattern |  37 ++++++++
 dmesg_triage/oops-pattern         |  59 ++++++++++++
 framework/dmesg.py                |  60 +++++++++++++
 framework/test/base.py            |   4 +
 tests/igt.py                      |   7 ++
 9 files changed, 689 insertions(+)
 create mode 100644 dmesg_triage/debug.sh
 create mode 100644 dmesg_triage/dmesg.rb
 create mode 100755 dmesg_triage/kmsg_triage
 create mode 100644 dmesg_triage/libdmesg.sh
 create mode 100644 dmesg_triage/oops-context-pattern
 create mode 100644 dmesg_triage/oops-pattern

Comments

Lespiau, Damien March 19, 2015, 5:16 p.m. UTC | #1
On Thu, Mar 19, 2015 at 10:36:51AM +0800, ethan gao wrote:
> tests/igt.py: add igt env to enable or disable dmesg triage
> framework/test/base.py: trigger dmesg triage depending on dmesg log occurrence
> framework/dmesg.py: employ dmesg triage simply for Linux dmesg
> dmesg_triage/*: deal with kmsg log with pre-defined dmesg oops pattern
> 
> In general, if dmesg triage is enabled and there is new dmesg along with i-g-t testcases
> running, the new dmesg will be captured and worked out a tag and head to rollback the
> detail when necessary, in addition, on the basis of different dmesg tag or head, the final
> result noise of a testcase can be reduced with defined rules or strategy.
> 
> Signed-off-by: ethan gao <ethan.gao@intel.com>

That's a patch to piglit, shouldn't be sent to the piglit ml as well to
get their opinion?
ethan gao March 20, 2015, 1:58 a.m. UTC | #2
Actually, I sent to piglit ML before intel-gfx ML.
Why It's here is because many internal guys don't subscribe to piglit ML.

Best Regards
Ethan, Gao

-----Original Message-----
From: Lespiau, Damien 
Sent: Friday, March 20, 2015 1:16 AM
To: Gao, Ethan
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] Add Dmesg Triage Feature: further triage i-g-t kmsg log to reduce result noise resulted from piglit dmesg defect

On Thu, Mar 19, 2015 at 10:36:51AM +0800, ethan gao wrote:
> tests/igt.py: add igt env to enable or disable dmesg triage
> framework/test/base.py: trigger dmesg triage depending on dmesg log 
> occurrence
> framework/dmesg.py: employ dmesg triage simply for Linux dmesg
> dmesg_triage/*: deal with kmsg log with pre-defined dmesg oops pattern
> 
> In general, if dmesg triage is enabled and there is new dmesg along 
> with i-g-t testcases running, the new dmesg will be captured and 
> worked out a tag and head to rollback the detail when necessary, in 
> addition, on the basis of different dmesg tag or head, the final result noise of a testcase can be reduced with defined rules or strategy.
> 
> Signed-off-by: ethan gao <ethan.gao@intel.com>

That's a patch to piglit, shouldn't be sent to the piglit ml as well to get their opinion?

--
Damien
Daniel Vetter March 20, 2015, 10:30 a.m. UTC | #3
On Fri, Mar 20, 2015 at 01:58:02AM +0000, Gao, Ethan wrote:
> Actually, I sent to piglit ML before intel-gfx ML.
> Why It's here is because many internal guys don't subscribe to piglit ML.

Please send it to both mailing lists in one command next time around, so
that the discussion doesn't split between piglit and intel-gfx mailing
lists.

Thanks, Daniel

> 
> Best Regards
> Ethan, Gao
> 
> -----Original Message-----
> From: Lespiau, Damien 
> Sent: Friday, March 20, 2015 1:16 AM
> To: Gao, Ethan
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH] Add Dmesg Triage Feature: further triage i-g-t kmsg log to reduce result noise resulted from piglit dmesg defect
> 
> On Thu, Mar 19, 2015 at 10:36:51AM +0800, ethan gao wrote:
> > tests/igt.py: add igt env to enable or disable dmesg triage
> > framework/test/base.py: trigger dmesg triage depending on dmesg log 
> > occurrence
> > framework/dmesg.py: employ dmesg triage simply for Linux dmesg
> > dmesg_triage/*: deal with kmsg log with pre-defined dmesg oops pattern
> > 
> > In general, if dmesg triage is enabled and there is new dmesg along 
> > with i-g-t testcases running, the new dmesg will be captured and 
> > worked out a tag and head to rollback the detail when necessary, in 
> > addition, on the basis of different dmesg tag or head, the final result noise of a testcase can be reduced with defined rules or strategy.
> > 
> > Signed-off-by: ethan gao <ethan.gao@intel.com>
> 
> That's a patch to piglit, shouldn't be sent to the piglit ml as well to get their opinion?
> 
> --
> Damien
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
ethan gao March 23, 2015, 1:34 a.m. UTC | #4
Sure. Thanks !

Best Regards
Ethan, Gao

-----Original Message-----
From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
Sent: Friday, March 20, 2015 6:30 PM
To: Gao, Ethan
Cc: Lespiau, Damien; intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] Add Dmesg Triage Feature: further triage i-g-t kmsg log to reduce result noise resulted from piglit dmesg defect

On Fri, Mar 20, 2015 at 01:58:02AM +0000, Gao, Ethan wrote:
> Actually, I sent to piglit ML before intel-gfx ML.
> Why It's here is because many internal guys don't subscribe to piglit ML.

Please send it to both mailing lists in one command next time around, so that the discussion doesn't split between piglit and intel-gfx mailing lists.

Thanks, Daniel

> 
> Best Regards
> Ethan, Gao
> 
> -----Original Message-----
> From: Lespiau, Damien
> Sent: Friday, March 20, 2015 1:16 AM
> To: Gao, Ethan
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH] Add Dmesg Triage Feature: further 
> triage i-g-t kmsg log to reduce result noise resulted from piglit 
> dmesg defect
> 
> On Thu, Mar 19, 2015 at 10:36:51AM +0800, ethan gao wrote:
> > tests/igt.py: add igt env to enable or disable dmesg triage
> > framework/test/base.py: trigger dmesg triage depending on dmesg log 
> > occurrence
> > framework/dmesg.py: employ dmesg triage simply for Linux dmesg
> > dmesg_triage/*: deal with kmsg log with pre-defined dmesg oops 
> > pattern
> > 
> > In general, if dmesg triage is enabled and there is new dmesg along 
> > with i-g-t testcases running, the new dmesg will be captured and 
> > worked out a tag and head to rollback the detail when necessary, in 
> > addition, on the basis of different dmesg tag or head, the final result noise of a testcase can be reduced with defined rules or strategy.
> > 
> > Signed-off-by: ethan gao <ethan.gao@intel.com>
> 
> That's a patch to piglit, shouldn't be sent to the piglit ml as well to get their opinion?
> 
> --
> Damien
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
diff mbox

Patch

diff --git a/dmesg_triage/debug.sh b/dmesg_triage/debug.sh
new file mode 100644
index 0000000..ec8280e
--- /dev/null
+++ b/dmesg_triage/debug.sh
@@ -0,0 +1,67 @@ 
+#!/bin/bash
+
+# Copyright (c) 2015 Intel Corporation
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+[[ "$BASHPID" ]] || echo "======= Run me with bash ======="
+
+dump_shell_vars()
+{
+	(
+		set -o posix
+		local branch_commit_files=()
+		local sparse_lines=()
+		local remote_url=()
+		set >&2
+	)
+}
+
+dump_call_stack()
+{
+	local stack_depth=${#FUNCNAME[@]}
+	local i
+	for ((i = 0; i < $((stack_depth-1)); i++)); do
+		echo "  ${BASH_SOURCE[i+1]}:${BASH_LINENO[i]}: ${FUNCNAME[i+1]}" >&2
+	done
+}
+
+# nr_stack_dumps=0
+dump_stack()
+{
+	# echo "Stack dump: $BASH_COMMAND"
+	dump_call_stack
+	echo
+	dump_shell_vars
+}
+
+notice()
+{
+	local time_str="$(date +'%F %H:%M:%S') "
+	echo -e ${color[MAGENTA]}"${time_str}$*"$reset_color
+}
+
+die()
+{
+	notice "$*"
+
+	dump_stack
+	email "$*"
+	exit
+}
diff --git a/dmesg_triage/dmesg.rb b/dmesg_triage/dmesg.rb
new file mode 100644
index 0000000..d7aecae
--- /dev/null
+++ b/dmesg_triage/dmesg.rb
@@ -0,0 +1,117 @@ 
+#!/usr/bin/ruby
+
+# Copyright (c) 2015 Intel Corporation
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+# [Description]: Find out all crash dmesg clauses with tailored pattern
+# [Author]: ethan.gao@intel.com
+
+
+
+require "fileutils"
+require "tempfile"
+
+def fixup_dmesg(line)
+	line.chomp!
+
+	# remove absolute path names
+	line.sub!(%r{/kbuild/src/[^/]+/}, '')
+
+	line.sub!(/\.(isra|constprop|part)\.[0-9]+\+0x/, '+0x')
+
+	# break up mixed messages
+	case line
+	when /^<[0-9]>/
+	when /(.+)(\[ *[0-9]{1,6}\.[0-9]{6}\] .*)/
+		line = $1 + "\n" + $2
+	end
+
+	return line
+end
+
+def fixup_dmesg_file(dmesg_file)
+	tmpfile = Tempfile.new '.fixup-dmesg-', File.dirname(dmesg_file)
+	dmesg_lines = []
+	File.open(dmesg_file, 'rb') do |f|
+		f.each_line { |line|
+			line = fixup_dmesg(line)
+			dmesg_lines << line
+			tmpfile.puts line
+		}
+	end
+	tmpfile.chmod 0664
+	tmpfile.close
+	FileUtils.mv tmpfile.path, dmesg_file, :force => true
+	return dmesg_lines
+end
+
+def grep_crash_head(dmesg, grep_options = '')
+	oops = %x[ grep -a -f #{DMESG_ROOT}/oops-pattern #{grep_options} #{dmesg} |
+		   grep -v -e 'INFO: NMI handler .* took too long to run' |
+		   awk '{line = $0; sub(/^(<[0-9]>)?\[[ 0-9.]+\] /, "", line); if (!x[line]++) print;}'
+	]
+	return oops unless oops.empty?
+
+
+	if system "grep -q -F 'EXT4-fs ('	#{dmesg}"
+		oops = `grep -a -f #{DMESG_ROOT}/ext4-crit-pattern	#{grep_options} #{dmesg}`
+		return oops unless oops.empty?
+	end
+
+	if system "grep -q -F 'XFS ('	#{dmesg}"
+		oops = `grep -a -f #{DMESG_ROOT}/xfs-alert-pattern	#{grep_options} #{dmesg}`
+		return oops unless oops.empty?
+	end
+
+	if system "grep -q -F 'btrfs: '	#{dmesg}"
+		oops = `grep -a -f #{DMESG_ROOT}/btrfs-crit-pattern	#{grep_options} #{dmesg}`
+		return oops unless oops.empty?
+	end
+
+	return ''
+end
+
+def grep_printk_errors(dmesg_file, dmesg_lines)
+	oops = `grep -a -f #{DMESG_ROOT}/oops-pattern #{dmesg_file}`
+	dmesg = dmesg_lines.join "\n"
+	oops += `grep -a -f #{DMESG_ROOT}/ext4-crit-pattern	#{dmesg_file}` if dmesg.index 'EXT4-fs ('
+	oops += `grep -a -f #{DMESG_ROOT}/xfs-alert-pattern	#{dmesg_file}` if dmesg.index 'XFS ('
+	oops += `grep -a -f #{DMESG_ROOT}/btrfs-crit-pattern	#{dmesg_file}` if dmesg.index 'btrfs: '
+	return oops
+end
+
+def common_error_id(line)
+	line = line.chomp
+	line.gsub! /\b3\.[0-9]+[-a-z0-9.]+/, '#'			# linux version: 3.17.0-next-20141008-g099669ed
+	line.gsub! /\b[1-9][0-9]-[A-Z][a-z]+-[0-9]{4}\b/, '#'		# Date: 28-Dec-2013
+	line.gsub! /\b0x[0-9a-f]+\b/, '#'				# hex number
+	line.gsub! /\b[a-f0-9]{40}\b/, '#'				# SHA-1
+	line.gsub! /\b[0-9][0-9.]*/, '#'				# number
+	line.gsub! /#x\b/, '0x'
+	line.gsub! /[ \t]/, ' '
+	line.gsub! /\ \ +/, ' '
+	line.gsub! /([^a-zA-Z0-9])\ /, '\1'
+	line.gsub! /\ ([^a-zA-Z])/, '\1'
+	line.gsub! /^\ /, ''
+	line.gsub! /\  _/, '_'
+	line.gsub! /\ /, '_'
+	line.gsub! /[-_.,;:#!\[\(]+$/, ''
+	line
+end
diff --git a/dmesg_triage/kmsg_triage b/dmesg_triage/kmsg_triage
new file mode 100755
index 0000000..5832871
--- /dev/null
+++ b/dmesg_triage/kmsg_triage
@@ -0,0 +1,182 @@ 
+#!/usr/bin/ruby
+
+# Copyright (c) 2015 Intel Corporation
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+# [Description]: Triage various dmesg trace log for an unique tag 
+# [Author]: ethan.gao@intel.com
+
+
+DMESG_ROOT = ENV['DMESG_TRIAGE_ROOT'] || File.dirname(File.dirname(File.realpath($0)))
+require "#{DMESG_ROOT}/dmesg.rb"
+
+if ENV['KMSG_TRIAGE_LOG']
+	KMSG = ENV['KMSG_TRIAGE_LOG']
+	kmsg_file = "#{KMSG}/.dmesg"
+	
+	if File.exist? kmsg_file
+		dmesg_file = kmsg_file
+	end
+
+elsif ARGV[0]
+	dmesg_file = ARGV[0]
+else
+	exit
+end
+
+if not File.exist?(dmesg_file) or File.size(dmesg_file) == 0
+	puts "No required dmesg file found !"
+	exit
+end
+
+def oops_to_bisect_pattern(line)
+		words = line.split
+		return '' if words.empty?
+		patterns = []
+		words.each { |w|
+			case w
+			when /([a-zA-Z0-9_]+)\.(isra|constprop|part)\.[0-9]+\+0x/
+				patterns << $1
+				break
+			when /([a-zA-Z0-9_]+\+0x)/, /([a-zA-Z0-9_]+=)/
+				patterns << $1
+				break
+			when /[^a-zA-Z\/:._-]/
+				patterns << '.*' if patterns[-1] != '.*'
+			else
+				patterns << w
+			end
+		}
+		patterns.shift while patterns[0] == '.*'
+		patterns.pop   if patterns[-1] == '.*'
+		patterns.join(' ')
+end
+
+error_ids = {}
+if $PROGRAM_NAME =~ /kmsg_triage/
+	output = grep_crash_head dmesg_file, '-o'
+end
+exit if output.empty?
+
+output.each_line { |line|
+	line.chomp!
+
+	next if line =~ /INFO: Stall ended before state dump start/
+	next if line =~ /INFO: NMI handler .* took too long to run:/
+	next if line =~ /Killed process \d+ \(/					# duplicated with "Out of memory: Kill process "
+
+	# print_hex_dump
+	next if line =~ /(\s[0-9a-f]{2}){16}/
+	next if line =~ /(\s[0-9a-f]{4}){8}/
+	next if line =~ /(\s[0-9a-f]{8}){4}/
+
+	next if line =~ /[^\t\n\0[:print:]]/
+	line.tr! "\0", ''
+
+	case line
+	when /(INFO: rcu[_a-z]* self-detected stall on CPU)/,
+	     /(INFO: rcu[_a-z]* detected stalls on CPUs\/tasks:)/
+		line = $1
+		bug_to_bisect = $1
+	when /(BUG: unable to handle kernel)/,
+	     /(BUG: unable to handle kernel) NULL pointer dereference/,
+	     /(BUG: unable to handle kernel) paging request/
+		line = $1
+		bug_to_bisect = $1
+	when /(BUG: scheduling while atomic:)/,
+	     /(BUG: Bad page map in process)/,
+	     /(BUG: Bad page state in process)/,
+	     /(BUG: soft lockup - CPU#\d+ stuck for \d+s!)/,
+	     /(BUG: spinlock .* on CPU#\d+)/
+		line = $1
+		bug_to_bisect = $1
+	when /(BUG: ).* (still has locks held)/,
+	     /(INFO: task ).* (blocked for more than \d+ seconds)/
+		line = $1 + $2
+		bug_to_bisect = $2
+	when /WARNING:.* at .* ([a-zA-Z.0-9_]+\+0x)/
+		bug_to_bisect = 'WARNING:.* at .* ' + $1.sub(/\.(isra|constprop|part)\.[0-9]+\+0x/, '')
+		line =~ /(at .*)/
+		line = "WARNING: " + $1
+	when /(Kernel panic - not syncing: No working init found.)  Try passing init= option to kernel. /,
+	     /(Kernel panic - not syncing: No init found.)  Try passing init= option to kernel. /
+		line = $1
+		bug_to_bisect = line
+	when /(Out of memory: Kill process) \d+ \(/
+		line = $1
+		bug_to_bisect = $1
+	when /(Writer stall state) \d+ g\d+ c\d+ f/
+		line = $1
+		bug_to_bisect = $1
+	when /(used greatest stack depth:)/
+		line = $1
+		bug_to_bisect = $1
+	# printk(KERN_ERR "BUG: Dentry %p{i=%lx,n=%pd} still in use (%d) [unmount of %s %s]\n"
+	when  /(BUG: Dentry ).* (still in use) .* \[unmount of /
+		line = $1 + $2
+		bug_to_bisect = $1 + '.* ' + $2
+	when /([a-zA-Z]+[ a-zA-Z]*: [a-f0-9]{4}) \[#[0-9]+\] [A-Z_ ]*$/
+		line = $1
+		bug_to_bisect = $1
+	when /^backtrace:([a-zA-Z0-9_]+)/
+		bug_to_bisect = $1 + '+0x'
+	else
+		bug_to_bisect = oops_to_bisect_pattern line
+	end
+
+	error_id = line.sub(/^[^a-zA-Z]+/, "")
+
+	error_id.gsub! /\ \]$/, ""					# [ INFO: possible recursive locking detected ]
+	#error_id.gsub! /\/c\/kernel-tests\/src\/[^\/]+\//, ''
+	#error_id.gsub! /\/c\/(wfg|yliu)\/[^\/]+\//, ''
+	#error_id.gsub! /\/lkp\/[^\/]+\/linux[0-9]*\//, ''
+	#error_id.gsub! /\/kernel-tests\/linux[0-9]*\//, ''
+	error_id.gsub! /\.(isra|constprop|part)\.[0-9]+/, ''
+
+	error_id.gsub! /\b[0-9a-f]{8}\b/, "#"
+	error_id.gsub! /\b[0-9a-f]{16}\b/, "#"
+	error_id.gsub! /(=)[0-9a-f]+\b/, '\1#'
+	error_id.gsub! /[+\/]0x[0-9a-f]+\b/, ''
+	error_id.gsub! /[+\/][0-9a-f]+\b/, ''
+
+	#error_id = common_error_id(error_id) + ': 1'
+	error_id = common_error_id(error_id)
+
+	error_id.gsub! /([a-z]:)[0-9]+\b/, '\1'				# WARNING: at arch/x86/kernel/cpu/perf_event.c:1077 x86_pmu_start+0xaa/0x110()
+	error_id.gsub! /#:\[<#>\]\[<#>\]/, ''				# RIP: 0010:[<ffffffff91906d8d>]  [<ffffffff91906d8d>] validate_chain+0xed/0xe80
+
+	next if error_id.size <= 3
+
+	error_ids[error_id] ||= bug_to_bisect
+}
+
+exit if error_ids.empty?
+
+puts "Found necessary kmsg error ids" if $PROGRAM_NAME =~ /kmsg_triage/
+
+KMSG_ERROR_ID = ENV['KMSG_TRIAGE_LOG'] || "#{DMESG_ROOT}"
+# This stores each error id
+f = File.new(File.join("#{KMSG_ERROR_ID}", ".dmesg_error_ids"), "w")
+error_ids.each { |error_id, head|
+	f.write("#{error_id}@#{head}\n")
+}
+
+f.close
+
diff --git a/dmesg_triage/libdmesg.sh b/dmesg_triage/libdmesg.sh
new file mode 100644
index 0000000..7a53fcf
--- /dev/null
+++ b/dmesg_triage/libdmesg.sh
@@ -0,0 +1,156 @@ 
+# !/bin/bash
+
+# Copyright (c) 2015 Intel Corporation
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#
+# [Description]: Defined utilities to check out specified dmesg segment 
+# [Author]: ethan.gao@intel.com
+
+
+if [[ -z "${DMESG_ROOT}" ]]; then
+	DMESG_ROOT=`pwd`
+fi
+
+source $DMESG_ROOT/debug.sh
+
+ERR=1
+
+# desc: capture all the heads of various kmsg exceptional log
+# param:
+#   [dmesg]: the dmesg log which you'll employ
+grep_crash_head()
+{
+	[[ ! "$*" || "$*" =~ ^-[a-z]+$ ]] && {
+		echo "input of grep_crash_head is empty" >&2
+		dump_stack
+		return 1
+	}
+	grep -a -f $DMESG_ROOT/oops-pattern "$@" && return
+
+	grep -B8 'Call Trace:$' "$@" |
+	grep -B1 -e 'Pid: [0-9]+, comm: ' \
+		 -e 'CPU: [0-9]+ PID: [0-9]+ Comm: ' |
+	grep -v -e ' [cC]omm: ' \
+		-e '^--$' \
+		-e '^$' \
+		-e '^\[[ 0-9.]*\] $' \
+		-e 'The following trace is a kernel self test and not a bug'
+}
+
+# desc: filter the kmsg exceptional log messages
+# param:
+#   [dmesg]: the dmesg log which you'll engage
+grep_crash_dmesg()
+{
+	[[ "$@" ]] || {
+		echo "input of grep_crash_dmesg is empty" >&2
+		dump_stack
+		return 1
+	}
+
+	grep -a -f $DMESG_ROOT/oops-pattern \
+		-f $DMESG_ROOT/oops-context-pattern "$@"
+}
+
+# desc: filter the first kmsg exceptional log 
+# param:
+#   [dmesg]: the dmesg source you'll use
+first_crash_dmesg()
+{
+	grep_crash_dmesg -C3 "$@" | awk 'BEGIN { nr_first_head=0; };
+					/^--$/ { exit };
+					/---\[ end trace .*\]---/ { print; getline; print; exit };
+					/(kernel BUG at |Kernel panic -|\<BUG: |WARNING: |INFO: )/ { if (nr_first_head > NR + 1) exit; nr_first_head = NR; };
+					{ print };
+					NR > 200 { exit };'
+}
+
+# desc: retrieve dmesg trace info based on dmesg tag
+# param:
+#   [tag]: a dmesg tag or head
+#   [dmesg]: the file which contains your full dmesg log
+head_to_dmesg()
+{
+	[[ $# -eq 2 ]] || {
+		echo "ERROR:input of head_to_dmesg is empty !" >&2
+		echo "Usage: head_to_dmesg <dmesg_head> <dmesg_log_file>"
+		return $ERR
+	}
+	
+	dmesg_head=$1
+	dmesg_file=$2
+
+	awk -v pattern="${dmesg_head}" '
+	BEGIN { nr_dmesg_target = 0; }
+	{
+		if ( $0 !~ pattern){
+			next;
+		} 
+		else { 
+			nr_dmesg_target = FNR; print;
+		}
+
+		do
+		{
+			if (getline > 0) 
+				print; 
+			else
+				break;
+
+			if ($0 ~ /---\[ end trace .*\]/){
+				break;
+			}
+		} while (NR - nr_dmesg_target < 60)
+	}	
+	END { 
+		if (nr_dmesg_target > 0) {
+			print "---[kernel msg end]---";
+		}
+		else {
+			print "Not Found";
+		}
+	}' "${dmesg_file}"
+}
+
+
+# desc: sort out the call trace funcs from the given dmesg log
+# param:
+#   [dmesg]: the dmesg source you'll use, if a file 
+CALL_TRACE_FUNCS='[a-zA-Z0-9._]+\+0x[0-9a-f]+\/0x[0-9a-f]+'
+call_trace_funcs()
+{
+	local dmesg=$1
+	grep -Eo "$CALL_TRACE_FUNCS" $1 | cut -f1 -d+ | uniq
+}
+
+# desc: sort out funcs from the first call trace stack of dmesg
+# param:
+#   [dmesg]: the dmesg call trace log
+first_call_trace()
+{
+	local dmesg=$1
+	awk 'BEGIN { in_trace=0; };
+		/(Call Trace:|state was registered at:)/ { in_trace++; nr=0; next; };
+		/'"$CALL_TRACE_FUNCS"'/{ if (match($0, />\] (?\s)?(_)?[a-zA-Z_][a-zA-Z0-9._]+\+0x/))
+						{ print substr($0, RSTART+3, RLENGTH-6); nr++; }; next; };
+		// {if (in_trace > 0 && nr > 1) exit;};' $dmesg |
+		sed -r -e 's/\.(isra|constprop|part)\.[0-9]+//g' \
+			| uniq
+}
diff --git a/dmesg_triage/oops-context-pattern b/dmesg_triage/oops-context-pattern
new file mode 100644
index 0000000..f7dc209
--- /dev/null
+++ b/dmesg_triage/oops-context-pattern
@@ -0,0 +1,37 @@ 
+invoked oom-killer:
+INFO: possible recursive locking detected
+ is trying to acquire lock:
+ state was registered at:
+Possible unsafe locking scenario:
+       CPU0$
+       ----$
+  lock(.*);
+\*\*\* DEADLOCK \*\*\*
+May be due to missing lock nesting notation
+  <Interrupt>
+but task is already holding lock:
+which lock already depends on the new lock.
+other info that might help us debug this:
+stack backtrace:
+ EFLAGS: 
+RBP: 
+ knlGS:
+ CR3: 
+Stack:
+Code: 
+sending NMI to all CPUs:
+Modules linked in:
+Hardware name: 
+Call Trace:
+\[<[a-f0-9]\+>\] [a-z0-9._]\++0x[0-9a-f]\+/0x[0-9a-f]\+
+\[<[a-f0-9]\+>\] ? [a-z0-9._]\++0x[0-9a-f]\+/0x[0-9a-f]\+
+\( [0-9a-f]\{16\}\)\{7\}
+slab_unreclaimable:
+Swap cache stats:
+pages RAM
+ locks held by 
+ lock held by 
+RCU used illegally from 
+ =========================$
+Object [0-9a-f]\+: [0-9a-f ]\+  \.\.\.
+Redzone [0-9a-f]\+: [0-9a-f ]\+ 
diff --git a/dmesg_triage/oops-pattern b/dmesg_triage/oops-pattern
new file mode 100644
index 0000000..9ce4148
--- /dev/null
+++ b/dmesg_triage/oops-pattern
@@ -0,0 +1,59 @@ 
+[a-zA-Z ]\+: [a-f0-9]\{4\} \[#[0-9]\+\] [A-Z_ ]*$
+kernel BUG at .*
+Kernel panic - not syncing: .*
+^BUG: .*
+\[ *[0-9.]\+\] BUG: .*
+BUG .* (.*): .*
+Oops: .*
+WARNING: at .*
+WARNING: CPU: [0-9]\+ PID: [0-9]\+ at .*
+^INFO: .*detected stall.*
+^INFO: .* cpu=.*
+^INFO: Slab 0x.*
+^INFO: Object 0x.*
+^INFO: 0x.*. First byte 0x.*
+\[ *[0-9.]\+\] INFO: .*
+\[ INFO: [^i].* \]
+RCU used illegally from offline CPU!
+RCU used illegally from idle CPU!
+RCU used illegally from extended quiescent state!
+inconsistent {.*} -> {.*} usage\.
+initcall .* returned with .*
+\[ BUG: [^b].*! \]
+ is trying to release lock (.*) at
+ is trying to contend lock (.*) at
+ is trying to lock: .*
+ but this task is not holding: .*
+ is leaving the kernel with locks still held!
+ is exiting with locks still held!
+ is freeing memory .*, with a lock still held there!
+genirq: Flags mismatch irq .*
+Out of memory: Kill process 
+page allocation failure: order:[0-9]\+, mode:0x.*
+pagealloc: single bit error
+pagealloc: memory corruption
+audit: (.*) error: .*
+getblk(): invalid block size [0-9]\+ requested
+KGDB: re-enter exception: ALL breakpoints killed
+IP-Config: Auto-configuration of network failed
+EIP is at [a-zA-Z0-9._]\++0x.*/0x.*
+RIP: [0-9a-f]\{4\}:\[.*\] [a-zA-Z0-9._]\++0x.*/0x.*
+PANIC: early exception 
+PANIC: double fault, 
+Unknown interrupt or fault at:
+End of test: FAILURE
+End of test: RCU_HOTPLUG
+S0Box: more than 5 loops in s0box_interrupt
+Initramfs unpacking failed: 
+assertion failed
+Assertion failed
+assertion failure
+Assertion failure
+32-bit relocation outside of kernel!
+64-bit relocation outside of kernel!
+Kernel is not a valid ELF file
+Failed to allocate space for phdrs
+Destination address inappropriately aligned
+Destination address too large
+Wrong destination address
+ -- System halted
diff --git a/framework/dmesg.py b/framework/dmesg.py
index 1a5f629..d23ba23 100644
--- a/framework/dmesg.py
+++ b/framework/dmesg.py
@@ -41,6 +41,7 @@  import sys
 import subprocess
 import warnings
 import abc
+import os, errno
 
 __all__ = [
     'BaseDmesg',
@@ -201,6 +202,65 @@  class LinuxDmesg(BaseDmesg):
         # Attempt to store the last element of dmesg, unless there was no dmesg
         self._last_message = dmesg[-1] if dmesg else None
 
+    def dmesg_triage(self, result, config):
+        """ Triage dmesg log to sort out different tag and head
+
+        Retrieve the contents of dmesg log, then work out a tag and a head, which
+        can be used to rollback the original dmesg trace details.
+
+        """
+        try:
+            os.stat(config["DMESG_TRIAGE_SRC"])
+        except OSError as e:
+            if e.errno == errno.ENOENT or e.errno == errno.ENODIR:
+                sys.stderr.write("[Dmesg Triage Err]: %s" % str(e))
+                return result
+
+        if not os.path.exists(config["DMESG_TRIAGE_TMP"]):
+            try:
+                os.makedirs(config["DMESG_TRIAGE_TMP"])
+            except OSError as e:
+                sys.stderr.write("[Dmesg Triage Warn]: %s" % str(e))
+                config["DMESG_TRIAGE_TMP"] = "/tmp" 	
+
+        os.environ["DMESG_TRIAGE_ROOT"] = config["DMESG_TRIAGE_SRC"]
+        os.environ["KMSG_TRIAGE_LOG"] = config["DMESG_TRIAGE_TMP"]
+
+        fd = open("%s" % os.path.join(config["DMESG_TRIAGE_TMP"], ".dmesg"), 'wb')
+        try:
+            fd.write(result["dmesg"])
+            fd.flush()
+        finally:
+            fd.close()
+
+        try:
+            os.chdir(config["DMESG_TRIAGE_SRC"])
+            ret = subprocess.call("which ruby > /dev/null", shell=True)
+            if ret != 0:
+                sys.stderr.write("[Dmesg Triage Err]: No ruby binary found, please install and try again !\n")
+                return result
+            subprocess.check_call("ruby kmsg_triage", shell=True)
+        except subprocess.CalledProcessError as e:
+            sys.stderr.write("[Dmesg Triage Err]: %s" % str(e))
+            return result
+
+        if not os.path.exists("%s" % os.path.join(config["DMESG_TRIAGE_TMP"], ".dmesg_error_ids")):
+            return result
+
+        error_ids = dict()
+        with open(os.path.join(config["DMESG_TRIAGE_TMP"], ".dmesg_error_ids")) as f:
+            for err in iter(f.readline, ''):
+                if re.search(r'@', err) is None:
+                    f.close()
+                    return result
+                err_id = re.split(r'@', err)
+                error_ids[err_id[0]] = err_id[1].strip("\n").replace("0x", "*")
+             
+            f.close()
+
+        result["dmesg_triage_info"] = error_ids
+        return result
+    
 
 class DummyDmesg(BaseDmesg):
     """ An dummy class for dmesg on non unix-like systems
diff --git a/framework/test/base.py b/framework/test/base.py
index efc20cb..a57c7ea 100644
--- a/framework/test/base.py
+++ b/framework/test/base.py
@@ -157,6 +157,10 @@  class Test(object):
                 self.run()
                 self.result['time'] = time.time() - time_start
                 self.result = dmesg.update_result(self.result)
+                if self.result.has_key("dmesg") and self.env["ENABLE_DMESG_TRIAGE"]:
+                    self.env["DMESG_TRIAGE_SRC"] = os.path.join(self.OPTS.env["PIGLIT_SOURCE_DIR"], self.env["DMESG_TRIAGE_SRC"])
+                    self.result = dmesg.dmesg_triage(self.result, self.env)
+
             # This is a rare case where a bare exception is okay, since we're
             # using it to log exceptions
             except:
diff --git a/tests/igt.py b/tests/igt.py
index 2047781..704ee75 100644
--- a/tests/igt.py
+++ b/tests/igt.py
@@ -104,6 +104,13 @@  class IGTTest(Test):
             [os.path.join(IGT_TEST_ROOT, binary)] + arguments)
         self.timeout = 600
 
+        # enable or disable dmesg triage feature here
+        self.env["ENABLE_DMESG_TRIAGE"] = True
+        if self.env["ENABLE_DMESG_TRIAGE"]:
+            # Initialize dmesg triage resources
+            self.env["DMESG_TRIAGE_TMP"] = "/tmp/dmesg"
+            self.env["DMESG_TRIAGE_SRC"] = "dmesg_triage"
+
     def interpret_result(self):
         if self.result['returncode'] == 0:
             self.result['result'] = 'pass'