[v4] scripts: add leaking_addresses.pl

Message ID	1510050731-32446-1-git-send-email-me@tobin.cc (mailing list archive)
State	New, archived
Headers	show Return-Path: <kernel-hardening-return-10404-patchwork-kernel-hardening=patchwork.kernel.org@lists.openwall.com> Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk From: "Tobin C. Harding" <me@tobin.cc> To: kernel-hardening@lists.openwall.com Cc: "Tobin C. Harding" <me@tobin.cc>, "Jason A. Donenfeld" <Jason@zx2c4.com>, Theodore Ts'o <tytso@mit.edu>, Linus Torvalds <torvalds@linux-foundation.org>, Kees Cook <keescook@chromium.org>, Paolo Bonzini <pbonzini@redhat.com>, Tycho Andersen <tycho@docker.com>, "Roberts, William C" <william.c.roberts@intel.com>, Tejun Heo <tj@kernel.org>, Jordan Glover <Golden_Miller83@protonmail.ch>, Greg KH <gregkh@linuxfoundation.org>, Petr Mladek <pmladek@suse.com>, Joe Perches <joe@perches.com>, Ian Campbell <ijc@hellion.org.uk>, Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <wilal.deacon@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Chris Fries <cfries@google.com>, Dave Weinstein <olorin@google.com>, Daniel Micay <danielmicay@gmail.com>, Djalal Harouni <tixxdz@gmail.com>, linux-kernel@vger.kernel.org, Network Development <netdev@vger.kernel.org>, David Miller <davem@davemloft.net> Date: Tue, 7 Nov 2017 21:32:11 +1100 Message-Id: <1510050731-32446-1-git-send-email-me@tobin.cc> Subject: [kernel-hardening] [PATCH v4] scripts: add leaking_addresses.pl

Tobin Harding Nov. 7, 2017, 10:32 a.m. UTC

Currently we are leaking addresses from the kernel to user space. This
script is an attempt to find some of those leakages. Script parses
`dmesg` output and /proc and /sys files for hex strings that look like
kernel addresses.

Only works for 64 bit kernels, the reason being that kernel addresses
on 64 bit kernels have 'ffff' as the leading bit pattern making greping
possible. On 32 kernels we don't have this luxury.

Scripts is _slightly_ smarter than a straight grep, we check for false
positives (all 0's or all 1's, and vsyscall start/finish addresses).

Output is saved to file to expedite repeated formatting/viewing of
output.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
---

This version outputs a report instead of the raw results by default. Designing
this proved to be non-trivial, the reason being that it is not immediately clear
what constitutes a duplicate entry (similar message, address range, same
file?). Also, the aim of the report is to assist users _not_ missing correct
results; limiting the output is inherently a trade off between noise and
correct, clear results.

Without testing on various real kernels its not clear that this reporting is any
good, my test cases were a bit contrived. Your usage may vary.

It would be super helpful to get some comments from people running this with
different set ups.

Please feel free to say 'try harder Tobin, this reporting is shit'.

Thanks, appreciate your time,
Tobin.

v4:
 - Add `scan` and `format` sub-commands.
 - Output report by default.
 - Add command line option to send scan results (to me).

v3:
 - Iterate matches to check for results instead of matching input line against
   false positives i.e catch lines that contain results as well as false
   positives.

v2:
 - Add regex's to prevent false positives.
 - Clean up white space.

 MAINTAINERS                  |   5 +
 scripts/leaking_addresses.pl | 437 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 442 insertions(+)
 create mode 100755 scripts/leaking_addresses.pl

Greg KH Nov. 7, 2017, 10:50 a.m. UTC | #1

On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
> 
> Only works for 64 bit kernels, the reason being that kernel addresses
> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> possible. On 32 kernels we don't have this luxury.
> 
> Scripts is _slightly_ smarter than a straight grep, we check for false
> positives (all 0's or all 1's, and vsyscall start/finish addresses).
> 
> Output is saved to file to expedite repeated formatting/viewing of
> output.
> 
> Signed-off-by: Tobin C. Harding <me@tobin.cc>
> ---
> 
> This version outputs a report instead of the raw results by default. Designing
> this proved to be non-trivial, the reason being that it is not immediately clear
> what constitutes a duplicate entry (similar message, address range, same
> file?). Also, the aim of the report is to assist users _not_ missing correct
> results; limiting the output is inherently a trade off between noise and
> correct, clear results.
> 
> Without testing on various real kernels its not clear that this reporting is any
> good, my test cases were a bit contrived. Your usage may vary.
> 
> It would be super helpful to get some comments from people running this with
> different set ups.
> 
> Please feel free to say 'try harder Tobin, this reporting is shit'.
> 
> Thanks, appreciate your time,
> Tobin.
> 
> v4:
>  - Add `scan` and `format` sub-commands.
>  - Output report by default.
>  - Add command line option to send scan results (to me).

As the script is already in Linus's tree, you might need to send a patch
on top of that, instead of this one, as this one will not apply anymore.

thanks,

greg k-h

David Laight Nov. 7, 2017, 1:56 p.m. UTC | #2

From: Tobin C. Harding
> Sent: 07 November 2017 10:32
>
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
...

Maybe the %p that end up in dmesg (via the kernel message buffer) should
be converted to text in a form that allows the code that reads them to
substitute alternate text for non-root users?

Then the actual addresses will be available to root (who can probably
get most by other means) but not to the casual observer.

	David

Petr Mladek Nov. 7, 2017, 3:51 p.m. UTC | #3

On Tue 2017-11-07 21:32:11, Tobin C. Harding wrote:
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
> 
> Only works for 64 bit kernels, the reason being that kernel addresses
> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> possible. On 32 kernels we don't have this luxury.
> 
> Scripts is _slightly_ smarter than a straight grep, we check for false
> positives (all 0's or all 1's, and vsyscall start/finish addresses).
> 
> Output is saved to file to expedite repeated formatting/viewing of
> output.
> 
> diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
> new file mode 100755
> index 000000000000..282c0cc2bdea
> --- /dev/null
> +++ b/scripts/leaking_addresses.pl
> +sub help
> +{
> +	my ($exitcode) = @_;
> +
> +	print << "EOM";
> +Usage: $P COMMAND [OPTIONS]
> +Version: $V
> +
> +Commands:
> +
> +	scan	Scan the kernel (savesg raw results to file and runs `format`).
> +	format	Parse results file and format output.
> +
> +Options:
> +	-o, --output=<path>	 Accepts absolute or relative filename or directory name.

IMHO, this is pretty non-standard. I would support only -o file. Then you do
not need to solve problems with replacing an existing file. The user
would know exactly what file will be generated.


> +	    --suppress-dmesg	 Don't show dmesg results.

The apostrophe breaks highlighting of the rest of the code ;-)


> +	    --squash-by-path	 Show one result per unique path.
> +	    --raw	 	 Show raw results.
> +	    --send-report	 Submit raw results for someone else to worry about.
> +	-d, --debug              Display debugging output.
> +	-h, --help, --version    Display this help and exit.
> +
> +Scans the running (64 bit) kernel for potential leaking addresses.
> +}

This bracket should not be here. The help text is limited
by "EOM" below.


> +
> +EOM
> +	exit($exitcode);
> +}

[...]

> +sub cache_path
> +{
> +        my ($paths, $line) = @_;
> +
> +        my $index = index($line, ':');

There are paths with the double dot, for example:
/sys/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.6/2-1.6:1.0/input/input4/uevent
Then the file name is wrongly detected, in my example as "pci0000"

It seems that searching for ": " sub-string works rather well.
I mean using:

	my $index = index($line, ': ');

> +        my $path = substr($line, 0, $index);
> +
> +        if (!$paths->{$path}) {
> +                $paths->{$path} = ();
> +        }
> +        push @{$paths->{$path}}, $line;

It would make sense to use the same trick from cache_filename
and remove path from the cached text. I mean:

	$index += 2;            # skip ': '
	push @{$paths->{$path}}, substr($line, $index);

> +}
> +
> +sub cache_filename
> +{
> +        my ($files, $line) = @_;
> +
> +        my $index = index($line, ':');

Same problem with the double dot in the path name.
The following helped me:

	my $index = index($line, ': ');

> +        my $path = substr($line, 0, $index);
> +        my $filename = basename($path);
> +        if (!$files->{$filename}) {
> +                $files->{$filename} = ();
> +        }
> +        $index += 2;            # skip ': '
> +        push @{$files->{$filename}}, substr($line, $index);
> +}

This is what caught my eye when trying the script.

Best Regards,
Petr

Tobin Harding Nov. 7, 2017, 8:39 p.m. UTC | #4

On Tue, Nov 07, 2017 at 04:51:29PM +0100, Petr Mladek wrote:
> On Tue 2017-11-07 21:32:11, Tobin C. Harding wrote:
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> > 
> > Only works for 64 bit kernels, the reason being that kernel addresses
> > on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> > possible. On 32 kernels we don't have this luxury.
> > 
> > Scripts is _slightly_ smarter than a straight grep, we check for false
> > positives (all 0's or all 1's, and vsyscall start/finish addresses).
> > 
> > Output is saved to file to expedite repeated formatting/viewing of
> > output.
> > 
> > diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
> > new file mode 100755
> > index 000000000000..282c0cc2bdea
> > --- /dev/null
> > +++ b/scripts/leaking_addresses.pl
> > +sub help
> > +{
> > +	my ($exitcode) = @_;
> > +
> > +	print << "EOM";
> > +Usage: $P COMMAND [OPTIONS]
> > +Version: $V
> > +
> > +Commands:
> > +
> > +	scan	Scan the kernel (savesg raw results to file and runs `format`).
> > +	format	Parse results file and format output.
> > +
> > +Options:
> > +	-o, --output=<path>	 Accepts absolute or relative filename or directory name.
> 
> IMHO, this is pretty non-standard. I would support only -o file. Then you do
> not need to solve problems with replacing an existing file. The user
> would know exactly what file will be generated.
> 
> 
> > +	    --suppress-dmesg	 Don't show dmesg results.
> 
> The apostrophe breaks highlighting of the rest of the code ;-)
> 
> 
> > +	    --squash-by-path	 Show one result per unique path.
> > +	    --raw	 	 Show raw results.
> > +	    --send-report	 Submit raw results for someone else to worry about.
> > +	-d, --debug              Display debugging output.
> > +	-h, --help, --version    Display this help and exit.
> > +
> > +Scans the running (64 bit) kernel for potential leaking addresses.
> > +}
> 
> This bracket should not be here. The help text is limited
> by "EOM" below.
> 
> 
> > +
> > +EOM
> > +	exit($exitcode);
> > +}
> 
> [...]
> 
> > +sub cache_path
> > +{
> > +        my ($paths, $line) = @_;
> > +
> > +        my $index = index($line, ':');
> 
> There are paths with the double dot, for example:
> /sys/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.6/2-1.6:1.0/input/input4/uevent
> Then the file name is wrongly detected, in my example as "pci0000"
> 
> It seems that searching for ": " sub-string works rather well.
> I mean using:
> 
> 	my $index = index($line, ': ');
> 
> > +        my $path = substr($line, 0, $index);
> > +
> > +        if (!$paths->{$path}) {
> > +                $paths->{$path} = ();
> > +        }
> > +        push @{$paths->{$path}}, $line;
> 
> It would make sense to use the same trick from cache_filename
> and remove path from the cached text. I mean:
> 
> 	$index += 2;            # skip ': '
> 	push @{$paths->{$path}}, substr($line, $index);
> 
> > +}
> > +
> > +sub cache_filename
> > +{
> > +        my ($files, $line) = @_;
> > +
> > +        my $index = index($line, ':');
> 
> Same problem with the double dot in the path name.
> The following helped me:
> 
> 	my $index = index($line, ': ');
> 
> > +        my $path = substr($line, 0, $index);
> > +        my $filename = basename($path);
> > +        if (!$files->{$filename}) {
> > +                $files->{$filename} = ();
> > +        }
> > +        $index += 2;            # skip ': '
> > +        push @{$files->{$filename}}, substr($line, $index);
> > +}
> 
> This is what caught my eye when trying the script.

Awesome. Thank you very much. All comments will be addressed for the
next spin.

thanks,
Tobin.

Tobin Harding Nov. 7, 2017, 8:51 p.m. UTC | #5

On Tue, Nov 07, 2017 at 11:50:27AM +0100, Greg KH wrote:
> On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> > 
> > Only works for 64 bit kernels, the reason being that kernel addresses
> > on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> > possible. On 32 kernels we don't have this luxury.
> > 
> > Scripts is _slightly_ smarter than a straight grep, we check for false
> > positives (all 0's or all 1's, and vsyscall start/finish addresses).
> > 
> > Output is saved to file to expedite repeated formatting/viewing of
> > output.
> > 
> > Signed-off-by: Tobin C. Harding <me@tobin.cc>
> > ---
> > 
> > This version outputs a report instead of the raw results by default. Designing
> > this proved to be non-trivial, the reason being that it is not immediately clear
> > what constitutes a duplicate entry (similar message, address range, same
> > file?). Also, the aim of the report is to assist users _not_ missing correct
> > results; limiting the output is inherently a trade off between noise and
> > correct, clear results.
> > 
> > Without testing on various real kernels its not clear that this reporting is any
> > good, my test cases were a bit contrived. Your usage may vary.
> > 
> > It would be super helpful to get some comments from people running this with
> > different set ups.
> > 
> > Please feel free to say 'try harder Tobin, this reporting is shit'.
> > 
> > Thanks, appreciate your time,
> > Tobin.
> > 
> > v4:
> >  - Add `scan` and `format` sub-commands.
> >  - Output report by default.
> >  - Add command line option to send scan results (to me).
> 
> As the script is already in Linus's tree, you might need to send a patch
> on top of that, instead of this one, as this one will not apply anymore.

Your awareness of what is going on never ceases to amaze me Greg, you're
the man.

thanks,
Tobin.

Tobin Harding Nov. 7, 2017, 8:58 p.m. UTC | #6

On Tue, Nov 07, 2017 at 01:56:05PM +0000, David Laight wrote:
> From: Tobin C. Harding
> > Sent: 07 November 2017 10:32
> >
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> ...
> 
> Maybe the %p that end up in dmesg (via the kernel message buffer) should
> be converted to text in a form that allows the code that reads them to
> substitute alternate text for non-root users?
>
> Then the actual addresses will be available to root (who can probably
> get most by other means) but not to the casual observer.

Interesting idea. Isn't the same outcome already achieved with
dmesg_restrict. I appreciate that this does beg the question 'why are we
scanning dmesg then?'

There has not been much discussion on dmesg_restrict. Is dmesg_restrict
good enough that we needn't bother scanning it?

thanks for your input,
Tobin.

Linus Torvalds Nov. 7, 2017, 9:11 p.m. UTC | #7

On Tue, Nov 7, 2017 at 12:58 PM, Tobin C. Harding <me@tobin.cc> wrote:
>
> Interesting idea. Isn't the same outcome already achieved with
> dmesg_restrict. I appreciate that this does beg the question 'why are we
> scanning dmesg then?'

dmesg_restrict is even more asinine than kptr_restrict.

It's a completely idiotic flag, only useful for distributions that
then also refuse to show system journals to regular users.

And such distributions are garbage, since that also effectively means
that users can't sanely make bug reports etc.

In other words, the whole 'dmesg_restrict' is the _classic_ case of
so-called "security" people who make bad decisions, and play security
theater.

This is exactly the kind of crap that the grsecurity people came up
with, and I'm sorry it was ever back-ported into the mainline kernel,
because it's f*cking retarded.

I often wish that security people used their brains more than they
actually seem to do.

Because a lot of them don't actually seem to ever look at the big
picture, and they do these kinds of security theater garbage patches
that don't actually help anything what-so-ever, but make people say
"that's good security".

And yes, the same would _very_ much be true of anything that just
hides the pointers from users when they read dmesg. It wouldn't be
sufficient to change the kernel, you also would have to change every
single program that implements system logging, and once you did that,
you'd have screwed up system debuggability.

So really, people - start thinking critically about security. That
VERY MUCH also means starting to thinking critically about things that
people _claim_ are a security feature.

               Linus

Laura Abbott Nov. 7, 2017, 11:36 p.m. UTC | #8

On 11/07/2017 02:32 AM, Tobin C. Harding wrote:
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
> 
> Only works for 64 bit kernels, the reason being that kernel addresses
> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> possible. On 32 kernels we don't have this luxury.
> 
> Scripts is _slightly_ smarter than a straight grep, we check for false
> positives (all 0's or all 1's, and vsyscall start/finish addresses).
> 
> Output is saved to file to expedite repeated formatting/viewing of
> output.
> 
> Signed-off-by: Tobin C. Harding <me@tobin.cc>
> ---
> 
> This version outputs a report instead of the raw results by default. Designing
> this proved to be non-trivial, the reason being that it is not immediately clear
> what constitutes a duplicate entry (similar message, address range, same
> file?). Also, the aim of the report is to assist users _not_ missing correct
> results; limiting the output is inherently a trade off between noise and
> correct, clear results.
> 
> Without testing on various real kernels its not clear that this reporting is any
> good, my test cases were a bit contrived. Your usage may vary.
> 
> It would be super helpful to get some comments from people running this with
> different set ups.
> 

Running on a stock Fedora kernel with gnome generates a 139M file.
I'll admit that Fedora is pretty generous in what it enables.
Trimmed down to omit some redundancies in various processes
by only printing off of the last file in the path

/proc/kallsyms
/proc/modules
/proc/timer_list
/proc/1244/stack
/proc/4041/status
/proc/bus/input/devices <--- Probably a false positive
/proc/1/net/hci
/proc/1/net/tcp
/proc/1/net/udp
/proc/1/net/bnep
/proc/1/net/raw6
/proc/1/net/tcp6
/proc/1/net/udp6
/proc/1/net/unix
/proc/1/net/l2cap
/proc/1/net/packet
/proc/1/net/rfcomm
/proc/1/net/netlink
/sys/module/snd_compress/sections/.note.gnu.build-id
/sys/module/snd_compress/sections/.exit.text
/sys/module/snd_compress/sections/__mcount_loc
/sys/module/snd_compress/sections/__ksymtab_strings
/sys/module/snd_compress/sections/__ksymtab_gpl
/sys/module/snd_compress/sections/.init.text
/sys/module/snd_compress/sections/.gnu.linkonce.this_module
/sys/module/snd_compress/sections/__jump_table
/sys/module/snd_compress/sections/.strtab
/sys/module/snd_compress/sections/.bss
/sys/module/snd_compress/sections/.rodata.str1.1
/sys/module/snd_compress/sections/__bug_table
/sys/module/snd_compress/sections/__verbose
/sys/module/snd_compress/sections/.rodata.str1.8
/sys/module/snd_compress/sections/.text
/sys/module/snd_compress/sections/.data
/sys/module/snd_compress/sections/.symtab
/sys/module/snd_compress/sections/.rodata
/sys/module/iwlmvm/sections/.altinstr_replacement
/sys/module/iwlmvm/sections/.altinstructions
/sys/module/iwlmvm/sections/.data.unlikely
/sys/module/iwlmvm/sections/__param
/sys/module/iwlmvm/sections/.smp_locks
/sys/module/snd_hda_intel/sections/__tracepoints_ptrs
/sys/module/snd_hda_intel/sections/__tracepoints
/sys/module/snd_hda_intel/sections/__tracepoints_strings
/sys/module/snd_hda_intel/sections/_ftrace_events
/sys/module/snd_hda_intel/sections/.ref.data
/sys/module/iwlwifi/sections/.parainstructions
/sys/module/iwlwifi/sections/__ksymtab
/sys/module/uvcvideo/sections/.fixup
/sys/module/uvcvideo/sections/.text.unlikely
/sys/module/uvcvideo/sections/__ex_table
/sys/module/intel_powerclamp/sections/.init.rodata
/sys/module/mac80211/sections/.data..read_mostly
/sys/module/nfnetlink/sections/.init.data
/sys/module/ghash_clmulni_intel/sections/.rodata.cst16.bswap_mask
/sys/module/videodev/sections/_ftrace_eval_map
/sys/module/kvm_intel/sections/.data..ro_after_init
/sys/module/kvm_intel/sections/.altinstr_aux
/sys/module/crct10dif_pclmul/sections/.rodata.cst16.SHUF_MASK
/sys/module/crct10dif_pclmul/sections/.rodata.cst16.mask1
/sys/module/crct10dif_pclmul/sections/.rodata.cst32.pshufb_shf_table
/sys/module/crct10dif_pclmul/sections/.rodata.cst16.mask2
/sys/module/nf_conntrack/sections/.data..cacheline_aligned
/sys/firmware/efi/runtime-map/5/virt_addr
/sys/devices/platform/i8042/serio0/input/input3/uevent
/sys/devices/platform/i8042/serio0/input/input3/capabilities/key

I'd probably put /proc/kallsyms and /proc/modules on the omit list
since those are designed to leak addresses to userspace. The
modules in sysfs might be harder to lockdown.

Thanks,
Laura

> Please feel free to say 'try harder Tobin, this reporting is shit'.
> 
> Thanks, appreciate your time,
> Tobin.
> 
> v4:
>   - Add `scan` and `format` sub-commands.
>   - Output report by default.
>   - Add command line option to send scan results (to me).
> 
> v3:
>   - Iterate matches to check for results instead of matching input line against
>     false positives i.e catch lines that contain results as well as false
>     positives.
> 
> v2:
>   - Add regex's to prevent false positives.
>   - Clean up white space.
> 
>   MAINTAINERS                  |   5 +
>   scripts/leaking_addresses.pl | 437 +++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 442 insertions(+)
>   create mode 100755 scripts/leaking_addresses.pl
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2f4e462aa4a2..a7995c737728 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7745,6 +7745,11 @@ S:	Maintained
>   F:	Documentation/scsi/53c700.txt
>   F:	drivers/scsi/53c700*
>   
> +LEAKING_ADDRESSES
> +M:	Tobin C. Harding <me@tobin.cc>
> +S:	Maintained
> +F:	scripts/leaking_addresses.pl
> +
>   LED SUBSYSTEM
>   M:	Richard Purdie <rpurdie@rpsys.net>
>   M:	Jacek Anaszewski <jacek.anaszewski@gmail.com>
> diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
> new file mode 100755
> index 000000000000..282c0cc2bdea
> --- /dev/null
> +++ b/scripts/leaking_addresses.pl
> @@ -0,0 +1,437 @@
> +#!/usr/bin/env perl
> +#
> +# (c) 2017 Tobin C. Harding <me@tobin.cc>
> +# Licensed under the terms of the GNU GPL License version 2
> +#
> +# leaking_addresses.pl: Scan 64 bit kernel for potential leaking addresses.
> +#  - Scans dmesg output.
> +#  - Walks directory tree and parses each file (for each directory in @DIRS).
> +#
> +# Use --debug to output path before parsing, this is useful to find files that
> +# cause the script to choke.
> +#
> +# You may like to set kptr_restrict=2 before running script
> +# (see Documentation/sysctl/kernel.txt).
> +
> +use warnings;
> +use strict;
> +use POSIX;
> +use File::Basename;
> +use File::Spec;
> +use Cwd 'abs_path';
> +use Term::ANSIColor qw(:constants);
> +use Getopt::Long qw(:config no_auto_abbrev);
> +use File::Spec::Functions 'catfile';
> +
> +my $P = $0;
> +my $V = '0.01';
> +
> +# Directories to scan (we scan `dmesg` also).
> +my @DIRS = ('/proc', '/sys');
> +
> +# Output path for raw scan data, set by set_ouput_path().
> +my $OUTPUT = "";
> +
> +# Command line options.
> +my $output = "";
> +my $suppress_dmesg = 0;
> +my $squash_by_path = 0;
> +my $raw = 0;
> +my $send_report = 0;
> +my $help = 0;
> +my $debug = 0;
> +
> +# Do not parse these files (absolute path).
> +my @skip_parse_files_abs = ('/proc/kmsg',
> +			    '/proc/kcore',
> +			    '/proc/fs/ext4/sdb1/mb_groups',
> +			    '/proc/1/fd/3',
> +			    '/sys/kernel/debug/tracing/trace_pipe',
> +			    '/sys/kernel/security/apparmor/revision')> +
> +# Do not parse thes files under any subdirectory.
> +my @skip_parse_files_any = ('0',
> +			    '1',
> +			    '2',
> +			    'pagemap',
> +			    'events',
> +			    'access',
> +			    'registers',
> +			    'snapshot_raw',
> +			    'trace_pipe_raw',
> +			    'ptmx',
> +			    'trace_pipe');
> +
> +# Do not walk these directories (absolute path).
> +my @skip_walk_dirs_abs = ();
> +
> +# Do not walk these directories under any subdirectory.
> +my @skip_walk_dirs_any = ('self',
> +			  'thread-self',
> +			  'cwd',
> +			  'fd',
> +			  'stderr',
> +			  'stdin',
> +			  'stdout');
> +
> +sub help
> +{
> +	my ($exitcode) = @_;
> +
> +	print << "EOM";
> +Usage: $P COMMAND [OPTIONS]
> +Version: $V
> +
> +Commands:
> +
> +	scan	Scan the kernel (savesg raw results to file and runs `format`).
> +	format	Parse results file and format output.
> +
> +Options:
> +	-o, --output=<path>	 Accepts absolute or relative filename or directory name.
> +	    --suppress-dmesg	 Don't show dmesg results.
> +	    --squash-by-path	 Show one result per unique path.
> +	    --raw	 	 Show raw results.
> +	    --send-report	 Submit raw results for someone else to worry about.
> +	-d, --debug              Display debugging output.
> +	-h, --help, --version    Display this help and exit.
> +
> +Scans the running (64 bit) kernel for potential leaking addresses.
> +}
> +
> +EOM
> +	exit($exitcode);
> +}
> +
> +GetOptions(
> +        'o|output=s'		=> \$output,
> +        'suppress-dmesg'	=> \$suppress_dmesg,
> +        'squash-by-path'	=> \$squash_by_path,
> +        'raw'			=> \$raw,
> +        'send-report'		=> \$send_report,
> +        'd|debug'		=> \$debug,
> +        'h|help'		=> \$help,
> +        'version'		=> \$help
> +) or help(1);
> +
> +help(0) if ($help);
> +
> +my ($command) = @ARGV;
> +if (not defined $command) {
> +        help(128);
> +}
> +
> +set_output_path($output);
> +
> +if ($command ne 'scan' and $command ne 'format') {
> +        printf "\nUnknown command: %s\n\n", $command;
> +        help(128);
> +}
> +
> +if ($command eq 'scan') {
> +        scan();
> +}
> +
> +if ($send_report) {
> +        send_report();
> +        print "Raw scan results sent, thank you.\n";
> +        exit(0);
> +}
> +
> +format_output();
> +
> +exit 0;
> +
> +sub dprint
> +{
> +	printf(STDERR @_) if $debug;
> +}
> +
> +# Sets global $OUTPUT, defaults to "./scan.out"
> +# Accepts relative or absolute path (directory name or filename).
> +sub set_output_path
> +{
> +        my ($path) = @_;
> +        my $def_filename = "scan.out";
> +        my $def_dirname = getcwd();
> +
> +        if ($path eq "") {
> +                $OUTPUT = catfile($def_dirname, $def_filename);
> +                return;
> +        }
> +
> +        my($filename, $dirs, $suffix) = fileparse($path);
> +
> +        if ($filename eq "") {
> +                $OUTPUT = catfile($dirs, $def_filename);
> +        } elsif ($filename) {
> +                $OUTPUT = catfile($dirs, $filename);
> +        }
> +}
> +
> +sub scan
> +{
> +        open (my $fh, '>', "$OUTPUT") or die "Cannot open $OUTPUT\n";
> +        select $fh;
> +
> +        parse_dmesg();
> +        walk(@DIRS);
> +
> +        select STDOUT;
> +}
> +
> +sub send_report
> +{
> +        system("mail -s 'LEAK REPORT' leaks\@tobin.cc < $OUTPUT");
> +}
> +
> +sub parse_dmesg
> +{
> +	open my $cmd, '-|', 'dmesg';
> +	while (<$cmd>) {
> +		if (may_leak_address($_)) {
> +			print 'dmesg: ' . $_;
> +		}
> +	}
> +	close $cmd;
> +}
> +
> +# Recursively walk directory tree.
> +sub walk
> +{
> +	my @dirs = @_;
> +	my %seen;
> +
> +	while (my $pwd = shift @dirs) {
> +		next if (skip_walk($pwd));
> +		next if (!opendir(DIR, $pwd));
> +		my @files = readdir(DIR);
> +		closedir(DIR);
> +
> +		foreach my $file (@files) {
> +			next if ($file eq '.' or $file eq '..');
> +
> +			my $path = "$pwd/$file";
> +			next if (-l $path);
> +
> +			if (-d $path) {
> +				push @dirs, $path;
> +			} else {
> +				parse_file($path);
> +			}
> +		}
> +	}
> +}
> +
> +# True if argument potentially contains a kernel address.
> +sub may_leak_address
> +{
> +        my ($line) = @_;
> +
> +        my @addresses = extract_addresses($line);
> +        return @addresses > 0;
> +}
> +
> +# Return _all_ non false positive addresses from $line.
> +sub extract_addresses
> +{
> +        my ($line) = @_;
> +        my $address = '\b(0x)?ffff[[:xdigit:]]{12}\b';
> +        my (@addresses, @empty);
> +
> +        # Signal masks.
> +        if ($line =~ '^SigBlk:' or
> +            $line =~ '^SigCgt:') {
> +                return @empty;
> +        }
> +
> +        if ($line =~ '\bKEY=[[:xdigit:]]{14} [[:xdigit:]]{16} [[:xdigit:]]{16}\b' or
> +            $line =~ '\b[[:xdigit:]]{14} [[:xdigit:]]{16} [[:xdigit:]]{16}\b') {
> +                return @empty;
> +        }
> +
> +        while ($line =~ /($address)/g) {
> +                if (!is_false_positive($1)) {
> +                        push @addresses, $1;
> +                }
> +        }
> +
> +        return @addresses;
> +}
> +
> +# True if we should skip walking this directory.
> +sub skip_walk
> +{
> +	my ($path) = @_;
> +	return skip($path, \@skip_walk_dirs_abs, \@skip_walk_dirs_any)
> +}
> +
> +sub parse_file
> +{
> +	my ($file) = @_;
> +
> +	if (! -R $file) {
> +		return;
> +	}
> +
> +	if (skip_parse($file)) {
> +		dprint "skipping file: $file\n";
> +		return;
> +	}
> +	dprint "parsing: $file\n";
> +
> +	open my $fh, "<", $file or return;
> +	while ( <$fh> ) {
> +		if (may_leak_address($_)) {
> +			print $file . ': ' . $_;
> +		}
> +	}
> +	close $fh;
> +}
> +
> +sub is_false_positive
> +{
> +        my ($match) = @_;
> +
> +        if ($match =~ '\b(0x)?(f|F){16}\b' or
> +            $match =~ '\b(0x)?0{16}\b') {
> +                return 1;
> +        }
> +
> +        # vsyscall memory region, we should probably check against a range here.
> +        if ($match =~ '\bf{10}600000\b' or
> +            $match =~ '\bf{10}601000\b') {
> +                return 1;
> +        }
> +
> +        return 0;
> +}
> +
> +# True if we should skip this path.
> +sub skip
> +{
> +	my ($path, $paths_abs, $paths_any) = @_;
> +
> +	foreach (@$paths_abs) {
> +		return 1 if (/^$path$/);
> +	}
> +
> +	my($filename, $dirs, $suffix) = fileparse($path);
> +	foreach (@$paths_any) {
> +		return 1 if (/^$filename$/);
> +	}
> +
> +	return 0;
> +}
> +
> +sub skip_parse
> +{
> +	my ($path) = @_;
> +	return skip($path, \@skip_parse_files_abs, \@skip_parse_files_any);
> +}
> +
> +sub format_output
> +{
> +        if ($raw) {
> +                dump_raw_output();
> +                return;
> +        }
> +
> +        my ($total, $dmesg, $paths, $files) = parse_raw_file();
> +
> +        printf "\nTotal number of results from scan (incl dmesg): %d\n", $total;
> +
> +        if (!$suppress_dmesg) {
> +                print_dmesg($dmesg);
> +        }
> +        squash_by($files, 'filename');
> +
> +        if ($squash_by_path) {
> +                squash_by($paths, 'path');
> +        }
> +}
> +
> +sub dump_raw_output
> +{
> +        open (my $fh, '<', $OUTPUT) or die "Cannot open $OUTPUT\n";
> +        while (<$fh>) {
> +                print $_;
> +        }
> +        close $fh;
> +}
> +
> +sub print_dmesg
> +{
> +        my ($dmesg) = @_;
> +
> +        print "\ndmesg output:\n";
> +        foreach(@$dmesg) {
> +                my $index = index($_, ':');
> +                $index += 2;    # skid ': '
> +                print substr($_, $index);
> +        }
> +}
> +
> +sub squash_by
> +{
> +        my ($ref, $desc) = @_;
> +
> +        print "\nResults squashed by $desc (excl dmesg). ";
> +        print "Displaying <number of results>, <$desc>, <example result>\n";
> +        foreach(keys %$ref) {
> +                my $lines = $ref->{$_};
> +                my $length = @$lines;
> +                printf "[%d %s] %s", $length, $_, @$lines[0];
> +        }
> +}
> +
> +sub parse_raw_file
> +{
> +        my $total = 0;          # Total number of lines parsed.
> +        my @dmesg;              # dmesg output.
> +        my %files;              # Unique filenames containing leaks.
> +        my %paths;              # Unique paths containing leaks.
> +
> +        open (my $fh, '<', $OUTPUT) or die "Cannot open $OUTPUT\n";
> +
> +        while (my $line = <$fh>) {
> +                $total++;
> +
> +                if ("dmesg:" eq substr($line, 0, 6)) {
> +                        push @dmesg, $line;
> +                        next;
> +                }
> +
> +                cache_path(\%paths, $line);
> +                cache_filename(\%files, $line);
> +        }
> +
> +        return $total, \@dmesg, \%paths, \%files;
> +}
> +
> +sub cache_path
> +{
> +        my ($paths, $line) = @_;
> +
> +        my $index = index($line, ':');
> +        my $path = substr($line, 0, $index);
> +
> +        if (!$paths->{$path}) {
> +                $paths->{$path} = ();
> +        }
> +        push @{$paths->{$path}}, $line;
> +}
> +
> +sub cache_filename
> +{
> +        my ($files, $line) = @_;
> +
> +        my $index = index($line, ':');
> +        my $path = substr($line, 0, $index);
> +        my $filename = basename($path);
> +        if (!$files->{$filename}) {
> +                $files->{$filename} = ();
> +        }
> +        $index += 2;            # skip ': '
> +        push @{$files->{$filename}}, substr($line, $index);
> +}
>

Tobin Harding Nov. 8, 2017, 1:13 a.m. UTC | #9

On Tue, Nov 07, 2017 at 03:36:06PM -0800, Laura Abbott wrote:
> On 11/07/2017 02:32 AM, Tobin C. Harding wrote:
> >Currently we are leaking addresses from the kernel to user space. This
> >script is an attempt to find some of those leakages. Script parses
> >`dmesg` output and /proc and /sys files for hex strings that look like
> >kernel addresses.
> >
> >Only works for 64 bit kernels, the reason being that kernel addresses
> >on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> >possible. On 32 kernels we don't have this luxury.
> >
> >Scripts is _slightly_ smarter than a straight grep, we check for false
> >positives (all 0's or all 1's, and vsyscall start/finish addresses).
> >
> >Output is saved to file to expedite repeated formatting/viewing of
> >output.
> >
> >Signed-off-by: Tobin C. Harding <me@tobin.cc>
> >---
> >
> >This version outputs a report instead of the raw results by default. Designing
> >this proved to be non-trivial, the reason being that it is not immediately clear
> >what constitutes a duplicate entry (similar message, address range, same
> >file?). Also, the aim of the report is to assist users _not_ missing correct
> >results; limiting the output is inherently a trade off between noise and
> >correct, clear results.
> >
> >Without testing on various real kernels its not clear that this reporting is any
> >good, my test cases were a bit contrived. Your usage may vary.
> >
> >It would be super helpful to get some comments from people running this with
> >different set ups.
> >
> 
> Running on a stock Fedora kernel with gnome generates a 139M file.
> I'll admit that Fedora is pretty generous in what it enables.
> Trimmed down to omit some redundancies in various processes
> by only printing off of the last file in the path
> 
> /proc/kallsyms
> /proc/modules
> /proc/timer_list
> /proc/1244/stack
> /proc/4041/status
> /proc/bus/input/devices <--- Probably a false positive
> /proc/1/net/hci
> /proc/1/net/tcp
> /proc/1/net/udp
> /proc/1/net/bnep
> /proc/1/net/raw6
> /proc/1/net/tcp6
> /proc/1/net/udp6
> /proc/1/net/unix
> /proc/1/net/l2cap
> /proc/1/net/packet
> /proc/1/net/rfcomm
> /proc/1/net/netlink
> /sys/module/snd_compress/sections/.note.gnu.build-id
> /sys/module/snd_compress/sections/.exit.text
> /sys/module/snd_compress/sections/__mcount_loc
> /sys/module/snd_compress/sections/__ksymtab_strings
> /sys/module/snd_compress/sections/__ksymtab_gpl
> /sys/module/snd_compress/sections/.init.text
> /sys/module/snd_compress/sections/.gnu.linkonce.this_module
> /sys/module/snd_compress/sections/__jump_table
> /sys/module/snd_compress/sections/.strtab
> /sys/module/snd_compress/sections/.bss
> /sys/module/snd_compress/sections/.rodata.str1.1
> /sys/module/snd_compress/sections/__bug_table
> /sys/module/snd_compress/sections/__verbose
> /sys/module/snd_compress/sections/.rodata.str1.8
> /sys/module/snd_compress/sections/.text
> /sys/module/snd_compress/sections/.data
> /sys/module/snd_compress/sections/.symtab
> /sys/module/snd_compress/sections/.rodata
> /sys/module/iwlmvm/sections/.altinstr_replacement
> /sys/module/iwlmvm/sections/.altinstructions
> /sys/module/iwlmvm/sections/.data.unlikely
> /sys/module/iwlmvm/sections/__param
> /sys/module/iwlmvm/sections/.smp_locks
> /sys/module/snd_hda_intel/sections/__tracepoints_ptrs
> /sys/module/snd_hda_intel/sections/__tracepoints
> /sys/module/snd_hda_intel/sections/__tracepoints_strings
> /sys/module/snd_hda_intel/sections/_ftrace_events
> /sys/module/snd_hda_intel/sections/.ref.data
> /sys/module/iwlwifi/sections/.parainstructions
> /sys/module/iwlwifi/sections/__ksymtab
> /sys/module/uvcvideo/sections/.fixup
> /sys/module/uvcvideo/sections/.text.unlikely
> /sys/module/uvcvideo/sections/__ex_table
> /sys/module/intel_powerclamp/sections/.init.rodata
> /sys/module/mac80211/sections/.data..read_mostly
> /sys/module/nfnetlink/sections/.init.data
> /sys/module/ghash_clmulni_intel/sections/.rodata.cst16.bswap_mask
> /sys/module/videodev/sections/_ftrace_eval_map
> /sys/module/kvm_intel/sections/.data..ro_after_init
> /sys/module/kvm_intel/sections/.altinstr_aux
> /sys/module/crct10dif_pclmul/sections/.rodata.cst16.SHUF_MASK
> /sys/module/crct10dif_pclmul/sections/.rodata.cst16.mask1
> /sys/module/crct10dif_pclmul/sections/.rodata.cst32.pshufb_shf_table
> /sys/module/crct10dif_pclmul/sections/.rodata.cst16.mask2
> /sys/module/nf_conntrack/sections/.data..cacheline_aligned
> /sys/firmware/efi/runtime-map/5/virt_addr
> /sys/devices/platform/i8042/serio0/input/input3/uevent
> /sys/devices/platform/i8042/serio0/input/input3/capabilities/key

thanks for running the script. Is there any chance you could email me
the complete output please? The next patch includes a flag to do
this. You can wait until that lands if it is easier for you.

thanks,
Tobin.

Michael Ellerman Nov. 8, 2017, 12:10 p.m. UTC | #10

"Tobin C. Harding" <me@tobin.cc> writes:
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
>
> Only works for 64 bit kernels, the reason being that kernel addresses
> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> possible.

That doesn't work super well on other architectures :D

I don't speak perl but presumably you can check the arch somehow and
customise the regex?

...
> +# Return _all_ non false positive addresses from $line.
> +sub extract_addresses
> +{
> +        my ($line) = @_;
> +        my $address = '\b(0x)?ffff[[:xdigit:]]{12}\b';

On 64-bit powerpc (ppc64/ppc64le) we'd want:

+        my $address = '\b(0x)?[89abcdef]00[[:xdigit:]]{13}\b';


> +# Do not parse these files (absolute path).
> +my @skip_parse_files_abs = ('/proc/kmsg',
> +			    '/proc/kcore',
> +			    '/proc/fs/ext4/sdb1/mb_groups',
> +			    '/proc/1/fd/3',
> +			    '/sys/kernel/debug/tracing/trace_pipe',
> +			    '/sys/kernel/security/apparmor/revision');

Can you add:

  /sys/firmware/devicetree

and/or /proc/device-tree (which is a symlink to the above).

We should also start restricting access to that because it may have
potentially interesting physical addresses in it, but that will break
existing tools, so it will need to be opt-in and done over time.

cheers

Tobin Harding Nov. 8, 2017, 9:16 p.m. UTC | #11

On Wed, Nov 08, 2017 at 11:10:56PM +1100, Michael Ellerman wrote:
> "Tobin C. Harding" <me@tobin.cc> writes:
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> >
> > Only works for 64 bit kernels, the reason being that kernel addresses
> > on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> > possible.
> 
> That doesn't work super well on other architectures :D
> 
> I don't speak perl but presumably you can check the arch somehow and
> customise the regex?

I'm on it.

> ...
> > +# Return _all_ non false positive addresses from $line.
> > +sub extract_addresses
> > +{
> > +        my ($line) = @_;
> > +        my $address = '\b(0x)?ffff[[:xdigit:]]{12}\b';
> 
> On 64-bit powerpc (ppc64/ppc64le) we'd want:
> 
> +        my $address = '\b(0x)?[89abcdef]00[[:xdigit:]]{13}\b';

This is great! Thanks a million. This gives me the idea of getting in
contact with people who have access to other [64 bit] architectures and
getting the address format. I guess a dump of kallsyms from each
architecture would do the job nicely. 

> > +# Do not parse these files (absolute path).
> > +my @skip_parse_files_abs = ('/proc/kmsg',
> > +			    '/proc/kcore',
> > +			    '/proc/fs/ext4/sdb1/mb_groups',
> > +			    '/proc/1/fd/3',
> > +			    '/sys/kernel/debug/tracing/trace_pipe',
> > +			    '/sys/kernel/security/apparmor/revision');
> 
> Can you add:
> 
>   /sys/firmware/devicetree
> 
> and/or /proc/device-tree (which is a symlink to the above).

Can do, thanks.

> We should also start restricting access to that because it may have
> potentially interesting physical addresses in it, but that will break
> existing tools, so it will need to be opt-in and done over time.

Seems like this is going to be a recurring theme if we try to stop leaks
using file permissions. I'm interested in how we would do this, assuming
it has to be a case by case fix but done many times.

thanks,
Tobin.

Tobin Harding Nov. 8, 2017, 10:48 p.m. UTC | #12

On Wed, Nov 08, 2017 at 11:10:56PM +1100, Michael Ellerman wrote:
> "Tobin C. Harding" <me@tobin.cc> writes:
[snip]

Hi Michael,

I'm working an adding support for ppc64 to leaking_addresses.pl, I've
added the kernel address regular expression that you suggested. I'd like
to add the false positive for vsyscall addresses. Excuse my ignorance
but does PowerPC use a constant address range for vsyscall like x86_64
does? The ppc64 machine I have access to does not output anything for

	$ cat /proc/PID/tasks/PID/smaps		or
	$ cat /proc/PID/tasks/PID/maps

thanks,
Tobin.

Michael Ellerman Nov. 9, 2017, 12:49 a.m. UTC | #13

"Tobin C. Harding" <me@tobin.cc> writes:

> On Wed, Nov 08, 2017 at 11:10:56PM +1100, Michael Ellerman wrote:
>> "Tobin C. Harding" <me@tobin.cc> writes:
> [snip]
>
> Hi Michael,
>
> I'm working an adding support for ppc64 to leaking_addresses.pl, I've
> added the kernel address regular expression that you suggested.

Thanks!

> I'd like to add the false positive for vsyscall addresses. Excuse my
> ignorance but does PowerPC use a constant address range for vsyscall like x86_64
> does? The ppc64 machine I have access to does not output anything for
>
> 	$ cat /proc/PID/tasks/PID/smaps		or
> 	$ cat /proc/PID/tasks/PID/maps

No we only have the vdso style vsyscall, which is mapped at user
addresses and is subject to ASLR, so you shouldn't need to worry about
it.

cheers

Tobin Harding Nov. 9, 2017, 2:08 a.m. UTC | #14

On Thu, Nov 09, 2017 at 11:49:52AM +1100, Michael Ellerman wrote:
> "Tobin C. Harding" <me@tobin.cc> writes:
> 
> > On Wed, Nov 08, 2017 at 11:10:56PM +1100, Michael Ellerman wrote:
> >> "Tobin C. Harding" <me@tobin.cc> writes:
> > [snip]
> >
> > Hi Michael,
> >
> > I'm working an adding support for ppc64 to leaking_addresses.pl, I've
> > added the kernel address regular expression that you suggested.
> 
> Thanks!
> 
> > I'd like to add the false positive for vsyscall addresses. Excuse my
> > ignorance but does PowerPC use a constant address range for vsyscall like x86_64
> > does? The ppc64 machine I have access to does not output anything for
> >
> > 	$ cat /proc/PID/tasks/PID/smaps		or
> > 	$ cat /proc/PID/tasks/PID/maps
> 
> No we only have the vdso style vsyscall, which is mapped at user
> addresses and is subject to ASLR, so you shouldn't need to worry about
> it.

Great. I'll add you to the CC list for the next spin. In line with my
aim of having the most confusing patches to follow the next version will
likely be

[PATCH 0/X v2] scripts/leaking_addresses: add summary report

thanks,
Tobin.

Frank Rowand Nov. 10, 2017, 10:12 p.m. UTC | #15

Hi Michael, Tobin,

On 11/08/17 04:10, Michael Ellerman wrote:
> "Tobin C. Harding" <me@tobin.cc> writes:
>> Currently we are leaking addresses from the kernel to user space. This
>> script is an attempt to find some of those leakages. Script parses
>> `dmesg` output and /proc and /sys files for hex strings that look like
>> kernel addresses.
>>
>> Only works for 64 bit kernels, the reason being that kernel addresses
>> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
>> possible.
> 
> That doesn't work super well on other architectures :D
> 
> I don't speak perl but presumably you can check the arch somehow and
> customise the regex?
> 
> ...
>> +# Return _all_ non false positive addresses from $line.
>> +sub extract_addresses
>> +{
>> +        my ($line) = @_;
>> +        my $address = '\b(0x)?ffff[[:xdigit:]]{12}\b';
> 
> On 64-bit powerpc (ppc64/ppc64le) we'd want:
> 
> +        my $address = '\b(0x)?[89abcdef]00[[:xdigit:]]{13}\b';
> 
> 
>> +# Do not parse these files (absolute path).
>> +my @skip_parse_files_abs = ('/proc/kmsg',
>> +			    '/proc/kcore',
>> +			    '/proc/fs/ext4/sdb1/mb_groups',
>> +			    '/proc/1/fd/3',
>> +			    '/sys/kernel/debug/tracing/trace_pipe',
>> +			    '/sys/kernel/security/apparmor/revision');
> 
> Can you add:
> 
>   /sys/firmware/devicetree
> 
> and/or /proc/device-tree (which is a symlink to the above).

/proc/device-tree is a symlink to /sys/firmware/devicetree/base

/sys/firmware contains
   fdt              -- the flattened device tree that was passed to the
                       kernel on boot
   devicetree/base/ -- the data that is currently in the live device tree.
                       This live device tree is represented as directories
                       and files beneath base/

The information in fdt is directly available in the kernel source tree
(possible exception: the bootloader may have modified the fdt, possibly
to add/modify the boot command line, add memory size).

The information in devicetree/base/ is directly available in the kernel
source tree for _most_ architectures, with the same possible exception
for the bootloader.  ppc64 may also modify this information dynamically
after the system is booted.  When overlay support is working, overlay
device trees will also be able to modify this information dynamically
(and again, this information will be directly available in the kernel
source tree).

Not having read the code in leaking_addresses.pl, trusting that the
comments are correct, it seems that /sys/firmware should be in
@skip_walk_dirs_abs instead of putting /sys/firmware/devicetree
in @skip_parse_files_abs.

> We should also start restricting access to that because it may have
> potentially interesting physical addresses in it, but that will break
> existing tools, so it will need to be opt-in and done over time.
> 
> cheers
>

Kirill A. Shutemov Nov. 11, 2017, 11:10 p.m. UTC | #16

On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> Currently we are leaking addresses from the kernel to user space. This
> script is an attempt to find some of those leakages. Script parses
> `dmesg` output and /proc and /sys files for hex strings that look like
> kernel addresses.
> 
> Only works for 64 bit kernels, the reason being that kernel addresses
> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> possible. On 32 kernels we don't have this luxury.

Well, it's not going to work as well as intented on x86 machine with
5-level paging. Kernel address space there starts at 0xff10000000000000.
It will still catch pointers to kernel/modules text, but the rest is
outside of 0xffff... space. See Documentation/x86/x86_64/mm.txt.

Not sure if we care. It won't work too for other 64-bit architectrues that
have more than 256TB of virtual address space.

Just wanted to point to the limitation.

Michael Ellerman Nov. 12, 2017, 11:49 a.m. UTC | #17

Hi Frank,

Frank Rowand <frowand.list@gmail.com> writes:
> Hi Michael, Tobin,
>
> On 11/08/17 04:10, Michael Ellerman wrote:
>> "Tobin C. Harding" <me@tobin.cc> writes:
>>> Currently we are leaking addresses from the kernel to user space. This
>>> script is an attempt to find some of those leakages. Script parses
>>> `dmesg` output and /proc and /sys files for hex strings that look like
>>> kernel addresses.
>>>
>>> Only works for 64 bit kernels, the reason being that kernel addresses
>>> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
>>> possible.
>> 
>> That doesn't work super well on other architectures :D
>> 
>> I don't speak perl but presumably you can check the arch somehow and
>> customise the regex?
>> 
>> ...
>>> +# Return _all_ non false positive addresses from $line.
>>> +sub extract_addresses
>>> +{
>>> +        my ($line) = @_;
>>> +        my $address = '\b(0x)?ffff[[:xdigit:]]{12}\b';
>> 
>> On 64-bit powerpc (ppc64/ppc64le) we'd want:
>> 
>> +        my $address = '\b(0x)?[89abcdef]00[[:xdigit:]]{13}\b';
>> 
>> 
>>> +# Do not parse these files (absolute path).
>>> +my @skip_parse_files_abs = ('/proc/kmsg',
>>> +			    '/proc/kcore',
>>> +			    '/proc/fs/ext4/sdb1/mb_groups',
>>> +			    '/proc/1/fd/3',
>>> +			    '/sys/kernel/debug/tracing/trace_pipe',
>>> +			    '/sys/kernel/security/apparmor/revision');
>> 
>> Can you add:
>> 
>>   /sys/firmware/devicetree
>> 
>> and/or /proc/device-tree (which is a symlink to the above).
>
> /proc/device-tree is a symlink to /sys/firmware/devicetree/base

Oh yep, forgot about the base part.

> /sys/firmware contains
>    fdt              -- the flattened device tree that was passed to the
>                        kernel on boot
>    devicetree/base/ -- the data that is currently in the live device tree.
>                        This live device tree is represented as directories
>                        and files beneath base/
>
> The information in fdt is directly available in the kernel source tree

On ARM that might be true, but not on powerpc.

Remember FDT comes from DT which comes from OF - in which case the
information is definitely not in the kernel source! :)

On our bare metal machines the device tree comes from skiboot
(firmware), with some of the content provided by hostboot (other
firmware), both of which are open source, so in theory most of the
information is available in *some* source tree. But there's still
information about runtime allocations etc. that is not available in the
source anywhere.

cheers

Frank Rowand Nov. 12, 2017, 6:02 p.m. UTC | #18

Hi Michael,

On 11/12/17 03:49, Michael Ellerman wrote:
> Hi Frank,
> 
> Frank Rowand <frowand.list@gmail.com> writes:
>> Hi Michael, Tobin,
>>
>> On 11/08/17 04:10, Michael Ellerman wrote:
>>> "Tobin C. Harding" <me@tobin.cc> writes:
>>>> Currently we are leaking addresses from the kernel to user space. This
>>>> script is an attempt to find some of those leakages. Script parses
>>>> `dmesg` output and /proc and /sys files for hex strings that look like
>>>> kernel addresses.
>>>>
>>>> Only works for 64 bit kernels, the reason being that kernel addresses
>>>> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
>>>> possible.
>>>
>>> That doesn't work super well on other architectures :D
>>>
>>> I don't speak perl but presumably you can check the arch somehow and
>>> customise the regex?
>>>
>>> ...
>>>> +# Return _all_ non false positive addresses from $line.
>>>> +sub extract_addresses
>>>> +{
>>>> +        my ($line) = @_;
>>>> +        my $address = '\b(0x)?ffff[[:xdigit:]]{12}\b';
>>>
>>> On 64-bit powerpc (ppc64/ppc64le) we'd want:
>>>
>>> +        my $address = '\b(0x)?[89abcdef]00[[:xdigit:]]{13}\b';
>>>
>>>
>>>> +# Do not parse these files (absolute path).
>>>> +my @skip_parse_files_abs = ('/proc/kmsg',
>>>> +			    '/proc/kcore',
>>>> +			    '/proc/fs/ext4/sdb1/mb_groups',
>>>> +			    '/proc/1/fd/3',
>>>> +			    '/sys/kernel/debug/tracing/trace_pipe',
>>>> +			    '/sys/kernel/security/apparmor/revision');
>>>
>>> Can you add:
>>>
>>>   /sys/firmware/devicetree
>>>
>>> and/or /proc/device-tree (which is a symlink to the above).
>>
>> /proc/device-tree is a symlink to /sys/firmware/devicetree/base
> 
> Oh yep, forgot about the base part.
> 
>> /sys/firmware contains
>>    fdt              -- the flattened device tree that was passed to the
>>                        kernel on boot
>>    devicetree/base/ -- the data that is currently in the live device tree.
>>                        This live device tree is represented as directories
>>                        and files beneath base/
>>
>> The information in fdt is directly available in the kernel source tree
> 
> On ARM that might be true, but not on powerpc.
> 
> Remember FDT comes from DT which comes from OF - in which case the
> information is definitely not in the kernel source! :)
> 
> On our bare metal machines the device tree comes from skiboot
> (firmware), with some of the content provided by hostboot (other
> firmware), both of which are open source, so in theory most of the
> information is available in *some* source tree. But there's still
> information about runtime allocations etc. that is not available in the
> source anywhere.

Thanks for the additional information. 

Can you explain a little bit what "runtime allocations" are?  Are you
referring to the memory reservation block, the memory node(s) and the
chosen node?  Or other information?

Tobin Harding Nov. 12, 2017, 9:18 p.m. UTC | #19

On Sun, Nov 12, 2017 at 10:02:55AM -0800, Frank Rowand wrote:
> Hi Michael,
> 
> On 11/12/17 03:49, Michael Ellerman wrote:
> > Hi Frank,
> > 
> > Frank Rowand <frowand.list@gmail.com> writes:
> >> Hi Michael, Tobin,
> >>
> >> On 11/08/17 04:10, Michael Ellerman wrote:
> >>> "Tobin C. Harding" <me@tobin.cc> writes:
> >>>> Currently we are leaking addresses from the kernel to user space. This
> >>>> script is an attempt to find some of those leakages. Script parses
> >>>> `dmesg` output and /proc and /sys files for hex strings that look like
> >>>> kernel addresses.
> >>>>
> >>>> Only works for 64 bit kernels, the reason being that kernel addresses
> >>>> on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> >>>> possible.
> >>>
> >>> That doesn't work super well on other architectures :D
> >>>
> >>> I don't speak perl but presumably you can check the arch somehow and
> >>> customise the regex?
> >>>
> >>> ...
> >>>> +# Return _all_ non false positive addresses from $line.
> >>>> +sub extract_addresses
> >>>> +{
> >>>> +        my ($line) = @_;
> >>>> +        my $address = '\b(0x)?ffff[[:xdigit:]]{12}\b';
> >>>
> >>> On 64-bit powerpc (ppc64/ppc64le) we'd want:
> >>>
> >>> +        my $address = '\b(0x)?[89abcdef]00[[:xdigit:]]{13}\b';
> >>>
> >>>
> >>>> +# Do not parse these files (absolute path).
> >>>> +my @skip_parse_files_abs = ('/proc/kmsg',
> >>>> +			    '/proc/kcore',
> >>>> +			    '/proc/fs/ext4/sdb1/mb_groups',
> >>>> +			    '/proc/1/fd/3',
> >>>> +			    '/sys/kernel/debug/tracing/trace_pipe',
> >>>> +			    '/sys/kernel/security/apparmor/revision');
> >>>
> >>> Can you add:
> >>>
> >>>   /sys/firmware/devicetree
> >>>
> >>> and/or /proc/device-tree (which is a symlink to the above).
> >>
> >> /proc/device-tree is a symlink to /sys/firmware/devicetree/base
> > 
> > Oh yep, forgot about the base part.
> > 
> >> /sys/firmware contains
> >>    fdt              -- the flattened device tree that was passed to the
> >>                        kernel on boot
> >>    devicetree/base/ -- the data that is currently in the live device tree.
> >>                        This live device tree is represented as directories
> >>                        and files beneath base/
> >>
> >> The information in fdt is directly available in the kernel source tree
> > 
> > On ARM that might be true, but not on powerpc.

Looks like we should be considering architecture specific lists for
files/directories to skip.

thanks,
Tobin.

Tobin Harding Nov. 12, 2017, 11:06 p.m. UTC | #20

On Sun, Nov 12, 2017 at 02:10:07AM +0300, Kirill A. Shutemov wrote:
> On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> > Currently we are leaking addresses from the kernel to user space. This
> > script is an attempt to find some of those leakages. Script parses
> > `dmesg` output and /proc and /sys files for hex strings that look like
> > kernel addresses.
> > 
> > Only works for 64 bit kernels, the reason being that kernel addresses
> > on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> > possible. On 32 kernels we don't have this luxury.
> 
> Well, it's not going to work as well as intented on x86 machine with
> 5-level paging. Kernel address space there starts at 0xff10000000000000.
> It will still catch pointers to kernel/modules text, but the rest is
> outside of 0xffff... space. See Documentation/x86/x86_64/mm.txt.

Thanks for the link. So it looks like we need to refactor the kernel
address regular expression into a function that takes into account the
machine architecture and the number of page table levels. We will need
to add this to the false positive checks also.

> Not sure if we care. It won't work too for other 64-bit architectrues that
> have more than 256TB of virtual address space.

Is this because of the virtual memory map? Did you mean 512TB?

from mm.txt:
ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)

Perhaps an option (--terse) that only checks the virtual memory map
range (above for 5-level paging) and

ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)

for 4-level paging?

> Just wanted to point to the limitation.

Appreciate it, thanks.

Tobin.

Michael Ellerman Nov. 13, 2017, 1:06 a.m. UTC | #21

Frank Rowand <frowand.list@gmail.com> writes:
> Hi Michael,
>
> On 11/12/17 03:49, Michael Ellerman wrote:
...
>> 
>> On our bare metal machines the device tree comes from skiboot
>> (firmware), with some of the content provided by hostboot (other
>> firmware), both of which are open source, so in theory most of the
>> information is available in *some* source tree. But there's still
>> information about runtime allocations etc. that is not available in the
>> source anywhere.
>
> Thanks for the additional information. 
>
> Can you explain a little bit what "runtime allocations" are?  Are you
> referring to the memory reservation block, the memory node(s) and the
> chosen node?  Or other information?

Yeah I was thinking of memory reservations. They're under the
reserved-memory node as well as the reservation block, eg:

$ ls -1 /proc/device-tree/reserved-memory/
ibm,firmware-allocs-memory@1000000000
ibm,firmware-allocs-memory@1800000000
ibm,firmware-allocs-memory@39c00000
ibm,firmware-allocs-memory@800000000
ibm,firmware-code@30000000
ibm,firmware-data@31000000
ibm,firmware-heap@30300000
ibm,firmware-stacks@31c00000
ibm,hbrt-code-image@1ffd510000
ibm,hbrt-target-image@1ffd6a0000
ibm,hbrt-vpd-image@1ffd700000
ibm,slw-image@1ffda00000
ibm,slw-image@1ffde00000
ibm,slw-image@1ffe200000
ibm,slw-image@1ffe600000


There's also some new systems where a catalog of PMU events is stored in
flash as a DTB and then stitched into the device tree by skiboot before
booting Linux.

Anyway my point was mainly just that the device tree is not simply a
copy of something in the kernel source.

cheers

Kirill A. Shutemov Nov. 13, 2017, 3:37 a.m. UTC | #22

On Mon, Nov 13, 2017 at 10:06:46AM +1100, Tobin C. Harding wrote:
> On Sun, Nov 12, 2017 at 02:10:07AM +0300, Kirill A. Shutemov wrote:
> > On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> > > Currently we are leaking addresses from the kernel to user space. This
> > > script is an attempt to find some of those leakages. Script parses
> > > `dmesg` output and /proc and /sys files for hex strings that look like
> > > kernel addresses.
> > > 
> > > Only works for 64 bit kernels, the reason being that kernel addresses
> > > on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> > > possible. On 32 kernels we don't have this luxury.
> > 
> > Well, it's not going to work as well as intented on x86 machine with
> > 5-level paging. Kernel address space there starts at 0xff10000000000000.
> > It will still catch pointers to kernel/modules text, but the rest is
> > outside of 0xffff... space. See Documentation/x86/x86_64/mm.txt.
> 
> Thanks for the link. So it looks like we need to refactor the kernel
> address regular expression into a function that takes into account the
> machine architecture and the number of page table levels. We will need
> to add this to the false positive checks also.
> 
> > Not sure if we care. It won't work too for other 64-bit architectrues that
> > have more than 256TB of virtual address space.
> 
> Is this because of the virtual memory map?

On x86 direct mapping is the nearest thing we have to userspace.

> Did you mean 512TB?

No, I mean 256TB.

You have all kernel memory in the range from 0xffff000000000000 to
0xffffffffffffffff if you have 256 TB of virtual address space. If you
hvae more, some thing might be ouside the range.

Tobin Harding Nov. 13, 2017, 4:35 a.m. UTC | #23

On Mon, Nov 13, 2017 at 06:37:28AM +0300, Kirill A. Shutemov wrote:
> On Mon, Nov 13, 2017 at 10:06:46AM +1100, Tobin C. Harding wrote:
> > On Sun, Nov 12, 2017 at 02:10:07AM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Nov 07, 2017 at 09:32:11PM +1100, Tobin C. Harding wrote:
> > > > Currently we are leaking addresses from the kernel to user space. This
> > > > script is an attempt to find some of those leakages. Script parses
> > > > `dmesg` output and /proc and /sys files for hex strings that look like
> > > > kernel addresses.
> > > > 
> > > > Only works for 64 bit kernels, the reason being that kernel addresses
> > > > on 64 bit kernels have 'ffff' as the leading bit pattern making greping
> > > > possible. On 32 kernels we don't have this luxury.
> > > 
> > > Well, it's not going to work as well as intented on x86 machine with
> > > 5-level paging. Kernel address space there starts at 0xff10000000000000.
> > > It will still catch pointers to kernel/modules text, but the rest is
> > > outside of 0xffff... space. See Documentation/x86/x86_64/mm.txt.
> > 
> > Thanks for the link. So it looks like we need to refactor the kernel
> > address regular expression into a function that takes into account the
> > machine architecture and the number of page table levels. We will need
> > to add this to the false positive checks also.
> > 
> > > Not sure if we care. It won't work too for other 64-bit architectrues that
> > > have more than 256TB of virtual address space.
> > 
> > Is this because of the virtual memory map?
> 
> On x86 direct mapping is the nearest thing we have to userspace.
> 
> > Did you mean 512TB?
> 
> No, I mean 256TB.
> 
> You have all kernel memory in the range from 0xffff000000000000 to
> 0xffffffffffffffff if you have 256 TB of virtual address space. If you
> hvae more, some thing might be ouside the range.

Doesn't 4-level paging already limit a system to 64TB of memory? So any
system better equipped than this will use 5-level paging right? If I am
totally talking rubbish please ignore, I'm appreciative that you pointed
out the limitation already. Perhaps we can add a comment to the script

# Script may miss some addresses on machines with more than 256TB of
# memory.

thanks,
Tobin.

Kaiwan N Billimoria Nov. 13, 2017, 5:27 a.m. UTC | #24

On Mon, Nov 13, 2017 at 10:05 AM, Tobin C. Harding <me@tobin.cc> wrote:
> On Mon, Nov 13, 2017 at 06:37:28AM +0300, Kirill A. Shutemov wrote:
>> On Mon, Nov 13, 2017 at 10:06:46AM +1100, Tobin C. Harding wrote:
>> > On Sun, Nov 12, 2017 at 02:10:07AM +0300, Kirill A. Shutemov wrote:
...
>> >
>> > Thanks for the link. So it looks like we need to refactor the kernel
>> > address regular expression into a function that takes into account the
>> > machine architecture and the number of page table levels. We will need
>> > to add this to the false positive checks also.
>> >
>> > > Not sure if we care. It won't work too for other 64-bit architectrues that
>> > > have more than 256TB of virtual address space.
>> >
>> > Is this because of the virtual memory map?
>>
>> On x86 direct mapping is the nearest thing we have to userspace.
>>
>> > Did you mean 512TB?
>>
>> No, I mean 256TB.
>>
>> You have all kernel memory in the range from 0xffff000000000000 to
>> 0xffffffffffffffff if you have 256 TB of virtual address space. If you
>> hvae more, some thing might be ouside the range.
>
> Doesn't 4-level paging already limit a system to 64TB of memory? So any
> system better equipped than this will use 5-level paging right? If I am
> totally talking rubbish please ignore, I'm appreciative that you pointed
> out the limitation already. Perhaps we can add a comment to the script
>
> # Script may miss some addresses on machines with more than 256TB of
> # memory.

I think the 256TB is wrt *virtual* address space not physical RAM.

Also, IMHO, the script should 'transparently' take into account the # of paging
levels (instead of the user needing to pass a parameter).
IOW it should be able to detect the same (say, from the .config file) and act
accordingly - in the sense, the regex's and associated logic would accordingly
differ.

[v4] scripts: add leaking_addresses.pl

Commit Message

Comments

Patch