diff mbox

[1/5] scripts: Add mkstrerror.sh

Message ID 1379459317-13046-2-git-send-email-daniel.santos@pobox.com (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Santos Sept. 17, 2013, 11:08 p.m. UTC
This is a simple bash script that parses our errno*.h files and formats
them into the error_strings.h header that our strerror and strerror_name
functions will use later.

First it looks at $ARCH and examines the errno.h files and figures out
which to use. Then, it parses their error definitions into a
pipe-character delimited table with the fields name, number and
descrption from the comments. Finally, it does some consistency checks
and output them as usable C code.

On my Phenom it takes between 1.2 and 2 seconds to run depending upon
the arch. There are a few arch-specific conditions that the script has
to manage however:
* alpha: EAGAIN is redefined as 35 while EDEADLK is defined as 11
* mips: EDQUOT is 1133, which overlaps the internal error range
  (512-529), so I'm just removing it.

This is done in a little case $ARCH statement in parse_errors().

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
---
 scripts/mkstrerror.sh | 209 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 209 insertions(+)
 create mode 100755 scripts/mkstrerror.sh

Comments

David Howells Sept. 18, 2013, 11:38 a.m. UTC | #1
danielfsantos@att.net wrote:

> This is a simple bash script that parses our errno*.h files and formats
> them into the error_strings.h header that our strerror and strerror_name
> functions will use later.

I presume you haven't tried building with a "make O=foo" build directory?  I
see:

      /bin/sh: /data/fs/linux-2.6-fscache/include/generated/error_strings.h: No such file or directory

when I try it.

Looking in your generated error_strings.h file, I see:

	static const struct error_strings {
		const unsigned first;
		const unsigned last;
		const unsigned count;
		const char * const desc[2];
	} error_strings[2] = {
		{
			.first = 0,
			.last  = 133,
			.count = 134,
			.desc  = {
	#ifdef CONFIG_STRERROR_NAME
		"\0"                            /* 0 */
		"EPERM\0"                       /* 1 */
		"ENOENT\0"                      /* 2 */
		"ESRCH\0"                       /* 3 */
		"EINTR\0"                       /* 4 */
	...
		"ERFKILL\0"                     /* 132 */
		"EHWPOISON\0"                   /* 133 */,
	#else
		NULL,
	#endif /* CONFIG_STRERROR_NAME */

	#ifdef CONFIG_STRERROR
		"\0"                                                    /* 0 */
		"Operation not permitted\0"                             /* 1 */
		"No such file or directory\0"                           /* 2 */
		"No such process\0"                                     /* 3 */
	...
	        "State not recoverable\0"                               /* 131 */
		"Operation not possible due to RF-kill\0"               /* 132 */
		"Memory page has hardware error\0"                      /* 133 */,
	#else
		NULL,
	#endif /* CONFIG_STRERROR */
			},
		}, {
			.first = 512,
			.last  = 529,
			.count = 18,
			.desc  = {
	#ifdef CONFIG_STRERROR_NAME
		"ERESTARTSYS\0"                 /* 512 */
		"ERESTARTNOINTR\0"              /* 513 */
	...
		"Request initiated, but will not complete before timeout\0"/* 528 */
		"iocb queued, will get completion event\0"              /* 529 */,
	#else
		NULL,
	#endif /* CONFIG_STRERROR */
			},
		}
	};


Some thoughts for you:

 (1) Why are you double-NUL'ing all your strings?  (see the \0 in the strings)

 (2) Storing the leading 'E' for each symbol is redundant as you can add that
     later so you might want to discard it.

 (3) You are storing a pointer to the symbolic name for each error.  On a
     64-bit machine, that's 8 bytes.  If you drop the leading 'E' and the
     trailing NUL, most symbols will fit into an 8 character slot saving you
     the cost of a pointer.

     Okay, you'll have to truncate some of the names (ERESTARTNOINTR ->
     RESNOINT for example) but probably not that many.  Run this:

	grep '["]E[A-Z0-9]' include/generated/error_strings.h |
           cut '-d\' -f1 | cut -dE -f2- | cut -c1-8

     and you'll see what I mean.  Most of them are still recognisable,
     particularly once you stick the 'E' back on the front.  Piping the output
     of that through "wc -l", I get:

	147

     which means the entire string table could be squeezed into 8 * 147 or
     1176 bytes with only one pointer and one count required for each segment
     of the table (you generate two segments).

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Howells Sept. 18, 2013, 11:55 a.m. UTC | #2
David Howells <dhowells@redhat.com> wrote:

>  (1) Why are you double-NUL'ing all your strings?  (see the \0 in the strings)

Ah...  I see what you're doing.  I missed the fact that you don't have a comma
after each string.

>  (3) You are storing a pointer to the symbolic name for each error.  On a
>      64-bit machine, that's 8 bytes.  If you drop the leading 'E' and the
>      trailing NUL, most symbols will fit into an 8 character slot saving you
>      the cost of a pointer.
>      ...

In which case, you can ignore this too.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Santos Sept. 18, 2013, 10:27 p.m. UTC | #3
On 09/18/2013 06:55 AM, David Howells wrote:
> David Howells <dhowells@redhat.com> wrote:
>
>>   (1) Why are you double-NUL'ing all your strings?  (see the \0 in the strings)
> Ah...  I see what you're doing.  I missed the fact that you don't have a comma
> after each string.

Yeah, I was trying to format the code so that it's efficient and still 
easy to read, but then there's no comma where we normally expect one.  
It's the best middle ground solution I could come up with.

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Santos Sept. 18, 2013, 10:43 p.m. UTC | #4
On 09/18/2013 06:38 AM, David Howells wrote:
> danielfsantos@att.net wrote:
>
>> This is a simple bash script that parses our errno*.h files and formats
>> them into the error_strings.h header that our strerror and strerror_name
>> functions will use later.
> I presume you haven't tried building with a "make O=foo" build directory?  I
> see:
>
>        /bin/sh: /data/fs/linux-2.6-fscache/include/generated/error_strings.h: No such file or directory
>
> when I try it.

No, indeed I haven't, thanks. I'm going to need help figuring out the 
correct way to put this in the build because the current definitely 
isn't correct.  From what I could tell from digging into the build last 
night (which I've never carefully analyzed), this should be added 
somewhere in one of the scripts/Makefile*s and not in the root makefile 
(as I have done), or maybe even in lib/Makefile.

>
>   (2) Storing the leading 'E' for each symbol is redundant as you can add that
>       later so you might want to discard it.

This is a good thought, that's 150-ish bytes, but it will introduce some 
new challenges.  Currently, strerror() functions exactly like "man 3p 
strerror", except that it returns a pointer to a const string like the 
POSIX specification should have done. :)  So I directly return a pointer 
to the string within the .rodata section of the object file (same for 
strerror_name).  Omitting the 'E' would require I work up another 
solution requiring a return buffer or some such, thus increasing 
complexity and possibly loosing that savings to code.

However, if we wanted to squeze that much more out of it, then we could 
7-th bit terminate the strings and save another 150-ish bytes on null 
terminators or go even further and use some encoding scheme (maybe 32 
bits per character)?  At 2k for both the error names and the code, I 
figured it was already pretty low, but if there are some existing 
libraries in the kernel we could use to do this and not further bloat 
the text size for decompression then I wouldn't be opposed, so long as 
it makes sense.  And since we're dealing with error conditions which 
don't happen often, speed shouldn't be a concern.

Thanks!
Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Santos Sept. 19, 2013, 10:35 p.m. UTC | #5
On 09/18/2013 06:38 AM, David Howells wrote:
> danielfsantos@att.net wrote:
>
>> This is a simple bash script that parses our errno*.h files and formats
>> them into the error_strings.h header that our strerror and strerror_name
>> functions will use later.
> I presume you haven't tried building with a "make O=foo" build directory?  I
> see:
>
>        /bin/sh: /data/fs/linux-2.6-fscache/include/generated/error_strings.h: No such file or directory
>
> when I try it.

Hmm, I cannot reproduce the error. :( I'm using next-20130919 currently 
(x86_64), and if I try to just "make O=lib" it fails w/o my patches.  
The only file that should depend upon error_strings.h is lib/string.c.

/home/daniel/proj/kernel/git
(daniel@loudmouth)$ make mrproper -j4
/home/daniel/proj/kernel/git
(daniel@loudmouth)$ make defconfig -j4
   HOSTCC  scripts/basic/fixdep
   HOSTCC  scripts/kconfig/conf.o
   SHIPPED scripts/kconfig/zconf.tab.c
   SHIPPED scripts/kconfig/zconf.lex.c
   SHIPPED scripts/kconfig/zconf.hash.c
   HOSTCC  scripts/kconfig/zconf.tab.o
   HOSTLD  scripts/kconfig/conf
*** Default configuration is based on 'x86_64_defconfig'
#
# configuration written to .config
#
/home/daniel/proj/kernel/git
(daniel@loudmouth)$ make O=lib -j4
   HOSTCC  scripts/basic/fixdep
   HOSTCC  scripts/kconfig/conf.o
   HOSTCC  scripts/kconfig/zconf.tab.o
   HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf --silentoldconfig Kconfig
***
*** Configuration file ".config" not found!
***
*** Please run some configurator (e.g. "make oldconfig" or
*** "make menuconfig" or "make xconfig").
***
make[3]: *** [silentoldconfig] Error 1
make[2]: *** [silentoldconfig] Error 2
make[2]: Nothing to be done for `all'.
make[1]: *** No rule to make target `include/config/auto.conf', needed 
by `include/config/kernel.release'.  Stop.
make[1]: *** Waiting for unfinished jobs....
make: *** [sub-make] Error 2

Is this some other subtle bug currently in -next or are you just 
supposed to run "make prepare" first? I injected the generation of 
error_names.h as a dependency of prepare1 (rightly or wrongly).  I'm 
still studying the kbuild process to try to find a better place for it 
or to at least clean up the way it's generated.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Santos Sept. 19, 2013, 10:53 p.m. UTC | #6
On 09/19/2013 05:35 PM, Daniel Santos wrote:
>
> Hmm, I cannot reproduce the error. :( I'm using next-20130919 
> currently (x86_64), and if I try to just "make O=lib" it fails w/o my 
> patches.  The only file that should depend upon error_strings.h is 
> lib/string.c.

Ahh! I've never seen the "make O=foo" before and just presumed it was an 
alternate directive to specify a directory of the source tree to build.  
I'm certainly glad to learn that the kernel supports out of tree builds! :)

So I am getting the error now and this is because the script expects to 
run in the root of the source tree. This will be an easy fix and I think 
I've figured out the correct place to put the make target for it as well.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/scripts/mkstrerror.sh b/scripts/mkstrerror.sh
new file mode 100755
index 0000000..e7842fc
--- /dev/null
+++ b/scripts/mkstrerror.sh
@@ -0,0 +1,209 @@ 
+#!/bin/bash
+
+# Generate lib/error_names.h by parsing errno*.h files
+
+typeset -i first=0
+typeset -i last=0
+typeset -i count=0
+
+die() {
+	echo "ERROR: $*" >&2
+	exit 1
+}
+
+parse_errors() {
+	egrep -h '^#define\s+\w+\s+[0-9]+' "$@" | # Extract error definitions
+	case "${ARCH}" in			  # Apply per-arch fixups
+		alpha)	egrep -v 'EAGAIN\s+11';;
+		mips)	egrep -v 'EDQUOT\s+1133';;
+		*)	cat;;
+	esac |
+	perl -pe '
+		# Replace missing comments with "/* \0 */"
+		s:(\d+)\s*$:$1 /\* \\0 \*/\n:g;
+
+		# Parse into pipe-delimited table
+		s:^#define\s+(\w+)\s+(\w+)\s+/\*\s+(.*?)\s+\*/:$1|$2|$3:g;
+
+		# Since we will be feeding this to printf later, double any
+		# occurrence of %
+		s:%:%%:g;
+	' |
+	sort '-t|' -nk2				  # Sort by error number
+}
+
+tab_align() {
+	typeset -i tabs=$1
+	typeset -i next=0
+	typeset -i i
+	typeset -i len
+	local s
+	shift
+
+	while (($#)); do
+		for (( i = 0; i < next; ++i)); do
+			printf "\t"
+		done
+
+		printf "$1"
+
+		# Get un-escaped string size
+		s=$(printf "$1")
+		(( next = tabs - (${#s} ) / 8 ))
+		shift
+	done
+}
+
+print_errors() {
+	typeset -i next_err=$1
+	typeset -i tabs=$2
+	typeset -i errnum
+	local errname
+	while read; do
+		errnum=${REPLY/*|/}
+		errname=${REPLY/|*/}
+
+		(( next_err <= errnum )) || die "errors out of order :("
+
+		# Fill in any gaps with empty names
+		while (( next_err < errnum )); do
+			printf "\t%s\n" "$(tab_align ${tabs} '"\\0"' "/* ${next_err} */\n")"
+			(( ++next_err ))
+		done
+
+		printf "\t%s\n" "$(tab_align ${tabs} "\"${errname}"'\\0"' "/* ${errnum} */\n")"
+		(( ++next_err ))
+	done
+
+}
+
+count_and_validate() {
+	local names="$1"
+	typeset -i expected_count
+
+	first=$(echo "${names}" | head -1 | awk '{print $3}')
+	last=$(echo "${names}" | tail -1 | awk '{print $3}')
+	count=$(echo "${names}" | wc -l)
+	expected_count=$((last - first + 1))
+
+	if (( count != expected_count )); then
+		echo "ERROR: count = ${count}, expected ${expected_count}"
+		return 1;
+	fi
+	return 0;
+}
+
+find_arch_errno() {
+	for d in $(find arch/${ARCH}/include -name errno.h); do
+		# If it just includes asm-generic/errno.h then skip it
+		if ! grep '#include <asm-generic/errno.h>' $d > /dev/null; then
+			arch_errno="$d"
+			return;
+		fi
+	done
+
+	# Otherwise, no arch-specific errno
+	arch_errno=include/uapi/asm-generic/errno.h
+}
+
+find_arch_errno
+
+base_err_table=$(
+	parse_errors include/uapi/asm-generic/errno-base.h ${arch_errno}
+) || die
+
+internal_err_table=$(
+	parse_errors include/linux/errno.h
+) || die
+
+base_err_names=$(
+	echo "${base_err_table}" |
+	perl -pe 's:(\d+)\|.*:$1:g;'|
+	print_errors 0 4
+) || die
+
+count_and_validate "${base_err_names}" || die
+typeset -i base_err_first=${first}
+typeset -i base_err_last=${last}
+typeset -i base_err_count=${count}
+
+int_err_names=$(
+	echo "${internal_err_table}" |
+	perl -pe 's:(\d+)\|.*:$1:g;'|
+	print_errors 512 4
+) || die
+
+count_and_validate "${int_err_names}" || die
+typeset -i int_err_first=${first}
+typeset -i int_err_last=${last}
+typeset -i int_err_count=${count}
+
+
+cat << asdf
+/* DO NOT EDIT!
+ *
+ * This file is automatically generated by scripts/mkstrerror.sh
+ */
+
+#ifndef _KERNEL_STRERROR_H
+#define _KERNEL_STRERROR_H
+
+#if defined(CONFIG_STRERROR) || defined(CONFIG_STRERROR_NAME)
+
+static const struct error_strings {
+	const unsigned first;
+	const unsigned last;
+	const unsigned count;
+	const char * const desc[2];
+} error_strings[2] = {
+	{
+		.first = ${base_err_first},
+		.last  = ${base_err_last},
+		.count = ${base_err_count},
+		.desc  = {
+#ifdef CONFIG_STRERROR_NAME
+${base_err_names},
+#else
+	NULL,
+#endif /* CONFIG_STRERROR_NAME */
+
+#ifdef CONFIG_STRERROR
+$(
+	echo "${base_err_table}" |
+	perl -pe 's:.*?(\d+)\|(.*):$2|$1:g;'|
+	print_errors 0 7
+),
+#else
+	NULL,
+#endif /* CONFIG_STRERROR */
+		},
+	}, {
+		.first = ${int_err_first},
+		.last  = ${int_err_last},
+		.count = ${int_err_count},
+		.desc  = {
+#ifdef CONFIG_STRERROR_NAME
+${int_err_names},
+#else
+	NULL,
+#endif /* CONFIG_STRERROR_NAME */
+
+
+#ifdef CONFIG_STRERROR
+$(
+	echo "${internal_err_table}" |
+	perl -pe 's:.*?(\d+)\|(.*):$2|$1:g;'|
+	print_errors 512 7
+),
+#else
+	NULL,
+#endif /* CONFIG_STRERROR */
+		},
+	}
+};
+
+#endif /* defined(CONFIG_STRERROR) || defined(CONFIG_STRERROR_NAME) */
+
+#endif /* _KERNEL_STRERROR_H */
+asdf
+