From patchwork Thu Dec 27 06:33:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yang Zhong X-Patchwork-Id: 10743471 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A917B13BF for ; Thu, 27 Dec 2018 06:37:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 96D89284B9 for ; Thu, 27 Dec 2018 06:37:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8A90228823; Thu, 27 Dec 2018 06:37:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5E548284B9 for ; Thu, 27 Dec 2018 06:37:04 +0000 (UTC) Received: from localhost ([127.0.0.1]:49709 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gcPHz-0004pX-Dc for patchwork-qemu-devel@patchwork.kernel.org; Thu, 27 Dec 2018 01:37:03 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54679) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gcPGM-0002sX-3e for qemu-devel@nongnu.org; Thu, 27 Dec 2018 01:35:27 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gcPG3-00043f-UP for qemu-devel@nongnu.org; Thu, 27 Dec 2018 01:35:14 -0500 Received: from mga11.intel.com ([192.55.52.93]:57851) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gcPG3-00041J-Dy for qemu-devel@nongnu.org; Thu, 27 Dec 2018 01:35:03 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Dec 2018 22:35:02 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,404,1539673200"; d="scan'208";a="103617244" Received: from he.bj.intel.com ([10.238.157.85]) by orsmga006.jf.intel.com with ESMTP; 26 Dec 2018 22:35:00 -0800 From: Yang Zhong To: qemu-devel@nongnu.org Date: Thu, 27 Dec 2018 14:33:57 +0800 Message-Id: <20181227063419.12981-4-yang.zhong@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181227063419.12981-1-yang.zhong@intel.com> References: <20181227063419.12981-1-yang.zhong@intel.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.93 Subject: [Qemu-devel] [RFC PATCH 03/25] minikconfig: add parser skeleton X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: yang.zhong@intel.com, peter.maydell@linaro.org, thuth@redhat.com, sameo@linux.intel.com, pbonzini@redhat.com, ehabkost@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Paolo Bonzini This implements a scanner and recursive descent parser for Kconfig-like configuration files. The only "action" of the parser is for now to detect undefined variables and process include files. The main differences between Kconfig and this are: * only the "bool" type is supported * variables can only be defined once * choices are not supported (but they could be added as syntactic sugar for multiple Boolean values) * menus and other graphical concepts (prompts, help text) are not supported * assignments ("CONFIG_FOO=y", "CONFIG_FOO=n") are parsed as part of the Kconfig language, not as a separate file. The idea was originally by Ákos Kovács, but I could not find his implementation so I had to redo it. Signed-off-by: Paolo Bonzini --- scripts/minikconf.py | 425 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 425 insertions(+) create mode 100644 scripts/minikconf.py diff --git a/scripts/minikconf.py b/scripts/minikconf.py new file mode 100644 index 0000000000..9e33d80514 --- /dev/null +++ b/scripts/minikconf.py @@ -0,0 +1,425 @@ +# +# Mini-Kconfig parser +# +# Copyright (c) 2015 Red Hat Inc. +# +# Authors: +# Paolo Bonzini +# +# This work is licensed under the terms of the GNU GPL, version 2 +# or, at your option, any later version. See the COPYING file in +# the top-level directory. + +import os +import sys + +__all__ = [ 'KconfigParserError', 'KconfigData', 'KconfigParser' ] + +# ------------------------------------------- +# KconfigData implements the Kconfig semantics. For now it can only +# detect undefined symbols, i.e. symbols that were referenced in +# assignments or dependencies but were not declared with "config FOO". +# +# Semantic actions are represented by methods called do_*. The do_var +# method return the semantic value of a variable (which right now is +# just its name). +# ------------------------------------------- + +class KconfigData: + def __init__(self): + self.previously_included = [] + self.incl_info = None + self.defined_vars = set() + self.referenced_vars = set() + + # semantic analysis ------------- + + def check_undefined(self): + undef = False + for i in self.referenced_vars: + if not (i in self.defined_vars): + print "undefined symbol %s" % (i) + undef = True + return undef + + # semantic actions ------------- + + def do_declaration(self, var): + if (var in self.defined_vars): + raise Exception('variable "' + var + '" defined twice') + + self.defined_vars.add(var) + + # var is a string with the variable's name. + # + # For now this just returns the variable's name itself. + def do_var(self, var): + self.referenced_vars.add(var) + return var + + def do_assignment(self, var, val): + pass + + def do_default(self, var, val, cond=None): + pass + + def do_depends_on(self, var, expr): + pass + + def do_select(self, var, symbol, cond=None): + pass + +# ------------------------------------------- +# KconfigParser implements a recursive descent parser for (simplified) +# Kconfig syntax. The static parse() method returns a KconfigData +# object, from which it is possible to build a BDD and compute a +# configuration +# ------------------------------------------- + +# tokens table +TOKENS = {} +TOK_NONE = -1 +TOK_LPAREN = 0; TOKENS[TOK_LPAREN] = '"("'; +TOK_RPAREN = 1; TOKENS[TOK_RPAREN] = '")"'; +TOK_EQUAL = 2; TOKENS[TOK_EQUAL] = '"="'; +TOK_AND = 3; TOKENS[TOK_AND] = '"&&"'; +TOK_OR = 4; TOKENS[TOK_OR] = '"||"'; +TOK_NOT = 5; TOKENS[TOK_NOT] = '"!"'; +TOK_DEPENDS = 6; TOKENS[TOK_DEPENDS] = '"depends"'; +TOK_ON = 7; TOKENS[TOK_ON] = '"on"'; +TOK_SELECT = 8; TOKENS[TOK_SELECT] = '"select"'; +TOK_CONFIG = 9; TOKENS[TOK_CONFIG] = '"config"'; +TOK_DEFAULT = 10; TOKENS[TOK_DEFAULT] = '"default"'; +TOK_Y = 11; TOKENS[TOK_Y] = '"y"'; +TOK_N = 12; TOKENS[TOK_N] = '"n"'; +TOK_SOURCE = 13; TOKENS[TOK_SOURCE] = '"source"'; +TOK_BOOL = 14; TOKENS[TOK_BOOL] = '"bool"'; +TOK_IF = 15; TOKENS[TOK_IF] = '"if"'; +TOK_ID = 16; TOKENS[TOK_ID] = 'identifier'; +TOK_EOF = 17; TOKENS[TOK_EOF] = 'end of file'; + +class KconfigParserError(Exception): + def __init__(self, parser, msg, tok=None): + self.loc = parser.location() + tok = tok or parser.tok + if tok != TOK_NONE: + msg = '%s before %s' %(msg, TOKENS[tok]) + self.msg = msg + + def __str__(self): + return "%s: %s" % (self.loc, self.msg) + +class KconfigParser: + @classmethod + def parse(self, fp): + data = KconfigData() + parser = KconfigParser(data) + parser.parse_file(fp) + if data.check_undefined(): + raise KconfigParserError(parser, "there were undefined symbols") + + return data + + def __init__(self, data): + self.data = data + + def parse_file(self, fp): + self.abs_fname = os.path.abspath(fp.name) + self.fname = fp.name + self.data.previously_included.append(self.abs_fname) + self.src = fp.read() + if self.src == '' or self.src[-1] != '\n': + self.src += '\n' + self.cursor = 0 + self.line = 1 + self.line_pos = 0 + self.get_token() + self.parse_config() + + # file management ----- + + def error_path(self): + inf = self.data.incl_info + res = "" + while inf: + res = ("In file included from %s:%d:\n" % (inf['file'], + inf['line'])) + res + inf = inf['parent'] + return res + + def location(self): + col = 1 + for ch in self.src[self.line_pos:self.pos]: + if ch == '\t': + col += 8 - ((col - 1) % 8) + else: + col += 1 + return '%s%s:%d:%d' %(self.error_path(), self.fname, self.line, col) + + def do_include(self, include): + incl_abs_fname = os.path.join(os.path.dirname(self.abs_fname), + include) + # catch inclusion cycle + inf = self.data.incl_info + while inf: + if incl_abs_fname == os.path.abspath(inf['file']): + raise KconfigParserError(self, "Inclusion loop for %s" + % include) + inf = inf['parent'] + + # skip multiple include of the same file + if incl_abs_fname in self.data.previously_included: + return + try: + fp = open(incl_abs_fname, 'r') + except IOError, e: + raise KconfigParserError(self, + '%s: %s' % (e.strerror, include)) + + inf = self.data.incl_info + self.data.incl_info = { 'file': self.fname, 'line': self.line, + 'parent': inf } + KconfigParser(self.data).parse_file(fp) + self.data.incl_info = inf + + # recursive descent parser ----- + + # y_or_n: Y | N + def parse_y_or_n(self): + if self.tok == TOK_Y: + self.get_token() + return True + if self.tok == TOK_N: + self.get_token() + return False + raise KconfigParserError(self, 'Expected "y" or "n"') + + # var: ID + def parse_var(self): + if self.tok == TOK_ID: + val = self.val + self.get_token() + return self.data.do_var(val) + else: + raise KconfigParserError(self, 'Expected identifier') + + # assignment_var: ID (starting with "CONFIG_") + def parse_assignment_var(self): + if self.tok == TOK_ID: + val = self.val + if not val.startswith("CONFIG_"): + raise KconfigParserError(self, + 'Expected identifier starting with "CONFIG_"', TOK_NONE) + self.get_token() + return self.data.do_var(val[7:]) + else: + raise KconfigParserError(self, 'Expected identifier') + + # assignment: var EQUAL y_or_n + def parse_assignment(self): + var = self.parse_assignment_var() + if self.tok != TOK_EQUAL: + raise KconfigParserError(self, 'Expected "="') + self.get_token() + self.data.do_assignment(var, self.parse_y_or_n()) + + # primary: NOT primary + # | LPAREN expr RPAREN + # | var + def parse_primary(self): + if self.tok == TOK_NOT: + self.get_token() + self.parse_primary() + elif self.tok == TOK_LPAREN: + self.get_token() + self.parse_expr() + if self.tok != TOK_RPAREN: + raise KconfigParserError(self, 'Expected ")"') + self.get_token() + elif self.tok == TOK_ID: + self.parse_var() + else: + raise KconfigParserError(self, 'Expected "!" or "(" or identifier') + + # disj: primary (OR primary)* + def parse_disj(self): + self.parse_primary() + while self.tok == TOK_OR: + self.get_token() + self.parse_primary() + + # expr: disj (AND disj)* + def parse_expr(self): + self.parse_disj() + while self.tok == TOK_AND: + self.get_token() + self.parse_disj() + + # condition: IF expr + # | empty + def parse_condition(self): + if self.tok == TOK_IF: + self.get_token() + return self.parse_expr() + else: + return None + + # property: DEFAULT y_or_n condition + # | DEPENDS ON expr + # | SELECT var condition + # | BOOL + def parse_property(self, var): + if self.tok == TOK_DEFAULT: + self.get_token() + val = self.parse_y_or_n() + cond = self.parse_condition() + self.data.do_default(var, val, cond) + elif self.tok == TOK_DEPENDS: + self.get_token() + if self.tok != TOK_ON: + raise KconfigParserError(self, 'Expected "on"') + self.get_token() + self.data.do_depends_on(var, self.parse_expr()) + elif self.tok == TOK_SELECT: + self.get_token() + symbol = self.parse_var() + cond = self.parse_condition() + self.data.do_select(var, symbol, cond) + elif self.tok == TOK_BOOL: + self.get_token() + else: + raise KconfigParserError(self, 'Error in recursive descent?') + + # properties: properties property + # | /* empty */ + def parse_properties(self, var): + had_default = False + while self.tok == TOK_DEFAULT or self.tok == TOK_DEPENDS or \ + self.tok == TOK_SELECT or self.tok == TOK_BOOL: + self.parse_property(var) + self.data.do_default(var, False) + + # for nicer error message + if self.tok != TOK_SOURCE and self.tok != TOK_CONFIG and \ + self.tok != TOK_ID and self.tok != TOK_EOF: + raise KconfigParserError(self, 'expected "source", "config", identifier, ' + + '"default", "depends on" or "select"') + + # declaration: config var properties + def parse_declaration(self): + if self.tok == TOK_CONFIG: + self.get_token() + var = self.parse_var() + self.data.do_declaration(var) + self.parse_properties(var) + else: + raise KconfigParserError(self, 'Error in recursive descent?') + + # clause: SOURCE + # | declaration + # | assignment + def parse_clause(self): + if self.tok == TOK_SOURCE: + val = self.val + self.get_token() + self.do_include(val) + elif self.tok == TOK_CONFIG: + self.parse_declaration() + elif self.tok == TOK_ID: + self.parse_assignment() + else: + raise KconfigParserError(self, 'expected "source", "config" or identifier') + + # config: clause+ EOF + def parse_config(self): + while self.tok != TOK_EOF: + self.parse_clause() + return self.data + + # scanner ----- + + def get_token(self): + while True: + self.tok = self.src[self.cursor] + self.pos = self.cursor + self.cursor += 1 + + self.val = None + self.tok = self.scan_token() + if self.tok is not None: + return + + def check_keyword(self, rest): + if not self.src.startswith(rest, self.cursor): + return False + length = len(rest) + if self.src[self.cursor + length].isalnum() or self.src[self.cursor + length] == '|': + return False + self.cursor += length + return True + + def scan_token(self): + if self.tok == '#': + self.cursor = self.src.find('\n', self.cursor) + return None + elif self.tok == '=': + return TOK_EQUAL + elif self.tok == '(': + return TOK_LPAREN + elif self.tok == ')': + return TOK_RPAREN + elif self.tok == '&' and self.src[self.pos+1] == '&': + self.cursor += 1 + return TOK_AND + elif self.tok == '|' and self.src[self.pos+1] == '|': + self.cursor += 1 + return TOK_OR + elif self.tok == '!': + return TOK_NOT + elif self.tok == 'd' and self.check_keyword("epends"): + return TOK_DEPENDS + elif self.tok == 'o' and self.check_keyword("n"): + return TOK_ON + elif self.tok == 's' and self.check_keyword("elect"): + return TOK_SELECT + elif self.tok == 'c' and self.check_keyword("onfig"): + return TOK_CONFIG + elif self.tok == 'd' and self.check_keyword("efault"): + return TOK_DEFAULT + elif self.tok == 'b' and self.check_keyword("ool"): + return TOK_BOOL + elif self.tok == 'i' and self.check_keyword("f"): + return TOK_IF + elif self.tok == 'y' and self.check_keyword(""): + return TOK_Y + elif self.tok == 'n' and self.check_keyword(""): + return TOK_N + elif (self.tok == 's' and self.check_keyword("ource")) or \ + self.tok == 'i' and self.check_keyword("nclude"): + # source FILENAME + # include FILENAME + while self.src[self.cursor].isspace(): + self.cursor += 1 + start = self.cursor + self.cursor = self.src.find('\n', self.cursor) + self.val = self.src[start:self.cursor] + return TOK_SOURCE + elif self.tok.isalpha(): + # identifier + while self.src[self.cursor].isalnum() or self.src[self.cursor] == '_': + self.cursor += 1 + self.val = self.src[self.pos:self.cursor] + return TOK_ID + elif self.tok == '\n': + if self.cursor == len(self.src): + return TOK_EOF + self.line += 1 + self.line_pos = self.cursor + elif not self.tok.isspace(): + raise KconfigParserError(self, 'Stray "%s"' % self.tok) + + return None + +if __name__ == '__main__': + fname = len(sys.argv) > 1 and sys.argv[1] or 'Kconfig.test' + KconfigParser.parse(open(fname, 'r'))