diff mbox series

allow show_token() on TOKEN_ZERO_IDENT

Message ID 20220607125441.36757-1-lucvoo@kernel.org (mailing list archive)
State Mainlined, archived
Headers show
Series allow show_token() on TOKEN_ZERO_IDENT | expand

Commit Message

Luc Van Oostenryck June 7, 2022, 12:54 p.m. UTC
From: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>

TOKEN_ZERO_IDENTs are created during the evaluation of pre-processor
expressions but which otherwise are normal idents and  were first tokenized
as TOKEN_IDENTs.

As such, they could perfectly be displayed by show_token() but are not.
So, in error messages they are displayed as "unhandled token type '4'",
which is not at all informative.

Fix this by letting show_token() process them like usual TOKEN_IDENTs.
Idem for quote_token().

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 tokenize.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Linus Torvalds June 7, 2022, 6:26 p.m. UTC | #1
On Tue, Jun 7, 2022 at 5:55 AM Luc Van Oostenryck <lucvoo@kernel.org> wrote:
>
> TOKEN_ZERO_IDENTs are created during the evaluation of pre-processor
> expressions but which otherwise are normal idents and  were first tokenized
> as TOKEN_IDENTs.
>
> As such, they could perfectly be displayed by show_token() but are not.
> So, in error messages they are displayed as "unhandled token type '4'",
> which is not at all informative.
>
> Fix this by letting show_token() process them like usual TOKEN_IDENTs.
> Idem for quote_token().

Ack.

I do wonder if it should be marked somehow as being that special case.
The main reason for 'show_token()' is debugging, after all, and
TOKEN_ZERO_IDENT does have magical properties in how it either
silently expands to the constant '0', or it generates a warning about
undefined preprocessor symbol.

But considering that we've apparently reported it as "unhandled token
type '4'" since 2005, I guess it's not exactly a big deal.

           Linus
Luc Van Oostenryck June 7, 2022, 8:48 p.m. UTC | #2
On Tue, Jun 07, 2022 at 11:26:36AM -0700, Linus Torvalds wrote:
> On Tue, Jun 7, 2022 at 5:55 AM Luc Van Oostenryck <lucvoo@kernel.org> wrote:
> >
> > TOKEN_ZERO_IDENTs are created during the evaluation of pre-processor
> > expressions but which otherwise are normal idents and  were first tokenized
> > as TOKEN_IDENTs.
> >
> > As such, they could perfectly be displayed by show_token() but are not.
> > So, in error messages they are displayed as "unhandled token type '4'",
> > which is not at all informative.
> >
> > Fix this by letting show_token() process them like usual TOKEN_IDENTs.
> > Idem for quote_token().
> 
> Ack.
> 
> I do wonder if it should be marked somehow as being that special case.
> The main reason for 'show_token()' is debugging, after all, and
> TOKEN_ZERO_IDENT does have magical properties in how it either
> silently expands to the constant '0', or it generates a warning about
> undefined preprocessor symbol.
> 
> But considering that we've apparently reported it as "unhandled token
> type '4'" since 2005, I guess it's not exactly a big deal.

Yes, I first thought to do so but then choose not because I could not
convince myself that its special property was irrelevant in warning/error
messages. It looks to me more as an internal thing, more semantical than
lexical, and a non-faithful representation would be confusing in messages.

For context, the input text I had (from GCC's testsuite) was:
	#define empty
	#if empty#cpu(m68k)
	#endif
and the error message sparse issued was:
	error: garbage at end: #unhandled token type '4' (unhandled token type '4' )
with this patch it's:
	error: garbage at end: #cpu(m68k)
 
-- Luc
diff mbox series

Patch

diff --git a/tokenize.c b/tokenize.c
index ea7105438270..fdaea370cc48 100644
--- a/tokenize.c
+++ b/tokenize.c
@@ -201,6 +201,7 @@  const char *show_token(const struct token *token)
 		return "end-of-input";
 
 	case TOKEN_IDENT:
+	case TOKEN_ZERO_IDENT:
 		return show_ident(token->ident);
 
 	case TOKEN_NUMBER:
@@ -259,6 +260,7 @@  const char *quote_token(const struct token *token)
 		return "syntax error";
 
 	case TOKEN_IDENT:
+	case TOKEN_ZERO_IDENT:
 		return show_ident(token->ident);
 
 	case TOKEN_NUMBER: