diff mbox

Designated initializers for fields in anonymous structs and unions

Message ID CA+55aFwfN1L6U=6+_a-Ue_Z6tPNfphKkpidtdD06zs+tHLPyJQ@mail.gmail.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Linus Torvalds Aug. 2, 2014, 1:16 a.m. UTC
On Thu, Jul 31, 2014 at 7:41 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Hmm. It exposes other problems.
>
> For example, the fact that we look up the identifier now exposes the
> fact that "first_subobject()" creates this dummy "EXPR_IDENTIFIER"
> expression that has no identifier. That will then cause
> "find_identifier()" to fail, which in turn causes convert_ident() to
> fail. Resulting in the parse tree having remaining EXPR_IDENTIFIER
> nodes, which will later result in "unknown expression (25,0)" errors.
>
> That's easy enough to fix with just adding the proper initializer, ie
>
>     new->expr_ident = field->ident;
>
> to first_subobject().

Actually, sadly, while that worked for my test-cases (and for Josh's
case), it turns out to really *not* work in general: the dummy
EXPR_IDENTIFIER that we create may end up referring to a unnamed
union, which doesn't *have* an ident associated with it.

This only really triggers when you mix named initializers with then
relying on the whole "initialize the next one without a name", but
that does happen. And it fundamentally breaks that whole "use
find_identifier()" approach.

So convert_ident() really cannot use find_indent(), and really does
have to use "sym->offset". But that doesn't work for anonymous union
members, because that offset was the offset within the anonymous
union, not the parent structure.

However, the fix is "simple". We can just say that anonymous
union/structure member offsets have to be the offsets within the
parent union instead, and fix things up. So how about a patch like
this *instead* of the previous patch..

                 Linus
evaluate.c |  2 +-
 symbol.c   | 16 ++++++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

Comments

Christopher Li Aug. 2, 2014, 5:16 a.m. UTC | #1
On Fri, Aug 1, 2014 at 6:16 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Actually, sadly, while that worked for my test-cases (and for Josh's
> case), it turns out to really *not* work in general: the dummy
> EXPR_IDENTIFIER that we create may end up referring to a unnamed
> union, which doesn't *have* an ident associated with it.
>
> This only really triggers when you mix named initializers with then
> relying on the whole "initialize the next one without a name", but
> that does happen. And it fundamentally breaks that whole "use
> find_identifier()" approach.
>
> So convert_ident() really cannot use find_indent(), and really does
> have to use "sym->offset". But that doesn't work for anonymous union
> members, because that offset was the offset within the anonymous
> union, not the parent structure.
>
> However, the fix is "simple". We can just say that anonymous
> union/structure member offsets have to be the offsets within the
> parent union instead, and fix things up. So how about a patch like
> this *instead* of the previous patch..

Sorry I am late for the party. This approach assume the anonymous
structure is only referenced inside the structure. There is not other
external reference. However, that assumption is not true if "-fplan9-extensions"
or "-fms-extensions" was enabled. Sparse current does not support these
two extensions. But it will be a problem if we support them down the road.

According to this gcc document:

https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html

One of the example is this:
     struct s1 { int a; };
     struct s2 { struct s1; };

You see, "struct s1" does not have a name, however "struct s1" can be
referenced from other symbol. So update the symbol offset in "struct s1"
will be wrong for the other reference of "struct s1".

Actually, the "find_identifier" call inside function "check_designators" already
calculate the correct offset. Only if we can pass that offset to
"convert_ident()".

Another idea is that, instead of using ident in EXPR_IDENTIFIER, we can
do one time find_identifier(). Then convert the EXPR_IDENTIFIER to an index
to the struct. It might take more than one index level for the anonymous
struct or union. BTW, that is very similar to the LLVM  performing the
"Get Element Pointer" instruction. Using the index, we can still reference
the anonymous struct.  This is not a trivial change. However it might be
the right thing to do if we want to support "-fplan9-extensions"

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds Aug. 2, 2014, 6:10 p.m. UTC | #2
On Fri, Aug 1, 2014 at 10:16 PM, Christopher Li <sparse@chrisli.org> wrote:
>
> Sorry I am late for the party. This approach assume the anonymous
> structure is only referenced inside the structure.

It does indeed. I thought that was ok.

> There is not other
> external reference. However, that assumption is not true if "-fplan9-extensions"
> or "-fms-extensions" was enabled. Sparse current does not support these
> two extensions. But it will be a problem if we support them down the road.

.. and in fact I think even without those things, you can just make
the unnamed union have a type name, ie

   struct S {
      union T {
          int a;
      }
   };

the union T has a typename, but is a unnamed member of struct S. We
could use "union T" later.

> Actually, the "find_identifier" call inside function "check_designators" already
> calculate the correct offset. Only if we can pass that offset to
> "convert_ident()".

I actually tried that initially, and wanted to get rid of the whole
"check_designators()" vs "convert_designators()", but couldn't really
find a sane way to make that work. I'll look at it again, maybe I
missed something.

             LInus
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Derek M Jones Aug. 2, 2014, 6:31 p.m. UTC | #3
Linus,

> .. and in fact I think even without those things, you can just make
> the unnamed union have a type name, ie
>
>     struct S {
>        union T {
>            int a;
>        }
>     };
>
> the union T has a typename, but is a unnamed member of struct S. We
> could use "union T" later.

No such luck, not allowed by the C Standard.

6.7.2.1p13
"An unnamed member whose type specifier is a structure specifier with
no tag is called an anonymous structure; an unnamed member whose type
specifier is a union specifier with no tag is called an anonymous union"

But you can do this sort of thing when the type specifier is not anonymous.
Christopher Li Aug. 2, 2014, 6:40 p.m. UTC | #4
On Sat, Aug 2, 2014 at 11:31 AM, Derek M Jones <derek@knosof.co.uk> wrote:
>> .. and in fact I think even without those things, you can just make
>> the unnamed union have a type name, ie
>>
>>     struct S {
>>        union T {
>>            int a;
>>        }
>>     };
>>
>> the union T has a typename, but is a unnamed member of struct S. We
>> could use "union T" later.
>
>
> No such luck, not allowed by the C Standard.
>
> 6.7.2.1p13
> "An unnamed member whose type specifier is a structure specifier with
> no tag is called an anonymous structure; an unnamed member whose type
> specifier is a union specifier with no tag is called an anonymous union"
>
> But you can do this sort of thing when the type specifier is not anonymous.

That is what I recall as well, not part of stander C. If you enable
the "-fplan9-extensions",
Linus' example can pass.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/evaluate.c b/evaluate.c
index 03992d03ae1d..78150f83a19f 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -1898,7 +1898,7 @@  static struct symbol *find_identifier(struct ident *ident, struct symbol_list *_
 				sub = find_identifier(ident, ctype->symbol_list, offset);
 				if (!sub)
 					continue;
-				*offset += sym->offset;
+				*offset = sym->offset;
 				return sub;
 			}	
 		}
diff --git a/symbol.c b/symbol.c
index a2731348a011..32af64050a64 100644
--- a/symbol.c
+++ b/symbol.c
@@ -116,6 +116,18 @@  static int bitfield_base_size(struct symbol *sym)
 	return sym->bit_size;
 }
 
+static void update_symbol_offsets(struct symbol *sym, unsigned int offset)
+{
+	struct symbol *field;
+
+	if (sym->type != SYM_STRUCT && sym->type != SYM_UNION)
+		return;
+	FOR_EACH_PTR(sym->symbol_list, field) {
+		field->offset += offset;
+		update_symbol_offsets(field, offset);
+	} END_FOR_EACH_PTR(field);
+}
+
 /*
  * Structures are a bit more interesting to lay out
  */
@@ -174,6 +186,10 @@  static void lay_out_struct(struct symbol *sym, struct struct_union_info *info)
 	bit_size = (bit_size + align_bit_mask) & ~align_bit_mask;
 	sym->offset = bits_to_bytes(bit_size);
 
+	/* If it's an unnamed struct or union, update the offsets of the sub-members */
+	if (sym->offset && !sym->ident)
+		update_symbol_offsets(sym, sym->offset);
+
 	info->bit_size = bit_size + base_size;
 	// warning (sym->pos, "regular: offset=%d", sym->offset);
 }