Message ID | 20191004213029.145027-1-sboyd@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | userdiff: Fix some corner cases in dts regex | expand |
Am 04.10.19 um 23:30 schrieb Stephen Boyd: > While reviewing some dts diffs recently I noticed that the hunk header > logic was failing to find the containing node. This is because the regex > doesn't consider properties that may span multiple lines, i.e. > > property = <something>, > <something_else>; What if the property spans more than two lines? property = <something>, more, <something_else>; Can the second line "more," begin with a word, or are the angle brackets mandatory? I understand that the continuation lines can begin with a word when the property is an expression that is distributed over a number of lines. Such continuation lines could be picked up as hunk headers. But I don't want to complicate things: The hunk header patterns do not have to be perfect; it is sufficient when they are helpful in a good majority of cases that occur in practice. > and it got hung up on comments inside nodes that look like the root node > because they start with '/*'. Add tests for these cases and update the > regex to find them. Maybe detecting the root node is too complicated but > forcing it to be a backslash with any amount of whitespace up to an open > bracket seemed OK. I tried to detect that a comment is in-between the > two parts but I wasn't happy so I just dropped it. > > Cc: Rob Herring <robh+dt@kernel.org> > Cc: Frank Rowand <frowand.list@gmail.com> > Signed-off-by: Stephen Boyd <sboyd@kernel.org> > --- > t/t4018/dts-nodes-multiline-prop | 12 ++++++++++++ > t/t4018/dts-root | 2 +- > t/t4018/dts-root-comment | 8 ++++++++ > userdiff.c | 3 ++- > 4 files changed, 23 insertions(+), 2 deletions(-) > create mode 100644 t/t4018/dts-nodes-multiline-prop > create mode 100644 t/t4018/dts-root-comment > > diff --git a/t/t4018/dts-nodes-multiline-prop b/t/t4018/dts-nodes-multiline-prop > new file mode 100644 > index 000000000000..f7b655935429 > --- /dev/null > +++ b/t/t4018/dts-nodes-multiline-prop > @@ -0,0 +1,12 @@ > +/ { > + label_1: node1@ff00 { > + RIGHT@deadf00,4000 { > + multilineprop = <3>, > + <4>; You could insert more lines to demonstrate that "<x>," on a line by itself is not picked up. > + > + > +> + ChangeMe = <0xffeedd00>; Sufficient distance to the incorrect candidates above. Good. > + }; > + }; > +}; > diff --git a/t/t4018/dts-root b/t/t4018/dts-root > index 2ef9e6ffaa2c..4353b8220c91 100644 > --- a/t/t4018/dts-root > +++ b/t/t4018/dts-root > @@ -1,4 +1,4 @@ > -/RIGHT { /* Technically just supposed to be a slash */ > +/ { RIGHT /* Technically just supposed to be a slash and brace */ Do I understand correctly that the updated form, "/ {", is the common way to spell a root node, but "/" or "/word" are not? > #size-cells = <1>; > > ChangeMe = <0xffeedd00>; > diff --git a/t/t4018/dts-root-comment b/t/t4018/dts-root-comment > new file mode 100644 > index 000000000000..333a625c7007 > --- /dev/null > +++ b/t/t4018/dts-root-comment > @@ -0,0 +1,8 @@ > +/ { RIGHT /* Technically just supposed to be a slash and brace */ Devil's advocate here: insert ';' or '=' in the comment, and the line would not be picked up. Does that hurt in practice? > + #size-cells = <1>; > + > + /* This comment should be ignored */ > + > + some-property = <40+2>; > + ChangeMe = <0xffeedd00>; > +}; > diff --git a/userdiff.c b/userdiff.c > index 86e3244e15dd..651b56caec56 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -25,8 +25,9 @@ IPATTERN("ada", > "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"), > PATTERNS("dts", > "!;\n" > + "!.*=.*\n" This behaves the same way as just "!=\n" no? > /* lines beginning with a word optionally preceded by '&' or the root */ > - "^[ \t]*((/|&?[a-zA-Z_]).*)", > + "^[ \t]*((/[ \t]*\\{|&?[a-zA-Z_]).*)", > /* -- */ > /* Property names and math operators */ > "[a-zA-Z0-9,._+?#-]+" > -- Hannes
Quoting Johannes Sixt (2019-10-05 07:09:11) > Am 04.10.19 um 23:30 schrieb Stephen Boyd: > > While reviewing some dts diffs recently I noticed that the hunk header > > logic was failing to find the containing node. This is because the regex > > doesn't consider properties that may span multiple lines, i.e. > > > > property = <something>, > > <something_else>; > > What if the property spans more than two lines? > > property = <something>, > more, > <something_else>; > > Can the second line "more," begin with a word, or are the angle brackets > mandatory? Angle brackets aren't mandatory, but it is very odd to have a property with mixed numbers and strings because parsing becomes difficult. > > I understand that the continuation lines can begin with a word when the > property is an expression that is distributed over a number of lines. > Such continuation lines could be picked up as hunk headers. > > But I don't want to complicate things: The hunk header patterns do not > have to be perfect; it is sufficient when they are helpful in a good > majority of cases that occur in practice. > > > and it got hung up on comments inside nodes that look like the root node > > because they start with '/*'. Add tests for these cases and update the > > regex to find them. Maybe detecting the root node is too complicated but > > forcing it to be a backslash with any amount of whitespace up to an open > > bracket seemed OK. I tried to detect that a comment is in-between the > > two parts but I wasn't happy so I just dropped it. > > > > Cc: Rob Herring <robh+dt@kernel.org> > > Cc: Frank Rowand <frowand.list@gmail.com> > > Signed-off-by: Stephen Boyd <sboyd@kernel.org> > > --- > > t/t4018/dts-nodes-multiline-prop | 12 ++++++++++++ > > t/t4018/dts-root | 2 +- > > t/t4018/dts-root-comment | 8 ++++++++ > > userdiff.c | 3 ++- > > 4 files changed, 23 insertions(+), 2 deletions(-) > > create mode 100644 t/t4018/dts-nodes-multiline-prop > > create mode 100644 t/t4018/dts-root-comment > > > > diff --git a/t/t4018/dts-nodes-multiline-prop b/t/t4018/dts-nodes-multiline-prop > > new file mode 100644 > > index 000000000000..f7b655935429 > > --- /dev/null > > +++ b/t/t4018/dts-nodes-multiline-prop > > @@ -0,0 +1,12 @@ > > +/ { > > + label_1: node1@ff00 { > > + RIGHT@deadf00,4000 { > > + multilineprop = <3>, > > + <4>; > > You could insert more lines to demonstrate that "<x>," on a line by > itself is not picked up. Maybe I should add another test? > > > + > > + > > +> + ChangeMe = <0xffeedd00>; > > Sufficient distance to the incorrect candidates above. Good. > > > + }; > > + }; > > +}; > > diff --git a/t/t4018/dts-root b/t/t4018/dts-root > > index 2ef9e6ffaa2c..4353b8220c91 100644 > > --- a/t/t4018/dts-root > > +++ b/t/t4018/dts-root > > @@ -1,4 +1,4 @@ > > -/RIGHT { /* Technically just supposed to be a slash */ > > +/ { RIGHT /* Technically just supposed to be a slash and brace */ > > Do I understand correctly that the updated form, "/ {", is the common > way to spell a root node, but "/" or "/word" are not? Correct. A root node is '/' and the '{' opens the node. There is the possibility of something like '/delete-node nodename;' or '/delete-property property;', where the latter exists inside some node. The regex would need to avoid all of those. > > > #size-cells = <1>; > > > > ChangeMe = <0xffeedd00>; > > diff --git a/t/t4018/dts-root-comment b/t/t4018/dts-root-comment > > new file mode 100644 > > index 000000000000..333a625c7007 > > --- /dev/null > > +++ b/t/t4018/dts-root-comment > > @@ -0,0 +1,8 @@ > > +/ { RIGHT /* Technically just supposed to be a slash and brace */ > > Devil's advocate here: insert ';' or '=' in the comment, and the line > would not be picked up. Does that hurt in practice? I don't think it hurts in practice so I'd like to ignore it. > > > + #size-cells = <1>; > > + > > + /* This comment should be ignored */ > > + > > + some-property = <40+2>; > > + ChangeMe = <0xffeedd00>; > > +}; > > diff --git a/userdiff.c b/userdiff.c > > index 86e3244e15dd..651b56caec56 100644 > > --- a/userdiff.c > > +++ b/userdiff.c > > @@ -25,8 +25,9 @@ IPATTERN("ada", > > "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"), > > PATTERNS("dts", > > "!;\n" > > + "!.*=.*\n" > > This behaves the same way as just > > "!=\n" > > no? > Not exactly. Properties don't always get assigned. There are boolean properties that can be tested for by the presence of some string with an ending semi-colon, like 'this-is-true;'. If we just check for not equal to a line with a semicolon and newline then we'll see boolean properties. Should I add that as another test?
Am 08.10.19 um 16:43 schrieb Stephen Boyd: > Quoting Johannes Sixt (2019-10-05 07:09:11) >> Am 04.10.19 um 23:30 schrieb Stephen Boyd: >>> --- /dev/null >>> +++ b/t/t4018/dts-nodes-multiline-prop >>> @@ -0,0 +1,12 @@ >>> +/ { >>> + label_1: node1@ff00 { >>> + RIGHT@deadf00,4000 { >>> + multilineprop = <3>, >>> + <4>; >> >> You could insert more lines to demonstrate that "<x>," on a line by >> itself is not picked up. > > Maybe I should add another test? This is is the _multi_line test case, right? ;) Just add one or two lines between the <3> and the <4> that look like common real-world cases to show that those lines won't be picked up. I don't think that another test file is required. >>> +/ { RIGHT /* Technically just supposed to be a slash and brace */ >> >> Devil's advocate here: insert ';' or '=' in the comment, and the line >> would not be picked up. Does that hurt in practice? > > I don't think it hurts in practice so I'd like to ignore it. Sure, no problem. >>> PATTERNS("dts", >>> "!;\n" >>> + "!.*=.*\n" >> >> This behaves the same way as just >> >> "!=\n" >> >> no? >> > > Not exactly. Properties don't always get assigned. I was just refering to the added line, not the combination of the two lines. But while you are speaking of it: > There are boolean > properties that can be tested for by the presence of some string with an > ending semi-colon, like 'this-is-true;'. If we just check for not equal > to a line with a semicolon and newline then we'll see boolean > properties. Should I add that as another test? I agree that a test case with a Boolean property would be great. -- Hannes
Quoting Johannes Sixt (2019-10-12 05:54:00) > Am 08.10.19 um 16:43 schrieb Stephen Boyd: > > Quoting Johannes Sixt (2019-10-05 07:09:11) > >> Am 04.10.19 um 23:30 schrieb Stephen Boyd: > >>> --- /dev/null > >>> +++ b/t/t4018/dts-nodes-multiline-prop > >>> @@ -0,0 +1,12 @@ > >>> +/ { > >>> + label_1: node1@ff00 { > >>> + RIGHT@deadf00,4000 { > >>> + multilineprop = <3>, > >>> + <4>; > >> > >> You could insert more lines to demonstrate that "<x>," on a line by > >> itself is not picked up. > > > > Maybe I should add another test? > > This is is the _multi_line test case, right? ;) Just add one or two > lines between the <3> and the <4> that look like common real-world cases > to show that those lines won't be picked up. I don't think that another > test file is required. Ok got it! > > >>> +/ { RIGHT /* Technically just supposed to be a slash and brace */ > >> > >> Devil's advocate here: insert ';' or '=' in the comment, and the line > >> would not be picked up. Does that hurt in practice? > > > > I don't think it hurts in practice so I'd like to ignore it. > > Sure, no problem. > > >>> PATTERNS("dts", > >>> "!;\n" > >>> + "!.*=.*\n" > >> > >> This behaves the same way as just > >> > >> "!=\n" > >> > >> no? > >> > > > > Not exactly. Properties don't always get assigned. > > I was just refering to the added line, not the combination of the two lines. Ah ok. I'll reduce the line as you suggest then. Thanks. > > But while you are speaking of it: > > > There are boolean > > properties that can be tested for by the presence of some string with an > > ending semi-colon, like 'this-is-true;'. If we just check for not equal > > to a line with a semicolon and newline then we'll see boolean > > properties. Should I add that as another test? > > I agree that a test case with a Boolean property would be great. > Alright I'll work on that and resend.
diff --git a/t/t4018/dts-nodes-multiline-prop b/t/t4018/dts-nodes-multiline-prop new file mode 100644 index 000000000000..f7b655935429 --- /dev/null +++ b/t/t4018/dts-nodes-multiline-prop @@ -0,0 +1,12 @@ +/ { + label_1: node1@ff00 { + RIGHT@deadf00,4000 { + multilineprop = <3>, + <4>; + + + + ChangeMe = <0xffeedd00>; + }; + }; +}; diff --git a/t/t4018/dts-root b/t/t4018/dts-root index 2ef9e6ffaa2c..4353b8220c91 100644 --- a/t/t4018/dts-root +++ b/t/t4018/dts-root @@ -1,4 +1,4 @@ -/RIGHT { /* Technically just supposed to be a slash */ +/ { RIGHT /* Technically just supposed to be a slash and brace */ #size-cells = <1>; ChangeMe = <0xffeedd00>; diff --git a/t/t4018/dts-root-comment b/t/t4018/dts-root-comment new file mode 100644 index 000000000000..333a625c7007 --- /dev/null +++ b/t/t4018/dts-root-comment @@ -0,0 +1,8 @@ +/ { RIGHT /* Technically just supposed to be a slash and brace */ + #size-cells = <1>; + + /* This comment should be ignored */ + + some-property = <40+2>; + ChangeMe = <0xffeedd00>; +}; diff --git a/userdiff.c b/userdiff.c index 86e3244e15dd..651b56caec56 100644 --- a/userdiff.c +++ b/userdiff.c @@ -25,8 +25,9 @@ IPATTERN("ada", "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"), PATTERNS("dts", "!;\n" + "!.*=.*\n" /* lines beginning with a word optionally preceded by '&' or the root */ - "^[ \t]*((/|&?[a-zA-Z_]).*)", + "^[ \t]*((/[ \t]*\\{|&?[a-zA-Z_]).*)", /* -- */ /* Property names and math operators */ "[a-zA-Z0-9,._+?#-]+"
While reviewing some dts diffs recently I noticed that the hunk header logic was failing to find the containing node. This is because the regex doesn't consider properties that may span multiple lines, i.e. property = <something>, <something_else>; and it got hung up on comments inside nodes that look like the root node because they start with '/*'. Add tests for these cases and update the regex to find them. Maybe detecting the root node is too complicated but forcing it to be a backslash with any amount of whitespace up to an open bracket seemed OK. I tried to detect that a comment is in-between the two parts but I wasn't happy so I just dropped it. Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> --- t/t4018/dts-nodes-multiline-prop | 12 ++++++++++++ t/t4018/dts-root | 2 +- t/t4018/dts-root-comment | 8 ++++++++ userdiff.c | 3 ++- 4 files changed, 23 insertions(+), 2 deletions(-) create mode 100644 t/t4018/dts-nodes-multiline-prop create mode 100644 t/t4018/dts-root-comment