diff mbox

getopts doesn't properly update OPTIND when called from function

Message ID 5567990E.3090902@gigawatt.nl (mailing list archive)
State Rejected
Delegated to: Herbert Xu
Headers show

Commit Message

Harald van Dijk May 28, 2015, 10:39 p.m. UTC
On 28/05/2015 20:54, Martijn Dekker wrote:
> I'm writing a shell function that extends the functionality of the
> 'getopts' builtin. For that to work, it is necessary to call the
> 'getopts' builtin from the shell function.
>
> The POSIX standard specifies that OPTIND and OPTARG are global
> variables, even though the positional parameters are local to the
> function.[*] This makes it possible to call 'getopts' from a function by
> simply passing the global positional parameters along by adding "$@".
>
> My problem is that dash does not properly update the global OPTIND
> variable when getopts is called from a function, which defeats my
> function on dash. It updates the global OPTIND for the first option but
> not for subsequent options, so OPTIND gets stuck on 3. (It does
> accurately update the global OPTARG variable, though.)

That isn't the problem, not exactly anyway. The problem is that getopts 
is required to keep internal state separately from the OPTIND variable 
(a single integer is insufficient to track the progress when multiple 
options are combined in a single word), and that internal state is 
stored along with the positional parameters. The positional parameters 
are saved just before a function call, and restored when the function 
returns. The internal state of getopts should not be saved the same way. 
It should probably just be global to dash.

A quick patch to make sure it is global, and isn't reset when it 
shouldn't or doesn't need to be, is attached. You can experiment with 
it, if you like. Your script runs as expected with this patch. However, 
I should warn that I have done far too little investigation to actually 
submit this patch for inclusion at this point. I'll do more extensive 
checking, and testing, later. If someone else can check whether the 
patch is okay, and if not, in what cases it fails, that would be very 
welcome too.

Note that any changes could break existing scripts, that (incorrectly) 
rely on dash's current behaviour of implicitly resetting the state, and 
don't bother setting OPTIND to explicitly reset it when they want to 
parse a new set of arguments.

Cheers,
Harald van Dijk

Comments

Herbert Xu May 29, 2015, 2:58 a.m. UTC | #1
Harald van Dijk <harald@gigawatt.nl> wrote:
> That isn't the problem, not exactly anyway. The problem is that getopts 
> is required to keep internal state separately from the OPTIND variable 
> (a single integer is insufficient to track the progress when multiple 
> options are combined in a single word), and that internal state is 
> stored along with the positional parameters. The positional parameters 
> are saved just before a function call, and restored when the function 
> returns. The internal state of getopts should not be saved the same way. 
> It should probably just be global to dash.

I think the current behaviour is fine as far as POSIX is concerned.
It says:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/getopts.html

: APPLICATION USAGE

...

: Note that shell functions share OPTIND with the calling shell
: even though the positional parameters are changed. If the calling
: shell and any of its functions uses getopts to parse arguments,
: the results are unspecified.

Cheers,
Harald van Dijk May 29, 2015, 5:50 a.m. UTC | #2
On 29/05/2015 04:58, Herbert Xu wrote:
> Harald van Dijk <harald@gigawatt.nl> wrote:
>> That isn't the problem, not exactly anyway. The problem is that getopts
>> is required to keep internal state separately from the OPTIND variable
>> (a single integer is insufficient to track the progress when multiple
>> options are combined in a single word), and that internal state is
>> stored along with the positional parameters. The positional parameters
>> are saved just before a function call, and restored when the function
>> returns. The internal state of getopts should not be saved the same way.
>> It should probably just be global to dash.
>
> I think the current behaviour is fine as far as POSIX is concerned.
> It says:
>
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/getopts.html
>
> : APPLICATION USAGE
>
> ...
>
> : Note that shell functions share OPTIND with the calling shell
> : even though the positional parameters are changed. If the calling
> : shell and any of its functions uses getopts to parse arguments,
> : the results are unspecified.

The Application usage sections are informative and aren't worded as 
precisely as the other sections. If a script uses getopts at the global 
level, and it calls a shell function that too uses getopts, then it is 
very easy to be covered by

 > Any other attempt to invoke getopts multiple times in a single shell 
execution environment with parameters (positional parameters or arg 
operands) that are not the same in all invocations, or with an OPTIND 
value modified to be a value other than 1, produces unspecified results.

But the test script in this thread does invoke getopts with parameters 
that are the same in all invocations, and without modifying OPTIND. I 
don't see anything else in the normative sections that would make the 
result undefined or unspecified either. I do think the script is valid, 
and the results in dash should match those of other shells.

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu June 1, 2015, 6:29 a.m. UTC | #3
On Fri, May 29, 2015 at 07:50:09AM +0200, Harald van Dijk wrote:
> 
> But the test script in this thread does invoke getopts with
> parameters that are the same in all invocations, and without
> modifying OPTIND. I don't see anything else in the normative
> sections that would make the result undefined or unspecified either.
> I do think the script is valid, and the results in dash should match
> those of other shells.

The bash behaviour makes it impossible to call shell functions
that invoke getopts while in the middle of an getopts loop.

IMHO the dash behaviour makes a lot more sense since a function
always brings with it a new set of parameters.  That plus the
fact that this behaviour has been there since day one makes me
reluctant to change it since the POSIX wording is not all that
clear.

Cheers,
Harald van Dijk June 1, 2015, 5:30 p.m. UTC | #4
On 01/06/2015 08:29, Herbert Xu wrote:
> On Fri, May 29, 2015 at 07:50:09AM +0200, Harald van Dijk wrote:
>>
>> But the test script in this thread does invoke getopts with
>> parameters that are the same in all invocations, and without
>> modifying OPTIND. I don't see anything else in the normative
>> sections that would make the result undefined or unspecified either.
>> I do think the script is valid, and the results in dash should match
>> those of other shells.
>
> The bash behaviour makes it impossible to call shell functions
> that invoke getopts while in the middle of an getopts loop.
 >
> IMHO the dash behaviour makes a lot more sense since a function
> always brings with it a new set of parameters.  That plus the
> fact that this behaviour has been there since day one makes me
> reluctant to change it since the POSIX wording is not all that
> clear.

True. Given that almost no shell supports that anyway, there can't be 
too many scripts that rely on it, but I did warn about the risk of 
breaking another type of existing scripts as well, I agree that's a real 
concern.

One thing that doesn't really make sense, though: if the getopts 
internal state is local to a function, then OPTIND and OPTARG really 
should be too. Because they aren't, nested getopts loops already don't 
really work in a useful way in dash, because the inner loop overwrites 
the OPTIND and OPTARG variables. While OPTARG will typically be checked 
right at the start of the loop, before any inner loop executes, OPTIND 
is typically used at the end of the loop, in combination with the shift 
command. The current behaviour makes the OPTIND value in that case 
unreliable.

So either way, I think something should change. But if you prefer to get 
clarification first about the intended meaning of the POSIX wording, 
that certainly seems reasonable to me.

> Cheers,
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jilles Tjoelker June 1, 2015, 10:10 p.m. UTC | #5
On Mon, Jun 01, 2015 at 07:30:46PM +0200, Harald van Dijk wrote:
> On 01/06/2015 08:29, Herbert Xu wrote:
> > On Fri, May 29, 2015 at 07:50:09AM +0200, Harald van Dijk wrote:

> >> But the test script in this thread does invoke getopts with
> >> parameters that are the same in all invocations, and without
> >> modifying OPTIND. I don't see anything else in the normative
> >> sections that would make the result undefined or unspecified either.
> >> I do think the script is valid, and the results in dash should match
> >> those of other shells.

> > The bash behaviour makes it impossible to call shell functions
> > that invoke getopts while in the middle of an getopts loop.

> > IMHO the dash behaviour makes a lot more sense since a function
> > always brings with it a new set of parameters.  That plus the
> > fact that this behaviour has been there since day one makes me
> > reluctant to change it since the POSIX wording is not all that
> > clear.

> True. Given that almost no shell supports that anyway, there can't be 
> too many scripts that rely on it, but I did warn about the risk of 
> breaking another type of existing scripts as well, I agree that's a real 
> concern.

FreeBSD sh inherits similar code from ash and likewise has per-function
getopts state. Various shell scripts in the FreeBSD base system use
getopts in functions without setting OPTIND=1.

> One thing that doesn't really make sense, though: if the getopts 
> internal state is local to a function, then OPTIND and OPTARG really 
> should be too. Because they aren't, nested getopts loops already don't 
> really work in a useful way in dash, because the inner loop overwrites 
> the OPTIND and OPTARG variables. While OPTARG will typically be checked 
> right at the start of the loop, before any inner loop executes, OPTIND 
> is typically used at the end of the loop, in combination with the shift 
> command. The current behaviour makes the OPTIND value in that case 
> unreliable.

First, note that the OPTARG and OPTIND shell variables are not an input
to getopts, except for an assignment OPTIND=1 (restoring an OPTIND local
at function return does not reset getopts), and that getopts writes
OPTIND no matter whether getopts's internal optind changed in this
invocation.

With that, the value of OPTIND generally used in scripts is not
unreliable. OPTIND is generally only checked after getopts returned
false (end of options), in the sequence
  while getopts ...; do
    ...
  done
  shift "$((OPTIND - 1))"

> So either way, I think something should change. But if you prefer to get 
> clarification first about the intended meaning of the POSIX wording, 
> that certainly seems reasonable to me.

I think the POSIX wording is clear enough, but it may not be desirable
to change getopts to work that way.
Harald van Dijk June 2, 2015, 12:21 a.m. UTC | #6
On 02/06/2015 00:10, Jilles Tjoelker wrote:
> On Mon, Jun 01, 2015 at 07:30:46PM +0200, Harald van Dijk wrote:
>> On 01/06/2015 08:29, Herbert Xu wrote:
>>> On Fri, May 29, 2015 at 07:50:09AM +0200, Harald van Dijk wrote:
>
>>>> But the test script in this thread does invoke getopts with
>>>> parameters that are the same in all invocations, and without
>>>> modifying OPTIND. I don't see anything else in the normative
>>>> sections that would make the result undefined or unspecified either.
>>>> I do think the script is valid, and the results in dash should match
>>>> those of other shells.
>
>>> The bash behaviour makes it impossible to call shell functions
>>> that invoke getopts while in the middle of an getopts loop.
>
>>> IMHO the dash behaviour makes a lot more sense since a function
>>> always brings with it a new set of parameters.  That plus the
>>> fact that this behaviour has been there since day one makes me
>>> reluctant to change it since the POSIX wording is not all that
>>> clear.
>
>> True. Given that almost no shell supports that anyway, there can't be
>> too many scripts that rely on it, but I did warn about the risk of
>> breaking another type of existing scripts as well, I agree that's a real
>> concern.
>
> FreeBSD sh inherits similar code from ash and likewise has per-function
> getopts state. Various shell scripts in the FreeBSD base system use
> getopts in functions without setting OPTIND=1.

Yikes. That's an unfortunate effect of writing scripts that only get run 
on a single shell: things like that don't even show up as a possible 
problem. It's similar to how many bashisms sneak into supposedly 
portable shell scripts.

>> One thing that doesn't really make sense, though: if the getopts
>> internal state is local to a function, then OPTIND and OPTARG really
>> should be too. Because they aren't, nested getopts loops already don't
>> really work in a useful way in dash, because the inner loop overwrites
>> the OPTIND and OPTARG variables. While OPTARG will typically be checked
>> right at the start of the loop, before any inner loop executes, OPTIND
>> is typically used at the end of the loop, in combination with the shift
>> command. The current behaviour makes the OPTIND value in that case
>> unreliable.
>
> First, note that the OPTARG and OPTIND shell variables are not an input
> to getopts, except for an assignment OPTIND=1 (restoring an OPTIND local
> at function return does not reset getopts), and that getopts writes
> OPTIND no matter whether getopts's internal optind changed in this
> invocation.
>
> With that, the value of OPTIND generally used in scripts is not
> unreliable. OPTIND is generally only checked after getopts returned
> false (end of options), in the sequence
>    while getopts ...; do
>      ...
>    done
>    shift "$((OPTIND - 1))"

Ah, you're right, I missed that there will usually be another execution 
of getopts before OPTIND is used. Thanks for clearing that up. In that 
case, I agree, the situations in which the values of OPTIND and OPTARG 
are unreliable are only situations in which scripts usually don't bother 
checking their values.

>> So either way, I think something should change. But if you prefer to get
>> clarification first about the intended meaning of the POSIX wording,
>> that certainly seems reasonable to me.
>
> I think the POSIX wording is clear enough, but it may not be desirable
> to change getopts to work that way.

It was Herbert Xu who felt the POSIX wording was unclear, and he is the 
dash maintainer, so his opinion on whether the wording is clear is the 
one that matters.

If it is clear or clarified what POSIX requires, and that POSIX allows 
the current implementation, then I see no need either to change the dash 
behaviour. It could still be useful to make OPTIND and OPTARG local, but 
you've convinced at least me that it's only a minor problem.

If it is clear or clarified what POSIX requires, and that POSIX 
disallows the current implementation, and the current implementation is 
deemed too desirable to drop, then it might make sense to support both 
alternatives, with an option at configure time to switch between them. 
As far as I know, dash does still aim to conform to POSIX, so even if a 
conscious decision is made to deviate from POSIX by default, I think an 
option to conform to it would be nice for those who care about it. I 
would be happy to create a patch, if this approach would be more agreeable.

Cheers,
Harald van Dijk
--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martijn Dekker June 4, 2015, 7:56 p.m. UTC | #7
Harald van Dijk schreef op 29-05-15 om 00:39:
> A quick patch to make sure it is global, and isn't reset when it
> shouldn't or doesn't need to be, is attached. You can experiment with
> it, if you like.

I've been using dash with this patch since you posted it, and it works
like a charm (including my function that extends getopts'
functionality). No issues encountered. Thanks.

Further discussion in this thread shows that the patch may conflict with
existing usage of 'getopts' for parsing the options within a function (a
usage that would make the script quite shell-specific, by the way,
because it would rely on Almquist-specific behaviour).

The issue, as I understand it, is that 'getopts' keeps not just the
OPTIND variable but also an additional invisible internal variable to
maintain its state. This is necessary to keep track of combined short
options.[*]

There appear to be two possible use cases for calling 'getopts' within a
function:

1. The option parsing loop is in the function, parsing the function's
options. This requires a function-local internal state of 'getopts',
otherwise calling a function using getopts from a main getopts loop
couldn't possibly work, because there is no way to directly save or
restore the unnamed internal state variable of getopts.

2. The option parsing loop is in the main shell environment, but instead
of calling getopts directly, the option parsing loop calls a function,
passing on the main positional parameters, and that function then calls
'getopts' and does additional things (in my case, re-parse GNU-style
--long options in terms of a special short option '--' with argument;
but of course it could be anything). This requires a global internal
'getopts' state.

Use case 1 requires a global internal 'getopts' state and use case 2
requires a local one, so they are mutually incompatible.

But I'm thinking that perhaps there is a way for the shell to
distinguish between these two use cases so that they can be reconciled.

The standard says that OPTIND is a global variable in any case, so use
case 1 above could only work if, before starting the function's option
parsing loop, OPTIND is either explicitly declared a function-local
variable using the non-standard 'local' keyword or is reinitialized
using an assignment.

On the other hand, use case 2 could only work if OPTIND is completely
left alone by the function, allowing a 'getopts' with a global state to
do its thing without interference.

So I would suggest the following might reconcile both use cases: By
default, make the 'getopts' internal state global. However, whenever
OPTIND is either assigned a value within a function or declared local
within a function, automatically make the 'getopts' internal state local
to the function.

Comments?

- M.

[*] Just as a datapoint, I found that yash has a different strategy for
this: it stores both values in OPTIND, separated by a semicolon -- e.g.
an $OPTIND of 3:2 means getopts is at the second option in the third
argument.

--
To unsubscribe from this list: send the line "unsubscribe dash" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/src/eval.c b/src/eval.c
index 071fb1b..59e7506 100644
--- a/src/eval.c
+++ b/src/eval.c
@@ -953,8 +953,6 @@  evalfun(struct funcnode *func, int argc, char **argv, int flags)
 	INTON;
 	shellparam.nparam = argc - 1;
 	shellparam.p = argv + 1;
-	shellparam.optind = 1;
-	shellparam.optoff = -1;
 	pushlocalvars();
 	evaltree(func->n.ndefun.body, flags & EV_TESTED);
 	poplocalvars(0);
diff --git a/src/options.c b/src/options.c
index 6f381e6..5b24eeb 100644
--- a/src/options.c
+++ b/src/options.c
@@ -163,8 +163,8 @@  setarg0:
 	}
 
 	shellparam.p = xargv;
-	shellparam.optind = 1;
-	shellparam.optoff = -1;
+	shoptind = 1;
+	shoptoff = -1;
 	/* assert(shellparam.malloc == 0 && shellparam.nparam == 0); */
 	while (*xargv) {
 		shellparam.nparam++;
@@ -316,8 +316,6 @@  setparam(char **argv)
 	shellparam.malloc = 1;
 	shellparam.nparam = nparam;
 	shellparam.p = newparam;
-	shellparam.optind = 1;
-	shellparam.optoff = -1;
 }
 
 
@@ -362,8 +360,6 @@  shiftcmd(int argc, char **argv)
 	}
 	ap2 = shellparam.p;
 	while ((*ap2++ = *ap1++) != NULL);
-	shellparam.optind = 1;
-	shellparam.optoff = -1;
 	INTON;
 	return 0;
 }
@@ -394,8 +390,8 @@  void
 getoptsreset(value)
 	const char *value;
 {
-	shellparam.optind = number(value) ?: 1;
-	shellparam.optoff = -1;
+	shoptind = number(value) ?: 1;
+	shoptoff = -1;
 }
 
 /*
@@ -412,20 +408,10 @@  getoptscmd(int argc, char **argv)
 
 	if (argc < 3)
 		sh_error("Usage: getopts optstring var [arg]");
-	else if (argc == 3) {
+	else if (argc == 3)
 		optbase = shellparam.p;
-		if ((unsigned)shellparam.optind > shellparam.nparam + 1) {
-			shellparam.optind = 1;
-			shellparam.optoff = -1;
-		}
-	}
-	else {
+	else
 		optbase = &argv[3];
-		if ((unsigned)shellparam.optind > argc - 2) {
-			shellparam.optind = 1;
-			shellparam.optoff = -1;
-		}
-	}
 
 	return getopts(argv[1], argv[2], optbase);
 }
@@ -438,10 +424,10 @@  getopts(char *optstr, char *optvar, char **optfirst)
 	int done = 0;
 	char s[2];
 	char **optnext;
-	int ind = shellparam.optind;
-	int off = shellparam.optoff;
+	int ind = shoptind;
+	int off = shoptoff;
 
-	shellparam.optind = -1;
+	shoptind = -1;
 	optnext = optfirst + ind - 1;
 
 	if (ind <= 1 || off < 0 || strlen(optnext[-1]) < off)
@@ -509,8 +495,8 @@  out:
 	s[1] = '\0';
 	setvar(optvar, s, 0);
 
-	shellparam.optoff = p ? p - *(optnext - 1) : -1;
-	shellparam.optind = ind;
+	shoptoff = p ? p - *(optnext - 1) : -1;
+	shoptind = ind;
 
 	return done;
 }
diff --git a/src/options.h b/src/options.h
index 975fe33..8295eb9 100644
--- a/src/options.h
+++ b/src/options.h
@@ -38,9 +38,9 @@  struct shparam {
 	int nparam;		/* # of positional parameters (without $0) */
 	unsigned char malloc;	/* if parameter list dynamically allocated */
 	char **p;		/* parameter list */
-	int optind;		/* next parameter to be processed by getopts */
-	int optoff;		/* used by getopts */
 };
+int shoptind;		/* next parameter to be processed by getopts */
+int shoptoff;		/* used by getopts */