Commit Graph

634 Commits

Author SHA1 Message Date
Jilles Tjoelker
4c4164f9a7 sh: Reindent evaltree(). 2010-10-31 12:08:16 +00:00
Jilles Tjoelker
dca867f1c9 sh: Use iteration instead of recursion to evaluate semicolon lists.
This reduces CPU and memory usage when executing long lists (such
as long functions).
2010-10-31 12:06:02 +00:00
Jilles Tjoelker
274110df0a sh: Tweak some string constants to reduce code size.
* Reduce some needless differences.
* Shorten some error messages that should not happen.
2010-10-29 21:44:43 +00:00
Jilles Tjoelker
a1251487f4 sh: Reject function names ending in one of !%*+-=?@}~
These do something else in ksh: name=(...) is an array or compound variable
assignment and the others are extended patterns.

This is the last patch of the ones tested in the exp run.

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-29 21:20:56 +00:00
Jilles Tjoelker
e20776d503 sh: Detect various additional errors in the parser.
Apart from detecting breakage earlier or at all, this also fixes a segfault
in the testsuite. The "handling" of the breakage left an invalid internal
representation in some cases.

Examples:
  echo a; do echo b
  echo `) echo a`
  echo `date; do do do`

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-29 21:06:57 +00:00
Jilles Tjoelker
33582ce055 sh: Error out on various specials/keywords in the wrong place in backticks.
Example:
  echo `date)`

Exp-run done by:	pav (with some other sh(1) changes)
Obtained from:		NetBSD (Christos Zoulas, NetBSD PR 11317)
2010-10-29 20:23:41 +00:00
Jilles Tjoelker
60f7eec450 sh: Fix some issues with CTL* bytes and ${var#pat}.
subevalvar() incorrectly assumed that CTLESC bytes were present iff the
expansion was quoted. However, they are present iff various processing such
as word splitting is to be done later on.

Example:
  v=@$e@$e@$e@
  y="${v##*"$e"}"
  echo "$y"
failed if $e contained the magic CTLESC byte.

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-29 19:34:57 +00:00
Jilles Tjoelker
048f26671a sh: Do IFS splitting on word in ${v+word} and ${v-word}.
The code is inspired by NetBSD sh somewhat, but different because we
preserve the old Almquist/Bourne/Korn ability to have an unquoted part in a
quoted ${v+word}. For example, "${v-"*"}" expands to $v as a single field if
v is set, but generates filenames otherwise.

Note that this is the only place where we split text literally from the
script (the similar ${v=word} assigns to v and then expands $v). The parser
must now add additional markers to allow the expansion code to know whether
arbitrary characters in substitutions are quoted.

Example:
  for i in ${$+a b c}; do echo $i; done

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-29 13:42:18 +00:00
Jilles Tjoelker
6c38071288 sh: Only accept a '}' inside ${v+-=?...} if double-quote state matches.
If double-quote state does not match, treat the '}' literally.

This ensures double-quote state remains the same before and after a
${v+-=?...} which helps with expand.c.

It makes things like
  ${foo+"\${bar}"}
which I have seen in the wild work as expected.

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-28 22:34:49 +00:00
Jilles Tjoelker
9cec947f3f sh: Make double-quotes quote a '}' inside ${v#...} and ${v%...}.
Exp-run done by:	pav (with some other sh(1) changes)
PR:			bin/57554
2010-10-28 21:51:14 +00:00
Jilles Tjoelker
d94c867339 sh: Ignore double-quotes in arithmetic rather than treating them as quotes.
This provides similar behaviour, but allows a simpler parser.

This changes r206473.

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-24 22:25:38 +00:00
Jilles Tjoelker
67e109adbe sh: Do not allow overriding a special builtin with a function.
This is a syntax error.

POSIX does not say explicitly whether defining a function with the same name
as a special builtin is allowed, but it does say that it is impossible to
call such a function.

A special builtin can still be overridden with an alias.

This commit is part of a set of changes that will ensure that when
something looks like a special builtin to the parser, it is one. (Not the
other way around, as it remains possible to call a special builtin named
by a variable or other substitution.)

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-24 22:03:21 +00:00
Jilles Tjoelker
074e83b14e sh: Make sure defined functions can actually be called.
Add some conservative checks on function names:
- Disallow expansions or quoting characters; these can only be called via
  strange control characters
- Disallow '/'; these functions cannot be called anyway, as exec.c assumes
  they are pathnames
- Make the CTL* bytes work properly in function names.

These are syntax errors.

POSIX does not require us to support more than names (letters, digits and
underscores, not starting with a digit), but I do not want to restrict it
that much at this time.

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-24 20:45:13 +00:00
Jilles Tjoelker
3dec7d0c15 sh: Check whether dup2 was successful for >&FD and <&FD.
A failure (usually caused by FD not being open) is a redirection error.

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-24 20:09:49 +00:00
Jilles Tjoelker
ba08f69b5c sh: Change ! within a pipeline to start a new pipeline instead.
This is how ksh93 treats ! within a pipeline and makes the ! in
  a | ! b | c
negate the exit status of the pipeline, as if it were
  a | { ! b | c; }

Side effect: something like
  f() ! a
is now a syntax error, because a function definition takes a command,
not a pipeline.

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-24 17:06:49 +00:00
Jilles Tjoelker
f1ec058177 sh(1): Clarify subshells/processes for pipelines.
For multi-command pipelines,
1. all commands are direct children of the shell (unlike the original
   Bourne shell)
2. all commands are executed in a subshell (unlike the real Korn shell)

MFC after:	1 week
2010-10-16 14:37:56 +00:00
Jilles Tjoelker
9fa5f4a093 sh: Use <stddef.h> rather than <sys/stddef.h>.
<sys/stddef.h> is only for the kernel and conflicts with <stddef.h>.
2010-10-16 12:40:00 +00:00
David E. O'Brien
56d47fb9b6 We only need to look as far as '..' to find 'test/'. 2010-10-13 23:31:17 +00:00
David E. O'Brien
7cfe69417c Do not assume in growstackstr() that a "precious" character will be
immediately written into the stack after the call.  Instead let the caller
manage the "space left".

Previously, growstackstr()'s assumption causes problems with STACKSTRNUL()
where we want to be able to turn a stack into a C string, and later
pretend the NUL is not there.

This fixes a bug in STACKSTRNUL() (that grew the stack) where:
1. STADJUST() called after a STACKSTRNUL() results in an improper adjust.
   This can be seen in ${var%pattern} and ${var%%pattern} evaluation.
2. Memory leak in STPUTC() called after a STACKSTRNUL().

Reviewed by:	jilles
2010-10-13 23:29:09 +00:00
David E. O'Brien
8832864298 In the spirit of r90111, depend on c89 and remove the "STATIC" macro
and its usage.
2010-10-13 22:18:03 +00:00
David E. O'Brien
f10d20060e If one wishes to set breakpoints of static the functions here, they
cannot be inlined.

Submitted by:	jhb
2010-10-13 18:23:43 +00:00
John Baldwin
8ab2e97063 Make DEBUG traces 64-bit clean:
- Use %t to print ptrdiff_t values.
- Cast a ptrdiff_t value explicitly to int for a field width specifier.

While here, sort includes.

Submitted by:	Garrett Cooper
2010-10-13 13:22:11 +00:00
John Baldwin
f12f3dbeee Suggest that DEBUG_FLAGS be used to enable extra debugging rather than
frobbing CFLAGS directly.  DEBUG_FLAGS is something that can be specified
on the make command line without having to edit the Makefile directly.

Submitted by:	Garrett Cooper
2010-10-13 13:17:38 +00:00
David E. O'Brien
aa7b6f8259 Consistently use "STATIC" for all functions in order to be able to set
breakpoints with in a debugger.  And use naked "static" for variables.

Noticed by:	bde
2010-10-13 04:01:01 +00:00
David E. O'Brien
9d22cd9be8 If DEBUG is 3 or greater, disable STATICization of functions.
Also correct the documented location of the trace file.
2010-10-12 19:24:41 +00:00
David E. O'Brien
f3bf9b7a16 Allow one to regression test 'sh' changes without having to install
a potentially bad /bin/sh first.
2010-10-12 18:20:38 +00:00
Jilles Tjoelker
2b11dfee8e sh: Add __dead2 to two functions that do not return.
Apart from helping static analyzers, this also appears to reduce the size of
the binary slightly.
2010-09-12 22:00:31 +00:00
Jilles Tjoelker
8f2dc7de67 sh: Fix exit status if return is used within a loop condition. 2010-09-11 15:07:40 +00:00
Jilles Tjoelker
011d162dd3 sh: Apply variable assignments left-to-right in bltinlookup().
Example:
  HOME=foo HOME=bar cd
2010-09-11 14:15:50 +00:00
Jilles Tjoelker
1217a4ead5 sh(1): Remove xrefs for expr(1) and getopt(1).
expr(1) should usually not be used as various forms of parameter expansion
and arithmetic expansion replicate most of its functionality in an easier
way.

getopt(1) should not be used at all in new code. Instead, getopts(1) or
entirely manual parsing should be used.

MFC after:	1 week
2010-09-10 13:40:31 +00:00
Jilles Tjoelker
917fdfb106 sh: Fix 'read' if all chars before the first IFS char are backslash-escaped.
Backslash-escaped characters did not set the flag for a non-IFS character.

MFC after:	2 weeks
2010-09-08 20:35:43 +00:00
Jilles Tjoelker
2ca3d70fc6 sh: Improve comments in expand.c. 2010-09-05 21:12:48 +00:00
Jilles Tjoelker
27542743cd sh: Get rid of some magic numbers.
MFC after:	1 week
2010-09-04 21:23:46 +00:00
Jilles Tjoelker
fe5d61a4cf sh: Do not use locale for determining if something is a name.
This makes it impossible to use locale-specific characters in variable
names.

Names containing locale-specific characters make scripts only work with the
correct locale setting. Also, they did not even work in many practical cases
because multibyte character sets such as utf-8 are not supported.

This also avoids weirdness if LC_CTYPE is changed in the middle of a script.
2010-09-03 22:13:54 +00:00
Jilles Tjoelker
8fdbdb5d50 sh: Remove remnants of '!!' to negate pattern.
This Almquist extension was disabled long ago.

In pathname generation, components starting with '!!' were treated as
containing wildcards, causing unnecessary readdir (which could fail, causing
pathname generation to fail while it should not).
2010-08-22 21:18:21 +00:00
Jilles Tjoelker
36cf3efe41 sh(1): Add a brief summary of arithmetic expressions. 2010-08-22 13:04:00 +00:00
Jilles Tjoelker
ae7c0700dc sh: Fix break/continue/return sometimes not skipping the rest of dot script.
In our implementation and most others, a break or continue in a dot script
can break or continue a loop outside the dot script. This should cause all
further commands in the dot script to be skipped. However, cmdloop() did not
know about this and continued to parse and execute commands from the dot
script.

As described in the man page, a return in a dot script in a function returns
from the function, not only from the dot script. There was a similar issue
as with break and continue. In various other shells, the return appears to
return from the dot script, but POSIX seems not very clear about this.
2010-08-15 21:06:53 +00:00
Jilles Tjoelker
c5e4fa998d sh: Add a forgotten const. 2010-08-13 20:29:43 +00:00
Jilles Tjoelker
92378cce77 sh: Fix shadowing of sigset. 2010-08-13 13:36:18 +00:00
Jilles Tjoelker
c8a3d81f34 sh: Fix heap-based buffer overflow in pathname generation.
The buffer for generated pathnames could be too small in some cases. It
happened to be always at least PATH_MAX long, so there was never an overflow
if the resulting pathnames would be usable.

This bug may be abused if a script subjects input from an untrusted source
to pathname generation, which a bad idea anyhow. Most shell scripts do not
work on untrusted data. secteam@ says no advisory is necessary.

PR:		bin/148733
Reported by:	Changming Sun snnn119 at gmail com
MFC after:	10 days
2010-08-10 22:45:59 +00:00
Jilles Tjoelker
40969e7396 Remove unnecessary duplicate letters in mksyntax.c,
the table elements would just be overwritten twice.
2010-08-08 21:04:27 +00:00
Jilles Tjoelker
b84d7af7c6 sh: Return 0 from eval if no command was given.
This makes a difference if there is a command substitution.

To make this work, evalstring() has been changed to set exitstatus to 0 if
no command was executed (the string contained only whitespace).

Example:
  eval $(false); echo $?
should print 0.
2010-08-03 22:17:29 +00:00
Jilles Tjoelker
933803fb21 sh: Do not enter consecutive duplicates into the history.
This simply sets a flag in libedit. It has a shortcoming in that it does not
apply to multi-line commands.

Note that a configuration option for this is not going to happen, but always
having this seems better than not having it. NetBSD has done the same.

PR:		bin/54683
Obtained from:	NetBSD
MFC after:	1 month
2010-08-01 16:37:51 +00:00
Jilles Tjoelker
6c0c240366 sh: Fix crash due to uninitialized here-document.
If an ; or & token was followed by an EOF token, pending here-documents were
left uninitialized. Execution would crash, either in the main shell process
for literal here-documents or in a child process for expanded
here-documents. In the latter case the problem is hard to detect apart from
the core dumps and log messages.

Side effect: slightly different retries on inputs where EOF is not
persistent.

Note that tools/regression/bin/sh/parser/heredoc6.0 still causes a similar
crash in a child process. The text passed to eval is malformed and should be
rejected.
2010-07-25 22:25:52 +00:00
Jilles Tjoelker
cac4830ce4 sh: Allow a background command consisting solely of redirections.
Example:
  </dev/null &

MFC after:	2 weeks
2010-07-18 12:45:31 +00:00
Jilles Tjoelker
d5af15eabf sh: There cannot be a TNOT in simplecmd(), remove checks.
simplecmd() only handles simple commands and function definitions, neither
of which involves the ! keyword. The initial token on entry to simplecmd()
is one of the following: TSEMI, TAND, TOR, TNL, TEOF, TWORD, TRP.
2010-07-14 22:31:45 +00:00
Jilles Tjoelker
c9c987cd35 sh: Use $PWD instead of getcwd() for the \w and \W prompt expansions.
This ensures that the logical working directory (which may include
symlinks) is shown and is similar to the default behaviour of the pwd
builtin.
2010-07-02 22:17:13 +00:00
Jilles Tjoelker
ed4c3b5f86 sh: Forget about terminated background processes sooner.
Unless $! has been referenced for a particular job or $! still contains that
job's pid, forget about it after it has terminated. If $! has been
referenced, remember the job until the wait builtin has reported its
completion (either with the pid as parameter or without parameters).

In interactive mode, jobs are forgotten after termination has been reported,
which happens before primary prompts and through the jobs builtin. Even
then, though, remember a job if $! has been referenced.

This is similar to what is suggested by POSIX and should fix most memory
leaks (which also tend to cause sh to use more CPU time) with long running
scripts that start background jobs.

Caveats:
* Repeatedly referencing $! without ever doing 'wait', like
    while :; do foo & echo started foo: $!; sleep 60; done
  will still use a lot of memory and CPU time in the long run.
* The jobs and jobid builtins do not cause a job to be remembered for longer
  like expanding $! does.

PR:		bin/55346
2010-06-29 22:37:45 +00:00
Jilles Tjoelker
49e10f5e38 sh: Fix compilation with -DNO_HISTORY.
The LINENO code uses snprintf() and relied on "myhistedit.h" to pull in the
necessary <stdio.h>.

Compiling with -DNO_HISTORY disables all editing and history support and
allows linking without -ledit -ltermcap. This may be useful for embedded
systems.

MFC after:	2 weeks
2010-06-19 10:33:04 +00:00
Jilles Tjoelker
e46b12b732 sh: Add filename completion.
This uses the new libedit completion function with quoting support.

Unlike NetBSD, there is no 'set +o tabcomplete' option to disable
completion. I do not see any reason for such a special treatment, as
completion is rather useful and it is possible to do
  bind ^I ed-insert
to disable completion and insert a tab character instead.

Submitted by:	Guy Yur
2010-06-15 21:58:40 +00:00