Commit Graph

107 Commits

Author SHA1 Message Date
jilles
8883e8ad3e sh: Eliminate some gotos. 2014-10-05 21:51:36 +00:00
jilles
9af633c667 sh: Correctly handle positional parameters beyond INT_MAX on 64-bit systems.
Currently, there can be no more than INT_MAX positional parameters. Make
sure to treat all higher ones as unset to avoid incorrect results and
crashes.

On 64-bit systems, our atoi() takes the low 32 bits of the strtol() and
sign-extends them.

On 32-bit systems, the call to atoi() returned INT_MAX for too high values
and there is not enough address space for so many positional parameters, so
there was no issue.
2014-07-12 21:54:11 +00:00
jilles
16670f2045 sh: Consistently treat ${01} like $1.
Leading zeroes were ignored when checking whether a positional parameter is
set, but not when expanding its value. Ignore leading zeroes in any case.
2014-07-12 10:27:30 +00:00
jilles
270892ce0a sh: Fix possible memory leaks and double frees with unexpected SIGINT. 2014-03-26 20:43:40 +00:00
jilles
eeed81169f sh: Add some consts. 2014-03-14 21:45:37 +00:00
jilles
c2f01aba00 sh: Make argstr() return where it stopped and simplify expari() using this. 2014-03-04 22:30:38 +00:00
jilles
d941f4e61c sh: Simplify expari().
Redo expari() like evalvar(). This makes the logic more understandable and
avoids possible problems if arithmetic expansion occurs if CTLESC characters
are not generated (looking backwards for CTLARI is not generally possible in
that case but the old code tried anyway).

This adds an extra argstr() recursion.
2014-03-02 22:59:34 +00:00
jilles
6891107e84 sh: Do not corrupt internal representation if LINENO inner expansion fails.
Example:
  f() { : ${LINENO+$((1/0))}; }
and call this function twice.
2014-02-27 16:54:43 +00:00
jilles
cd2fb30b68 sh: Make expari() static. 2014-02-26 21:38:42 +00:00
jilles
1d244d8c45 sh: Prefer memcpy() to strcpy() in most cases. Remove the scopy macro. 2013-11-30 21:27:11 +00:00
jilles
3deb97fc0b sh: Fix various compiler warnings.
It now passes WARNS=7 with clang on i386.

GCC 4.2.1 does not understand setjmp() properly so will always trigger
-Wuninitialized. I will not add the volatile keywords to suppress this.
2013-04-01 17:18:22 +00:00
jilles
6fa139d56e sh: Expand here documents in the current process.
Expand here documents at the same point other redirections are expanded but
use a non-fork subshell environment (like simple command substitutions) for
compatibility. Substitition errors result in an empty here document like
before.

As a result, a fork is avoided for short (<4K) expanded here documents.

Unexpanded here documents (with quoted end marker after <<) are not affected
by this change. They already only forked when >4K.

Side effects:
* Order of expansion is slightly different.
* Slow expansions are not executed in parallel with the redirected command.
* A non-fork subshell environment is subtly different from a forked process.
2013-02-03 15:54:57 +00:00
jilles
8152f4c192 sh: Make various functions static. 2012-01-01 22:17:12 +00:00
jilles
109579dd42 sh: Make patmatch() non-recursive. 2012-01-01 20:50:19 +00:00
jilles
0121cc11e8 sh: Use dirent.d_type in pathname generation.
This improves performance for globs where a slash or another component
follows a component with metacharacters by eliminating unnecessary attempts
to open directories that are not.
2011-12-28 23:40:46 +00:00
jilles
f922478439 sh: Cache de->d_namlen in a local variable. 2011-12-28 23:30:17 +00:00
jilles
84b55be725 sh: Add support for named character classes in bracket expressions.
Example:
  case x in [[:alpha:]]) echo yes ;; esac
2011-06-15 21:48:10 +00:00
jilles
16addb8673 sh: Fix duplicate prototypes for builtins.
Have mkbuiltins write the prototypes for the *cmd functions to builtins.h
instead of builtins.c and include builtins.h in more .c files instead of
duplicating prototypes for *cmd functions in other headers.
2011-06-13 21:03:27 +00:00
jilles
91789615b4 sh: Save/restore changed variables in optimized command substitution.
In optimized command substitution, save and restore any variables changed by
expansions (${var=value} and $((var=assigned))), instead of trying to
determine if an expansion may cause such changes.

If $! is referenced in optimized command substitution, do not cause jobs to
be remembered longer.

This fixes $(jobs $!) again, simplifies the man page and shortens the code.
2011-06-12 23:06:04 +00:00
jilles
59713c6c29 sh: Fix locale-dependent ranges in bracket expressions.
When I added UTF-8 support in r221646, the LC_COLLATE-based ordering broke
because of sign extension of char.

Because of libc restrictions, this does not work for UTF-8. For UTF-8
locales, ranges always use character code order.
2011-06-12 12:54:52 +00:00
jilles
bd55770c94 sh: Do parameter expansion before printing PS4 (set -x).
The function name expandstr() and the general idea of doing this kind of
expansion by treating the text as a here document without end marker is from
dash.

All variants of parameter expansion and arithmetic expansion also work (the
latter is not required by POSIX but it does not take extra code and many
other shells also allow it).

Command substitution is prevented because I think it causes too much code to
be re-entered (for example creating an unbounded recursion of trace lines).

Unfortunately, our LINENO is somewhat crude, otherwise PS4='$LINENO+ ' would
be quite useful.
2011-06-09 23:12:23 +00:00
jilles
543f63b8dc sh: Fix unquoted $@/$* if IFS=''.
If IFS is null, unquoted $@/$* should still expand to separate words.
This differs from quoted $@ (which does not depend on IFS) in that pathname
generation is performed and empty words are removed.
2011-05-27 15:56:13 +00:00
jilles
8ac39aa5be sh: Add UTF-8 support to pattern matching.
?, [...] patterns match codepoints instead of bytes. They do not match
invalid sequences. [...] patterns must not contain invalid sequences
otherwise they will not match anything. This is so that ${var#?} removes the
first codepoint, not the first byte, without putting UTF-8 knowledge into
the ${var#pattern} code. However, * continues to match any string and an
invalid sequence matches an identical invalid sequence. (This differs from
fnmatch(3).)
2011-05-08 11:32:20 +00:00
jilles
8bbce85526 sh: Add UTF-8 support to ${#var}.
If the current locale uses UTF-8, ${#var} counts codepoints (more precisely,
bytes b with (b & 0xc0) != 0x80).
2011-05-07 14:32:16 +00:00
brucec
6d9b42b486 Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
jilles
e252925aeb sh: Remove comment mentioning herefd, which is gone. 2011-02-02 21:48:53 +00:00
jilles
74d9b02bb0 sh: Don't do optimized command substitution if expansions have side effects.
Before considering to execute a command substitution in the same process,
check if any of the expansions may have a side effect; if so, execute it in
a new process just like happens if it is not a single simple command.

Although the check happens at run time, it is a static check that does not
depend on current state. It is triggered by:
- expanding $! (which may cause the job to be remembered)
- ${var=value} default value assignment
- assignment operators in arithmetic
- parameter substitutions in arithmetic except ${#param}, $$, $# and $?
- command substitutions in arithmetic

This means that $((v+1)) does not prevent optimized command substitution,
whereas $(($v+1)) does, because $v might expand to something containing
assignment operators.

Scripts should not depend on these exact details for correctness. It is also
imaginable to have the shell fork if and when a side effect is encountered
or to create a new temporary namespace for variables.

Due to the $! change, the construct $(jobs $!) no longer works. The value of
$! should be stored in a variable outside command substitution first.
2010-12-28 21:27:08 +00:00
jilles
de73f385a5 sh: Allow arbitrary large numbers in CHECKSTRSPACE.
Reduce "stack string" API somewhat and simplify code.
Add a check for integer overflow of the "stack string" length (probably
incomplete).
2010-12-26 13:25:47 +00:00
uqs
bd917baec5 Remove dead code.
c is assigned 0 and *loc is pointing to NULL, so c!=0 cannot be true,
and dereferencing loc would be a bad idea anyway.

Coverity Prevent:	CID 5113
Reviewed by:		jilles
2010-12-18 22:16:15 +00:00
jilles
da5b058d1d sh: Fix corruption of command substitutions with special chars after newline
The CTLESC byte to protect a special character was output before instead of
after a newline directly preceding the special character.

The special handling of newlines is because command substitutions discard
all trailing newlines.
2010-12-16 23:28:20 +00:00
jilles
9daf74d4c8 sh: Remove the herefd hack.
The herefd hack wrote out partial here documents while expanding them. It
seems unnecessary complication given that other expansions just allocate
memory. It causes bugs because the stack is also used for intermediate
results such as arithmetic expressions. Such places should disable herefd
for the duration but not all of them do, and I prefer removing the need for
disabling herefd to disabling it everywhere needed.

Here documents larger than 1024 bytes will use a bit more CPU time and
memory.

Additionally this allows a later change to expand here documents in the
current shell environment. (This is faster for small here documents but also
changes behaviour.)

Obtained from:	dash
2010-12-12 00:07:27 +00:00
jilles
9f0c118349 sh: Replace some macros and repeated code in expand.c with functions.
No functional change is intended, but the binary is about 1K smaller on
i386.
2010-12-11 22:13:29 +00:00
jilles
7377de8f91 sh: Code size optimizations to "stack string" memory allocation:
* Prefer one CHECKSTRSPACE with multiple USTPUTC to multiple STPUTC.
* Add STPUTS macro (based on function) and use it instead of loops that add
  nul-terminated strings to the stack string.

No functional change is intended, but code size is about 1K less on i386.
2010-11-23 22:17:39 +00:00
jilles
6915411ab2 sh: Code size optimizations to buffered output.
This is mainly less use of the outc macro.

No functional change is intended, but code size is about 2K less on i386.
2010-11-20 14:14:52 +00:00
jilles
aaa3347e35 sh: Fix some issues with CTL* bytes and ${var#pat}.
subevalvar() incorrectly assumed that CTLESC bytes were present iff the
expansion was quoted. However, they are present iff various processing such
as word splitting is to be done later on.

Example:
  v=@$e@$e@$e@
  y="${v##*"$e"}"
  echo "$y"
failed if $e contained the magic CTLESC byte.

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-29 19:34:57 +00:00
jilles
28ad180ab4 sh: Do IFS splitting on word in ${v+word} and ${v-word}.
The code is inspired by NetBSD sh somewhat, but different because we
preserve the old Almquist/Bourne/Korn ability to have an unquoted part in a
quoted ${v+word}. For example, "${v-"*"}" expands to $v as a single field if
v is set, but generates filenames otherwise.

Note that this is the only place where we split text literally from the
script (the similar ${v=word} assigns to v and then expands $v). The parser
must now add additional markers to allow the expansion code to know whether
arbitrary characters in substitutions are quoted.

Example:
  for i in ${$+a b c}; do echo $i; done

Exp-run done by:	pav (with some other sh(1) changes)
2010-10-29 13:42:18 +00:00
obrien
08b8d916b5 In the spirit of r90111, depend on c89 and remove the "STATIC" macro
and its usage.
2010-10-13 22:18:03 +00:00
jhb
ec1e5e05cc Make DEBUG traces 64-bit clean:
- Use %t to print ptrdiff_t values.
- Cast a ptrdiff_t value explicitly to int for a field width specifier.

While here, sort includes.

Submitted by:	Garrett Cooper
2010-10-13 13:22:11 +00:00
obrien
f31ad1c86b Consistently use "STATIC" for all functions in order to be able to set
breakpoints with in a debugger.  And use naked "static" for variables.

Noticed by:	bde
2010-10-13 04:01:01 +00:00
jilles
c86bc993b1 sh: Improve comments in expand.c. 2010-09-05 21:12:48 +00:00
jilles
44306000bd sh: Remove remnants of '!!' to negate pattern.
This Almquist extension was disabled long ago.

In pathname generation, components starting with '!!' were treated as
containing wildcards, causing unnecessary readdir (which could fail, causing
pathname generation to fail while it should not).
2010-08-22 21:18:21 +00:00
jilles
8824c5ab76 sh: Fix heap-based buffer overflow in pathname generation.
The buffer for generated pathnames could be too small in some cases. It
happened to be always at least PATH_MAX long, so there was never an overflow
if the resulting pathnames would be usable.

This bug may be abused if a script subjects input from an untrusted source
to pathname generation, which a bad idea anyhow. Most shell scripts do not
work on untrusted data. secteam@ says no advisory is necessary.

PR:		bin/148733
Reported by:	Changming Sun snnn119 at gmail com
MFC after:	10 days
2010-08-10 22:45:59 +00:00
jilles
8fcbe1caf8 sh: Forget about terminated background processes sooner.
Unless $! has been referenced for a particular job or $! still contains that
job's pid, forget about it after it has terminated. If $! has been
referenced, remember the job until the wait builtin has reported its
completion (either with the pid as parameter or without parameters).

In interactive mode, jobs are forgotten after termination has been reported,
which happens before primary prompts and through the jobs builtin. Even
then, though, remember a job if $! has been referenced.

This is similar to what is suggested by POSIX and should fix most memory
leaks (which also tend to cause sh to use more CPU time) with long running
scripts that start background jobs.

Caveats:
* Repeatedly referencing $! without ever doing 'wait', like
    while :; do foo & echo started foo: $!; sleep 60; done
  will still use a lot of memory and CPU time in the long run.
* The jobs and jobid builtins do not cause a job to be remembered for longer
  like expanding $! does.

PR:		bin/55346
2010-06-29 22:37:45 +00:00
jilles
c49fe4933f sh: Fix pathname expansion with quoted slashes like *\/.
These are git commits 36f0fa8fcbc8c7b2b194addd29100fb40e73e4e9 and
d6d06ff5c2ea0fa44becc5ef4340e5f2f15073e4 in dash.

Because this is the first code I'm importing from dash to expand.c, add the
Herbert Xu copyright notice which is in dash's expand.c.

When pathname expanding *\/, the CTLESC representing the quoted state was
erroneously taken as part of the * pathname component. This CTLESC was then
seen by the pattern matching code as escaping the '\0' terminating the
string.

The code is slightly different because dash converts the CTLESC characters
to backslashes and removes all the other CTL* characters to allow
substituting glob(3).

The effect of the bug was also slightly different from dash (where nothing
matched at all). Because a CTLESC can escape a '\0' in some way, whether
files were included despite the bug depended on memory that should not be
read. In particular, on many machines /*\/ expanded to a strict subset of
what /*/ expanded to.

Example:
  echo /*"/null"

This should print /dev/null, not /*/null.

PR:		bin/146378
Obtained from:	dash
2010-05-11 23:19:28 +00:00
jilles
88403fad18 sh: Use stalloc for arith variable names.
This is simpler than the custom memory tracker I added earlier, and is also
needed by the dash arith code I plan to import.
2010-04-25 20:43:19 +00:00
jilles
d21d692410 sh: Do tilde expansion in substitutions.
This applies to word in ${v-word}, ${v+word}, ${v=word}, ${v?word} (which
inherits quoting from the outside) and in ${v%word}, ${v%%word}, ${v#word},
${v##word} (which does not inherit any quoting).

In all cases tilde expansion is only attempted at the start of word, even if
word contains spaces. This agrees with POSIX and other shells.

This is the last part of the patch tested in the exp-run.

Exp-run done by: erwin (with some other sh(1) changes)
2010-04-03 22:04:44 +00:00
jilles
8b7edeca44 sh: Allow quoting pattern match characters in ${v%pat} and ${v#pat}.
Note that this depends on r206145 for allowing pattern match characters to
have their special meaning inside a double-quoted expansion like "${v%pat}".

PR:		bin/117748
Exp-run done by:	erwin (with some other sh(1) changes)
2010-04-03 21:07:50 +00:00
jilles
446838eef9 sh: Fix some bugs with backquoted builtins:
- correctly handle error output in $(builtin 2>&1), clarify out1/out2 vs
  output/errout in the code
- treat all builtins as regular builtins so errors do not abort the shell
  and variable assignments do not persist
- respect the caller's INTOFF

Some bugs still exist:
- expansion errors may still abort the shell
- some side effects of expansions and builtins persist
2010-01-01 18:17:46 +00:00
jilles
5e8a2136e7 sh: Various warning fixes (from WARNS=6 NO_WERROR=1):
- const
- initializations to silence -Wuninitialized (it was safe anyway)
- remove nested extern declarations
- rename "index" locals to "idx"
2009-12-27 18:04:05 +00:00
jilles
d5d0bbbb70 sh: Do not consider a tilde-prefix with expansions in it.
That is, do not do tilde expansion if any of the CTL* bytes (\201-\210), not
only CTLESC and CTLQUOTEMARK, are encountered. Such an expansion would look
up a user name with sh's internal representation.

The parser does not currently distinguish between backslashed and
unbackslashed \201-\210, so tilde expansion of user names with these bytes
in them is not so easy to fix.
2009-12-25 15:29:18 +00:00