Because sh executes commands in subshell environments without forking in
more and more cases (particularly from 8.0 on), it makes sense to describe
subshell environments more precisely using ideas from POSIX, together with
some FreeBSD-specific items.
In particular, the hash and times builtins may not behave as if their state
is copied for a subshell environment while leaving the parent shell
environment unchanged.
* Shell patterns are also for ${var#pat} and the like.
* An '!' by itself will not trigger pathname generation so do not call it a
meta-character, even though it has a special meaning directly after an
'['.
* Character ranges are locale-dependent.
* A '^' will complement a character class like '!' but is non-standard.
MFC after: 1 week
POSIX requires a -h option to sh and set, to locate and remember utilities
invoked by functions as they are defined. Given that this
locate-and-remember process is optional elsewhere, it seems safe enough to
make this option do nothing.
POSIX does not specify a long name for this option. Follow ksh in calling it
"trackall".
Replacing ;; with the new control operator ;& will cause the next list to be
executed as well without checking its pattern, continuing until a list ends
with ;; or until the end of the case statement. This is like omitting
"break" in a C "switch" statement.
The sequence ;& was formerly invalid.
This feature is proposed for the next POSIX issue in Austin Group issue
#449.
In optimized command substitution, save and restore any variables changed by
expansions (${var=value} and $((var=assigned))), instead of trying to
determine if an expansion may cause such changes.
If $! is referenced in optimized command substitution, do not cause jobs to
be remembered longer.
This fixes $(jobs $!) again, simplifies the man page and shortens the code.
The function name expandstr() and the general idea of doing this kind of
expansion by treating the text as a here document without end marker is from
dash.
All variants of parameter expansion and arithmetic expansion also work (the
latter is not required by POSIX but it does not take extra code and many
other shells also allow it).
Command substitution is prevented because I think it causes too much code to
be re-entered (for example creating an unbounded recursion of trace lines).
Unfortunately, our LINENO is somewhat crude, otherwise PS4='$LINENO+ ' would
be quite useful.
This reflects failure to determine the pathname of the new directory in the
exit status (1). Normally, cd returns successfully if it did chdir() and the
call was successful.
In POSIX, -e only has meaning with -P; because our -L is not entirely
compliant and may fall back to -P mode, -e has some effect with -L as well.
Because we have no iconv in base, support for other charsets is not
possible.
Note that \u/\U are processed using the locale that was active when the
shell started. This is necessary to avoid behaviour that depends on the
parse/execute split (for example when placing braces around an entire
script). Therefore, UTF-8 encoding is implemented manually.
A string between $' and ' may contain backslash escape sequences similar to
the ones in a C string constant (except that a single-quote must be escaped
and a double-quote need not be). Details are in the sh(1) man page.
This construct is useful to include unprintable characters, tabs and
newlines in strings; while this can be done with a command substitution
containing a printf command, that needs ugly workarounds if the result is to
end with a newline as command substitution removes all trailing newlines.
The construct may also be useful in future to describe unprintable
characters without needing to write those characters themselves in 'set -x',
'export -p' and the like.
The implementation attempts to comply to the proposal for the next issue of
the POSIX specification. Because this construct is not in POSIX.1-2008,
using it in scripts intended to be portable is unwise.
Matching the minimal locale support in the rest of sh, the \u and \U
sequences are currently not useful.
Exp-run done by: pav (with some other sh(1) changes)
POSIX does not require the shell to fork for a subshell environment, and we
use that possibility in various ways (command substitutions with a single
command and most subshells that are the final command of a shell process).
Therefore do not tie subshells to forking in the man page.
Command substitutions with expansions are a bit strange, causing a fork for
$(...$(($x))...) because $x might expand to y=2; they will probably be
changed later but this is how they work now.
If execve() returns an [ENOEXEC] error, check if the file is binary before
trying to execute it using sh. A file is considered binary if at least one
of the first 256 bytes is '\0'.
In particular, trying to execute ELF binaries for the wrong architecture now
fails with an "Exec format error" message instead of syntax errors and
potentially strange results.
These are called "shell procedures" in the source.
If execve() failed with [ENOEXEC], the shell would reinitialize itself
and execute the program as a script. This requires a fair amount of code
which is not frequently used (most scripts have a #! magic number).
Therefore just execute a new instance of sh (_PATH_BSHELL) to run the
script.
This allows specifying a %job (which is equivalent to the corresponding
process group).
Additionally, it improves reliability of kill from sh in high-load
situations and ensures "kill" finds the correct utility regardless of PATH,
as required by POSIX (unless the undocumented %builtin mechanism is used).
Side effect: fatal errors (any error other than kill(2) failure) now return
exit status 2 instead of 1. (This is consistent with other sh builtins, but
not in NetBSD.)
Code size increases about 1K on i386.
Obtained from: NetBSD
Make sure all built-in commands are in the subsection named such, except
exp, let and wordexp which are deliberately undocumented. The text said only
built-ins that really need to be a built-in were documented there but in
fact almost all of them were already documented.
This was removed in 2001 but I think it is appropriate to add it back:
* I do not want to encourage people to write fragile and non-portable echo
commands by making printf much slower than echo.
* Recent versions of Autoconf use it a lot.
* Almost no software still wants to support systems that do not have
printf(1) at all.
* In many other shells printf is already a builtin.
Side effect: printf is now always the builtin version (which behaves
identically to /usr/bin/printf) and cannot be overridden via PATH (except
via the undocumented %builtin mechanism).
Code size increases about 5K on i386. Embedded folks might want to replace
/usr/bin/printf with a hard link to /usr/bin/alias.
In particular, remove the text about ksh-like features, which are usually
taken for granted nowadays. The original Bourne shell is fading away and for
most users our /bin/sh is one of the most minimalistic they know.
For multi-command pipelines,
1. all commands are direct children of the shell (unlike the original
Bourne shell)
2. all commands are executed in a subshell (unlike the real Korn shell)
MFC after: 1 week
expr(1) should usually not be used as various forms of parameter expansion
and arithmetic expansion replicate most of its functionality in an easier
way.
getopt(1) should not be used at all in new code. Instead, getopts(1) or
entirely manual parsing should be used.
MFC after: 1 week
Unless $! has been referenced for a particular job or $! still contains that
job's pid, forget about it after it has terminated. If $! has been
referenced, remember the job until the wait builtin has reported its
completion (either with the pid as parameter or without parameters).
In interactive mode, jobs are forgotten after termination has been reported,
which happens before primary prompts and through the jobs builtin. Even
then, though, remember a job if $! has been referenced.
This is similar to what is suggested by POSIX and should fix most memory
leaks (which also tend to cause sh to use more CPU time) with long running
scripts that start background jobs.
Caveats:
* Repeatedly referencing $! without ever doing 'wait', like
while :; do foo & echo started foo: $!; sleep 60; done
will still use a lot of memory and CPU time in the long run.
* The jobs and jobid builtins do not cause a job to be remembered for longer
like expanding $! does.
PR: bin/55346
* Move the "environment variables" that do not need exporting to be
effective or that are set by the shell without exporting to a new section
"Special Variables".
* Add special variables LINENO and PPID.
* Add environment variables LANG, LC_* and PWD; also describe ENV under
environment variables.
This makes sh a bit more friendly in single user mode, make buildenv, chroot
and the like, and matches other shells.
The -o emacs can be overridden on the command line or in the ENV file.
Note that the following sentence
> Enclosing the full parameter expansion string in double-quotes does not
> cause the following four varieties of pattern characters to be quoted,
> whereas quoting characters within the braces has this effect.
is now true, but used to be incorrect.
* avoid unnecessary fork
* allow executing builtins via command
* executing a special builtin via command removes its special properties
Obtained from: NetBSD (parts)
This seems more useful and will likely be in the next POSIX standard.
Also document more precisely in the man page what set -u does (note that
$@, $* and $! are the only special parameters that can ever be unset, all
the others are always set, although they may be empty).
I do not consider this a bug because POSIX permits it and argument strings
and environment variables cannot contain '\0' anyway.
PR: bin/25542
MFC after: 2 weeks
The exit status may exceed 255 in some cases (return); even though it seems
unwise to rely on this, it is also unwise to assume that $? is always
between 0 and 255.
This resolves bin/124748 by documenting that 'exit -1' is not valid.
PR: bin/124748
Approved by: ed (mentor)
character.
This avoids using non-standard behaviour of the old (upto FreeBSD 7) TTY
layer: it reprocesses the input queue when switching to canonical mode. The
new TTY layer does not provide this functionality and so read -t worked
very poorly (first character is not echoed, cannot be backspaced but is
still read).
This also agrees with what most other shells with read -t do.
PR: bin/129566
Reviewed by: stefanf
Approved by: ed (mentor)
When I imported the MPSAFE TTY code, I added the -p flag to sh(1)'s
ulimit, but I forgot to document it in the appropriate manual page.
Requested by: stefanf