The information in sh(1) about the echo builtin is equivalent, though less
extensive.
The echo(1) man page (bin/echo/echo.1) is different.
Unfortunately, sh's echo builtin and /bin/echo have gone out of sync and
this probably cannot be fixed any more.
Reported by: uqs (list of untouched files)
MFC after: 1 week
In particular, remove the text about ksh-like features, which are usually
taken for granted nowadays. The original Bourne shell is fading away and for
most users our /bin/sh is one of the most minimalistic they know.
Convert the tests to the perl prove format.
Remove obsolete TEST.README (results of an old TEST.sh for some old Unices)
and TEST.csh (old tests without correct values, far less complete than
TEST.sh).
MFC after: 1 week
This moves the function of the noaliases variable into the checkkwd
variable. This way it is properly reset on errors and aliases can be used
normally in the commands for each case (the case labels recognize the
keyword esac but no aliases).
The new code is clearer as well.
Obtained from: dash
I've noticed various terminal emulators that need to obtain a sane
default termios structure use very complex `hacks'. Even though POSIX
doesn't provide any functionality for this, extend our termios API with
cfmakesane(3), which is similar to the commonly supported cfmakeraw(3),
except that it fills the termios structure with sane defaults.
Change all code in our base system to use this function, instead of
depending on <sys/ttydefaults.h> to provide TTYDEF_*.
These do something else in ksh: name=(...) is an array or compound variable
assignment and the others are extended patterns.
This is the last patch of the ones tested in the exp run.
Exp-run done by: pav (with some other sh(1) changes)
Apart from detecting breakage earlier or at all, this also fixes a segfault
in the testsuite. The "handling" of the breakage left an invalid internal
representation in some cases.
Examples:
echo a; do echo b
echo `) echo a`
echo `date; do do do`
Exp-run done by: pav (with some other sh(1) changes)
subevalvar() incorrectly assumed that CTLESC bytes were present iff the
expansion was quoted. However, they are present iff various processing such
as word splitting is to be done later on.
Example:
v=@$e@$e@$e@
y="${v##*"$e"}"
echo "$y"
failed if $e contained the magic CTLESC byte.
Exp-run done by: pav (with some other sh(1) changes)
The code is inspired by NetBSD sh somewhat, but different because we
preserve the old Almquist/Bourne/Korn ability to have an unquoted part in a
quoted ${v+word}. For example, "${v-"*"}" expands to $v as a single field if
v is set, but generates filenames otherwise.
Note that this is the only place where we split text literally from the
script (the similar ${v=word} assigns to v and then expands $v). The parser
must now add additional markers to allow the expansion code to know whether
arbitrary characters in substitutions are quoted.
Example:
for i in ${$+a b c}; do echo $i; done
Exp-run done by: pav (with some other sh(1) changes)
If double-quote state does not match, treat the '}' literally.
This ensures double-quote state remains the same before and after a
${v+-=?...} which helps with expand.c.
It makes things like
${foo+"\${bar}"}
which I have seen in the wild work as expected.
Exp-run done by: pav (with some other sh(1) changes)
This is a syntax error.
POSIX does not say explicitly whether defining a function with the same name
as a special builtin is allowed, but it does say that it is impossible to
call such a function.
A special builtin can still be overridden with an alias.
This commit is part of a set of changes that will ensure that when
something looks like a special builtin to the parser, it is one. (Not the
other way around, as it remains possible to call a special builtin named
by a variable or other substitution.)
Exp-run done by: pav (with some other sh(1) changes)
Add some conservative checks on function names:
- Disallow expansions or quoting characters; these can only be called via
strange control characters
- Disallow '/'; these functions cannot be called anyway, as exec.c assumes
they are pathnames
- Make the CTL* bytes work properly in function names.
These are syntax errors.
POSIX does not require us to support more than names (letters, digits and
underscores, not starting with a digit), but I do not want to restrict it
that much at this time.
Exp-run done by: pav (with some other sh(1) changes)
This is how ksh93 treats ! within a pipeline and makes the ! in
a | ! b | c
negate the exit status of the pipeline, as if it were
a | { ! b | c; }
Side effect: something like
f() ! a
is now a syntax error, because a function definition takes a command,
not a pipeline.
Exp-run done by: pav (with some other sh(1) changes)
For multi-command pipelines,
1. all commands are direct children of the shell (unlike the original
Bourne shell)
2. all commands are executed in a subshell (unlike the real Korn shell)
MFC after: 1 week
immediately written into the stack after the call. Instead let the caller
manage the "space left".
Previously, growstackstr()'s assumption causes problems with STACKSTRNUL()
where we want to be able to turn a stack into a C string, and later
pretend the NUL is not there.
This fixes a bug in STACKSTRNUL() (that grew the stack) where:
1. STADJUST() called after a STACKSTRNUL() results in an improper adjust.
This can be seen in ${var%pattern} and ${var%%pattern} evaluation.
2. Memory leak in STPUTC() called after a STACKSTRNUL().
Reviewed by: jilles
- Use %t to print ptrdiff_t values.
- Cast a ptrdiff_t value explicitly to int for a field width specifier.
While here, sort includes.
Submitted by: Garrett Cooper
frobbing CFLAGS directly. DEBUG_FLAGS is something that can be specified
on the make command line without having to edit the Makefile directly.
Submitted by: Garrett Cooper
Add directory names directly and sort at the end.
Include bsd.arch.inc.mk so we can, in the future, more easily make arch
dependent changes in /bin (unlikely, but is needed for symmetry).
expr(1) should usually not be used as various forms of parameter expansion
and arithmetic expansion replicate most of its functionality in an easier
way.
getopt(1) should not be used at all in new code. Instead, getopts(1) or
entirely manual parsing should be used.
MFC after: 1 week
The three examples are better done using sh(1) itself these days.
The example
expr -- "$a" : ".*"
is incorrect in the general case, as "$a" may be an operator.
MFC after: 2 weeks
This makes it impossible to use locale-specific characters in variable
names.
Names containing locale-specific characters make scripts only work with the
correct locale setting. Also, they did not even work in many practical cases
because multibyte character sets such as utf-8 are not supported.
This also avoids weirdness if LC_CTYPE is changed in the middle of a script.
are too long. Filenames escaping this test are caught later on,
so the bug doesn't cause any breakage.
Document the correct ustar limitations in pax. As I have no access
to the IEEE 1003.2 spec, I can only assume that the limitations
imposed are in fact correct.
Add regression tests for the filename limitations imposed by pax.
MFC after: 3 weeks
This Almquist extension was disabled long ago.
In pathname generation, components starting with '!!' were treated as
containing wildcards, causing unnecessary readdir (which could fail, causing
pathname generation to fail while it should not).
In our implementation and most others, a break or continue in a dot script
can break or continue a loop outside the dot script. This should cause all
further commands in the dot script to be skipped. However, cmdloop() did not
know about this and continued to parse and execute commands from the dot
script.
As described in the man page, a return in a dot script in a function returns
from the function, not only from the dot script. There was a similar issue
as with break and continue. In various other shells, the return appears to
return from the dot script, but POSIX seems not very clear about this.
The buffer for generated pathnames could be too small in some cases. It
happened to be always at least PATH_MAX long, so there was never an overflow
if the resulting pathnames would be usable.
This bug may be abused if a script subjects input from an untrusted source
to pathname generation, which a bad idea anyhow. Most shell scripts do not
work on untrusted data. secteam@ says no advisory is necessary.
PR: bin/148733
Reported by: Changming Sun snnn119 at gmail com
MFC after: 10 days
This makes a difference if there is a command substitution.
To make this work, evalstring() has been changed to set exitstatus to 0 if
no command was executed (the string contained only whitespace).
Example:
eval $(false); echo $?
should print 0.
refusing to use stdio.
Reduce nesting level in the sleep loop by returning earlier for negative
timeouts.
Limit the maximum timeout to INT_MAX seconds.
Submitted by: bde
MFC after: 3 weeks
This simply sets a flag in libedit. It has a shortcoming in that it does not
apply to multi-line commands.
Note that a configuration option for this is not going to happen, but always
having this seems better than not having it. NetBSD has done the same.
PR: bin/54683
Obtained from: NetBSD
MFC after: 1 month
So a command like
kill _HUP 1
now fails without sending SIGTERM to init.
The behaviour when kill(2) fails remains unchanged: processing continues.
This matches other implementations and POSIX and is useful for killing
multiple processes at once when some of them may already be gone.
PR: bin/40282
If an ; or & token was followed by an EOF token, pending here-documents were
left uninitialized. Execution would crash, either in the main shell process
for literal here-documents or in a child process for expanded
here-documents. In the latter case the problem is hard to detect apart from
the core dumps and log messages.
Side effect: slightly different retries on inputs where EOF is not
persistent.
Note that tools/regression/bin/sh/parser/heredoc6.0 still causes a similar
crash in a child process. The text passed to eval is malformed and should be
rejected.
simplecmd() only handles simple commands and function definitions, neither
of which involves the ! keyword. The initial token on entry to simplecmd()
is one of the following: TSEMI, TAND, TOR, TNL, TEOF, TWORD, TRP.
Unless $! has been referenced for a particular job or $! still contains that
job's pid, forget about it after it has terminated. If $! has been
referenced, remember the job until the wait builtin has reported its
completion (either with the pid as parameter or without parameters).
In interactive mode, jobs are forgotten after termination has been reported,
which happens before primary prompts and through the jobs builtin. Even
then, though, remember a job if $! has been referenced.
This is similar to what is suggested by POSIX and should fix most memory
leaks (which also tend to cause sh to use more CPU time) with long running
scripts that start background jobs.
Caveats:
* Repeatedly referencing $! without ever doing 'wait', like
while :; do foo & echo started foo: $!; sleep 60; done
will still use a lot of memory and CPU time in the long run.
* The jobs and jobid builtins do not cause a job to be remembered for longer
like expanding $! does.
PR: bin/55346
The LINENO code uses snprintf() and relied on "myhistedit.h" to pull in the
necessary <stdio.h>.
Compiling with -DNO_HISTORY disables all editing and history support and
allows linking without -ledit -ltermcap. This may be useful for embedded
systems.
MFC after: 2 weeks
This uses the new libedit completion function with quoting support.
Unlike NetBSD, there is no 'set +o tabcomplete' option to disable
completion. I do not see any reason for such a special treatment, as
completion is rather useful and it is possible to do
bind ^I ed-insert
to disable completion and insert a tab character instead.
Submitted by: Guy Yur
- .Nd in section NAME is not optional
- .Ed was missing
- "indent" is not a flag, but a literal argument for -offset
- stop switching font sizes for acronyms
- use .Brq instead of rolling our own
is enabled.
This already worked if without job control.
In either case, this depends on it that a process that terminates due to
SIGINT exits on it (so not with status 1, or worse, 0).
Example:
sleep 5; echo continued
This does not print "continued" any more if sleep is aborted via ctrl+c.
MFC after: 1 month
Previously, it would either try to copy it anyway and fail (without -R),
or create fifo instead of the socket (with -R).
Found with: Coverity Prevent
CID: 5623
MFC after: 2 weeks
usually be set first when using -v.
Adjust an example that sets the day to 30 before setting the month to 3 in
accordance with this approach as the example would always fail in February!
PR: 147354
MFC after: 2 weeks
Example (in interactive mode):
cat <<EOF && )
The next command typed caused sh to segfault, because the state for the here
document was not reset.
Like parser_temp, this uses the fact that the parser is not re-entered.
If a command substitution contains a newline token, this no longer starts
here documents of outer commands. This way, we follow POSIX's idea of the
command substitution being a separate script more closely. It also matches
other shells better and is consistent with newline characters in quotes not
starting here documents.
The extension tested in parser/heredoc3.0 ($(cat <<EOF)\ntext\nEOF\n)
continues to be supported.
In particular, this change allows things like
cat <<EOF && echo `pwd`
(a `` command substitution after a here document)
which formerly silently used an empty file as the here document, because the
EOF of the inner command "pwd" also forced an empty here document.
Although "--" historically has not been required to be recognized for
certain special builtins that do not take options in POSIX, some other
implementations recognize options for them, requiring scripts to use "--" or
avoid operands starting with "-".
Operands starting with "-" can be avoided with eval by prepending a space,
and cannot occur with break, continue, exit, return and shift as they only
take numbers, nor with times as it does not take operands. With . and exec,
avoiding "-" is not so easy as it may require reimplementing the PATH
search; therefore the current proposal for POSIX is to require recognition
of "--" for them.
We continue to accept other strings starting with "-" as operands to . and
exec, and also "--" if it is alone to . (which would otherwise be invalid
anyway).
* Move the "environment variables" that do not need exporting to be
effective or that are set by the shell without exporting to a new section
"Special Variables".
* Add special variables LINENO and PPID.
* Add environment variables LANG, LC_* and PWD; also describe ENV under
environment variables.
This prevents accumulating huge amounts of zombies if a script executes
many background commands but no external commands or subshells.
Note that zombies will not be reaped during long calculations (within
the shell process) or read builtins, but those actions do not create
more zombies.
The terminated background commands will also still be remembered by the
shell.
PR: bin/55346
pax(1) was trying to copy the back-referenced data from
the match pattern, not the matched data.
PR: bin/118132
Obtained from: Debian bug #451361
Reviewed by: jilles
MFC after: 3 weeks
These are git commits 36f0fa8fcbc8c7b2b194addd29100fb40e73e4e9 and
d6d06ff5c2ea0fa44becc5ef4340e5f2f15073e4 in dash.
Because this is the first code I'm importing from dash to expand.c, add the
Herbert Xu copyright notice which is in dash's expand.c.
When pathname expanding *\/, the CTLESC representing the quoted state was
erroneously taken as part of the * pathname component. This CTLESC was then
seen by the pattern matching code as escaping the '\0' terminating the
string.
The code is slightly different because dash converts the CTLESC characters
to backslashes and removes all the other CTL* characters to allow
substituting glob(3).
The effect of the bug was also slightly different from dash (where nothing
matched at all). Because a CTLESC can escape a '\0' in some way, whether
files were included despite the bug depended on memory that should not be
read. In particular, on many machines /*\/ expanded to a strict subset of
what /*/ expanded to.
Example:
echo /*"/null"
This should print /dev/null, not /*/null.
PR: bin/146378
Obtained from: dash
This allows doing things like LC_ALL=C some_builtin to run a builtin under a
different locale, just like is possible with external programs. The
immediate reason is that this allows making printf(1) a builtin without
breaking things like LC_NUMERIC=C printf '%f\n' 1.2
This change also affects special builtins, as even though the assignment is
persistent, the export is only to the builtin (unless the variable was
already exported).
Note: for this to work for builtins that also exist as external programs
such as /bin/test, the setlocale() call must be under #ifndef SHELL. The
shell will do the setlocale() calls which may not agree with the environment
variables.
in at least three ways, so do not say it is ignored:
* who may delete/rename a symlink in a sticky directory
* who may do lchflags(2)/lchown(2)/lchmod(2)
* whose inode quota is charged
MFC after: 1 week
In the 'ln source... directory' synopsis, the basename of each source
determines the name of the created link. Determine this using basename(3)
instead of strrchr(..., '/') which is incorrect if the pathname ends in a
slash.
The patch is somewhat changed to allow for basename(3) implementations that
change the passed pathname, and to fix the -w option's checking also.
The code to compare directory entries only applies to hard links, which
cannot be created to directories using ln.
Example:
ln -s /etc/defaults/ /tmp
This should create a symlink named defaults.
PR: 121568
Submitted by: Ighighi
MFC after: 1 week
Two pathnames refer to the same directory entry iff the directories match
and the final components' names match.
Example: (assuming file1 is an existing file)
ln -f file1 file1
This now fails while leaving file1 intact. It used to delete file1 and then
complain it cannot be linked because it is gone.
With -i, this error is detected before the question is asked.
MFC after: 2 weeks
Unset PWD if it is incorrect and no value for it can be determined.
This preserves the logical current directory across shell invocations.
Example (assuming /home is a symlink):
$ cd
$ pwd
/home/foo
$ sh
$ pwd
/home/foo
Formerly the second pwd would show the physical path (symlinks resolved).
Although groff_mdoc(7) gives another impression, this is the ordering
most widely used and also required by mdocml/mandoc.
Reviewed by: ru
Approved by: philip, ed (mentors)
These do pretty much nothing (except that parentheses are ignored), but
people seem to use them and allowing them does not hurt much.
Single-quotes seem not to be used and cause silently different behaviour
with ksh93 character constants.
This makes sh a bit more friendly in single user mode, make buildenv, chroot
and the like, and matches other shells.
The -o emacs can be overridden on the command line or in the ENV file.
Note that the following sentence
> Enclosing the full parameter expansion string in double-quotes does not
> cause the following four varieties of pattern characters to be quoted,
> whereas quoting characters within the braces has this effect.
is now true, but used to be incorrect.
This applies to word in ${v-word}, ${v+word}, ${v=word}, ${v?word} (which
inherits quoting from the outside) and in ${v%word}, ${v%%word}, ${v#word},
${v##word} (which does not inherit any quoting).
In all cases tilde expansion is only attempted at the start of word, even if
word contains spaces. This agrees with POSIX and other shells.
This is the last part of the patch tested in the exp-run.
Exp-run done by: erwin (with some other sh(1) changes)
Note that this depends on r206145 for allowing pattern match characters to
have their special meaning inside a double-quoted expansion like "${v%pat}".
PR: bin/117748
Exp-run done by: erwin (with some other sh(1) changes)
They will be treated like normal characters, resulting in a runtime
arithmetic expression error.
Exp-run done by: erwin (with some other sh(1) changes)
* remove the backslash from \} inside double quotes inside +-=?
substitutions, e.g. "${$+\}a}"
* maintain separate double-quote state for ${v#...} and ${v%...};
single and double quotes are special inside, even in a double-quoted
string or here document
* keep track of correct order of substitutions and arithmetic
This is different from dash's approach, which does not track individual
double quotes in the parser, trying to fix this up during expansion.
This treats single quotes inside "${v#...}" incorrectly, however.
This is similar to NetBSD's approach (as submitted in PR bin/57554), but
recognizes the difference between +-=? and #% substitutions hinted at in
POSIX and is more refined for arithmetic expansion and here documents.
PR: bin/57554
Exp-run done by: erwin (with some other sh(1) changes)
The old approach was wrong because PS2 was not used and seems unlikely to
parse extensions (ksh93's ${ COMMAND} may well fail to parse).
Exp-run done by: erwin (with some other sh(1) changes)
Redirection errors on subshells already did not abort the shell because
the redirection is executed in the subshell.
Other shells seem to agree that these redirection errors should not abort
the shell.
Also ensure that the redirections will be cleaned up properly in cases like
command eval '{ shift x; } 2>/dev/null'
Example:
{ echo bad; } </var/empty/x; echo good
Although simple commands without a command word (only assignments and/or
redirections) are much like special builtins, POSIX and most shells seem to
agree that redirection errors should not abort the shell in this case. Of
course, the assignments persist and assignment errors are fatal.
To get the old behaviour portably, use the ':' special builtin.
To get the new behaviour portably, given that there are no assignments, use
the 'true' regular builtin.
Make parsebackq a function instead of an emulated nested function.
This puts the setjmp usage in a smaller function where it is easier to avoid
bad optimizations.
* avoid unnecessary fork
* allow executing builtins via command
* executing a special builtin via command removes its special properties
Obtained from: NetBSD (parts)
Although argc and argv are never read after the longjmp is complete,
gcc is not clever enough to see that and needlessly warns about it.
So add volatile to silence the compiler.
Approved by: ed (the co-mentor, not ed(1))
Otherwise the -i option will show the inode number of the referenced file
for symbolic links given on the command line. Similarly, the file color
was printed according to the link target in colorized output.
PR: bin/102394
Reviewed by: jilles
MFC after: 2 weeks
All the elements of these structs are char anyway, so it won't hurt
performance.
Bump warns back up to the default.
# we likely should have CTASSERTS to make sure they are the right size.
# but with libarchive based tar maybe we shouldn't bother.
- Allow -h option to work if the listing contains at least one device
file.
- Align major and minor device numbers correctly to the size field.
PR: bin/125678
Approved by: trasz (mentor)
MFC after: 1 month
provides an empty fts_name and reporting the full path is more
appropriate especially with the -R option.
PR: bin/107515
Submitted by: bde
Approved by: trasz (mentor)
MFC after: 1 week
feature parity with du(1) and similar: When set, cp(1) will not traverse
mount points.
Initial patch by: Graham J Lee leeg teaching.physics.ox.ac.uk
PR: bin/88056
Initial patch by: Graham J Lee leeg teaching.physics.ox.ac.uk
Approved by: ed (mentor)
MFC after: 1 month