Commit Graph

2597 Commits

Author SHA1 Message Date
jilles
c9be2081e0 sh: Add \u/\U support (in $'...') for UTF-8.
Because we have no iconv in base, support for other charsets is not
possible.

Note that \u/\U are processed using the locale that was active when the
shell started. This is necessary to avoid behaviour that depends on the
parse/execute split (for example when placing braces around an entire
script). Therefore, UTF-8 encoding is implemented manually.
2011-05-08 17:40:10 +00:00
jilles
bf0e27838c sh: Optimize variable code by storing the length of the name.
Obtained from:	NetBSD
2011-05-08 16:15:50 +00:00
jilles
b838671bb4 sh(1): Update BUGS section for UTF-8 support. 2011-05-08 14:03:44 +00:00
jilles
8ac39aa5be sh: Add UTF-8 support to pattern matching.
?, [...] patterns match codepoints instead of bytes. They do not match
invalid sequences. [...] patterns must not contain invalid sequences
otherwise they will not match anything. This is so that ${var#?} removes the
first codepoint, not the first byte, without putting UTF-8 knowledge into
the ${var#pattern} code. However, * continues to match any string and an
invalid sequence matches an identical invalid sequence. (This differs from
fnmatch(3).)
2011-05-08 11:32:20 +00:00
jilles
8bbce85526 sh: Add UTF-8 support to ${#var}.
If the current locale uses UTF-8, ${#var} counts codepoints (more precisely,
bytes b with (b & 0xc0) != 0x80).
2011-05-07 14:32:16 +00:00
jilles
6672b66ab7 sh: Track if the current locale's charset is UTF-8 or not. 2011-05-06 22:31:27 +00:00
jilles
e4f71c5640 sh: Change the CTL* bytes to ones invalid in UTF-8.
This ensures that mbrtowc(3) can be used directly once it has been verified
that there is no CTL* byte. Dealing with a CTLESC byte within a multibyte
character would be complicated.

The new values do occur in iso-8859-* encodings. This decreases efficiency
slightly but should not affect correctness.

Caveat: Updating across this change and rebuilding without cleaning may
yield a subtly broken sh binary. By default, make buildworld will clean and
avoid problems.
2011-05-06 20:45:50 +00:00
jilles
5a49f52603 sh: Add $'quoting' (C-style escape sequences).
A string between $' and ' may contain backslash escape sequences similar to
the ones in a C string constant (except that a single-quote must be escaped
and a double-quote need not be). Details are in the sh(1) man page.

This construct is useful to include unprintable characters, tabs and
newlines in strings; while this can be done with a command substitution
containing a printf command, that needs ugly workarounds if the result is to
end with a newline as command substitution removes all trailing newlines.

The construct may also be useful in future to describe unprintable
characters without needing to write those characters themselves in 'set -x',
'export -p' and the like.

The implementation attempts to comply to the proposal for the next issue of
the POSIX specification. Because this construct is not in POSIX.1-2008,
using it in scripts intended to be portable is unwise.

Matching the minimal locale support in the rest of sh, the \u and \U
sequences are currently not useful.

Exp-run done by: pav (with some other sh(1) changes)
2011-05-05 20:55:55 +00:00
jilles
ba0d29571f sh: Apply set -u to variables in arithmetic.
Note that this only applies to variables that are actually used.
Things like (0 && unsetvar) do not cause an error.

Exp-run done by: pav (with some other sh(1) changes)
2011-05-04 22:12:22 +00:00
jilles
fa0f3c42ef sh: Detect an error for ${#var<GARBAGE>}.
In particular, this makes things like ${#foo[0]} and ${#foo[@]} errors
rather than silent equivalents of ${#foo}.

PR:		bin/151720
Submitted by:	Mark Johnston
Exp-run done by: pav (with some other sh(1) changes)
2011-05-04 21:49:34 +00:00
ru
a1d4dd2a37 Don't call -f option's argument "stdin".
MFC after:	3 days
2011-05-03 10:08:11 +00:00
jilles
7b50330e01 sh: Set $? to 0 for background commands.
For backgrounded pipelines and subshells, the previous value of $? was being
preserved, which is incorrect.

For backgrounded simple commands containing a command substitution, the
status of the last command substitution was returned instead of 0.

If fork() fails, this is an error.
2011-04-25 20:54:12 +00:00
jilles
836a99923b sh: Check setuid()/setgid() return values.
If the -p option is turned off, privileges from a setuid or setgid binary
are dropped. Make sure to check if this succeeds. If it fails, this is an
error which will cause the shell to abort except in interactive mode or if
'command' was used to make 'set' or an outer 'eval' or '.' non-special.

Note that taking advantage of this feature and writing setuid shell scripts
seems unwise.

MFC after:	1 week
2011-04-25 10:14:29 +00:00
jilles
54847e6220 sh: Remove duplicate code resetting uid/gid for set +p/+o privileged.
MFC after:	1 week
2011-04-25 10:08:34 +00:00
jilles
f250dc2f44 sh: Allow EV_EXIT through function calls, make {...} <redir more consistent.
If EV_EXIT causes an exit, use the exception mechanism to unwind
redirections and local variables. This way, if the final command is a
redirected command, an EXIT trap now executes without the redirections.

Because of these changes, EV_EXIT can now be inherited by the body of a
function, so do so. This means that a function no longer prevents a fork
before an exec being skipped, such as in
  f() { head -1 /etc/passwd; }; echo $(f)

Wrapping a single builtin in a function may still cause an otherwise
unnecessary fork with command substitution, however.

An exit command or -e failure still invokes the EXIT trap with the
original redirections and local variables in place.

Note: this depends on SHELLPROC being gone. A SHELLPROC depended on
keeping the redirections and local variables and only cleaning up the
state to restore them.
2011-04-23 22:28:56 +00:00
jilles
1347144ea4 sh: Do not word split "${#parameter}".
This is only a problem if IFS contains digits, which is unusual but valid.

Because of an incorrect fix for PR bin/12137, "${#parameter}" was treated
as ${#parameter}. The underlying problem was that "${#parameter}"
erroneously added CTLESC bytes before determining the length. This
was properly fixed for PR bin/56147 but the incorrect fix was not backed
out.

Reported by:	Seeker on forums.freebsd.org
MFC after:	2 weeks
2011-04-20 22:24:54 +00:00
trasz
3ba2f4e3f2 Document problems with -d/-w and the fact that -X is the default.
Suggested by:	arundel@
Reviewed by:	arundel@
2011-04-18 19:20:47 +00:00
trasz
205b535d3e Get rid of DSIZ; instead just call the sizing function if provided. 2011-04-12 20:10:15 +00:00
trasz
53df99cb04 Make it possible to use permission sets (full_set, modify_set, read_set
and write_set) with setfacl(1).

PR:		kern/154113
Submitted by:	Shawn Webb <lattera at gmail dot com> (earlier version)
MFC after:	1 month
2011-04-09 07:42:25 +00:00
trasz
e94d4d2ed6 Add proper width calculation for time fields (time, cputime and usertime).
This fixes the ugly overflow in "ps aux" output for "[idle]".
2011-03-24 20:15:42 +00:00
trasz
7a8bd43974 Make "LOGIN" and "CLASS" columns width scale properly instead of wasting space. 2011-03-24 17:20:24 +00:00
jilles
f4c860e408 sh(1): Describe subshell environment, command substitution more correctly.
POSIX does not require the shell to fork for a subshell environment, and we
use that possibility in various ways (command substitutions with a single
command and most subshells that are the final command of a shell process).
Therefore do not tie subshells to forking in the man page.

Command substitutions with expansions are a bit strange, causing a fork for
$(...$(($x))...) because $x might expand to y=2; they will probably be
changed later but this is how they work now.
2011-03-20 23:52:45 +00:00
kib
425e556ac9 Implement the usertime and systime keywords for ps, printing the
corresponding times reported by getrusage().

Submitted by:	Dan Nelson <dnelson allantgroup com>
MFC after:	1 week
2011-03-17 11:25:32 +00:00
jilles
2a22eeb6a2 bin: Prefer strrchr() to rindex().
This removes the last index/rindex usage from /bin.
2011-03-15 22:22:11 +00:00
jilles
161663c247 sh: Fix some parameter expansion variants ${#...}.
These already worked: $# ${#} ${##} ${#-} ${#?}
These now work as well: ${#+word} ${#-word} ${##word} ${#%word}

There is an ambiguity in the standard with ${#?}: it could be the length of
$? or it could be $# giving an error in the (impossible) case that it is not
set. We continue to use the former interpretation as it seems more useful.
2011-03-13 20:02:39 +00:00
stefanf
0d4be9304a Remove unnecessary cast.
Reviewed by:	jilles
2011-03-07 07:31:15 +00:00
jilles
75dda0ff36 sh(1): Reduce excessive semicolon-separated sentences.
Reported by:	Benjamin Kaduk
2011-03-06 21:20:53 +00:00
trasz
1618438630 Export login class information via kinfo and make it possible to view
it using "ps -o class".
2011-03-05 14:41:49 +00:00
jilles
1a2c2ccf00 sh: Fix some warnings in code for arithmetic expressions.
Submitted by:	eadler
2011-03-05 13:27:13 +00:00
jilles
fe3682faa7 kill: Note that this is used both as a normal program and a shell builtin. 2011-03-01 21:48:22 +00:00
delphij
fbf06de861 Accept == as an alias of = which is a popular GNU extension.
This is intentionally undocumented for now since it's not part
of any standard.

MFC after:	1 month
2011-02-27 12:28:06 +00:00
ume
77ef92289d When WITH_ICONV is set, use our in-tree iconv. 2011-02-26 18:54:54 +00:00
pluknet
4005b565ff mdoc(7) markup.
Approved by:	avg (mentor), kib (mentor)
MFC after:	3 days
2011-02-21 16:03:39 +00:00
brucec
6d9b42b486 Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
jilles
9cea8aad2d test: Note that this is used both as a normal program and a shell builtin.
MFC after:	1 week
2011-02-15 22:17:47 +00:00
jilles
2fb0603686 sh: Detect dividing the smallest integer by -1.
This overflows and on some architectures such as amd64 it generates SIGFPE.
Generate an error on all architectures.
2011-02-12 23:44:05 +00:00
brucec
d03fb1381f Fix typos.
PR:	docs/131625
Submitted by:	Andrew Wright <andrew at qemg.org>
MFC after:	1 month
2011-02-12 20:28:15 +00:00
jilles
a0549c0f22 sh(1): Update description of arithmetic. 2011-02-08 23:19:40 +00:00
jilles
1cbab8a321 sh: Import arithmetic expression code from dash.
New features:
* proper lazy evaluation of || and &&
* ?: ternary operator
* executable is considerably smaller (8K on i386) because lex and yacc are
  no longer used

Differences from dash:
* arith_t instead of intmax_t
* imaxdiv() not used
* unset or null variables default to 0
* let/exp builtin (undocumented, will probably be removed later)

Obtained from:	dash
2011-02-08 23:18:06 +00:00
jilles
ff6aee65ce sh: Fix two things about {(...)} <redir:
* In {(...) <redir1;} <redir2, do not drop redir1.
* Maintain the difference between (...) <redir and {(...)} <redir:
  In (...) <redir, the redirection is performed in the child, while in
  {(...)} <redir it should be performed in the parent (like {(...); :;}
  <redir)
2011-02-05 15:02:19 +00:00
jilles
9a75a8c404 sh: Remove clearcmdentry()'s now unused argument. 2011-02-05 14:08:51 +00:00
jilles
852a80acf7 sh: Forget all cached command locations on any PATH change.
POSIX requires this and it is simpler than the previous code that remembered
command locations when appending directories to PATH.

In particular,
  PATH=$PATH
is no longer a no-op but discards all cached command locations.
2011-02-05 14:01:46 +00:00
jilles
a81357fbe9 sh: Do not try to execute binary files as scripts.
If execve() returns an [ENOEXEC] error, check if the file is binary before
trying to execute it using sh. A file is considered binary if at least one
of the first 256 bytes is '\0'.

In particular, trying to execute ELF binaries for the wrong architecture now
fails with an "Exec format error" message instead of syntax errors and
potentially strange results.
2011-02-05 12:54:59 +00:00
jilles
95ad413d4a sh: Remove special code for shell scripts without magic number.
These are called "shell procedures" in the source.

If execve() failed with [ENOEXEC], the shell would reinitialize itself
and execute the program as a script. This requires a fair amount of code
which is not frequently used (most scripts have a #! magic number).
Therefore just execute a new instance of sh (_PATH_BSHELL) to run the
script.
2011-02-04 22:47:55 +00:00
jilles
dbecc33067 Make sys_signame upper case.
This matches the constants from <signal.h> with 'SIG' removed, which POSIX
requires kill and trap to accept and 'kill -l' to write.

'kill -l', 'trap', 'trap -l' output is now upper case.

In Turkish locales, signal names with an upper case 'I' are now accepted,
while signal names with a lower case 'i' are no longer accepted, and the
output of 'killall -l' now contains proper capital 'I' without dot instead
of a dotted capital 'I'.
2011-02-04 16:40:50 +00:00
jilles
86ccb3f9c0 sh: Return only 126 or 127 for execve() failures.
Do not return 2 for errors other than [EACCES] or [ENOENT].
2011-02-03 23:38:11 +00:00
jilles
e252925aeb sh: Remove comment mentioning herefd, which is gone. 2011-02-02 21:48:53 +00:00
jilles
8605caacbf sh: Send messages about signals to stderr.
This is required by POSIX and seems to make more sense.

See also r217557.
2011-01-30 22:57:52 +00:00
jilles
a123f0aac0 sh: Clean up some old comments:
* There is no plan for an alternative to the command "set".
* Attempting to unset a readonly variable has not raised an error for quite
  a while, so the order of unsetting a variable and a function with the same
  name does not matter.

MFC after:	1 week
2011-01-25 20:56:18 +00:00
kib
ca24f47263 Document P_FOLLOWFORK.
MFC after:	2 weeks
2011-01-25 11:04:16 +00:00