sh: Update TOUR and comments for some code changes, some of them old.

Also, improve some terminology in TOUR and comments.
This commit is contained in:
Jilles Tjoelker 2017-05-06 13:28:42 +00:00
parent 1fc317e374
commit b98072777f
5 changed files with 58 additions and 44 deletions

View File

@ -24,7 +24,7 @@ programs is:
program input files generates program input files generates
------- ----------- --------- ------- ----------- ---------
mkbuiltins builtins builtins.h builtins.c mkbuiltins builtins.def builtins.h builtins.c
mknodes nodetypes nodes.h nodes.c mknodes nodetypes nodes.h nodes.c
mksyntax - syntax.h syntax.c mksyntax - syntax.h syntax.c
mktokens - token.h mktokens - token.h
@ -108,10 +108,12 @@ The text field of a NARG structure points to the text of the
word. The text consists of ordinary characters and a number of word. The text consists of ordinary characters and a number of
special codes defined in parser.h. The special codes are: special codes defined in parser.h. The special codes are:
CTLVAR Variable substitution CTLVAR Parameter expansion
CTLENDVAR End of variable substitution CTLENDVAR End of parameter expansion
CTLBACKQ Command substitution CTLBACKQ Command substitution
CTLBACKQ|CTLQUOTE Command substitution inside double quotes CTLBACKQ|CTLQUOTE Command substitution inside double quotes
CTLARI Arithmetic expansion
CTLENDARI End of arithmetic expansion
CTLESC Escape next character CTLESC Escape next character
A variable substitution contains the following elements: A variable substitution contains the following elements:
@ -130,18 +132,31 @@ stitution. The possible types are:
VSQUESTION|VSNUL ${var:?text} VSQUESTION|VSNUL ${var:?text}
VSASSIGN ${var=text} VSASSIGN ${var=text}
VSASSIGN|VSNUL ${var:=text} VSASSIGN|VSNUL ${var:=text}
VSTRIMLEFT ${var#text}
VSTRIMLEFTMAX ${var##text}
VSTRIMRIGHT ${var%text}
VSTRIMRIGHTMAX ${var%%text}
VSLENGTH ${#var}
VSERROR delayed error
In addition, the type field will have the VSQUOTE flag set if the In addition, the type field will have the VSQUOTE flag set if the
variable is enclosed in double quotes. The name of the variable variable is enclosed in double quotes and the VSLINENO flag if
comes next, terminated by an equals sign. If the type is not LINENO is being expanded (the parameter name is the decimal line
VSNORMAL, then the text field in the substitution follows, ter- number). The parameter's name comes next, terminated by an equals
minated by a CTLENDVAR byte. sign. If the type is not VSNORMAL (including when it is VSLENGTH),
then the text field in the substitution follows, terminated by a
CTLENDVAR byte.
The type VSERROR is used to allow parsing bad substitutions like
${var[7]} and generate an error when they are expanded.
Commands in back quotes are parsed and stored in a linked list. Commands in back quotes are parsed and stored in a linked list.
The locations of these commands in the string are indicated by The locations of these commands in the string are indicated by
CTLBACKQ and CTLBACKQ+CTLQUOTE characters, depending upon whether CTLBACKQ and CTLBACKQ+CTLQUOTE characters, depending upon whether
the back quotes were enclosed in double quotes. the back quotes were enclosed in double quotes.
Arithmetic expansion starts with CTLARI and ends with CTLENDARI.
The character CTLESC escapes the next character, so that in case The character CTLESC escapes the next character, so that in case
any of the CTL characters mentioned above appear in the input, any of the CTL characters mentioned above appear in the input,
they can be passed through transparently. CTLESC is also used to they can be passed through transparently. CTLESC is also used to
@ -153,11 +168,11 @@ right. In the case of here documents which are not subject to
variable and command substitution, the parser doesn't insert any variable and command substitution, the parser doesn't insert any
CTLESC characters to begin with (so the contents of the text CTLESC characters to begin with (so the contents of the text
field can be written without any processing). Other here docu- field can be written without any processing). Other here docu-
ments, and words which are not subject to splitting and file name ments, and words which are not subject to file name generation,
generation, have the CTLESC characters removed during the vari- have the CTLESC characters removed during the variable and command
able and command substitution phase. Words which are subject to substitution phase. Words which are subject to file name
splitting and file name generation have the CTLESC characters re- generation have the CTLESC characters removed as part of the file
moved as part of the file name phase. name phase.
EXECUTION: Command execution is handled by the following files: EXECUTION: Command execution is handled by the following files:
eval.c The top level routines. eval.c The top level routines.
@ -199,10 +214,10 @@ later.)
The routine shellexec is the interface to the exec system call. The routine shellexec is the interface to the exec system call.
EXPAND.C: Arguments are processed in three passes. The first EXPAND.C: As the routine argstr generates words by parameter
(performed by the routine argstr) performs variable and command expansion, command substitution and arithmetic expansion, it
substitution. The second (ifsbreakup) performs word splitting performs word splitting on the result. As each word is output,
and the third (expandmeta) performs file name generation. the routine expandmeta performs file name generation (if enabled).
VAR.C: Variables are stored in a hash table. Probably we should VAR.C: Variables are stored in a hash table. Probably we should
switch to extensible hashing. The variable name is stored in the switch to extensible hashing. The variable name is stored in the
@ -221,8 +236,8 @@ BUILTIN COMMANDS: The procedures for handling these are scat-
tered throughout the code, depending on which location appears tered throughout the code, depending on which location appears
most appropriate. They can be recognized because their names al- most appropriate. They can be recognized because their names al-
ways end in "cmd". The mapping from names to procedures is ways end in "cmd". The mapping from names to procedures is
specified in the file builtins, which is processed by the mkbuilt- specified in the file builtins.def, which is processed by the
ins command. mkbuiltins command.
A builtin command is invoked with argc and argv set up like a A builtin command is invoked with argc and argv set up like a
normal program. A builtin command is allowed to overwrite its normal program. A builtin command is allowed to overwrite its
@ -230,22 +245,20 @@ arguments. Builtin routines can call nextopt to do option pars-
ing. This is kind of like getopt, but you don't pass argc and ing. This is kind of like getopt, but you don't pass argc and
argv to it. Builtin routines can also call error. This routine argv to it. Builtin routines can also call error. This routine
normally terminates the shell (or returns to the main command normally terminates the shell (or returns to the main command
loop if the shell is interactive), but when called from a builtin loop if the shell is interactive), but when called from a non-
command it causes the builtin command to terminate with an exit special builtin command it causes the builtin command to
status of 2. terminate with an exit status of 2.
The directory bltins contains commands which can be compiled in- The directory bltins contains commands which can be compiled in-
dependently but can also be built into the shell for efficiency dependently but can also be built into the shell for efficiency
reasons. The makefile in this directory compiles these programs reasons. The header file bltin.h takes care of most of the
in the normal fashion (so that they can be run regardless of differences between the ash and the stand-alone environment.
whether the invoker is ash), but also creates a library named The user should call the main routine "main", and #define main to
bltinlib.a which can be linked with ash. The header file bltin.h be the name of the routine to use when the program is linked into
takes care of most of the differences between the ash and the ash. This #define should appear before bltin.h is included;
stand-alone environment. The user should call the main routine bltin.h will #undef main if the program is to be compiled
"main", and #define main to be the name of the routine to use stand-alone. A similar approach is used for a few utilities from
when the program is linked into ash. This #define should appear bin and usr.bin.
before bltin.h is included; bltin.h will #undef main if the pro-
gram is to be compiled stand-alone.
CD.C: This file defines the cd and pwd builtins. CD.C: This file defines the cd and pwd builtins.
@ -258,7 +271,7 @@ is called at appropriate points to actually handle the signal.
When an interrupt is caught and no trap has been set for that When an interrupt is caught and no trap has been set for that
signal, the routine "onint" in error.c is called. signal, the routine "onint" in error.c is called.
OUTPUT: Ash uses it's own output routines. There are three out- OUTPUT: Ash uses its own output routines. There are three out-
put structures allocated. "Output" represents the standard out- put structures allocated. "Output" represents the standard out-
put, "errout" the standard error, and "memout" contains output put, "errout" the standard error, and "memout" contains output
which is to be stored in memory. This last is used when a buil- which is to be stored in memory. This last is used when a buil-

View File

@ -1222,7 +1222,7 @@ bltincmd(int argc, char **argv)
return 127; return 127;
} }
/* /*
* Preserve exitstatus of a previous possible redirection * Preserve exitstatus of a previous possible command substitution
* as POSIX mandates * as POSIX mandates
*/ */
return exitstatus; return exitstatus;

View File

@ -338,7 +338,7 @@ find_command(const char *name, struct cmdentry *entry, int act,
cd = 0; cd = 0;
/* If name is in the table, and not invalidated by cd, we're done */ /* If name is in the table, we're done */
if ((cmdp = cmdlookup(name, 0)) != NULL) { if ((cmdp = cmdlookup(name, 0)) != NULL) {
if (cmdp->cmdtype == CMDFUNCTION && act & DO_NOFUNC) if (cmdp->cmdtype == CMDFUNCTION && act & DO_NOFUNC)
cmdp = NULL; cmdp = NULL;
@ -485,8 +485,7 @@ changepath(const char *newval __unused)
/* /*
* Clear out command entries. The argument specifies the first entry in * Clear out cached utility locations.
* PATH which has changed.
*/ */
void void

View File

@ -222,9 +222,9 @@ stputs_split(const char *data, const char *syntax, int flag, char *p,
* The result is left in the stack string. * The result is left in the stack string.
* When arglist is NULL, perform here document expansion. * When arglist is NULL, perform here document expansion.
* *
* Caution: this function uses global state and is not reentrant. * When doing something that may cause this to be re-entered, make sure
* However, a new invocation after an interrupted invocation is safe * the stack string is empty via grabstackstr() and do not assume expdest
* and will reset the global state for the new call. * remains valid.
*/ */
void void
expandarg(union node *arg, struct arglist *arglist, int flag) expandarg(union node *arg, struct arglist *arglist, int flag)
@ -476,7 +476,7 @@ expbackq(union node *cmd, int quoted, int flag, struct worddest *dst)
ifs = ifsset() ? ifsval() : " \t\n"; ifs = ifsset() ? ifsval() : " \t\n";
else else
ifs = ""; ifs = "";
/* Don't copy trailing newlines */ /* Remove trailing newlines */
for (;;) { for (;;) {
if (--in.nleft < 0) { if (--in.nleft < 0) {
if (in.fd < 0) if (in.fd < 0)
@ -821,7 +821,7 @@ evalvar(const char *p, struct nodelist **restrict argbackq, int flag,
/* /*
* Test whether a specialized variable is set. * Test whether a special or positional parameter is set.
*/ */
static int static int
@ -918,7 +918,7 @@ reprocess(int startloc, int flag, int subtype, int quoted,
} }
/* /*
* Add the value of a specialized variable to the stack string. * Add the value of a special or positional parameter to the stack string.
*/ */
static void static void

View File

@ -141,6 +141,8 @@ optschanged(void)
/* /*
* Process shell options. The global variable argptr contains a pointer * Process shell options. The global variable argptr contains a pointer
* to the argument list; we advance it past the options. * to the argument list; we advance it past the options.
* If cmdline is true, process the shell's argv; otherwise, process arguments
* to the set special builtin.
*/ */
static void static void
@ -392,7 +394,7 @@ shiftcmd(int argc, char **argv)
/* /*
* The set command builtin. * The set builtin command.
*/ */
int int
@ -558,7 +560,7 @@ out:
/* /*
* Standard option processing (a la getopt) for builtin routines. The * Standard option processing (a la getopt) for builtin routines. The
* only argument that is passed to nextopt is the option string; the * only argument that is passed to nextopt is the option string; the
* other arguments are unnecessary. It return the character, or '\0' on * other arguments are unnecessary. It returns the option, or '\0' on
* end of input. * end of input.
*/ */