awk: bring in vendor branch from upstream 20210727

Changes since the last import:

July 27, 2021:
	As per IEEE Std 1003.1-2008, -F "str" is now consistent with
	-v FS="str" when str is null. Thanks to Warner Losh.

July 24, 2021:
	Fix readrec's definition of a record. This fixes an issue
	with NetBSD's RS regular expression support that can cause
	an infinite read loop. Thanks to Miguel Pineiro Jr.

	Fix regular expression RS ^-anchoring. RS ^-anchoring needs to
	know if it is reading the first record of a file. This change
	restores a missing line that was overlooked when porting NetBSD's
	RS regex functionality. Thanks to Miguel Pineiro Jr.

	Fix size computation in replace_repeat() for special case
	REPEAT_WITH_Q. Thanks to Todd C. Miller.

Also, for the first time, import all the tests.

Sponsored by:		Netflix
This commit is contained in:
Warner Losh 2021-08-01 10:02:22 -06:00
parent 746b7396bb
commit f9002b8561
330 changed files with 71928 additions and 13 deletions

17
FIXES
View File

@ -25,6 +25,23 @@ THIS SOFTWARE.
This file lists all bug fixes, changes, etc., made since the AWK book
was sent to the printers in August, 1987.
July 27, 2021:
As per IEEE Std 1003.1-2008, -F "str" is now consistent with
-v FS="str" when str is null. Thanks to Warner Losh.
July 24, 2021:
Fix readrec's definition of a record. This fixes an issue
with NetBSD's RS regular expression support that can cause
an infinite read loop. Thanks to Miguel Pineiro Jr.
Fix regular expression RS ^-anchoring. RS ^-anchoring needs to
know if it is reading the first record of a file. This change
restores a missing line that was overlooked when porting NetBSD's
RS regex functionality. Thanks to Miguel Pineiro Jr.
Fix size computation in replace_repeat() for special case
REPEAT_WITH_Q. Thanks to Todd C. Miller.
February 15, 2021:
Small fix so that awk will compile again with g++. Thanks to
Arnold Robbins.

123
README.md Normal file
View File

@ -0,0 +1,123 @@
# The One True Awk
This is the version of `awk` described in _The AWK Programming Language_,
by Al Aho, Brian Kernighan, and Peter Weinberger
(Addison-Wesley, 1988, ISBN 0-201-07981-X).
## Copyright
Copyright (C) Lucent Technologies 1997<br/>
All Rights Reserved
Permission to use, copy, modify, and distribute this software and
its documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appear in all
copies and that both that the copyright notice and this
permission notice and warranty disclaimer appear in supporting
documentation, and that the name Lucent Technologies or any of
its entities not be used in advertising or publicity pertaining
to distribution of the software without specific, written prior
permission.
LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.
## Distribution and Reporting Problems
Changes, mostly bug fixes and occasional enhancements, are listed
in `FIXES`. If you distribute this code further, please please please
distribute `FIXES` with it.
If you find errors, please report them
to bwk@cs.princeton.edu.
Please _also_ open an issue in the GitHub issue tracker, to make
it easy to track issues.
Thanks.
## Submitting Pull Requests
Pull requests are welcome. Some guidelines:
* Please do not use functions or facilities that are not standard (e.g.,
`strlcpy()`, `fpurge()`).
* Please run the test suite and make sure that your changes pass before
posting the pull request. To do so:
1. Save the previous version of `awk` somewhere in your path. Call it `nawk` (for example).
1. Run `oldawk=nawk make check > check.out 2>&1`.
1. Search for `BAD` or `error` in the result. In general, look over it manually to make sure there are no errors.
* Please create the pull request with a request
to merge into the `staging` branch instead of into the `master` branch.
This allows us to do testing, and to make any additional edits or changes
after the merge but before merging to `master`.
## Building
The program itself is created by
make
which should produce a sequence of messages roughly like this:
yacc -d awkgram.y
conflicts: 43 shift/reduce, 85 reduce/reduce
mv y.tab.c ytab.c
mv y.tab.h ytab.h
cc -c ytab.c
cc -c b.c
cc -c main.c
cc -c parse.c
cc maketab.c -o maketab
./maketab >proctab.c
cc -c proctab.c
cc -c tran.c
cc -c lib.c
cc -c run.c
cc -c lex.c
cc ytab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o -lm
This produces an executable `a.out`; you will eventually want to
move this to some place like `/usr/bin/awk`.
If your system does not have `yacc` or `bison` (the GNU
equivalent), you need to install one of them first.
NOTE: This version uses ANSI C (C 99), as you should also. We have
compiled this without any changes using `gcc -Wall` and/or local C
compilers on a variety of systems, but new systems or compilers
may raise some new complaint; reports of difficulties are
welcome.
This compiles without change on Macintosh OS X using `gcc` and
the standard developer tools.
You can also use `make CC=g++` to build with the GNU C++ compiler,
should you choose to do so.
The version of `malloc` that comes with some systems is sometimes
astonishly slow. If `awk` seems slow, you might try fixing that.
More generally, turning on optimization can significantly improve
`awk`'s speed, perhaps by 1/3 for highest levels.
## A Note About Releases
We don't do releases.
## A Note About Maintenance
NOTICE! Maintenance of this program is on a ''best effort''
basis. We try to get to issues and pull requests as quickly
as we can. Unfortunately, however, keeping this program going
is not at the top of our priority list.
#### Last Updated
Sat Jul 25 14:00:07 EDT 2021

19
TODO Normal file
View File

@ -0,0 +1,19 @@
Wed Jan 22 02:10:35 MST 2020
============================
Here are some things that it'd be nice to have volunteer
help on.
1. Rework the test suite so that it's easier to maintain
and see exactly which tests fail:
A. Extract beebe.tar into separate file and update scripts
B. Split apart multiple tests into separate tests with input
and "ok" files for comparisons.
2. Pull in more of the tests from gawk that only test standard features.
The beebe.tar file appears to be from sometime in the 1990s.
3. Make the One True Awk valgrind clean. In particular add a
a test suite target that runs valgrind on all the tests and
reports if there are any definite losses or any invalid reads
or writes (similar to gawk's test of this nature).

9
b.c
View File

@ -935,7 +935,7 @@ replace_repeat(const uschar *reptok, int reptoklen, const uschar *atom,
if (special_case == REPEAT_PLUS_APPENDED) {
size++; /* for the final + */
} else if (special_case == REPEAT_WITH_Q) {
size += init_q + (atomlen+1)* n_q_reps;
size += init_q + (atomlen+1)* (n_q_reps-init_q);
} else if (special_case == REPEAT_ZERO) {
size += 2; /* just a null ERE: () */
}
@ -964,11 +964,8 @@ replace_repeat(const uschar *reptok, int reptoklen, const uschar *atom,
}
}
memcpy(&buf[j], reptok+reptoklen, suffix_length);
if (special_case == REPEAT_ZERO) {
buf[j+suffix_length] = '\0';
} else {
buf[size] = '\0';
}
j += suffix_length;
buf[j] = '\0';
/* free old basestr */
if (firstbasestr != basestr) {
if (basestr)

28
bugs-fixed/REGRESS Executable file
View File

@ -0,0 +1,28 @@
#! /bin/bash
if [ ! -f ../a.out ]
then
echo Making executable
(cd .. ; make) || exit 0
fi
for i in *.awk
do
echo === $i
OUT=${i%.awk}.OUT
OK=${i%.awk}.ok
IN=${i%.awk}.in
input=
if [ -f $IN ]
then
input=$IN
fi
../a.out -f $i $input > $OUT 2>&1
if cmp -s $OK $OUT
then
rm -f $OUT
else
echo ++++ $i failed!
fi
done

View File

@ -0,0 +1 @@
foo

View File

@ -0,0 +1,4 @@
{
for (i = 1; i <= NF; i++)
print i, $i, $i + 0
}

View File

@ -0,0 +1 @@
-inf -inform inform -nan -nancy nancy -123 0 123 +123 nancy +nancy +nan inform +inform +inf

View File

@ -0,0 +1,16 @@
1 -inf -inf
2 -inform 0
3 inform 0
4 -nan -nan
5 -nancy 0
6 nancy 0
7 -123 -123
8 0 0
9 123 123
10 +123 123
11 nancy 0
12 +nancy 0
13 +nan +nan
14 inform 0
15 +inform 0
16 +inf +inf

View File

@ -0,0 +1 @@
\

View File

@ -0,0 +1,4 @@
../a.out: syntax error at source line 1 source file pfile-overflow.awk
context is
>>> <<<
../a.out: bailing out at source line 1 source file pfile-overflow.awk

View File

@ -0,0 +1 @@
BEGIN { RS="zx" } { print $1 }

View File

@ -0,0 +1 @@
<EFBFBD>

View File

@ -0,0 +1 @@
<EFBFBD>

4
lib.c
View File

@ -176,6 +176,7 @@ int getrec(char **pbuf, int *pbufsize, bool isrecord) /* get next input record *
infile = stdin;
else if ((infile = fopen(file, "r")) == NULL)
FATAL("can't open file %s", file);
innew = true;
setfval(fnrloc, 0.0);
}
c = readrec(&buf, &bufsize, infile, innew);
@ -241,6 +242,7 @@ int readrec(char **pbuf, int *pbufsize, FILE *inf, bool newflag) /* read one rec
}
if (found)
setptr(patbeg, '\0');
isrec = (found == 0 && *buf == '\0') ? false : true;
} else {
if ((sep = *rs) == 0) {
sep = '\n';
@ -270,10 +272,10 @@ int readrec(char **pbuf, int *pbufsize, FILE *inf, bool newflag) /* read one rec
if (!adjbuf(&buf, &bufsize, 1+rr-buf, recsize, &rr, "readrec 3"))
FATAL("input record `%.30s...' too long", buf);
*rr = 0;
isrec = (c == EOF && rr == buf) ? false : true;
}
*pbuf = buf;
*pbufsize = bufsize;
isrec = *buf || !feof(inf);
DPRINTF("readrec saw <%s>, returns %d\n", buf, isrec);
return isrec;
}

8
main.c
View File

@ -22,7 +22,7 @@ ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.
****************************************************************/
const char *version = "version 20210215";
const char *version = "version 20210724";
#define DEBUG
#include <stdio.h>
@ -91,9 +91,7 @@ setfs(char *p)
/* wart: t=>\t */
if (p[0] == 't' && p[1] == '\0')
return "\t";
else if (p[0] != '\0')
return p;
return NULL;
return p;
}
static char *
@ -169,8 +167,6 @@ int main(int argc, char *argv[])
break;
case 'F': /* set field separator */
fs = setfs(getarg(&argc, &argv, "no field separator"));
if (fs == NULL)
WARNING("field separator FS is empty");
break;
case 'v': /* -v a=1 to be done NOW. one -v for each */
vn = getarg(&argc, &argv, "no variable name");

10
testdir/Compare.T1 Executable file
View File

@ -0,0 +1,10 @@
oldawk=${oldawk-awk}
awk=${awk-../a.out}
echo oldawk=$oldawk, awk=$awk
for i in T.*
do
$i
done

35
testdir/Compare.drek Executable file
View File

@ -0,0 +1,35 @@
# an arbitrary collection of input data
cat td.1 td.1 >foo.td
sed 's/^........................//' td.1 >>foo.td
pr -m td.1 td.1 td.1 >>foo.td
pr -2 td.1 >>foo.td
wc foo.td
td=foo.td
>footot
for i in $*
do
echo $i >/dev/tty
echo $i '<<<'
cd ..
echo testdir/$i:
ind <testdir/$i
a.out -f testdir/$i >drek.c
cat drek.c
make drek || ( echo $i ' ' bad compile; echo $i ' ' bad compile >/dev/tty; continue )
cd testdir
time /usr/bin/awk -f $i $td >foo2 2>foo2t
cat foo2t
time ../drek $td >foo1 2>foo1t
cat foo1t
cmp foo1 foo2 || ( echo $i ' ' bad; echo $i ' ' bad >/dev/tty; diff foo1 foo2 | sed 20q )
echo '>>>' $i
echo
echo $i: >>footot
cat foo1t foo2t >>footot
done
ctimes footot

17
testdir/Compare.p Executable file
View File

@ -0,0 +1,17 @@
oldawk=${oldawk-awk}
awk=${awk-../a.out}
echo oldawk=$oldawk, awk=$awk
for i
do
echo "$i:"
$oldawk -f $i test.countries test.countries >foo1
$awk -f $i test.countries test.countries >foo2
if cmp -s foo1 foo2
then true
else echo -n "$i: BAD ..."
fi
diff -b foo1 foo2 | sed -e 's/^/ /' -e 10q
done

17
testdir/Compare.t Executable file
View File

@ -0,0 +1,17 @@
oldawk=${oldawk-myawk}
awk=${awk-../a.out}
echo oldawk=$oldawk, awk=$awk
for i
do
echo "$i:"
$oldawk -f $i test.data >foo1
$awk -f $i test.data >foo2
if cmp -s foo1 foo2
then true
else echo -n "$i: BAD ..."
fi
diff -b foo1 foo2 | sed -e 's/^/ /' -e 10q
done

49
testdir/Compare.tt Executable file
View File

@ -0,0 +1,49 @@
#!/bin/sh
oldawk=${oldawk-awk}
awk=${awk-../a.out}
echo compiling time.c
gcc time.c -o time
time=./time
echo time command = $time
#case `uname` in
#SunOS)
# time=/usr/bin/time ;;
#Linux)
# time=/usr/bin/time ;;
#*)
# time=time ;;
#esac
echo oldawk = $oldawk, awk = $awk, time command = $time
# an arbitrary collection of input data
cat td.1 td.1 >foo.td
sed 's/^........................//' td.1 >>foo.td
pr -m td.1 td.1 td.1 >>foo.td
pr -2 td.1 >>foo.td
cat bib >>foo.td
wc foo.td
td=foo.td
>footot
for i in $*
do
echo $i "($oldawk vs $awk)":
# ind <$i
$time $oldawk -f $i $td >foo2 2>foo2t
cat foo2t
$time $awk -f $i $td >foo1 2>foo1t
cat foo1t
cmp foo1 foo2
echo $i: >>footot
cat foo1t foo2t >>footot
done
ctimes footot

10
testdir/NOTES Normal file
View File

@ -0,0 +1,10 @@
Need some tests for octal, hex, various string escapes.
Need to complete the sub and gsub tests.
more on printf, especially weird formats
more on operators
never throw away a test

44
testdir/README.TESTS Normal file
View File

@ -0,0 +1,44 @@
The archive of test files contains
- A shell file called REGRESS that controls the testing process.
- Several shell files called Compare* that control sub-parts
of the testing.
- About 160 small tests called t.* that constitute a random
sampling of awk constructions collected over the years.
Not organized, but they touch almost everything.
- About 60 small tests called p.* that come from the first
two chapters of The AWK Programming Language. This is
basic stuff -- they have to work.
These two sets are intended as regression tests, to be sure
that a new version produces the same results as a previous one.
There are a couple of standard data files used with them,
test.data and test.countries, but others would work too.
- About 20 files called T.* that are self-contained and
more systematic tests of specific language features.
For example, T.clv tests command-line variable handling.
These tests are not regressions -- they compute the right
answer by separate means, then compare the awk output.
A specific test for each new bug found shows up in at least
one of these, most often T.misc. There are about 220 tests
total in these files.
- Two of these files, T.re and T.sub, are systematic tests
of the regular expression and substitution code. They express
tests in a small language, then generate awk programs that
verify behavior.
- About 20 files called tt.* that are used as timing tests;
they use the most common awk constructions in straightforward
ways, against a large input file constructed by Compare.tt.
There is undoubtedly more stuff in the archive; it's been
collecting for years and may need pruning. Suggestions for
improvement, additional tests (especially systematic ones),
and the like are all welcome.

21
testdir/REGRESS Executable file
View File

@ -0,0 +1,21 @@
#!/bin/sh
uname -a
gcc echo.c -o echo && echo echo compiled
oldawk=${oldawk-awk}
awk=${awk-../a.out}
echo oldawk=$oldawk, awk=$awk
oldawk=$oldawk awk=$awk Compare.t t.*
echo `ls t.* | wc -l` tests; echo
oldawk=$oldawk awk=$awk Compare.p p.? p.??*
echo `ls p.* | wc -l` tests; echo
oldawk=$oldawk awk=$awk Compare.T1
echo `grep '\$awk' T.* | wc -l` tests; echo
oldawk=$oldawk awk=$awk Compare.tt tt.*
echo `ls tt.* | wc -l` tests; echo

35
testdir/T.-f-f Executable file
View File

@ -0,0 +1,35 @@
#!/bin/sh
echo T.-f-f: check multiple -f arguments
awk=${awk-../a.out}
echo 'begin
end' >foo
echo 'BEGIN { print "begin" }' >foo1
echo 'END { print "end" }' >foo2
echo xxx | $awk -f foo1 -f foo2 >foo3
diff foo foo3 || echo 'BAD: T.-f-f multiple -fs'
echo '/a/' | $awk -f - /etc/passwd >foo1
$awk '/a/' /etc/passwd >foo2
diff foo1 foo2 || echo 'BAD: T.-f-f -f -'
cp /etc/passwd foo1
echo '/./ {' >foo2
echo 'print' >foo3
echo '}' >foo4
$awk -f foo2 -f foo3 -f foo4 /etc/passwd >foo5
diff foo1 foo5 || echo 'BAD: T.-f-f 3 files'
echo '/./ {' >foo2
echo 'print' >foo3
echo '
]' >foo4
$awk -f foo2 -f foo3 -f foo4 /etc/passwd >foo5 2>foo6
grep 'syntax error.*file foo4' foo6 >/dev/null 2>&1 || echo 'BAD: T.-f-f source file name'

144
testdir/T.argv Executable file
View File

@ -0,0 +1,144 @@
echo T.argv: misc tests of argc and argv
awk=${awk-../a.out}
echo >foo1
echo >foo2
$awk '
BEGIN {
for (i = 1; i < ARGC-1; i++)
printf "%s ", ARGV[i]
if (ARGC > 1)
printf "%s", ARGV[i]
printf "\n"
exit
}' * >foo1
echo * >foo2
diff foo1 foo2 || echo 'BAD: T.argv (echo1 *)'
$awk '
BEGIN {
for (i = 1; i < ARGC; i++) {
printf "%s", ARGV[i]
if (i < ARGC-1)
printf " "
}
printf "\n"
exit
}' * >foo1
echo * >foo2
diff foo1 foo2 || echo 'BAD: T.argv (echo2 *)'
$awk '
BEGIN {
print ARGC
ARGV[ARGC-1] = ""
for (i=0; i < ARGC; i++)
print ARGV[i]
exit
}
' a bc def gh >foo1
echo "5
$awk
a
bc
def
" >foo2
diff foo1 foo2 || echo 'BAD: T.argv (argc *)'
echo '1
2
3' >foo0
echo 'foo1
foo2
foo3' >foo1
$awk '{print L $0}' L=foo <foo0 >foo2
diff foo1 foo2 || echo 'BAD: T.argv (L=foo <foo1)'
echo '1
2
3' >foo0
echo 'foo1
foo2
foo3' >foo1
$awk '{print L $0}' L=foo foo0 >foo2
diff foo1 foo2 || echo 'BAD: T.argv (L=foo foo1)'
echo '1
2
3' >foo0
echo 'foo1
foo2
foo3' >foo1
cat foo0 | $awk '{print L $0}' L=foo - >foo2
diff foo1 foo2 || echo 'BAD: T.argv (L=foo -)'
echo '1
2
3' >foo0
echo 'foo1
foo2
foo3
glop1
glop2
glop3' >foo1
$awk '{print L $0}' L=foo foo0 L=glop foo0 >foo2
diff foo1 foo2 || echo 'BAD: T.argv (L=foo L=glop)'
echo '1
2
3' >foo0
echo '111
112
113
221
222
223' >foo1
$awk '{print L $0}' L=11 foo0 L=22 foo0 >foo2
diff foo1 foo2 || echo 'BAD: T.argv (L=11 L=22)'
echo 3.345 >foo1
$awk 'BEGIN { print ARGV[1] + ARGV[2]}' 1 2.345 >foo2
diff foo1 foo2 || echo 'BAD: T.argv (ARGV[1] + ARGV[2])'
echo 3.345 >foo1
x1=1 x2=2.345 $awk 'BEGIN { print ENVIRON["x1"] + ENVIRON["x2"]}' 1 2.345 >foo2
diff foo1 foo2 || echo 'BAD: T.argv (ENVIRON[x1] + ENVIRON[x2])'
echo 'foo1' >foo1
echo 'foo2' >foo2
echo 'foo3' >foo3
$awk 'BEGIN { ARGV[2] = "" }
{ print }' foo1 foo2 foo3 >foo4
echo 'foo1
foo3' >foo5
diff foo4 foo5 || echo 'BAD: T.argv zap ARGV[2]'
echo hi > foo1 ; mv foo1 foo2
$awk 'BEGIN { ARGV[1] = "foo2" ; print FILENAME }
{ print FILENAME }' foo1 >foo3
echo '
foo2' >foo4
diff foo3 foo4 || echo 'BAD: T.argv startup FILENAME'
# assumes that startup FILENAME is ""
# test data balanced on pinhead...
echo 'ARGV[3] is /dev/null
ARGV[0] is ../a.out
ARGV[1] is /dev/null' >foo1
$awk 'BEGIN { # this is a variant of arnolds original example
ARGV[1] = "/dev/null"
ARGV[2] = "glotch" # file open must skipped deleted argv
ARGV[3] = "/dev/null"
ARGC = 4
delete ARGV[2]
}
# note that input is read here
END {
for (i in ARGV)
printf("ARGV[%d] is %s\n", i, ARGV[i])
}' >foo2
diff foo1 foo2 || echo 'BAD: T.argv delete ARGV[2]'

19
testdir/T.arnold Executable file
View File

@ -0,0 +1,19 @@
echo T.arnold: test fixes by Arnold Robbins 8/18
# for which many thanks
rm -rf arnold-fixes
tar xf arnold-fixes.tar
cd arnold-fixes
pwd
awk=../../a.out
ls -l $awk
for i in *.awk
do
name=$(basename $i .awk)
#echo $name:
$awk -f $name.awk >foo.$name
diff $name.ok foo.$name || echo "BAD: T.arnold ($name)"
done

8
testdir/T.beebe Executable file
View File

@ -0,0 +1,8 @@
echo T.beebe: tests from nelson beebe from gawk test suite
# for which thanks.
rm -rf beebe
tar xf beebe.tar # creates beebe
cd beebe
make all | sed 's/^/ /' | grep -v cmp

90
testdir/T.builtin Executable file
View File

@ -0,0 +1,90 @@
echo T.builtin: test miscellaneous builtin functions
awk=${awk-../a.out}
$awk 'BEGIN { print index(123, substr(123, 2)) }' >foo1
echo 2 >foo2
diff foo1 foo2 || echo 'BAD: T.builtin (index/substr)'
$awk 'BEGIN {
pi = 2 * atan2(1, 0)
printf("%.5f %.3f %.3f %.5f %.3f\n",
pi, sin(pi), cos(pi/2), exp(log(pi)), log(exp(10)))
}' >foo1
echo '3.14159 0.000 0.000 3.14159 10.000' >foo2
diff foo1 foo2 || echo 'BAD: T.builtin (sin/cos)'
$awk 'BEGIN {
s = srand(1) # set a real random start
for (i = 1; i <= 10; i++)
print rand() >"foo1"
srand(s) # reset it
for (i = 1; i <= 10; i++)
print rand() >"foo2"
}'
diff foo1 foo2 || echo 'BAD: T.builtin (rand)'
echo 'hello, WORLD!' |
$awk '{ printf("%s|%s|%s\n", tolower($0), toupper($0), $0)}' >foo1
echo 'hello, world!|HELLO, WORLD!|hello, WORLD!' >foo2
diff foo1 foo2 || echo 'BAD: T.builtin (toupper/tolower)'
if locale -a | grep -qsi de_DE.UTF-8; then
(export LANG=de_DE.UTF-8 && echo 'Dürst' |
$awk '{ printf("%s|%s|%s\n", tolower($0), toupper($0), $0)}') >foo1
echo 'dürst|DÜRST|Dürst' >foo2
diff foo1 foo2 || echo 'BAD: T.builtin (toupper/tolower) for utf-8'
(export LC_NUMERIC=de_DE.UTF-8 && $awk 'BEGIN { print 0.01 }' /dev/null) >foo1
echo "0.01" >foo2
diff foo1 foo2 || echo 'BAD: T.builtin LC_NUMERIC radix (.) handling'
fi
$awk 'BEGIN {
j = 1; sprintf("%d", 99, ++j) # does j get incremented?
if (j != 2)
print "BAD: T.builtin (printf arg list not evaluated)"
}'
$awk 'BEGIN {
j = 1; substr("", 1, ++j) # does j get incremented?
if (j != 2)
print "BAD: T.builtin (substr arg list not evaluated)"
}'
$awk 'BEGIN {
j = 1; sub(/1/, ++j, z) # does j get incremented?
if (j != 2)
print "BAD: T.builtin (sub() arg list not evaluated)"
}'
$awk 'BEGIN {
j = 1; length("zzzz", ++j, ++j) # does j get incremented?
if (j != 3)
print "BAD: T.builtin (excess length args not evaluated)"
}' 2>foo
grep 'too many arg' foo >/dev/null || echo 'T.bad: too many args not caught'
echo 'a
a b
a b c' >foo0
echo '1
2
3' >foo1
$awk '{ n = split($0, x); print length(x) }' <foo0 >foo2
diff foo1 foo2 || echo 'BAD: T.builtin length array'
# Test for backslash handling
cat << \EOF >foo0
BEGIN {
print "A\
B";
print "CD"
}
EOF
$awk -f foo0 /dev/null >foo1
cat << \EOF >foo2
AB
CD
EOF
diff foo1 foo2 || echo 'BAD: T.builtin continuation handling (backslash)'

11
testdir/T.chem Executable file
View File

@ -0,0 +1,11 @@
echo T.chem: test chem.awk
awk=${awk-../a.out}
oldawk=${oldawk-awk}
for i in lsd1.p penicil.p res.p
do
$awk -f chem.awk $i >foo1
$oldawk -f chem.awk $i >foo2
diff foo1 foo2 || echo "BAD: T.chem on $i"
done

36
testdir/T.close Executable file
View File

@ -0,0 +1,36 @@
echo T.close: test close built-in
awk=${awk-../a.out}
rm -f foo
$awk '{ print >>"foo"; close("foo") }' /etc/passwd
diff /etc/passwd foo || echo 'BAD: T.close (1)'
ls -l >foo
tail -1 foo >foo1
$awk '{ print >"foo2"; close("foo2") }' foo
diff foo1 foo2 || echo 'BAD: T.close (2)'
echo 0 >foo1
$awk ' # non-accessible file
BEGIN { getline <"/etc/passwd"; print close("/etc/passwd"); }
' >foo2
diff foo1 foo2 || echo 'BAD: T.close (3)'
echo -1 >foo1
$awk ' # file not opened
BEGIN { print close("glotch"); }
' >foo2
diff foo1 foo2 || echo 'BAD: T.close (4)'
echo 0 >foo1
$awk ' # normal close
BEGIN { print "hello" > "foo"; print close("foo"); }
' >foo2
diff foo1 foo2 || echo 'BAD: T.close (5)'
echo 0 >foo1
$awk ' # normal close
BEGIN { print "hello" | "cat >foo"; print close("cat >foo"); }
' >foo2
diff foo1 foo2 || echo 'BAD: T.close (6)'

181
testdir/T.clv Executable file
View File

@ -0,0 +1,181 @@
#!/bin/sh
echo T.clv: check command-line variables
awk=${awk-../a.out}
rm -f core
# stdin only, no cmdline asgn
echo 'hello
goodbye' | $awk '
BEGIN { x=0; print x; getline; print x, $0 }
' >foo1
echo '0
0 hello' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (stdin only)'
# cmdline asgn then stdin
echo 'hello
goodbye' | $awk '
BEGIN { x=0; print x; getline; print x, $0 }
' x=1 >foo1
echo '0
1 hello' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=1 only)'
# several cmdline asgn, then stdin
echo 'hello
goodbye' | $awk '
BEGIN { x=0; print x; getline; print x, $0 }
' x=1 x=2 x=3 >foo1
echo '0
3 hello' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=3 only)'
# several cmdline asgn, then file
echo 'hello
goodbye' >foo
$awk '
BEGIN { x=0; print x; getline; print x, $0 }
' x=1 x=2 x=3 foo >foo1
echo '0
3 hello' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=3 only)'
# cmdline asgn then file
echo 4 >foo1
$awk 'BEGIN { getline; print x}' x=4 /dev/null >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=4 /dev/null)'
#cmdline asgn then file but no read of it
echo 0 >foo1
$awk 'BEGIN { x=0; getline <"/dev/null"; print x}' x=5 /dev/null >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=5 /dev/null)'
#cmdline asgn then file then read
echo 'xxx
yyy
zzz' >foo
echo '6
end' >foo1
$awk 'BEGIN { x=0; getline; print x}
END { print x }' x=6 foo x=end >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=6 /dev/null)'
#cmdline asgn then file then read
echo '0
end' >foo1
$awk 'BEGIN { x=0; getline <"/dev/null"; print x}
END { print x }' x=7 /dev/null x=end >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=7 /dev/null)'
#cmdline asgn then file then read; _ in commandname
echo '0
end' >foo1
$awk 'BEGIN { _=0; getline <"/dev/null"; print _}
END { print _ }' _=7A /dev/null _=end >foo2
diff foo1 foo2 || echo 'BAD: T.clv (_=7A /dev/null)'
# illegal varname in commandname
$awk '{ print }' 99_=foo /dev/null >foo 2>foo2
grep "can't open.*foo" foo2 >/dev/null 2>&1 || echo 'BAD: T.clv (7B: illegal varname)'
# these test the new -v option: awk ... -v a=1 -v b=2 'prog' does before BEGIN
echo 123 >foo1
$awk -v x=123 'BEGIN { print x }' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=11)'
echo 123 >foo1
$awk -vx=123 'BEGIN { print x }' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=11a)'
echo 123 abc 10.99 >foo1
$awk -v x=123 -v y=abc -v z1=10.99 'BEGIN { print x, y, z1 }' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=12)'
echo 123 abc 10.99 >foo1
$awk -vx=123 -vy=abc -vz1=10.99 'BEGIN { print x, y, z1 }' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=12a)'
echo 123 abc 10.99 >foo1
$awk -v x=123 -v y=abc -v z1=10.99 -- 'BEGIN { print x, y, z1 }' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=12a)'
echo 'BEGIN { print x, y, z1 }' >foo0
echo 123 abc 10.99 >foo1
$awk -v x=123 -v y=abc -f foo0 -v z1=10.99 >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=13)'
echo 'BEGIN { print x, y, z1 }' >foo0
echo 123 abc 10.99 >foo1
$awk -vx=123 -vy=abc -f foo0 -vz1=10.99 >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=13a)'
echo 'BEGIN { print x, y, z1 }' >foo0
echo 123 abc 10.99 >foo1
$awk -f foo0 -v x=123 -v y=abc -v z1=10.99 >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=14)'
echo 'BEGIN { print x, y, z1 }' >foo0
echo 123 abc 10.99 >foo1
$awk -f foo0 -vx=123 -vy=abc -vz1=10.99 >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=14a)'
echo 'BEGIN { print x, y, z1 }
END { print x }' >foo0
echo '123 abc 10.99
4567' >foo1
$awk -f foo0 -v x=123 -v y=abc -v z1=10.99 /dev/null x=4567 /dev/null >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=15)'
echo 'BEGIN { print x, y, z1 }
END { print x }' >foo0
echo '123 abc 10.99
4567' >foo1
$awk -f foo0 -vx=123 -vy=abc -vz1=10.99 /dev/null x=4567 /dev/null >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=15a)'
echo 'BEGIN { print x, y, z1 }
NR==1 { print x }' >foo0
echo '123 abc 10.99
4567' >foo1
$awk -v x=123 -v y=abc -v z1=10.99 -f foo0 x=4567 /etc/passwd >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=16)'
echo 'BEGIN { print x, y, z1 }
NR==1 { print x }' >foo0
echo '123 abc 10.99
4567' >foo1
$awk -vx=123 -vy=abc -vz1=10.99 -f foo0 x=4567 /etc/passwd >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=16a)'
# special chars in commandline assigned value;
# have to use local echo to avoid quoting problems.
./echo 'a\\b\z' >foo1
./echo 'hello' | $awk '{print x}' x='\141\\\\\142\\z' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=17)'
./echo "a
z" >foo1
./echo 'hello' | $awk '{print x}' x='a\nz' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=18)'
# a bit circular here...
$awk 'BEGIN { printf("a%c%c%cz\n", "\b", "\r", "\f") }' >foo1
./echo 'hello' | $awk '{print x}' x='a\b\r\fz' >foo2
diff foo1 foo2 || echo 'BAD: T.clv (x=19)'
### newer -v tests
$awk -vx 'BEGIN {print x}' >foo 2>&1
grep 'invalid -v option argument: x' foo >/dev/null || echo 'BAD: T.clv (x=20)'
$awk -v x 'BEGIN {print x}' >foo 2>&1
grep 'invalid -v option argument: x' foo >/dev/null || echo 'BAD: T.clv (x=20a)'

29
testdir/T.csconcat Executable file
View File

@ -0,0 +1,29 @@
echo T.csconcat: test constant string concatentation
awk=${awk-../a.out}
$awk '
BEGIN {
$0 = "aaa"
print "abcdef" " " $0
}
BEGIN { print "hello" "world"; print helloworld }
BEGIN {
print " " "hello"
print "hello" " "
print "hello" " " "world"
print "hello" (" " "world")
}
' > foo1
cat << \EOF > foo2
abcdef aaa
helloworld
hello
hello
hello world
hello world
EOF
diff foo1 foo2 || echo 'BAD: T.csconcat (1)'

21
testdir/T.delete Executable file
View File

@ -0,0 +1,21 @@
echo T.delete: misc tests of array deletion
awk=${awk-../a.out}
echo '1 2 3 4
1 2 3
1
' >foo0
echo '4 3 0
3 2 0
1 0 0
0 0 0' >foo2
$awk '
{ n = split($0, x)
delete x[1]
n1 = 0; for (i in x) n1++
delete x;
n2 = 0; for (i in x) n2++
print n, n1, n2
}' foo0 >foo1
diff foo1 foo2 || echo 'BAD: T.delete (1)'

215
testdir/T.errmsg Executable file
View File

@ -0,0 +1,215 @@
echo T.errmsg: check some error messages
awk=${awk-../a.out}
ls >glop
awk=$awk awk '
{ pat = $0
prog = ""
while (getline x > 0 && x != "")
prog = prog "\n" x
print sprintf("\n%s '"'"'%s'"'"' <glop >>devnull 2>foo",
ENVIRON["awk"], prog)
print sprintf("grep '"'"'%s'"'"' foo >>devnull || echo '"'"'BAD: %s'"'"' failed", pat, pat)
}
' >foo.sh <<\!!!!
illegal primary in regular expression
/(/
illegal break, continue, next or nextfile from BEGIN
BEGIN { nextfile }
illegal break, continue, next or nextfile from END
END { nextfile }
nextfile is illegal inside a function
function foo() { nextfile }
duplicate argument
function f(i,j,i) { return i }
nonterminated character class
/[[/
nonterminated character class
/[]/
nonterminated character class
/[\
nonterminated character class
BEGIN { s = "[x"; if (1 ~ s) print "foo"}
syntax error in regular expression
BEGIN { if ("x" ~ /$^/) print "ugh" }
syntax error in regular expression
/((.)/
division by zero
BEGIN { print 1/0 }
division by zero in /=
BEGIN { x = 1; print x /= 0 }
division by zero in %=
BEGIN { x = 1; print x %= 0 }
division by zero in mod
BEGIN { print 1%0 }
can.t read value.* array name.
BEGIN { x[1] = 0; split("a b c", y, x) }
can.t read value.* function
function f(){}; {split($0, x, f)}
can.t assign.* a function
function f(){}; {f = split($0, x)}
can.t assign to x; it.s an array name.
{x = split($0, x)}
is a function, not an array
function f(){}; {split($0, f)}
function f called with 1 args, uses only 0
BEGIN { f(f) }
function f() { print "x" }
can.t use function f as argument in f
BEGIN { f(f) }
function f() { print "x" }
x is an array, not a function
{ split($0, x) }; function x() {}
illegal nested function
function x() { function g() {} }
return not in function
{ return }
break illegal outside
{ break }
continue illegal outside
{ continue }
non-terminated string
{ print "abc
}
illegal field $(foo)
BEGIN { print $"foo" }
next is illegal inside a function
BEGIN { f() }
function f() { next }
not enough args in printf(%s)
BEGIN { printf("%s") }
weird printf conversion
BEGIN { printf("%z", "foo")}
function f has .* arguments, limit .*
function f(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,b1,b2,b3,b4,b5,b6,b7,b8,b9,b10,
c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,d1,d2,d3,d4,d5,d6,d7,d8,d9,d10,
e1,e2,e3,e4,e5,e6,e7,e8,e9,e10,f1,f2,f3,f4,f5,f6,f7,f8,f9,f10) {}
BEGIN { f(123) }
bailing out
])}
bailing out
{ print }}
bailing out
{ print }}}
bailing out
]
bailing out
[
bailing out
a & b
extra )
{ x = 1) }
illegal statement
{ print ))}
illegal statement
{{ print }
illegal statement
{{{ print }
illegal .*next.* from BEGIN
BEGIN { next }
illegal .*next.* from END
END { next; print NR }
can.t open file ./nonexistentdir/foo
BEGIN { print "abc" >"./nonexistentdir/foo" }
you can.t define function f more than once
function f() { print 1 }
function f() { print 2 }
function mp called with 1 args, uses only 0
function mp(){ cnt++;}
BEGIN { mp(xx) }
index.*doesn.t permit regular expressions
BEGIN { index("abc", /a/) }
log argument out of domain
BEGIN { print log(-1) }
exp result out of range
BEGIN {print exp(1000)}
null file name in print or getline
BEGIN { print >foo }
function has too many arguments
BEGIN { length("abc", "def") }
calling undefined function foo
BEGIN { foo() }
this should print a BAD message
BEGIN { print }
!!!!
echo ' running tests in foo.sh'
sh foo.sh
test -r core && echo BAD: someone dropped core 1>&2
echo xxx >foo0
$awk '{print x}' x='a
b' foo0 >foo1 2>foo2
grep 'newline in string' foo2 >/dev/null || echo 'BAD: T.errmsg newline in string'
$awk -safe 'BEGIN{"date" | getline}' >foo 2>foo2
grep 'cmd | getline is unsafe' foo2 >/dev/null || echo 'BAD: T.errmsg cmd|getline unsafe'
$awk -safe 'BEGIN{print >"foo"}' >foo 2>foo2
grep 'print > is unsafe' foo2 >/dev/null || echo 'BAD: T.errmsg print > unsafe'
$awk -safe 'BEGIN{print >> "foo"}' >foo 2>foo2
grep 'print >> is unsafe' foo2 >/dev/null || echo 'BAD: T.errmsg print >> unsafe'
$awk -safe 'BEGIN{print | "foo"}' >foo 2>foo2
grep 'print | is unsafe' foo2 >/dev/null || echo 'BAD: T.errmsg print | unsafe'
$awk -safe 'BEGIN {system("date")}' >foo 2>foo2
grep 'system is unsafe' foo2 >/dev/null || echo 'BAD: T.errmsg system unsafe'

235
testdir/T.expr Executable file
View File

@ -0,0 +1,235 @@
#!/bin/sh
echo T.expr: tests of miscellaneous expressions
awk=${awk-../a.out}
$awk '
BEGIN {
FS = "\t"
awk = "../a.out"
}
NF == 0 || $1 ~ /^#/ {
next
}
$1 ~ /try/ { # new test
nt++
sub(/try /, "")
prog = $0
printf("%3d %s\n", nt, prog)
prog = sprintf("%s -F\"\\t\" '"'"'%s'"'"'", awk, prog)
# print "prog is", prog
nt2 = 0
while (getline > 0) {
if (NF == 0) # blank line terminates a sequence
break
input = $1
for (i = 2; i < NF; i++) # input data
input = input "\t" $i
test = sprintf("./echo '"'"'%s'"'"' | %s >foo1; ",
input, prog)
if ($NF == "\"\"")
output = ">foo2;"
else
output = sprintf("./echo '"'"'%s'"'"' >foo2; ", $NF)
gsub(/\\t/, "\t", output)
gsub(/\\n/, "\n", output)
run = sprintf("cmp foo1 foo2 || echo test %d.%d failed",
nt, ++nt2)
# print "input is", input
# print "test is", test
# print "output is", output
# print "run is", run
system(test output run)
}
tt += nt2
}
END { print tt, "tests" }
' <<\!!!!
# General format:
# try program as rest of line
# $1 $2 $3 output1 (\t for tab, \n for newline,
# $1 $2 $3 output2 ("" for null)
# ... terminated by blank line
# try another program...
try { print ($1 == 1) ? "yes" : "no" }
1 yes
1.0 yes
1E0 yes
0.1E1 yes
10E-1 yes
01 yes
10 no
10E-2 no
try $1 > 0
1 1
2 2
0 ""
-1 ""
1e0 1e0
0e1 ""
-2e64 ""
3.1e4 3.1e4
try { print NF }
0
x 1
x y 2
y 2
x 2
try { print NF, $NF }
0
x 1 x
x y 2 y
x yy zzz 3 zzz
# this horror prints $($2+1)
try { i=1; print ($++$++i) }
1 1
1 2 3 3
abc abc
# concatenate $1 and ++$2; print new $1 and concatenated value
try { x = $1++++$2; print $1, x }
1 3 2 14
# do we get the precedence of ! right?
try $1 !$2
0 0 0\t0
0 1 0\t1
1 0 1\t0
1 1 1\t1
# another ava special
try { print ($1~/abc/ !$2) }
0 0 01
0 1 00
abc 0 11
xabcd 1 10
try { print !$1 + $2 }
1 3 3
0 3 4
-1 3 3
# aside: !$1 = $2 is now a syntax error
# the definition of "number" changes with isnumber.
# 2e100 is ok according to strtod.
# try 1
try { print ($1 == $2) }
0 0 1
0 1 0
0 00 1
0 "" 0
+0 -0 1
1 1.0 1
1 1e0 1
2e10 2.00e10 1
2e10 2e+10 1
2e-10 2e-10 1
2e10 2e-10 0
2e10 20e9 1
2e100 2.000e100 1
2e1000 2.0e1000 0
# this one (3 & 4) may "fail" if a negative 0 is printed as -0,
# but i think this might be a type-coercion problem.
try { print $1, +$1, -$1, - -$1 }
1 1 1 -1 1
-1 -1 -1 1 -1
0 0 0 0 0
x x 0 0 0
try { printf("a%*sb\n", $1, $2) }
1 x axb
2 x a xb
3 x a xb
try { printf("a%-*sb\n", $1, $2) }
1 x axb
2 x ax b
3 x ax b
try { printf("a%*.*sb\n", $1, $2, "hello") }
1 1 ahb
2 1 a hb
3 1 a hb
try { printf("a%-*.*sb\n", $1, $2, "hello") }
1 1 ahb
2 1 ah b
3 1 ah b
try { printf("%d %ld %lld %zd %jd %hd %hhd\n", $1, $1, $1, $1, $1, $1, $1) }
1 1 1 1 1 1 1 1
10 10 10 10 10 10 10 10
10000 10000 10000 10000 10000 10000 10000 10000
try { printf("%x %lx %llx %zx %jx %hx %hhx\n", $1, $1, $1, $1, $1, $1, $1) }
1 1 1 1 1 1 1 1
10 a a a a a a a
10000 2710 2710 2710 2710 2710 2710 2710
try { if ($1 ~ $2) print 1; else print 0 }
a \141 1
a \142 0
a \x61 1
a \x061 0
a \x62 0
0 \060 1
0 \60 1
0 \0060 0
Z \x5a 1
Z \x5A 1
try { print $1 ~ $2 }
a \141 1
a \142 0
a \x61 1
a \x061 0
a \x62 0
0 \060 1
0 \60 1
0 \0060 0
Z \x5a 1
Z \x5A 1
try { print $1 || $2 }
0
1 1
0 0 0
1 0 1
0 1 1
1 1 1
a b 1
try { print $1 && $2 }
0
1 0
0 0 0
1 0 0
0 1 0
1 1 1
a b 1
try { $1 = $2; $1 = $1; print $1 }
abc def def
abc def ghi def
# $f++ => ($f)++
try { f = 1; $f++; print f, $f }
11 22 33 1 12
# $f[1]++ => ($f[1])++
try { f[1]=1; f[2]=2; print $f[1], $f[1]++, $f[2], f[1], f[2] }
111 222 333 111 111 222 2 2
!!!!

21
testdir/T.exprconv Executable file
View File

@ -0,0 +1,21 @@
echo T.exprconv: check conversion of expr to number
awk=${awk-../a.out}
$awk '
BEGIN { x = (1 > 0); print x
x = (1 < 0); print x
x = (1 == 1); print x
print ("a" >= "b")
print ("b" >= "a")
print (0 == 0.0)
# x = ((1 == 1e0) && (1 == 10e-1) && (1 == .1e2)); print x
exit
}' >foo1
echo '1
0
1
0
1
1' >foo2
cmp foo1 foo2 || echo 'BAD: T.exprconv (1 > 0, etc.)'

24
testdir/T.flags Executable file
View File

@ -0,0 +1,24 @@
echo T.flags: test some commandline flags
awk=${awk-../a.out}
$awk >foo 2>&1
grep '[Uu]sage' foo >/dev/null || echo 'T.flags: bad usage'
$awk -f >foo 2>&1
grep 'no program' foo >/dev/null || echo 'T.flags: bad no program'
$awk -f glop/glop >foo 2>&1
grep 'can.t open' foo >/dev/null || echo 'T.flags: bad can.t open program'
$awk -fglop/glop >foo 2>&1
grep 'can.t open' foo >/dev/null || echo 'T.flags: bad can.t open program 2'
$awk -zz 'BEGIN{}' >foo 2>&1
grep 'unknown option' foo >/dev/null || echo 'T.flags: bad unknown option'
$awk -F >foo 2>&1
grep 'no field separator' foo >/dev/null || echo 'T.flags: bad missing field separator'
$awk -F '' >foo 2>&1
grep 'field separator FS is empty' foo >/dev/null || echo 'T.flags: bad empty field separator'

196
testdir/T.func Executable file
View File

@ -0,0 +1,196 @@
echo T.func: test user-defined functions
awk=${awk-../a.out}
echo '10 2
2 10
10 10
10 1e1
1e1 9' | $awk '
# tests whether function returns sensible type bits
function assert(cond) { # assertion
if (cond) print 1; else print 0
}
function i(x) { return x }
{ m=$1; n=i($2); assert(m>n) }
' >foo1
echo '1
0
0
0
1' >foo2
diff foo1 foo2 || echo 'BAD: T.func (function return type)'
echo 'data: data' >foo1
$awk '
function test1(array) { array["test"] = "data" }
function test2(array) { return(array["test"]) }
BEGIN { test1(foo); print "data: " test2(foo) }
' >foo2
diff foo1 foo2 || echo 'BAD: T.func (array type)'
$awk '
BEGIN { code() }
END { codeout("x") }
function code() { ; }
function codeout(ex) { print ex }
' /dev/null >foo1
echo x >foo2
diff foo1 foo2 || echo 'BAD: T.func (argument passing)'
$awk '
BEGIN { unireghf() }
function unireghf(hfeed) {
hfeed[1]=0
rcell("foo",hfeed)
hfeed[1]=0
rcell("bar",hfeed)
}
function rcell(cellname,hfeed) {
print cellname
}
' >foo1
echo "foo
bar" >foo2
diff foo1 foo2 || echo 'BAD: T.func (convert arg to array)'
$awk '
function f(n) {
if (n <= 1)
return 1
else
return n * f(n-1)
}
{ print f($1) }
' <<! >foo2
0
1
2
3
4
5
6
7
8
9
!
cat <<! >foo1
1
1
2
6
24
120
720
5040
40320
362880
!
diff foo1 foo2 || echo 'BAD: T.func (factorial)'
$awk '
function ack(m,n) {
k = k+1
if (m == 0) return n+1
if (n == 0) return ack(m-1, 1)
return ack(m-1, ack(m, n-1))
}
{ k = 0; print ack($1,$2), "(" k " calls)" }
' <<! >foo2
0 0
1 1
2 2
3 3
3 4
3 5
!
cat <<! >foo1
1 (1 calls)
3 (4 calls)
7 (27 calls)
61 (2432 calls)
125 (10307 calls)
253 (42438 calls)
!
diff foo1 foo2 || echo 'BAD: T.func (ackermann)'
$awk '
END { print "end" }
{ print fib($1) }
function fib(n) {
if (n <= 1) return 1
else return add(fib(n-1), fib(n-2))
}
function add(m,n) { return m+n }
BEGIN { print "begin" }
' <<! >foo2
1
3
5
10
!
cat <<! >foo1
begin
1
3
8
89
end
!
diff foo1 foo2 || echo 'BAD: T.func (fib)'
$awk '
function foo() {
for (i = 1; i <= 2; i++)
return 3
print "should not see this"
}
BEGIN { foo(); exit }
' >foo1
grep 'should not' foo1 && echo 'BAD: T.func (return)'
# this exercises multiple free of temp cells
echo 'eqn
eqn2' >foo1
$awk 'BEGIN { eprocess("eqn", "x", contig)
process("tbl" )
eprocess("eqn" "2", "x", contig)
}
function eprocess(file, first, contig) {
print file
}
function process(file) {
close(file)
}' >foo2
diff foo1 foo2 || echo 'BAD: T.func (eqn)'
echo 1 >foo1
$awk 'function f() { n = 1; exit }
BEGIN { n = 0; f(); n = 2 }; END { print n}' >foo2
diff foo1 foo2 || echo 'BAD: T.func (exit in function)'
echo 1 >foo1
$awk '
BEGIN { n = 10
for (i = 1; i <= n; i++)
for (j = 1; j <= n; j++)
x[i,j] = n * i + j
for (i = 1; i <= n; i++)
for (j = 1; j <= n; j++)
if ((i,j) in x)
k++
print (k == n^2)
}
' >foo2
diff foo1 foo2 || echo 'BAD: T.func (multi-dim subscript)'
echo '<> 0' >foo1
$awk '
function foo() { i = 0 }
BEGIN { x = foo(); printf "<%s> %d\n", x, x }' >foo2
diff foo1 foo2 || echo 'BAD: T.func (fall off end)'

390
testdir/T.gawk Executable file
View File

@ -0,0 +1,390 @@
echo T.gawk: tests adapted from gawk test suite
# for which thanks.
awk=${awk-../a.out}
# arrayref:
./echo '1
1' >foo1
$awk '
BEGIN { # foo[10] = 0 # put this line in and it will work
test(foo); print foo[1]
test2(foo2); print foo2[1]
}
function test(foo) { test2(foo) }
function test2(bar) { bar[1] = 1 }
' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk arrayref'
# asgext
./echo '1 2 3
1
1 2 3 4' >foo
./echo '3
1 2 3 a
1 a
3
1 2 3 a' >foo1
$awk '{ print $3; $4 = "a"; print }' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk asgext'
# backgsub:
./echo 'x\y
x\\y' >foo
./echo 'x\y
xAy
xAy
xAAy' >foo1
$awk '{ x = y = $0
gsub( /\\\\/, "A", x); print x
gsub( "\\\\", "A", y); print y
}' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk backgsub'
# backgsub2:
./echo 'x\y
x\\y
x\\\y' >foo
./echo ' x\y
x\y
x\y
x\y
x\\y
x\\\y
x\\y
x\\\y
x\\\\y' >foo1
$awk '{ w = x = y = z = $0
gsub( /\\\\/, "\\", w); print " " w
gsub( /\\\\/, "\\\\", x); print " " x
gsub( /\\\\/, "\\\\\\", y); print " " y
}
' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk backgsub2'
# backgsub3:
./echo 'xax
xaax' >foo
./echo ' xax
x&x
x&x
x\ax
x\ax
x\&x
xaax
x&&x
x&&x
x\a\ax
x\a\ax
x\&\&x' >foo1
$awk '{ w = x = y = z = z1 = z2 = $0
gsub( /a/, "\&", w); print " " w
gsub( /a/, "\\&", x); print " " x
gsub( /a/, "\\\&", y); print " " y
gsub( /a/, "\\\\&", z); print " " z
gsub( /a/, "\\\\\&", z1); print " " z1
gsub( /a/, "\\\\\\&", z2); print " " z2
}
' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk backgsub3'
# backsub3:
./echo 'xax
xaax' >foo
./echo ' xax
x&x
x&x
x\ax
x\ax
x\&x
xaax
x&ax
x&ax
x\aax
x\aax
x\&ax' >foo1
$awk '{ w = x = y = z = z1 = z2 = $0
sub( /a/, "\&", w); print " " w
sub( /a/, "\\&", x); print " " x
sub( /a/, "\\\&", y); print " " y
sub( /a/, "\\\\&", z); print " " z
sub( /a/, "\\\\\&", z1); print " " z1
sub( /a/, "\\\\\\&", z2); print " " z2
}
' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk backsub3'
# backsub:
./echo 'x\y
x\\y' >foo
./echo 'x\y
x\\y
x\\y
x\\\y' >foo1
$awk '{ x = y = $0
sub( /\\\\/, "\\\\", x); print x
sub( "\\\\", "\\\\", y); print y
}' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk backsub'
# dynlj:
./echo 'hello world' >foo1
$awk 'BEGIN { printf "%*sworld\n", -20, "hello" }' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk dynlj'
# fsrs:
./echo 'a b
c d
e f
1 2
3 4
5 6' >foo
# note -n:
./echo -n 'a b
c d
e f1 2
3 4
5 6' >foo1
$awk '
BEGIN {
RS=""; FS="\n";
ORS=""; OFS="\n";
}
{
split ($2,f," ")
print $0;
}' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk fsrs'
# intest
./echo '0 1' >foo1
$awk 'BEGIN {
bool = ((b = 1) in c);
print bool, b # gawk-3.0.1 prints "0 "; should print "0 1"
}' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk intest'
# intprec:
./echo '0000000005:000000000e' >foo1
$awk 'BEGIN { printf "%.10d:%.10x\n", 5, 14 }' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk intprec'
# litoct:
./echo 'axb
ab
a*b' >foo
./echo 'no match
no match
match' >foo1
$awk '{ if (/a\52b/) print "match" ; else print "no match" }' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk litoct'
# math:
./echo 'cos(0.785398) = 0.707107
sin(0.785398) = 0.707107
e = 2.718282
log(e) = 1.000000
sqrt(pi ^ 2) = 3.141593
atan2(1, 1) = 0.785398' >foo1
$awk 'BEGIN {
pi = 3.1415927
printf "cos(%f) = %f\n", pi/4, cos(pi/4)
printf "sin(%f) = %f\n", pi/4, sin(pi/4)
e = exp(1)
printf "e = %f\n", e
printf "log(e) = %f\n", log(e)
printf "sqrt(pi ^ 2) = %f\n", sqrt(pi ^ 2)
printf "atan2(1, 1) = %f\n", atan2(1, 1)
}' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk math'
# nlfldsep:
./echo 'some stuff
more stuffA
junk
stuffA
final' >foo
./echo '4
some
stuff
more
stuff
2
junk
stuff
1
final
' >foo1
$awk 'BEGIN { RS = "A" }
{print NF; for (i = 1; i <= NF; i++) print $i ; print ""}
' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk nlfldsep'
# numsubstr:
./echo '5000
10000
5000' >foo
./echo '000
1000
000' >foo1
$awk '{ print substr(1000+$1, 2) }' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk numsubstr'
# pcntplus:
./echo '+3 4' >foo1
$awk 'BEGIN { printf "%+d %d\n", 3, 4 }' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk pcntplus'
# prt1eval:
./echo 1 >foo1
$awk 'function tst () {
sum += 1
return sum
}
BEGIN { OFMT = "%.0f" ; print tst() }
' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk prt1eval'
# reparse:
./echo '1 axbxc 2' >foo
./echo '1
1 a b c 2
1 a b' >foo1
$awk '{ gsub(/x/, " ")
$0 = $0
print $1
print $0
print $1, $2, $3
}' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk reparse'
# rswhite:
./echo ' a b
c d' >foo
./echo '< a b
c d>' >foo1
$awk 'BEGIN { RS = "" }
{ printf("<%s>\n", $0) }' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk rswhite'
# splitvar:
./echo 'Here===Is=Some=====Data' >foo
./echo 4 >foo1
$awk '{ sep = "=+"
n = split($0, a, sep)
print n
}' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk splitvar'
# splitwht:
./echo '4
5' >foo1
$awk 'BEGIN {
str = "a b\t\tc d"
n = split(str, a, " ")
print n
m = split(str, b, / /)
print m
}' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk splitwht'
# sprintfc:
./echo '65
66
foo' >foo
./echo 'A 65
B 66
f foo' >foo1
$awk '{ print sprintf("%c", $1), $1 }' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk sprintfc'
# substr:
./echo 'xxA
xxab
xxbc
xxab
xx
xx
xxab
xx
xxef
xx' >foo1
$awk 'BEGIN {
x = "A"
printf("xx%-39s\n", substr(x,1,39))
print "xx" substr("abcdef", 0, 2)
print "xx" substr("abcdef", 2.3, 2)
print "xx" substr("abcdef", -1, 2)
print "xx" substr("abcdef", 1, 0)
print "xx" substr("abcdef", 1, -3)
print "xx" substr("abcdef", 1, 2.3)
print "xx" substr("", 1, 2)
print "xx" substr("abcdef", 5, 5)
print "xx" substr("abcdef", 7, 2)
exit (0)
}' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk substr'
# fldchg:
./echo 'aa aab c d e f' >foo
./echo '1: + +b c d e f
2: + +b <c> d e f
2a:%+%+b%<c>%d%e' >foo1
$awk '{ gsub("aa", "+")
print "1:", $0
$3 = "<" $3 ">"
print "2:", $0
print "2a:" "%" $1 "%" $2 "%" $3 "%" $4 "%" $5
}' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk fldchg'
# fldchgnf:
./echo 'a b c d' >foo
./echo 'a::c:d
4' >foo1
$awk '{ OFS = ":"; $2 = ""; print $0; print NF }' foo >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk fldchgnf'
# funstack:
# ./echo ' funstack test takes 5-10 sec, replicates part of T.beebe'
$awk -f funstack.awk funstack.in >foo 2>&1
cmp -s foo funstack.ok || ./echo 'BAD: T.gawk funstack'
# OFMT from arnold robbins 6/02:
# 5.7 with OFMT = %0.f is 6
./echo '6' >foo1
$awk 'BEGIN {
OFMT = "%.0f"
print 5.7
}' >foo2
cmp -s foo1 foo2 || ./echo 'BAD: T.gawk ofmt'
### don't know what this is supposed to do now.
### # convfmt:
### ./echo 'a = 123.46
### a = 123.456
### a = 123.456' >foo1
### $awk 'BEGIN {
### CONVFMT = "%2.2f"
### a = 123.456
### b = a "" # give a string value also
### a += 0 # make a numeric only again
### print "a = ", a
### CONVFMT = "%.6g"
### print "a = ", a
### a += 0 # make a numeric only again
### print "a = ", a # use a as string
### }' >foo2
### cmp -s foo1 foo2 || ./echo 'BAD: T.gawk convfmt'

98
testdir/T.getline Executable file
View File

@ -0,0 +1,98 @@
echo T.getline: test getline function
awk=${awk-../a.out}
who >foo1
cat foo1 | $awk '
BEGIN {
while (getline)
print
exit
}
' >foo
cmp -s foo1 foo || echo 'BAD: T.getline (bare getline)'
who >foo1
cat foo1 | $awk '
BEGIN {
while (getline xxx)
print xxx
exit
}
' >foo
cmp -s foo1 foo || echo 'BAD: T.getline (getline xxx)'
$awk '
BEGIN {
while (getline <"/etc/passwd")
print
exit
}
' >foo
cmp -s /etc/passwd foo || echo 'BAD: T.getline (getline <file)'
cat /etc/passwd | $awk '
BEGIN {
while (getline <"-") # stdin
print
exit
}
' >foo
cmp -s /etc/passwd foo || echo 'BAD: T.getline (getline <"-")'
$awk '
BEGIN {
while (getline <ARGV[1])
print
exit
}
' /etc/passwd >foo
cmp -s /etc/passwd foo || echo 'BAD: T.getline (getline <arg)'
$awk '
BEGIN {
while (getline x <ARGV[1])
print x
exit
}
' /etc/passwd >foo
cmp -s /etc/passwd foo || echo 'BAD: T.getline (getline x <arg)'
$awk '
BEGIN {
while (("cat " ARGV[1]) | getline)
print
exit
}
' /etc/passwd >foo
cmp -s /etc/passwd foo || echo 'BAD: T.getline (cat arg | getline)'
$awk '
BEGIN {
while (("cat " ARGV[1]) | getline x)
print x
exit
}
' /etc/passwd >foo
cmp -s /etc/passwd foo || echo 'BAD: T.getline (cat arg | getline x)'
$awk ' BEGIN { print getline <"/glop/glop/glop" } ' >foo
echo '-1' >foo1
cmp -s foo foo1 || echo 'BAD: T.getline (non-existent file)'
echo 'false false equal' >foo1
$awk 'BEGIN {
"echo 0" | getline
if ($0) printf "true "
else printf "false "
if ($1) printf "true "
else printf "false "
if ($0==$1) printf "equal\n"
else printf "not equal\n"
}' >foo2
cmp -s foo1 foo2 || echo 1>&2 'BAD: T.getline bad $0 type in cmd|getline'
echo 'L1
L2' | $awk 'BEGIN { $0="old stuff"; $1="new"; getline x; print}' >foo1
echo 'new stuff' >foo2
cmp -s foo1 foo2 || echo 1>&2 'BAD: T.getline bad update $0'

124
testdir/T.int-expr Executable file
View File

@ -0,0 +1,124 @@
echo T.int-expr: test interval expressions
awk=${awk-../a.out}
rm -f foo
cat << \EOF > prog
NF == 0 { next }
$1 == "pat" { pattern = $2; next }
{
check = ($1 ~ pattern)
printf("%s ~ /%s/ -> should be %d, is %d\n", $1, pattern, $2, check)
}
EOF
cat << \EOF > foo.in
pat ab{0}c
ac 1
abc 0
pat a(b{0})c
ac 1
abc 0
pat ab{0}*c
ac 1
abc 0
pat a(b{0})*c
ac 1
abc 0
pat ab{0,}c
ac 1
abc 1
pat a(b{0,})c
ac 1
abc 1
pat ab{0,}*c
ac 1
abc 1
pat a(b{0,})*c
ac 1
abc 1
pat ab{1}c
ac 0
abc 1
abbc 0
pat ab{1,}c
ac 0
abc 1
abbc 1
abbbc 1
abbbbc 1
pat ab{0,1}c
ac 1
abc 1
abbc 0
pat ab{0,3}c
ac 1
abc 1
abbc 1
abbbc 1
abbbbc 0
pat ab{1,3}c
ac 0
abc 1
abbc 1
abbbc 1
abbbbc 0
EOF
cat << \EOF > foo1
ac ~ /ab{0}c/ -> should be 1, is 1
abc ~ /ab{0}c/ -> should be 0, is 0
ac ~ /a(b{0})c/ -> should be 1, is 1
abc ~ /a(b{0})c/ -> should be 0, is 0
ac ~ /ab{0}*c/ -> should be 1, is 1
abc ~ /ab{0}*c/ -> should be 0, is 0
ac ~ /a(b{0})*c/ -> should be 1, is 1
abc ~ /a(b{0})*c/ -> should be 0, is 0
ac ~ /ab{0,}c/ -> should be 1, is 1
abc ~ /ab{0,}c/ -> should be 1, is 1
ac ~ /a(b{0,})c/ -> should be 1, is 1
abc ~ /a(b{0,})c/ -> should be 1, is 1
ac ~ /ab{0,}*c/ -> should be 1, is 1
abc ~ /ab{0,}*c/ -> should be 1, is 1
ac ~ /a(b{0,})*c/ -> should be 1, is 1
abc ~ /a(b{0,})*c/ -> should be 1, is 1
ac ~ /ab{1}c/ -> should be 0, is 0
abc ~ /ab{1}c/ -> should be 1, is 1
abbc ~ /ab{1}c/ -> should be 0, is 0
ac ~ /ab{1,}c/ -> should be 0, is 0
abc ~ /ab{1,}c/ -> should be 1, is 1
abbc ~ /ab{1,}c/ -> should be 1, is 1
abbbc ~ /ab{1,}c/ -> should be 1, is 1
abbbbc ~ /ab{1,}c/ -> should be 1, is 1
ac ~ /ab{0,1}c/ -> should be 1, is 1
abc ~ /ab{0,1}c/ -> should be 1, is 1
abbc ~ /ab{0,1}c/ -> should be 0, is 0
ac ~ /ab{0,3}c/ -> should be 1, is 1
abc ~ /ab{0,3}c/ -> should be 1, is 1
abbc ~ /ab{0,3}c/ -> should be 1, is 1
abbbc ~ /ab{0,3}c/ -> should be 1, is 1
abbbbc ~ /ab{0,3}c/ -> should be 0, is 0
ac ~ /ab{1,3}c/ -> should be 0, is 0
abc ~ /ab{1,3}c/ -> should be 1, is 1
abbc ~ /ab{1,3}c/ -> should be 1, is 1
abbbc ~ /ab{1,3}c/ -> should be 1, is 1
abbbbc ~ /ab{1,3}c/ -> should be 0, is 0
EOF
$awk -f prog foo.in > foo2
diff foo1 foo2 || echo 'BAD: T.int-expr (1)'
rm -f prog

37
testdir/T.latin1 Executable file
View File

@ -0,0 +1,37 @@
echo T.latin1: tests of 8-bit input
awk=${awk-../a.out}
$awk '
{ print $0 }
' latin1 >foo1
diff latin1 foo1 || echo 'BAD: T.latin1 1'
$awk '{ gsub(/\351/, "\370"); print }' latin1 >foo0
$awk '{ gsub(/é/, "ø"); print }' latin1 >foo1
diff foo0 foo1 || echo 'BAD: T.latin1 3'
$awk '{ gsub(/[^\300-\370]/, ""); print }' latin1 >foo0
$awk '{ gsub(/[^À-ø]/, ""); print } ' latin1 >foo1
diff foo0 foo1 || echo 'BAD: T.latin1 4'
echo '/á/' >foo1
$awk -f foo1 foo1 >foo2
diff foo1 foo2 || echo 'BAD: T.latin1 5'
echo /[áé]/ >foo1
$awk -f foo1 foo1 >foo2
diff foo1 foo2 || echo 'BAD: T.latin1 6'
echo 'This is a line.
Patterns like /[áé]/ do not work yet. Example, run awk /[áé]/
over a file containing just á.
This is another line.' >foo0
echo 'Patterns like /[áé]/ do not work yet. Example, run awk /[áé]/
over a file containing just á.' >foo1
$awk '/[áé]/' foo0 >foo2
diff foo1 foo2 || echo 'BAD: T.latin1 7'

28
testdir/T.lilly Executable file
View File

@ -0,0 +1,28 @@
echo T.lilly: miscellaneous RE tests from Bruce Lilly
awk=${awk-../a.out}
rm -f foo
awk '
/./ {
print $0 >"foo"
close("foo")
print "###", NR, $0
system("awk -f foo <\"lilly.ifile\" ")
}' <lilly.progs >foo1 2>&1
rm -f foo
$awk '
/./ {
print $0 >"foo"
close("foo")
print "###", NR, $0
system("../a.out -f foo <\"lilly.ifile\" ")
}' <lilly.progs >foo2 2>&1
echo `cat lilly.progs | wc -l` tests
sed -e 's/awk://' -e 's/Syntax/syntax/' -e '/warning:/d' foo1 >glop1
sed 's/..\/a.out://' foo2 >glop2
diff glop1 glop2 >lilly.diff || echo 'bad: T.lilly is different'
echo

32
testdir/T.main Executable file
View File

@ -0,0 +1,32 @@
echo T.main: misc tests of arguments in main
awk=${awk-../a.out}
rm -f core
# test -d option
echo hello | $awk -d '{print}' >foo1
if test -r core; then echo 1>&2 "BAD: T.main awk -d dropped core"; fi
echo 'a::b::c' >foo
$awk -F:: '{print NF}' foo >foo1
echo '3' >foo2
diff foo1 foo2 || echo 'bad: awk -F::'
echo 'a::b::c' >foo
$awk -F :: '{print NF}' foo >foo1
echo '3' >foo2
diff foo1 foo2 || echo 'bad: awk -F ::'
echo 'a b c' >foo
$awk -F t '{print NF}' foo >foo1
echo '3' >foo2
diff foo1 foo2 || echo 'bad: awk -F (tab)'
echo 'atabbtabc' >foo
$awk -F tab '{print NF}' foo >foo1
echo '3' >foo2
diff foo1 foo2 || echo 'bad: awk -F tab'

506
testdir/T.misc Executable file
View File

@ -0,0 +1,506 @@
#!/bin/sh
echo T.misc: miscellaneous buglets now watched for
awk=${awk-../a.out}
rm -f core
echo 'The big brown over the lazy doe
The big brown over the lazy dog
x
The big brown over the lazy dog' >foo
echo 'failed
succeeded
failed
succeeded' >foo1
$awk '{ if (match($0, /^The big brown over the lazy dog/) == 0) {
printf("failed\n")
} else {
printf("succeeded\n")
}
} ' foo >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc ghosh RE bug'
echo '123
1234567890
12345678901' >foo
echo '12345678901' >foo1
$awk 'length($0) > 10' foo >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc last number bug'
# check some \ sequences in strings (ascii)
echo HIJKL >foo1
echo foo | $awk '{ print "H\x49\x4a\x4BL" }' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc hex string cvt'
echo 012x45 >foo1
$awk 'BEGIN { print "0\061\62x\0645" }' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc oct string cvt'
# $i++ means ($i)++
echo 3 5 | $awk '{ i = 1; print $i++ ; print $1, i }' >foo1
echo '3
4 1' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc bad field increment'
# makes sure that fields are recomputed even if self-assignment
# take into account that subtracting from NF now rebuilds the record
echo 'a b c
s p q r
x y z' >foo
echo 'a
s p
x' >foo1
$awk '{ NF -= 2; $1 = $1; print }' <foo >foo2
diff foo1 foo2 || echo 1>&2 "BAD: T.misc bad field self-assignment"
echo '1
1' >foo1
$awk 'BEGIN {x = 1; print x; x = x; print x}' >foo2
diff foo1 foo2 || echo 1>&2 "BAD: T.misc bad self-assignment"
echo 573109312 | $awk '{print $1*4}' >foo1
echo 2292437248 >foo2
diff foo1 foo2 || echo 1>&2 "BAD: T.misc bad overflow"
# note that there are 8-bit characters in the echo
# some shells will probably screw this up.
echo '#
code € 1
code € 2' |
$awk '/^#/' >foo1
echo '#' >foo2
diff foo1 foo2 || echo 1>&2 "BAD: T.misc bad match of 8-bit char"
echo hello |
$awk 'BEGIN { FILENAME = "/etc/passwd" }
{ print $0 }' >/dev/null
if test -r core; then echo 1>&2 "BAD: T.misc /etc/passwd dropped core"; fi
echo hello |
$awk ' function foo(foo) {
foo = 1
foo()
}
{ foo(bar) }
' >/dev/null 2>&1
if test -r core; then
echo 1>&2 "BAD: T.misc function foo(foo) dropped core"
rm -f core
fi
echo '2
10' |
$awk '{ x[NR] = $0 } # test whether $0 is NUM as well as STR
END { if (x[1] > x[2]) print "BAD: T.misc: $0 is not NUM" }'
$awk 'BEGIN {
npad = substr("alexander" " ",1,15)
print npad
}' >foo
grep '\\' foo && echo 1>&2 "BAD: T.misc alexander fails"
# This should give an error about function arguments
$awk '
function foo(x) { print "x is" x }
BEGIN { foo(foo) }
' 2>foo
grep "can't use function foo" foo >/dev/null || echo 1>&2 "BAD: T.misc fcn args"
# gawk defref test; should give error about undefined function
$awk 'BEGIN { foo() }' 2>foo
grep "calling undefined function foo" foo >/dev/null || echo 1>&2 "BAD: T.misc undefined function"
# gawk arrayparm test; should give error about function
$awk '
BEGIN {
foo[1]=1;
foo[2]=2;
bug1(foo);
}
function bug1(i) {
for (i in foo) {
bug2(i);
delete foo[i];
print i,1,bot[1];
}
}
function bug2(arg) {
bot[arg]=arg;
}
' 2>foo
grep "can.t assign to foo" foo >/dev/null || echo 1>&2 "BAD: T.misc foo bug"
# This should be a syntax error
$awk '
!x = y
' 2>foo
grep "syntax error" foo >/dev/null || echo 1>&2 "BAD: T.misc syntax error !x=y fails"
# This should print bbb
$awk '
BEGIN { up[1] = "a"
for (i in up) gsub("a", "A", x)
print x x "bbb"
exit
}
' >foo
grep bbb foo >/dev/null || echo 1>&2 "BAD: T.misc gsub failed"
echo yes |
$awk '
BEGIN {
printf "push return" >"/dev/null"
getline ans <"/dev/null"
} '
if test -r core; then echo 1>&2 "BAD: T.misc getline ans dropped core"; fi
$awk 'BEGIN { unireghf() }
function unireghf(hfeed) { hfeed[1] = 0 }'
if test -r core; then echo 1>&2 "BAD: T.misc unireghf dropped core"; fi
echo x | $awk '/[/]/' 2>foo
grep 'nonterminated character class' foo >/dev/null || error 'BAD: T.misc nonterminated fails'
if test -r core; then echo 1>&2 "BAD: T.misc nonterminated dropped core"; fi
$awk '
function f() { return 12345 }
BEGIN { printf "<%s>\n", f() }
' >foo
grep '<12345>' foo >/dev/null || echo 'BAD: T.misc <12345> fails'
echo 'abc
def
ghi
jkl' >foo
$awk '
BEGIN { RS = ""
while (getline <"foo")
print
}' >foo1
$awk 'END {print NR}' foo1 | grep 4 >/dev/null || echo 'BAD: T.misc abcdef fails'
# Test for RS regex matching an empty record at EOF
echo a | $awk 1 RS='a\n' > foo1
cat << \EOF > foo2
EOF
diff foo1 foo2 || echo 'BAD: T.misc RS regex matching an empty record at EOF fails'
# Test for RS regex being reapplied
echo aaa1a2a | $awk 1 RS='^a' >foo1
cat << \EOF > foo2
aa1a2a
EOF
diff foo1 foo2 || echo 'BAD: T.misc ^regex reapplied fails'
# ^-anchored RS matching should be active at the start of each input file
tee foo1 foo2 >foo3 << \EOF
aaa
EOF
$awk 1 RS='^a' foo1 foo2 foo3 >foo4
cat << \EOF > foo5
aa
aa
aa
EOF
diff foo4 foo5 || echo 'BAD: T.misc ^RS matches the start of every input file fails'
# The following should not produce a warning about changing a constant
# nor about a curdled tempcell list
$awk 'function f(x) { x = 2 }
BEGIN { f(1) }' >foo
grep '^' foo && echo 'BAD: test constant change fails'
# The following should not produce a warning about a curdled tempcell list
$awk 'function f(x) { x }
BEGIN { f(1) }' >foo
grep '^' foo && echo 'BAD: test tempcell list fails'
$awk 'BEGIN { print 9, a=10, 11; print a; exit }' >foo1
echo '9 10 11
10' >foo2
diff foo1 foo2 || echo 'BAD: T.misc (embedded expression)'
echo "abc defgh ijkl" | $awk '
{ $1 = ""; line = $0; print line; print $0; $0 = line; print $0 }' >foo1
echo " defgh ijkl
defgh ijkl
defgh ijkl" >foo2
diff foo1 foo2 || echo 'BAD: T.misc (assignment to $0)'
$awk '
function min(a, b)
{
if (a < b)
return a
else
return b
}
BEGIN { exit }
'
if test -r core; then echo 1>&2 "BAD: T.misc function min dropped core"; fi
# The following should not give a syntax error message:
$awk '
function expand(chart) {
getline chart < "CHAR.ticks"
}
' >foo
grep '^' foo >/dev/null && echo 'BAD: T.misc expand error'
$awk 'BEGIN { print 1e40 }' >/dev/null
if test -r core; then echo 1>&2 "BAD: T.misc 1E40 dropped core"; fi
# The following syntax error should not dump core:
$awk '
$NF==3 {first=1}
$NF==2 && first==0 && (abs($1-o1)>120||abs($2-o2)>120) {print $0}
$NF==2 {o1=%1; o2=$2; first=0}
' 2>/dev/null
if test -r core; then echo 1>&2 "BAD: T.misc first/abs dropped core"; fi
# The following syntax error should not dump core:
$awk '{ n = split($1, address, !); print address[1] }' 2>foo
grep 'illegal statement' foo >/dev/null || echo 'BAD: T.misc split error'
if test -r core; then echo 1>&2 "BAD: T.misc split! dropped core"; fi
# The following should cause a syntax error message
$awk 'BEGIN {"hello"}' 2>foo
grep 'illegal statement' foo >/dev/null || echo 'BAD: T.misc hello error'
# The following should give a syntax error message:
$awk '
function pile(c, r) {
r = ++pile[c]
}
{ pile($1) }
' 2>foo
grep 'context is' foo >/dev/null || echo 'BAD: T.misc pile error'
# This should complain about missing atan2 argument:
$awk 'BEGIN { atan2(1) }' 2>foo
grep 'requires two arg' foo >/dev/null || echo 'BAD: T.misc atan2 error'
# This should not core dump:
$awk 'BEGIN { f() }
function f(A) { delete A[1] }
'
if test -r core; then echo 1>&2 "BAD: T.misc delete dropped core"; fi
# nasty one: should not be able to overwrite constants
$awk 'BEGIN { gsub(/ana/,"anda","banana")
printf "the monkey ate a %s\n", "banana" }
' >/dev/null 2>foo
grep 'syntax error' foo >/dev/null || echo 'BAD: T.misc gsub banana error'
# nasty one: should not be able to overwrite constants
$awk 'BEGIN { sub(/ana/,"anda","banana")
printf "the monkey ate a %s\n", "banana" }
' >/dev/null 2>foo
grep 'syntax error' foo >/dev/null || echo 'BAD: T.misc sub banana error'
# line numbers used to double-count comments
$awk '#
#
#
/x
' >/dev/null 2>foo
grep 'line [45]' foo >/dev/null || echo 'BAD: T.misc lineno'
echo 'x  \y' >foo1
$awk 'BEGIN { print "x\f\r\b\v\a\\y" }' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc weird chars'
echo 0 >foo1
$awk ' BEGIN { exit }
{ print }
END { print NR }' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc BEGIN exit'
echo 1 >foo1
$awk ' { exit }
END { print NR }' /etc/passwd >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc immmediate exit'
echo 1 >foo1
$awk ' {i = 1; while (i <= NF) {if (i == NF) exit; i++ } }
END { print NR }' /etc/passwd >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc immmediate exit 2'
echo 1 >foo1
$awk ' function f() {
i = 1; while (i <= NF) {if (i == NF) return NR; i++ }
}
{ if (f() == 1) exit }
END { print NR }' /etc/passwd >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc while return'
echo 1 >foo1
$awk ' function f() {
split("a b c", arr)
for (i in arr) {if (i == 3) return NR; i++ }
}
{ if (f() == 1) exit }
END { print NR }' /etc/passwd >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc while return'
echo 1 >foo1
$awk ' {i = 1; do { if (i == NF) exit; i++ } while (i <= NF) }
END { print NR }' /etc/passwd >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc immmediate exit 3'
echo 1 >foo1
$awk ' function f() {
i = 1; do { if (i == NF) return NR; i++ } while (i <= NF)
}
{ if (f() == 1) exit }
END { print NR }' /etc/passwd >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc do return'
echo 1 >foo1
$awk ' {i = 1; do { if (i == NF) break; i++ } while (i <= NF); exit }
END { print NR }' /etc/passwd >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc immmediate exit 4'
echo 1 >foo1
$awk ' { n = split($0, x)
for (i in x) {
if (i == 1)
exit } }
END { print NR }' /etc/passwd >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc immmediate exit 5'
echo XXXXXXXX >foo1
$awk 'BEGIN { s = "ab\fc\rd\be"
t = s; gsub("[" s "]", "X", t); print t }' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc weird escapes in char class'
$awk '{}' /etc/passwd glop/glop >foo 2>foo2
grep "can't open.*glop" foo2 >/dev/null || echo "BAD: T.misc can't open"
echo '
a
aa
b
c
' >foo
echo 3 >foo1
$awk 'BEGIN { RS = "" }; END { print NR }' foo >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc RS botch'
$awk 'BEGIN \
{
print "hello, world"
}
}}}' >foo1 2>foo2
grep 'source line 4' foo2 >/dev/null 2>&1 || echo 'BAD: T.misc continuation line number'
echo 111 222 333 >foo
$awk '{ f[1]=1; f[2]=2; print $f[1], $f[1]++, $f[2], f[1], f[2] }' foo >foo2
echo 111 111 222 2 2 >foo1
cmp -s foo1 foo2 || echo 'BAD: T.misc $f[1]++'
# These should be syntax errors
$awk . 2>foo
grep "syntax error" foo >/dev/null || echo 1>&2 "BAD: T.misc syntax error . fails"
$awk .. 2>foo
grep "syntax error" foo >/dev/null || echo 1>&2 "BAD: T.misc syntax error .. fails"
$awk .E. 2>foo
grep "syntax error" foo >/dev/null || echo 1>&2 "BAD: T.misc syntax error .E. fails"
$awk .++. 2>foo
grep "syntax error" foo >/dev/null || echo 1>&2 "BAD: T.misc syntax error .++. fails"
# These should be syntax errors
$awk '$' 2>foo
grep "unexpected" foo >/dev/null || echo 1>&2 "BAD: T.misc syntax error $ fails"
$awk '{print $' 2>foo
grep "unexpected" foo >/dev/null || echo 1>&2 "BAD: T.misc syntax error $2 fails"
$awk '"' 2>foo
grep "non-terminated" foo >/dev/null || echo 1>&2 "BAD: T.misc bare quote fails"
# %c of 0 is explicit null byte
./echo '3' >foo1
$awk 'BEGIN {printf("%c%c\n", 0, 0) }' | wc | $awk '{print $3}' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc null byte'
# non-terminated RE
$awk /xyz >foo 2>&1
grep "non-terminated" foo >/dev/null || echo 1>&2 "BAD: T.misc non-terminated RE"
# next several were infinite loops, found by brian tsang.
# this is his example:
$awk 'BEGIN {
switch (substr("x",1,1)) {
case /ask.com/:
break
case "google":
break
}
}' >foo 2>&1
grep "illegal statement" foo >/dev/null || echo 1>&2 "BAD: T.misc looping syntax error 1"
$awk 'BEGIN { s { c /./ } }' >foo 2>&1
grep "illegal statement" foo >/dev/null || echo 1>&2 "BAD: T.misc looping syntax error 2"
$awk 'BEGIN { s { c /../ } }' >foo 2>&1
grep "illegal statement" foo >/dev/null || echo 1>&2 "BAD: T.misc looping syntax error 3"
$awk 'BEGIN {printf "%2$s %1$s\n", "a", "b"}' >foo 2>&1
grep "'$' not permitted in awk formats" foo >/dev/null || echo 1>&2 "BAD: T.misc '$' not permitted in formats"
echo 'a
b c
de fg hi' >foo0
$awk 'END { print NF, $0 }' foo0 >foo1
awk '{ print NF, $0 }' foo0| tail -1 >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc END must preserve $0'
echo 'fg hi' >foo0
$awk 'END { print NF, $0 }' foo0 >foo1
awk '{ print NF, $0 }' foo0| tail -1 >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc END must preserve $0'
echo '' >foo0
$awk 'END { print NF, $0 }' foo0 >foo1
awk '{ print NF, $0 }' foo0| tail -1 >foo2
cmp -s foo1 foo2 || echo 'BAD: T.misc END must preserve $0'
# Check for nonzero exit status on I/O error.
echo 'E 2' >foo1
(trap '' PIPE; "$awk" 'BEGIN { print "hi"; }' 2>/dev/null; echo "E $?" >foo2) | :
cmp -s foo1 foo2 || echo 'BAD: T.misc exit status on I/O error'

86
testdir/T.nextfile Executable file
View File

@ -0,0 +1,86 @@
echo T.nextfile: tests of nextfile command
awk=${awk-../a.out}
# 1st lines of some files
rm -f foo0
for i in T.*
do
sed 1q $i >>foo0
done
$awk '
{ print $0; nextfile } # print first line, quit
' T.* >foo1
diff foo0 foo1 || echo 'BAD: T.nextfile 1'
$awk ' # same test but in a for loop
{ print $0;
for (i = 1; i < 10; i++)
if (i == 1)
nextfile
print "nextfile for error"
} # print first line, quit
' T.* >foo1
diff foo0 foo1 || echo 'BAD: T.nextfile 1f'
$awk ' # same test but in a while loop
{ print $0;
i = 1
while (i < 10)
if (i++ == 1)
nextfile
print "nextfile while error"
} # print first line, quit
' T.* >foo1
diff foo0 foo1 || echo 'BAD: T.nextfile 1w'
$awk ' # same test but in a do loop
{ print $0;
i = 1
do {
if (i++ == 1)
nextfile # print first line, quit
} while (i < 10)
print "nextfile do error"
}
' T.* >foo1
diff foo0 foo1 || echo 'BAD: T.nextfile 1d'
# 100 lines of some files
rm -f foo0
for i in T.*
do
sed 100q $i >>foo0
done
$awk '
{ print }
FNR == 100 { nextfile } # print first line, quit
' T.* >foo1
diff foo0 foo1 || echo 'BAD: T.nextfile 2'
>foo0 # empty
$awk ' { nextfile; print $0 }' T.* >foo1
diff foo0 foo1 || echo 'BAD: T.nextfile 3'
# skip weird args
rm -f foo0
for i in T.*
do
sed 1q $i >>foo0
done
$awk '
{ print $0; nextfile } # print first line, quit
' T.* >foo1
diff foo0 foo1 || echo 'BAD: T.nextfile 4'

86
testdir/T.overflow Executable file
View File

@ -0,0 +1,86 @@
echo T.overflow: test some overflow conditions
awk=${awk-../a.out}
$awk 'BEGIN {
for (i = 0; i < 1000; i++) printf("abcdefghijklmnopqsrtuvwxyz")
printf("\n")
exit
}' >foo1
$awk '{print}' foo1 >foo2
cmp -s foo1 foo2 || echo 'BAD: T.overflow record 1'
echo 'abcdefghijklmnopqsrtuvwxyz' >foo1
echo hello | $awk '
{ for (i = 1; i < 500; i++) s = s "abcdefghijklmnopqsrtuvwxyz "
$0 = s
print $1
}' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.overflow abcdef'
# default input record 3072, fields 200:
$awk '
BEGIN {
for (j = 0; j < 2; j++) {
for (i = 0; i < 500; i++)
printf(" 123456789")
printf("\n");
}
} ' >foo1
$awk '{$1 = " 123456789"; print}' foo1 >foo2
cmp -s foo1 foo2 || echo 'BAD: T.overflow -mr -mf set $1'
$awk '
BEGIN {
for (j = 0; j < 2; j++) {
for (i = 0; i < 500; i++)
printf(" 123456789")
printf("\n");
}
} ' >foo
$awk '{print NF}' foo >foo1
echo '500
500' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.overflow -mr -mf NF'
rm -f core
# this should not drop core
$awk 'BEGIN {
for (i = 1; i < 1000; i++) s = s "a-z"
if ("x" ~ "[" s "]")
print "ugh"
}' >foo 2>foo
test -r core && echo 1>&2 "BAD: T.overflow too long char class dropped core"
echo 4000004 >foo1
$awk '
BEGIN {
x1 = sprintf("%1000000s\n", "hello")
x2 = sprintf("%-1000000s\n", "world")
x3 = sprintf("%1000000.1000000s\n", "goodbye")
x4 = sprintf("%-1000000.1000000s\n", "goodbye")
print length(x1 x2 x3 x4)
}' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.overflow huge sprintfs'
echo 0 >foo1
$awk '
BEGIN {
for (i = 0; i < 100000; i++)
x[i] = i
for (i in x)
delete x[i]
n = 0
for (i in x)
n++
print n
}' >foo2
cmp -s foo1 foo2 || echo 'BAD: T.overflow big array'
echo x >foo1
$awk '{print $40000000000000}' <foo1 >foo2 2>foo
grep "out of range field" foo >/dev/null || echo 1>&2 "BAD: T.overflow \$400000"
rm -rf /tmp/awktestfoo*
$awk 'BEGIN { for (i=1; i <= 1000; i++) print i >("/tmp/awktestfoo" i) }'
ls /tmp/awktestfoo* | grep '1000' >/dev/null || echo 1>&2 "BAD: T.overflow openfiles"

340
testdir/T.re Executable file
View File

@ -0,0 +1,340 @@
echo T.re: tests of regular expression code
awk '
BEGIN {
FS = "\t"
awk = "../a.out"
}
NF == 0 {
next
}
$1 != "" { # new test
re = $1
}
$2 != "" { # either ~ or !~
op = $2
if (op == "~")
neg = "!"
else if (op == "!~")
neg = ""
}
$3 != "" { # new test string
str = $3
}
$3 == "\"\"" { # explicit empty line
$3 = ""
}
NF > 2 { # generate a test
input = $3
test = sprintf("./echo '"'"'%s'"'"' | %s '"'"'%s/%s/ {print \"%d fails %s %s %s\"}'"'"'",
input, awk, neg, re, NR, re, op, input)
# printf(" %3d %s %s %s:\n", NR, re, op, input)
# print "test is |" test "|"
system(test)
# system("bprint -c ../a.out")
nt++
}
END { print " " nt, "tests" }
' <<\!!!!
~ a
aa
aaa
""
a ~ a
ba
bab
!~ ""
x
xxxxx
= ~ =
b=
b=b
!~ ""
x
xxxxx
. ~ x
xxx
!~ ""
.a ~ xa
xxa
xax
!~ a
ax
""
$ ~ x
""
.$ ~ x
xx
!~ ""
a$ ~ a
ba
bbba
!~ ab
x
""
^ ~ x
""
^
^a$ ~ a
!~ xa
ax
xax
""
^a.$ ~ ax
aa
!~ xa
aaa
axy
""
^$ ~ ""
!~ x
^
^.a ~ xa
xaa
!~ a
""
^.*a ~ a
xa
xxxxxxa
!~ ""
^.+a ~ xa
xxxxxxa
!~ ""
a
ax
a* ~ ""
a
aaaa
xa
xxxx
aa* ~ a
aaa
xa
!~ xxxx
""
\$ ~ x$
$
$x
x$x
!~ ""
x
\. ~ .
!~ x
""
xr+y ~ xry
xrry
xrrrrrry
!~ ry
xy
xr
xr?y ~ xy
xry
!~ xrry
a?b?c? ~ ""
x
^a?b?x ~ x
ax
bx
abx
xa
!~ ""
ab
aba
[0-9] ~ 1
567
x0y
!~ abc
""
[^0-9] !~ 1
567
""
~ abc
x0y
[0-25-69] ~ 0
1
2
5
6
9
!~ 3
4
7
8
[[:lower:]] ~ a
b
z
!~ A
Z
1
:
[
]
[[:upper:]] ~ A
B
Z
!~ a
z
1
:
[
]
[[:digit:]] ~ 0
1
9
!~ d
:
[
]
x[0-9]+y ~ x0y
x23y
x12345y
!~ 0y
xy
x[0-9]?y ~ xy
x1y
!~ x23y
x[[]y ~ x[y
!~ xy
x[[]y
x]y
x[[-]y ~ x[y
x-y
!~ xy
x[[]y
x]y
x[[-a]y ~ x[y
xay
x]y
!~ xy
x[[]y
x-y
x[]-a]y ~ x]y
xay
!~ xy
x[y
x-y
x[]]y ~ x]y
!~ xy
x[]]y
x[y
x[^[]y ~ xay
!~ x[y
x[-]y ~ x-y
!~ xy
x+y
x[^-]y ~ x+y
!~ x-y
xy
x[][]y ~ x[y
x]y
!~ xy
x][y
x[]y
x[z-a]y ~ xy
!~ x
y
xay
xzy
x-y
[0\-9] ~ 0
-
9
!~ 1
""
[-1] ~ -
1
!~ 0
[0-] ~ 0
-
!~ 1
[^-0] ~ x
^
!~ -
0
""
[^0-] ~ x
^
!~ -
0
""
x|y ~ x
y
xy
!~ a
""
^abc|xyz$ ~ abc
abcd
axyz
xyz
!~ xabc
xyza
^(abc|xyz)$ ~ abc
xyz
!~ abcxyz
abcx
cxyz
^x\|y$ ~ x|y
!~ xy
^x\\y$ ~ x\y
!~ xy
x\\y
xay
\141\142 ~ ab
xab
abx
!~ a
b
ax
axb
x\056y ~ x.y
!~ x.
.x
xxx
xby because \056 is not the metacharacter .
xcy ditto
[\60-\62\65-6\71] ~ 0
1
2
5
6
9
!~ 3
4
7
8
[\60-2\65-6\71] ~ 0
1
2
5
6
9
!~ 3
4
7
8
[\x30-\x32\x35-6\71] ~ 0
1
2
5
6
9
!~ 3
4
7
8
[\x30-2\x35-6\x39] ~ 0
1
2
5
6
9
!~ 3
4
7
8
\f !~ x
\b !~ x
\r !~ x
\n !~ x
...) ~ abc)
!!!!

33
testdir/T.recache Executable file
View File

@ -0,0 +1,33 @@
echo T.recache: test re cache in b.c
# thanks to ross ridge for this horror
awk=${awk-../a.out}
echo b >foo1
$awk '
BEGIN {
#
# Fill up DFA cache with run-time REs that have all been
# used twice.
#
CACHE_SIZE=64
for(i = 0; i < CACHE_SIZE; i++) {
for(j = 0; j < 2; j++) {
"" ~ i "";
}
}
#
# Now evalutate an expression that uses two run-time REs
# that have never been used before. The second RE will
# push the first out of the cache while the first RE is
# still needed.
#
x = "a"
reg1 = "[Aa]"
reg2 = "A"
sub(reg1, x ~ reg2 ? "B" : "b", x)
print x
}
' >foo2
diff foo1 foo2 || echo 'BAD: T.recache'

38
testdir/T.redir Executable file
View File

@ -0,0 +1,38 @@
echo T.redir: test redirections
awk=${awk-../a.out}
$awk '{ print >"foo" }' /etc/passwd
diff foo /etc/passwd || echo 'BAD: T.redir (print >"foo")'
rm -f foo
$awk '{ print >>"foo" }' /etc/passwd
diff foo /etc/passwd || echo 'BAD: T.redir (print >>"foo")'
rm -f foo
$awk 'NR%2 == 1 { print >>"foo" }
NR%2 == 0 { print >"foo" }' /etc/passwd
diff foo /etc/passwd || echo 'BAD: T.redir (print > and >>"foo")'
rm -f foo
$awk '{ print | "cat >foo" }' /etc/passwd
diff foo /etc/passwd || echo 'BAD: T.redir (print | "cat >foo")'
# tests flush of stdout before opening pipe
echo ' head
1
2' >foo1
$awk 'BEGIN { print " head"
for (i = 1; i < 3; i++)
print i | "sort" }' >foo2
diff foo1 foo2 || echo 'BAD: T.redir (buffering)'
rm -f foo[12]
$awk '{ print >"/dev/stderr" }' /etc/passwd 1>foo1 2>foo2
diff foo2 /etc/passwd || echo 'BAD: T.redir (print >"/dev/stderr")'
diff foo1 /dev/null || echo 'BAD: T.redir (print >"/dev/stderr")'
rm -f foo[12]
$awk '{ print >"/dev/stdout" }' /etc/passwd 1>foo1 2>foo2
diff foo1 /etc/passwd || echo 'BAD: T.redir (print >"/dev/stdout")'
diff foo2 /dev/null || echo 'BAD: T.redir (print >"/dev/stderr")'

224
testdir/T.split Executable file
View File

@ -0,0 +1,224 @@
#!/bin/sh
awk=${awk-../a.out}
WORKDIR=$(mktemp -d /tmp/nawktest.XXXXXX)
TEMP0=$WORKDIR/test.temp.0
TEMP1=$WORKDIR/test.temp.1
TEMP2=$WORKDIR/test.temp.2
RESULT=0
fail() {
echo "$1" >&2
RESULT=1
}
echo T.split: misc tests of field splitting and split command
$awk 'BEGIN {
# Assign string to $0, then change FS.
FS = ":"
$0="a:bc:def"
FS = "-"
print FS, $1, NF
# Assign number to $0, then change FS.
FS = "2"
$0=1212121
FS="3"
print FS, $1, NF
}' > $TEMP1
echo '- a 3
3 1 4' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split 0.1'
$awk 'BEGIN {
# FS changes after getline.
FS = ":"
"echo a:bc:def" | getline
FS = "-"
print FS, $1, NF
}' > $TEMP1
echo '- a 3' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split 0.2'
echo '
a
a:b
c:d:e
e:f:g:h' > $TEMP0
$awk 'BEGIN {
FS = ":"
while (getline <"'$TEMP0'" > 0)
print NF
}' > $TEMP1
echo '0
1
2
3
4' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split 0.3'
# getline var shouldn't impact fields.
echo 'f b a' > $TEMP0
$awk '{
FS = ":"
getline a < "/etc/passwd"
print $1
}' $TEMP0 > $TEMP1
echo 'f' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split 0.4'
echo 'a b c d
foo
e f g h i
bar' > $TEMP0
$awk '{
FS=":"
getline v
print $2, NF
FS=" "
}' $TEMP0 > $TEMP1
echo 'b 4
f 5' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split 0.5'
echo 'a.b.c=d.e.f
g.h.i=j.k.l
m.n.o=p.q.r' > $TEMP0
echo 'b
h
n' > $TEMP1
$awk 'BEGIN { FS="=" } { FS="."; $0=$1; print $2; FS="="; }' $TEMP0 > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split (record assignment 1)'
echo 'a.b.c=d.e.f
g.h.i=j.k.l
m.n.o=p.q.r' > $TEMP0
echo 'd.e.f
b
j.k.l
h
p.q.r
n' > $TEMP1
$awk 'BEGIN { FS="=" } { print $2; FS="."; $0=$1; print $2; FS="="; }' $TEMP0 > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split (record assignment 2)'
echo 'abc
de
f
' > $TEMP0
who | sed 10q >> $TEMP0
sed 10q /etc/passwd >> $TEMP0
$awk '
{ n = split($0, x, "")
m = length($0)
if (m != n) print "error 1", NR
s = ""
for (i = 1; i <= m; i++)
s = s x[i]
if (s != $0) print "error 2", NR
print s
}' $TEMP0 > $TEMP1
diff $TEMP0 $TEMP1 || fail 'BAD: T.split 1'
# assumes same test.temp.0! bad design
$awk '
{ n = split($0, x, //)
m = length($0)
if (m != n) print "error 1", NR
s = ""
for (i = 1; i <= m; i++)
s = s x[i]
if (s != $0) print "error 2", NR
print s
}' $TEMP0 > $TEMP1
diff $TEMP0 $TEMP1 || fail 'BAD: T.split //'
$awk '
BEGIN { FS = "" }
{ n = split($0, x) # will be split with FS
m = length($0)
if (m != n) print "error 1", NR
s = ""
for (i = 1; i <= m; i++)
s = s x[i]
if (s != $0) print "error 2", NR
print s
}' $TEMP0 > $TEMP2
diff $TEMP0 $TEMP2 || fail 'BAD: T.split 2'
# assumes same test.temp.0!
$awk '
BEGIN { FS = "" }
{ n = NF
m = length($0)
if (m != n) print "error 1", NR
s = ""
for (i = 1; i <= m; i++)
s = s $i
if (s != $0) print "error 2", NR
print s
}' $TEMP0 > $TEMP2
diff $TEMP0 $TEMP2 || fail 'BAD: T.split 3'
$awk '
{ n = split( $0, temp, /^@@@ +/ )
print n
}' > $TEMP1 <<XXX
@@@ xxx
@@@ xxx
@@@ xxx
XXX
echo '2
2
2' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split 4'
rm -f $WORKDIR/test.temp*
echo '
a
bc
def' > $TEMP0
$awk '
{ print split($0, x, "")
}' $TEMP0 > $TEMP1
echo '0
1
2
3' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split null 3rd arg'
rm -f $WORKDIR/test.temp*
$awk 'BEGIN {
a[1]="a b"
print split(a[1],a),a[1],a[2]
}' > $TEMP1
echo '2 a b' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split(a[1],a)'
$awk 'BEGIN {
a = "cat\n\n\ndog"
split(a, b, "[\r\n]+")
print b[1], b[2]
}' > $TEMP1
echo 'cat dog' > $TEMP2
diff $TEMP1 $TEMP2 || fail 'BAD: T.split(a, b, "[\r\n]+")'
exit $RESULT

315
testdir/T.sub Executable file
View File

@ -0,0 +1,315 @@
echo T.sub: tests of sub and gsub code
# input lines are of form
# pattern replacement input-string sub-output gsub-output
awk '
BEGIN {
FS = "\t"
awk = "../a.out"
}
NF == 0 { next }
$1 ~ /^#/ { next }
$1 != "" { # new pattern
pat = $1
}
$2 != "" { # new replacement
repl = $2
}
$3 != "" { # new input string
str = $3
}
$4 != "" { # new sub output
subout = $4
}
$5 != "" { # new gsub output
gsubout = $5
}
NF < 5 { # weird input line
printf("weird test spec `%s` ignored\n", $0) | "cat 1>&2"
next
}
{ # "" => explicitly empty
# printf(" %3d: %s %s %s %s %s:\n", NR, pat, repl, str, subout, gsubout)
if (pat == "\"\"") pat = ""
if (repl == "\"\"") repl = ""
if (str == "\"\"") str = ""
if (subout == "\"\"") subout = ""
if (gsubout == "\"\"") gsubout = ""
}
{ # generate a test
nt++
gsub(/\\/, "&&", repl) # in case of \ enclosed
test = sprintf("echo '"'"'%s'"'"' | %s '"'\n"'", str, awk) \
sprintf("{ temp = $0; sub(/%s/, \"%s\", temp)\n", pat, repl) \
sprintf(" if (temp != \"%s\") print \" sub %d fails:\", temp, \"should be %s in %s\"\n",
subout, nt, subout, (pat " " repl " " str " " subout)) \
sprintf(" gsub(/%s/, \"%s\")\n", pat, repl) \
sprintf(" if ($0 != \"%s\") print \"gsub %d fails:\", $0, \"should be %s in %s\"\n}",
gsubout, nt, gsubout, (pat " " repl " " str " " gsubout)) \
"'" '"'"
# if (nt >= 55) print "test is: " test
system(test)
# system("bprint -c ../a.out")
}
END { print nt, "tests" }
' <<\!!!!
a x aaa xaa xxx
axa xxa xxx
bbb bbb bbb
"" "" ""
a xy aaa xyaa xyxyxy
axa xyxa xyxxy
bbb bbb bbb
"" "" ""
. x aaa xaa xxx
axa xxa xxx
bbb xbb xxx
"" "" ""
.a x a a a
ax ax ax
aa x x
aaab xab xab
aaaa xaa xx
"" "" ""
$ x a ax ax
"" x x
.$ x "" "" ""
a x x
ab ax ax
a$ x "" "" ""
a x x
b b b
ab ab ab
^ x "" x x
a xa xa
^a$ xx a xx xx
"" "" ""
b b b
aa aa aa
^a.$ xy a a a
"" "" ""
ab xy xy
ba ba ba
^$ x "" x x
a a a
^.a x aa x x
ba x x
ab ab ab
a a a
^.*a xy "" "" ""
a xy xy
b b b
ba xy xy
^.+a xy "" "" ""
a a a
bb bb bb
ba xy xy
a &x&y a axay axay
aa axaya axayaxay
a* x "" x x
z xz xzx
az xz xzx
aza xza xzx
b xxx bxxx bxbxbxb
x& paq xpaq xpxaqx
x\& paq x&paq x&px&qx&
x&y paq xypaq xypxayqxy
x\&y paq x&ypaq x&ypx&yqx&y
a+ x& paq pxaq pxaq
x\& paq px&q px&q
x&y paq pxayq pxayq
x\&y paq px&yq px&yq
aa* x a x x
aa x x
wawa wxwa wxwx
\$ x "" "" ""
a a a
a$ ax ax
$$$ x$$ xxx
z$z$z zxz$z zxzxz
\. x "" "" ""
a a a
a. ax ax
... x.. xxx
z.z.z zxz.z zxzxz
xr+y q xy xy xy
xry q q
xrry q q
xryWxry qWxry qWq
xr?y q AxyB AqB AqB
AxryB AqB AqB
Axrry Axrry Axrry
a?b?c? x "" x x
a x x
b x x
c x x
ac x x
acc xc xx
^a?b?q x "" "" ""
q x x
a a a
aq x x
bq x x
abq x x
qab xab xab
abqabq xabq xabq
[0-9] xyz 0 xyz xyz
00 xyz0 xyzxyz
000 xyz00 xyzxyzxyz
0a xyza xyza
a0 axyz axyz
0a0 xyza0 xyzaxyz
xx xx xx
"" "" ""
^[0-9] xyz 0 xyz xyz
00 xyz0 xyz0
000 xyz00 xyz00
0a xyza xyza
a0 a0 a0
xx xx xx
"" "" ""
[0-9]$ xyz 0 xyz xyz
00 0xyz 0xyz
000 00xyz 00xyz
0a 0a 0a
a0 axyz axyz
xx xx xx
"" "" ""
[0-9]* xyz 0 xyz xyz
000 xyz xyz
0a xyza xyzaxyz
a0 xyza0 xyzaxyz
0a0 xyza0 xyzaxyz
pq xyzpq xyzpxyzqxyz
"" xyz xyz
"" <&> abc <>abc <>a<>b<>c<> fixed 2/07, we think
"" <\&> abc <&>abc <&>a<&>b<&>c<&>
"" <&&> abc <>abc <>a<>b<>c<>
"" <&> "" <> <>
d?abc <&> abc <abc> <abc>
d? <&> abc <>abc <>a<>b<>c<>
x[0-9]+y Q xy xy xy no change
x0y Q Q
x12y Q Q
x1y2 Q2 Q2
x1yax23y Qax23y QaQ
# x[0-9]?y ~ xy
# x1y
# !~ x23y
# x[[]y ~ x[y
# !~ xy
# x[[]y
# x]y
# x[^[]y ~ xay
# !~ x[y
# x[-]y ~ x-y
# !~ xy
# x+y
# x[^-]y ~ x+y
# !~ x-y
# xy
# [0\-9] ~ 0
# -
# 9
# !~ 1
# ""
# [-1] ~ -
# 1
# !~ 0
# [0-] ~ 0
# -
# !~ 1
# [^-0] ~ x
# ^
# !~ -
# 0
# ""
# [^0-] ~ x
# ^
# !~ -
# 0
# ""
# x|y ~ x
# y
# xy
# !~ a
# ""
# ^abc|xyz$ ~ abc
# abcd
# axyz
# xyz
# !~ xabc
# xyza
# ^(abc|xyz)$ ~ abc
# xyz
# !~ abcxyz
# abcx
# cxyz
# ^x\|y$ ~ x|y
# !~ xy
# ^x\\y$ ~ x\y
# !~ xy
# x\\y
# xay
# \141\142 ~ ab
# xab
# abx
# !~ a
# b
# ax
# axb
# x\056y ~ x.y
# !~ x.
# .x
# xxx
# xby because \056 is not the metacharacter .
# xcy ditto
# [\60-\62\65-6\71] ~ 0
# 1
# 2
# 5
# 6
# 9
# !~ 3
# 4
# 7
# 8
# [\60-2\65-6\71] ~ 0
# 1
# 2
# 5
# 6
# 9
# !~ 3
# 4
# 7
# 8
# [\x30-\x32\x35-6\71] ~ 0
# 1
# 2
# 5
# 6
# 9
# !~ 3
# 4
# 7
# 8
# [\x30-2\x35-6\x39] ~ 0
# 1
# 2
# 5
# 6
# 9
# !~ 3
# 4
# 7
# 8
!!!!

15
testdir/T.system Executable file
View File

@ -0,0 +1,15 @@
echo T.system: test system built-in
awk=${awk-../a.out}
rm -f foo
$awk 'BEGIN {
n = system("exit 3")
print n
exit n+1
}
' >foo
echo $? >>foo
echo "3
4" >foo1
diff foo foo1 || echo 'BAD: T.system (1)'

BIN
testdir/arnold-fixes.tar Normal file

Binary file not shown.

BIN
testdir/beebe.tar Normal file

Binary file not shown.

31102
testdir/bib Normal file

File diff suppressed because it is too large Load Diff

3
testdir/bundle.awk Normal file
View File

@ -0,0 +1,3 @@
# bundle - combine multiple files into one
{ print FILENAME, $0 }

492
testdir/chem.awk Normal file
View File

@ -0,0 +1,492 @@
BEGIN {
macros = "/usr/bwk/chem/chem.macros" # CHANGE ME!!!!!
macros = "/dev/null" # since originals are lost
pi = 3.141592654
deg = 57.29578
setparams(1.0)
set(dc, "up 0 right 90 down 180 left 270 ne 45 se 135 sw 225 nw 315")
set(dc, "0 n 30 ne 45 ne 60 ne 90 e 120 se 135 se 150 se 180 s")
set(dc, "300 nw 315 nw 330 nw 270 w 210 sw 225 sw 240 sw")
}
function init() {
printf ".PS\n"
if (firsttime++ == 0) {
printf "copy \"%s\"\n", macros
printf "\ttextht = %g; textwid = .1; cwid = %g\n", textht, cwid
printf "\tlineht = %g; linewid = %g\n", lineht, linewid
}
printf "Last: 0,0\n"
RING = "R"; MOL = "M"; BOND = "B"; OTHER = "O" # manifests
last = OTHER
dir = 90
}
function setparams(scale) {
lineht = scale * 0.2
linewid = scale * 0.2
textht = scale * 0.16
db = scale * 0.2 # bond length
cwid = scale * 0.12 # character width
cr = scale * 0.08 # rad of invis circles at ring vertices
crh = scale * 0.16 # ht of invis ellipse at ring vertices
crw = scale * 0.12 # wid
dav = scale * 0.015 # vertical shift up for atoms in atom macro
dew = scale * 0.02 # east-west shift for left of/right of
ringside = scale * 0.3 # side of all rings
dbrack = scale * 0.1 # length of bottom of bracket
}
{ lineno++ }
/^(\.cstart)|(begin chem)/ { init(); inchem = 1; next }
/^(\.cend)|(end)/ { inchem = 0; print ".PE"; next }
/^\./ { print; next } # troff
inchem == 0 { print; next } # everything else
$1 == "pic" { shiftfields(1); print; next } # pic pass-thru
$1 ~ /^#/ { next } # comment
$1 == "textht" { textht = $NF; next }
$1 == "cwid" { cwid = $NF; next }
$1 == "db" { db = $NF; next }
$1 == "size" { if ($NF <= 4) size = $NF; else size = $NF/10
setparams(size); next }
{ print "\n#", $0 } # debugging, etc.
{ lastname = "" }
$1 ~ /^[A-Z].*:$/ { # label; falls thru after shifting left
lastname = substr($1, 1, length($1)-1)
print $1
shiftfields(1)
}
$1 ~ /^\"/ { print "Last: ", $0; last = OTHER; next }
$1 ~ /bond/ { bond($1); next }
$1 ~ /^(double|triple|front|back)$/ && $2 == "bond" {
$1 = $1 $2; shiftfields(2); bond($1); next }
$1 == "aromatic" { temp = $1; $1 = $2; $2 = temp }
$1 ~ /ring|benz/ { ring($1); next }
$1 == "methyl" { $1 = "CH3" } # left here as an example
$1 ~ /^[A-Z]/ { molecule(); next }
$1 == "left" { left[++stack] = fields(2, NF); printf("Last: [\n"); next }
$1 == "right" { bracket(); stack--; next }
$1 == "label" { label(); next }
/./ { print "Last: ", $0; last = OTHER }
END { if (firsttime == 0) error("did you forget .cstart and .cend?")
if (inchem) printf ".PE\n"
}
function bond(type, i, goes, from) {
goes = ""
for (i = 2; i <= NF; i++)
if ($i == ";") {
goes = $(i+1)
NF = i - 1
break
}
leng = db
from = ""
for (cf = 2; cf <= NF; ) {
if ($cf ~ /(\+|-)?[0-9]+|up|down|right|left|ne|se|nw|sw/)
dir = cvtdir(dir)
else if ($cf ~ /^leng/) {
leng = $(cf+1)
cf += 2
} else if ($cf == "to") {
leng = 0
from = fields(cf, NF)
break
} else if ($cf == "from") {
from = dofrom()
break
} else if ($cf ~ /^#/) {
cf = NF+1
break;
} else {
from = fields(cf, NF)
break
}
}
if (from ~ /( to )|^to/) # said "from ... to ...", so zap length
leng = 0
else if (from == "") # no from given at all
from = "from Last." leave(last, dir) " " fields(cf, NF)
printf "Last: %s(%g, %g, %s)\n", type, leng, dir, from
last = BOND
if (lastname != "")
labsave(lastname, last, dir)
if (goes) {
$0 = goes
molecule()
}
}
function dofrom( n, s) {
cf++ # skip "from"
n = $cf
if (n in labtype) # "from Thing" => "from Thing.V.s"
return "from " n "." leave(labtype[n], dir)
if (n ~ /^\.[A-Z]/) # "from .V" => "from Last.V.s"
return "from Last" n "." corner(dir)
if (n ~ /^[A-Z][^.]*\.[A-Z][^.]*$/) # "from X.V" => "from X.V.s"
return "from " n "." corner(dir)
return fields(cf-1, NF)
}
function bracket( t) {
printf("]\n")
if ($2 == ")")
t = "spline"
else
t = "line"
printf("%s from last [].sw+(%g,0) to last [].sw to last [].nw to last [].nw+(%g,0)\n",
t, dbrack, dbrack)
printf("%s from last [].se-(%g,0) to last [].se to last [].ne to last [].ne-(%g,0)\n",
t, dbrack, dbrack)
if ($3 == "sub")
printf("\" %s\" ljust at last [].se\n", fields(4,NF))
}
function molecule( n, type) {
n = $1
if (n == "BP") {
$1 = "\"\" ht 0 wid 0"
type = OTHER
} else {
$1 = atom(n)
type = MOL
}
gsub(/[^A-Za-z0-9]/, "", n) # for stuff like C(OH3): zap non-alnum
if ($2 == "")
printf "Last: %s: %s with .%s at Last.%s\n", \
n, $0, leave(type,dir+180), leave(last,dir)
else if ($2 == "below")
printf("Last: %s: %s with .n at %s.s\n", n, $1, $3)
else if ($2 == "above")
printf("Last: %s: %s with .s at %s.n\n", n, $1, $3)
else if ($2 == "left" && $3 == "of")
printf("Last: %s: %s with .e at %s.w+(%g,0)\n", n, $1, $4, dew)
else if ($2 == "right" && $3 == "of")
printf("Last: %s: %s with .w at %s.e-(%g,0)\n", n, $1, $4, dew)
else
printf "Last: %s: %s\n", n, $0
last = type
if (lastname != "")
labsave(lastname, last, dir)
labsave(n, last, dir)
}
function label( i, v) {
if (substr(labtype[$2], 1, 1) != RING)
error(sprintf("%s is not a ring", $2))
else {
v = substr(labtype[$2], 2, 1)
for (i = 1; i <= v; i++)
printf("\"\\s-3%d\\s0\" at 0.%d<%s.C,%s.V%d>\n", i, v+2, $2, $2, i)
}
}
function ring(type, typeint, pt, verts, i) {
pt = 0 # points up by default
if (type ~ /[1-8]$/)
verts = substr(type, length(type), 1)
else if (type ~ /flat/)
verts = 5
else
verts = 6
fused = other = ""
for (i = 1; i <= verts; i++)
put[i] = dbl[i] = ""
nput = aromatic = withat = 0
for (cf = 2; cf <= NF; ) {
if ($cf == "pointing")
pt = cvtdir(0)
else if ($cf == "double" || $cf == "triple")
dblring(verts)
else if ($cf ~ /arom/) {
aromatic++
cf++ # handled later
} else if ($cf == "put") {
putring(verts)
nput++
} else if ($cf ~ /^#/) {
cf = NF+1
break;
} else {
if ($cf == "with" || $cf == "at")
withat = 1
other = other " " $cf
cf++
}
}
typeint = RING verts pt # RING | verts | dir
if (withat == 0)
fused = joinring(typeint, dir, last)
printf "Last: [\n"
makering(type, pt, verts)
printf "] %s %s\n", fused, other
last = typeint
if (lastname != "")
labsave(lastname, last, dir)
}
function makering(type, pt, v, i, a, r) {
if (type ~ /flat/)
v = 6
# vertices
r = ringside / (2 * sin(pi/v))
printf "\tC: 0,0\n"
for (i = 0; i <= v+1; i++) {
a = ((i-1) / v * 360 + pt) / deg
printf "\tV%d: (%g,%g)\n", i, r * sin(a), r * cos(a)
}
if (type ~ /flat/) {
printf "\tV4: V5; V5: V6\n"
v = 5
}
# sides
if (nput > 0) { # hetero ...
for (i = 1; i <= v; i++) {
c1 = c2 = 0
if (put[i] != "") {
printf("\tV%d: ellipse invis ht %g wid %g at V%d\n",
i, crh, crw, i)
printf("\t%s at V%d\n", put[i], i)
c1 = cr
}
j = i+1
if (j > v)
j = 1
if (put[j] != "")
c2 = cr
printf "\tline from V%d to V%d chop %g chop %g\n", i, j, c1, c2
if (dbl[i] != "") { # should check i<j
if (type ~ /flat/ && i == 3) {
rat = 0.75; fix = 5
} else {
rat = 0.85; fix = 1.5
}
if (put[i] == "")
c1 = 0
else
c1 = cr/fix
if (put[j] == "")
c2 = 0
else
c2 = cr/fix
printf "\tline from %g<C,V%d> to %g<C,V%d> chop %g chop %g\n",
rat, i, rat, j, c1, c2
if (dbl[i] == "triple")
printf "\tline from %g<C,V%d> to %g<C,V%d> chop %g chop %g\n",
2-rat, i, 2-rat, j, c1, c2
}
}
} else { # regular
for (i = 1; i <= v; i++) {
j = i+1
if (j > v)
j = 1
printf "\tline from V%d to V%d\n", i, j
if (dbl[i] != "") { # should check i<j
if (type ~ /flat/ && i == 3) {
rat = 0.75
} else
rat = 0.85
printf "\tline from %g<C,V%d> to %g<C,V%d>\n",
rat, i, rat, j
if (dbl[i] == "triple")
printf "\tline from %g<C,V%d> to %g<C,V%d>\n",
2-rat, i, 2-rat, j
}
}
}
# punt on triple temporarily
# circle
if (type ~ /benz/ || aromatic > 0) {
if (type ~ /flat/)
r *= .4
else
r *= .5
printf "\tcircle rad %g at 0,0\n", r
}
}
function putring(v) { # collect "put Mol at n"
cf++
mol = $(cf++)
if ($cf == "at")
cf++
if ($cf >= 1 && $cf <= v) {
m = mol
gsub(/[^A-Za-z0-9]/, "", m)
put[$cf] = m ":" atom(mol)
}
cf++
}
function joinring(type, dir, last) { # join a ring to something
if (substr(last, 1, 1) == RING) { # ring to ring
if (substr(type, 3) == substr(last, 3)) # fails if not 6-sided
return "with .V6 at Last.V2"
}
# if all else fails
return sprintf("with .%s at Last.%s", \
leave(type,dir+180), leave(last,dir))
}
function leave(last, d, c, c1) { # return vertex of last in dir d
if (last == BOND)
return "end"
d = reduce(d)
if (substr(last, 1, 1) == RING)
return ringleave(last, d)
if (last == MOL) {
if (d == 0 || d == 180)
c = "C"
else if (d > 0 && d < 180)
c = "R"
else
c = "L"
if (d in dc)
c1 = dc[d]
else
c1 = corner(d)
return sprintf("%s.%s", c, c1)
}
if (last == OTHER)
return corner(d)
return "c"
}
function ringleave(last, d, rd, verts) { # return vertex of ring in dir d
verts = substr(last, 2, 1)
rd = substr(last, 3)
return sprintf("V%d.%s", int(reduce(d-rd)/(360/verts)) + 1, corner(d))
}
function corner(dir) {
return dc[reduce(45 * int((dir+22.5)/45))]
}
function labsave(name, type, dir) {
labtype[name] = type
labdir[name] = dir
}
function dblring(v, d, v1, v2) { # should canonicalize to i,i+1 mod v
d = $cf
for (cf++; $cf ~ /^[1-9]/; cf++) {
v1 = substr($cf,1,1)
v2 = substr($cf,3,1)
if (v2 == v1+1 || v1 == v && v2 == 1) # e.g., 2,3 or 5,1
dbl[v1] = d
else if (v1 == v2+1 || v2 == v && v1 == 1) # e.g., 3,2 or 1,5
dbl[v2] = d
else
error(sprintf("weird %s bond in\n\t%s", d, $0))
}
}
function cvtdir(d) { # maps "[pointing] somewhere" to degrees
if ($cf == "pointing")
cf++
if ($cf ~ /^[+\-]?[0-9]+/)
return reduce($(cf++))
else if ($cf ~ /left|right|up|down|ne|nw|se|sw/)
return reduce(dc[$(cf++)])
else {
cf++
return d
}
}
function reduce(d) { # reduces d to 0 <= d < 360
while (d >= 360)
d -= 360
while (d < 0)
d += 360
return d
}
function atom(s, c, i, n, nsub, cloc, nsubc) { # convert CH3 to atom(...)
if (s == "\"\"")
return s
n = length(s)
nsub = nsubc = 0
cloc = index(s, "C")
if (cloc == 0)
cloc = 1
for (i = 1; i <= n; i++)
if (substr(s, i, 1) !~ /[A-Z]/) {
nsub++
if (i < cloc)
nsubc++
}
gsub(/([0-9]+\.[0-9]+)|([0-9]+)/, "\\s-3\\d&\\u\\s+3", s)
if (s ~ /([^0-9]\.)|(\.[^0-9])/) # centered dot
gsub(/\./, "\\v#-.3m#.\\v#.3m#", s)
return sprintf("atom(\"%s\", %g, %g, %g, %g, %g, %g)",
s, (n-nsub/2)*cwid, textht, (cloc-nsubc/2-0.5)*cwid, crh, crw, dav)
}
function in_line( i, n, s, s1, os) {
s = $0
os = ""
while ((n = match(s, /!?[A-Z][A-Za-z]*(([0-9]+\.[0-9]+)|([0-9]+))/)) > 0) {
os = os substr(s, 1, n-1) # prefix
s1 = substr(s, n, RLENGTH) # molecule
if (substr(s1, 1, 1) == "!") { # !mol => leave alone
s1 = substr(s1, 2)
} else {
gsub(/([0-9]+\.[0-9]+)|([0-9]+)/, "\\s-3\\d&\\u\\s+3", s1)
if (s1 ~ /([^0-9]\.)|(\.[^0-9])/) # centered dot
gsub(/\./, "\\v#-.3m#.\\v#.3m#", s1)
}
os = os s1
s = substr(s, n + RLENGTH) # tail
}
os = os s
print os
return
}
function shiftfields(n, i) { # move $n+1..$NF to $n..$NF-1, zap $NF
for (i = n; i < NF; i++)
$i = $(i+1)
$NF = ""
NF--
}
function fields(n1, n2, i, s) {
if (n1 > n2)
return ""
s = ""
for (i = n1; i <= n2; i++) {
if ($i ~ /^#/)
break;
s = s $i " "
}
return s
}
function set(a, s, i, n, q) {
n = split(s, q)
for (i = 1; i <= n; i += 2)
a[q[i]] = q[i+1]
}
function error(s) {
printf "chem\007: error on line %d: %s\n", lineno, s | "cat 1>&2"
}

5
testdir/cleanup Executable file
View File

@ -0,0 +1,5 @@
rm -f core foo* junk* glop* *temp* *.p bigpop smallpop tt.* countries td.1
rm -f T.* t.* p.* u.* chem.awk test.data test.countries Compare*
rm -f *.awk *.out testall try ind NOTES cleanup xc yc

11
testdir/countries Normal file
View File

@ -0,0 +1,11 @@
USSR 8649 275 Asia
Canada 3852 25 North America
China 3705 1032 Asia
USA 3615 237 North America
Brazil 3286 134 South America
India 1267 746 Asia
Mexico 762 78 North America
France 211 55 Europe
Japan 144 120 Asia
Germany 96 61 Europe
England 94 56 Europe

40
testdir/ctimes Executable file
View File

@ -0,0 +1,40 @@
awk '
BEGIN {
OFS = "\t"
print " new old new/old"
print ""
}
/differ/
/:$/ { name = $1; cnt = 0; next }
$1 ~ /user|sys/ {
n = split($2, x, "m") # 0m0.23s
if (n == 1)
time[cnt] += x[1]
else
time[cnt] += 60 * x[1] + x[2]
}
$1 ~ /sys/ {
cnt++
if (cnt == 2)
dump()
}
function dump() {
old = time[1]
new = time[0]
if (old > 0) {
printf "%8.2f %8.2f %8.3f %s\n", new, old, new/old, name
rat += new/old
}
nrat++
totnew += new
totold += old
time[0] = time[1] = cnt = 0
}
END {
print ""
printf "%8.2f %8.2f\n\n", totnew, totold
printf "avg new/old = %.3f\n", rat/nrat
printf "total new/old = %.3f\n", totnew/totold
print nrat " tests"
}
' $*

19
testdir/echo.c Normal file
View File

@ -0,0 +1,19 @@
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
int i, start, minusn;
start = 1;
minusn = 0;
if (argc > 1 && strcmp(argv[1], "-n") == 0) {
start = 2;
minusn = 1;
}
for (i = start; i < argc; i++)
printf("%s%s", argv[i], i==argc-1 ? "" : " ");
if (minusn == 0)
printf("\n");
}

977
testdir/funstack.awk Normal file
View File

@ -0,0 +1,977 @@
### ====================================================================
### @Awk-file{
### author = "Nelson H. F. Beebe",
### version = "1.00",
### date = "09 October 1996",
### time = "15:57:06 MDT",
### filename = "journal-toc.awk",
### address = "Center for Scientific Computing
### Department of Mathematics
### University of Utah
### Salt Lake City, UT 84112
### USA",
### telephone = "+1 801 581 5254",
### FAX = "+1 801 581 4148",
### URL = "http://www.math.utah.edu/~beebe",
### checksum = "25092 977 3357 26493",
### email = "beebe@math.utah.edu (Internet)",
### codetable = "ISO/ASCII",
### keywords = "BibTeX, bibliography, HTML, journal table of
### contents",
### supported = "yes",
### docstring = "Create a journal cover table of contents from
### <at>Article{...} entries in a journal BibTeX
### .bib file for checking the bibliography
### database against the actual journal covers.
### The output can be either plain text, or HTML.
###
### Usage:
### bibclean -max-width 0 BibTeX-file(s) | \
### bibsort -byvolume | \
### awk -f journal-toc.awk \
### [-v HTML=nnn] [-v INDENT=nnn] \
### [-v BIBFILEURL=url] >foo.toc
###
### or if the bibliography is already sorted
### by volume,
###
### bibclean -max-width 0 BibTeX-file(s) | \
### awk -f journal-toc.awk \
### [-v HTML=nnn] [-v INDENT=nnn] \
### [-v BIBFILEURL=url] >foo.toc
###
### A non-zero value of the command-line option,
### HTML=nnn, results in HTML output instead of
### the default plain ASCII text (corresponding
### to HTML=0). The
###
### The INDENT=nnn command-line option specifies
### the number of blanks to indent each logical
### level of HTML. The default is INDENT=4.
### INDENT=0 suppresses indentation. The INDENT
### option has no effect when the default HTML=0
### (plain text output) option is in effect.
###
### When HTML output is selected, the
### BIBFILEURL=url command-line option provides a
### way to request hypertext links from table of
### contents page numbers to the complete BibTeX
### entry for the article. These links are
### created by appending a sharp (#) and the
### citation label to the BIBFILEURL value, which
### conforms with the practice of
### bibtex-to-html.awk.
###
### The HTML output form may be useful as a more
### compact representation of journal article
### bibliography data than the original BibTeX
### file provides. Of course, the
### table-of-contents format provides less
### information, and is considerably more
### troublesome for a computer program to parse.
###
### When URL key values are provided, they will
### be used to create hypertext links around
### article titles. This supports journals that
### provide article contents on the World-Wide
### Web.
###
### For parsing simplicity, this program requires
### that BibTeX
###
### key = "value"
###
### and
###
### @String{name = "value"}
###
### specifications be entirely contained on
### single lines, which is readily provided by
### the `bibclean -max-width 0' filter. It also
### requires that bibliography entries begin and
### end at the start of a line, and that
### quotation marks, rather than balanced braces,
### delimit string values. This is a
### conventional format that again can be
### guaranteed by bibclean.
###
### This program requires `new' awk, as described
### in the book
###
### Alfred V. Aho, Brian W. Kernighan, and
### Peter J. Weinberger,
### ``The AWK Programming Language'',
### Addison-Wesley (1988), ISBN
### 0-201-07981-X,
###
### such as provided by programs named (GNU)
### gawk, nawk, and recent AT&T awk.
###
### The checksum field above contains a CRC-16
### checksum as the first value, followed by the
### equivalent of the standard UNIX wc (word
### count) utility output of lines, words, and
### characters. This is produced by Robert
### Solovay's checksum utility.",
### }
### ====================================================================
BEGIN { initialize() }
/^ *@ *[Ss][Tt][Rr][Ii][Nn][Gg] *{/ { do_String(); next }
/^ *@ *[Pp][Rr][Ee][Aa][Mm][Bb][Ll][Ee]/ { next }
/^ *@ *[Aa][Rr][Tt][Ii][Cc][Ll][Ee]/ { do_Article(); next }
/^ *@/ { do_Other(); next }
/^ *author *= *\"/ { do_author(); next }
/^ *journal *= */ { do_journal(); next }
/^ *volume *= *\"/ { do_volume(); next }
/^ *number *= *\"/ { do_number(); next }
/^ *year *= *\"/ { do_year(); next }
/^ *month *= */ { do_month(); next }
/^ *title *= *\"/ { do_title(); next }
/^ *pages *= *\"/ { do_pages(); next }
/^ *URL *= *\"/ { do_URL(); next }
/^ *} *$/ { if (In_Article) do_end_entry(); next }
END { terminate() }
########################################################################
# NB: The programming conventions for variables in this program are: #
# UPPERCASE global constants and user options #
# Initialuppercase global variables #
# lowercase local variables #
# Any deviation is an error! #
########################################################################
function do_Article()
{
In_Article = 1
Citation_label = $0
sub(/^[^\{]*{/,"",Citation_label)
sub(/ *, *$/,"",Citation_label)
Author = ""
Title = ""
Journal = ""
Volume = ""
Number = ""
Month = ""
Year = ""
Pages = ""
Url = ""
}
function do_author()
{
Author = TeX_to_HTML(get_value($0))
}
function do_end_entry( k,n,parts)
{
n = split(Author,parts," and ")
if (Last_number != Number)
do_new_issue()
for (k = 1; k < n; ++k)
print_toc_line(parts[k] " and", "", "")
Title_prefix = html_begin_title()
Title_suffix = html_end_title()
if (html_length(Title) <= (MAX_TITLE_CHARS + MIN_LEADERS)) # complete title fits on line
print_toc_line(parts[n], Title, html_begin_pages() Pages html_end_pages())
else # need to split long title over multiple lines
do_long_title(parts[n], Title, html_begin_pages() Pages html_end_pages())
}
function do_journal()
{
if ($0 ~ /[=] *"/) # have journal = "quoted journal name",
Journal = get_value($0)
else # have journal = journal-abbreviation,
{
Journal = get_abbrev($0)
if (Journal in String) # replace abbrev by its expansion
Journal = String[Journal]
}
gsub(/\\-/,"",Journal) # remove discretionary hyphens
}
function do_long_title(author,title,pages, last_title,n)
{
title = trim(title) # discard leading and trailing space
while (length(title) > 0)
{
n = html_breakpoint(title,MAX_TITLE_CHARS+MIN_LEADERS)
last_title = substr(title,1,n)
title = substr(title,n+1)
sub(/^ +/,"",title) # discard any leading space
print_toc_line(author, last_title, (length(title) == 0) ? pages : "")
author = ""
}
}
function do_month( k,n,parts)
{
Month = ($0 ~ /[=] *"/) ? get_value($0) : get_abbrev($0)
gsub(/[\"]/,"",Month)
gsub(/ *# *\\slash *# */," / ",Month)
gsub(/ *# *-+ *# */," / ",Month)
n = split(Month,parts," */ *")
Month = ""
for (k = 1; k <= n; ++k)
Month = Month ((k > 1) ? " / " : "") \
((parts[k] in Month_expansion) ? Month_expansion[parts[k]] : parts[k])
}
function do_new_issue()
{
Last_number = Number
if (HTML)
{
if (Last_volume != Volume)
{
Last_volume = Volume
print_line(prefix(2) "<BR>")
}
html_end_toc()
html_begin_issue()
print_line(prefix(2) Journal "<BR>")
}
else
{
print_line("")
print_line(Journal)
}
print_line(strip_html(vol_no_month_year()))
if (HTML)
{
html_end_issue()
html_toc_entry()
html_begin_toc()
}
else
print_line("")
}
function do_number()
{
Number = get_value($0)
}
function do_Other()
{
In_Article = 0
}
function do_pages()
{
Pages = get_value($0)
sub(/--[?][?]/,"",Pages)
}
function do_String()
{
sub(/^[^\{]*\{/,"",$0) # discard up to and including open brace
sub(/\} *$/,"",$0) # discard from optional whitespace and trailing brace to end of line
String[get_key($0)] = get_value($0)
}
function do_title()
{
Title = TeX_to_HTML(get_value($0))
}
function do_URL( parts)
{
Url = get_value($0)
split(Url,parts,"[,;]") # in case we have multiple URLs
Url = trim(parts[1])
}
function do_volume()
{
Volume = get_value($0)
}
function do_year()
{
Year = get_value($0)
}
function get_abbrev(s)
{ # return abbrev from ``key = abbrev,''
sub(/^[^=]*= */,"",s) # discard text up to start of non-blank value
sub(/ *,? *$/,"",s) # discard trailing optional whitspace, quote,
# optional comma, and optional space
return (s)
}
function get_key(s)
{ # return kay from ``key = "value",''
sub(/^ */,"",s) # discard leading space
sub(/ *=.*$/,"",s) # discard everthing after key
return (s)
}
function get_value(s)
{ # return value from ``key = "value",''
sub(/^[^\"]*\" */,"",s) # discard text up to start of non-blank value
sub(/ *\",? *$/,"",s) # discard trailing optional whitspace, quote,
# optional comma, and optional space
return (s)
}
function html_accents(s)
{
if (index(s,"\\") > 0) # important optimization
{
# Convert common lower-case accented letters according to the
# table on p. 169 of in Peter Flynn's ``The World Wide Web
# Handbook'', International Thomson Computer Press, 1995, ISBN
# 1-85032-205-8. The official table of ISO Latin 1 SGML
# entities used in HTML can be found in the file
# /usr/local/lib/html-check/lib/ISOlat1.sgml (your path
# may differ).
gsub(/{\\\a}/, "\\&agrave;", s)
gsub(/{\\'a}/, "\\&aacute;", s)
gsub(/{\\[\^]a}/,"\\&acirc;", s)
gsub(/{\\~a}/, "\\&atilde;", s)
gsub(/{\\\"a}/, "\\&auml;", s)
gsub(/{\\aa}/, "\\&aring;", s)
gsub(/{\\ae}/, "\\&aelig;", s)
gsub(/{\\c{c}}/,"\\&ccedil;", s)
gsub(/{\\\e}/, "\\&egrave;", s)
gsub(/{\\'e}/, "\\&eacute;", s)
gsub(/{\\[\^]e}/,"\\&ecirc;", s)
gsub(/{\\\"e}/, "\\&euml;", s)
gsub(/{\\\i}/, "\\&igrave;", s)
gsub(/{\\'i}/, "\\&iacute;", s)
gsub(/{\\[\^]i}/,"\\&icirc;", s)
gsub(/{\\\"i}/, "\\&iuml;", s)
# ignore eth and thorn
gsub(/{\\~n}/, "\\&ntilde;", s)
gsub(/{\\\o}/, "\\&ograve;", s)
gsub(/{\\'o}/, "\\&oacute;", s)
gsub(/{\\[\^]o}/, "\\&ocirc;", s)
gsub(/{\\~o}/, "\\&otilde;", s)
gsub(/{\\\"o}/, "\\&ouml;", s)
gsub(/{\\o}/, "\\&oslash;", s)
gsub(/{\\\u}/, "\\&ugrave;", s)
gsub(/{\\'u}/, "\\&uacute;", s)
gsub(/{\\[\^]u}/,"\\&ucirc;", s)
gsub(/{\\\"u}/, "\\&uuml;", s)
gsub(/{\\'y}/, "\\&yacute;", s)
gsub(/{\\\"y}/, "\\&yuml;", s)
# Now do the same for upper-case accents
gsub(/{\\\A}/, "\\&Agrave;", s)
gsub(/{\\'A}/, "\\&Aacute;", s)
gsub(/{\\[\^]A}/, "\\&Acirc;", s)
gsub(/{\\~A}/, "\\&Atilde;", s)
gsub(/{\\\"A}/, "\\&Auml;", s)
gsub(/{\\AA}/, "\\&Aring;", s)
gsub(/{\\AE}/, "\\&AElig;", s)
gsub(/{\\c{C}}/,"\\&Ccedil;", s)
gsub(/{\\\e}/, "\\&Egrave;", s)
gsub(/{\\'E}/, "\\&Eacute;", s)
gsub(/{\\[\^]E}/, "\\&Ecirc;", s)
gsub(/{\\\"E}/, "\\&Euml;", s)
gsub(/{\\\I}/, "\\&Igrave;", s)
gsub(/{\\'I}/, "\\&Iacute;", s)
gsub(/{\\[\^]I}/, "\\&Icirc;", s)
gsub(/{\\\"I}/, "\\&Iuml;", s)
# ignore eth and thorn
gsub(/{\\~N}/, "\\&Ntilde;", s)
gsub(/{\\\O}/, "\\&Ograve;", s)
gsub(/{\\'O}/, "\\&Oacute;", s)
gsub(/{\\[\^]O}/, "\\&Ocirc;", s)
gsub(/{\\~O}/, "\\&Otilde;", s)
gsub(/{\\\"O}/, "\\&Ouml;", s)
gsub(/{\\O}/, "\\&Oslash;", s)
gsub(/{\\\U}/, "\\&Ugrave;", s)
gsub(/{\\'U}/, "\\&Uacute;", s)
gsub(/{\\[\^]U}/, "\\&Ucirc;", s)
gsub(/{\\\"U}/, "\\&Uuml;", s)
gsub(/{\\'Y}/, "\\&Yacute;", s)
gsub(/{\\ss}/, "\\&szlig;", s)
# Others not mentioned in Flynn's book
gsub(/{\\'\\i}/,"\\&iacute;", s)
gsub(/{\\'\\j}/,"j", s)
}
return (s)
}
function html_begin_issue()
{
print_line("")
print_line(prefix(2) "<HR>")
print_line("")
print_line(prefix(2) "<H1>")
print_line(prefix(3) "<A NAME=\"" html_label() "\">")
}
function html_begin_pages()
{
return ((HTML && (BIBFILEURL != "")) ? ("<A HREF=\"" BIBFILEURL "#" Citation_label "\">") : "")
}
function html_begin_pre()
{
In_PRE = 1
print_line("<PRE>")
}
function html_begin_title()
{
return ((HTML && (Url != "")) ? ("<A HREF=\"" Url "\">") : "")
}
function html_begin_toc()
{
html_end_toc()
html_begin_pre()
}
function html_body( k)
{
for (k = 1; k <= BodyLines; ++k)
print Body[k]
}
function html_breakpoint(title,maxlength, break_after,k)
{
# Return the largest character position in title AFTER which we
# can break the title across lines, without exceeding maxlength
# visible characters.
if (html_length(title) > maxlength) # then need to split title across lines
{
# In the presence of HTML markup, the initialization of
# k here is complicated, because we need to advance it
# until html_length(title) is at least maxlength,
# without invoking the expensive html_length() function
# too frequently. The need to split the title makes the
# alternative of delayed insertion of HTML markup much
# more complicated.
break_after = 0
for (k = min(maxlength,length(title)); k < length(title); ++k)
{
if (substr(title,k+1,1) == " ")
{ # could break after position k
if (html_length(substr(title,1,k)) <= maxlength)
break_after = k
else # advanced too far, retreat back to last break_after
break
}
}
if (break_after == 0) # no breakpoint found by forward scan
{ # so switch to backward scan
for (k = min(maxlength,length(title)) - 1; \
(k > 0) && (substr(title,k+1,1) != " "); --k)
; # find space at which to break title
if (k < 1) # no break point found
k = length(title) # so must print entire string
}
else
k = break_after
}
else # title fits on one line
k = length(title)
return (k)
}
function html_end_issue()
{
print_line(prefix(3) "</A>")
print_line(prefix(2) "</H1>")
}
function html_end_pages()
{
return ((HTML && (BIBFILEURL != "")) ? "</A>" : "")
}
function html_end_pre()
{
if (In_PRE)
{
print_line("</PRE>")
In_PRE = 0
}
}
function html_end_title()
{
return ((HTML && (Url != "")) ? "</A>" : "")
}
function html_end_toc()
{
html_end_pre()
}
function html_fonts(s, arg,control_word,k,level,n,open_brace)
{
open_brace = index(s,"{")
if (open_brace > 0) # important optimization
{
level = 1
for (k = open_brace + 1; (level != 0) && (k <= length(s)); ++k)
{
if (substr(s,k,1) == "{")
level++
else if (substr(s,k,1) == "}")
level--
}
# {...} is now found at open_brace ... (k-1)
for (control_word in Font_decl_map) # look for {\xxx ...}
{
if (substr(s,open_brace+1,length(control_word)+1) ~ \
("\\" control_word "[^A-Za-z]"))
{
n = open_brace + 1 + length(control_word)
arg = trim(substr(s,n,k - n))
if (Font_decl_map[control_word] == "toupper") # arg -> ARG
arg = toupper(arg)
else if (Font_decl_map[control_word] != "") # arg -> <TAG>arg</TAG>
arg = "<" Font_decl_map[control_word] ">" arg "</" Font_decl_map[control_word] ">"
return (substr(s,1,open_brace-1) arg html_fonts(substr(s,k)))
}
}
for (control_word in Font_cmd_map) # look for \xxx{...}
{
if (substr(s,open_brace - length(control_word),length(control_word)) ~ \
("\\" control_word))
{
n = open_brace + 1
arg = trim(substr(s,n,k - n))
if (Font_cmd_map[control_word] == "toupper") # arg -> ARG
arg = toupper(arg)
else if (Font_cmd_map[control_word] != "") # arg -> <TAG>arg</TAG>
arg = "<" Font_cmd_map[control_word] ">" arg "</" Font_cmd_map[control_word] ">"
n = open_brace - length(control_word) - 1
return (substr(s,1,n) arg html_fonts(substr(s,k)))
}
}
}
return (s)
}
function html_header()
{
USER = ENVIRON["USER"]
if (USER == "")
USER = ENVIRON["LOGNAME"]
if (USER == "")
USER = "????"
"hostname" | getline HOSTNAME
"date" | getline DATE
("ypcat passwd | grep '^" USER ":' | awk -F: '{print $5}'") | getline PERSONAL_NAME
if (PERSONAL_NAME == "")
("grep '^" USER ":' /etc/passwd | awk -F: '{print $5}'") | getline PERSONAL_NAME
print "<!-- WARNING: Do NOT edit this file. It was converted from -->"
print "<!-- BibTeX format to HTML by journal-toc.awk version " VERSION_NUMBER " " VERSION_DATE " -->"
print "<!-- on " DATE " -->"
print "<!-- for " PERSONAL_NAME " (" USER "@" HOSTNAME ") -->"
print ""
print ""
print "<!DOCTYPE HTML public \"-//IETF//DTD HTML//EN\">"
print ""
print "<HTML>"
print prefix(1) "<HEAD>"
print prefix(2) "<TITLE>"
print prefix(3) Journal
print prefix(2) "</TITLE>"
print prefix(2) "<LINK REV=\"made\" HREF=\"mailto:" USER "@" HOSTNAME "\">"
print prefix(1) "</HEAD>"
print ""
print prefix(1) "<BODY>"
}
function html_label( label)
{
label = Volume "(" Number "):" Month ":" Year
gsub(/[^A-Za-z0-9():,;.\/\-]/,"",label)
return (label)
}
function html_length(s)
{ # Return visible length of s, ignoring any HTML markup
if (HTML)
{
gsub(/<\/?[^>]*>/,"",s) # remove SGML tags
gsub(/&[A-Za-z0-9]+;/,"",s) # remove SGML entities
}
return (length(s))
}
function html_toc()
{
print prefix(2) "<H1>"
print prefix(3) "Table of contents for issues of " Journal
print prefix(2) "</H1>"
print HTML_TOC
}
function html_toc_entry()
{
HTML_TOC = HTML_TOC " <A HREF=\"#" html_label() "\">"
HTML_TOC = HTML_TOC vol_no_month_year()
HTML_TOC = HTML_TOC "</A><BR>" "\n"
}
function html_trailer()
{
html_end_pre()
print prefix(1) "</BODY>"
print "</HTML>"
}
function initialize()
{
# NB: Update these when the program changes
VERSION_DATE = "[09-Oct-1996]"
VERSION_NUMBER = "1.00"
HTML = (HTML == "") ? 0 : (0 + HTML)
if (INDENT == "")
INDENT = 4
if (HTML == 0)
INDENT = 0 # indentation suppressed in ASCII mode
LEADERS = " . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ."
MAX_TITLE_CHARS = 36 # 36 produces a 79-char output line when there is
# just an initial page number. If this is
# increased, the LEADERS string may need to be
# lengthened.
MIN_LEADERS = 4 # Minimum number of characters from LEADERS
# required when leaders are used. The total
# number of characters that can appear in a
# title line is MAX_TITLE_CHARS + MIN_LEADERS.
# Leaders are omitted when the title length is
# between MAX_TITLE_CHARS and this sum.
MIN_LEADERS_SPACE = " " # must be at least MIN_LEADERS characters long
Month_expansion["jan"] = "January"
Month_expansion["feb"] = "February"
Month_expansion["mar"] = "March"
Month_expansion["apr"] = "April"
Month_expansion["may"] = "May"
Month_expansion["jun"] = "June"
Month_expansion["jul"] = "July"
Month_expansion["aug"] = "August"
Month_expansion["sep"] = "September"
Month_expansion["oct"] = "October"
Month_expansion["nov"] = "November"
Month_expansion["dec"] = "December"
Font_cmd_map["\\emph"] = "EM"
Font_cmd_map["\\textbf"] = "B"
Font_cmd_map["\\textit"] = "I"
Font_cmd_map["\\textmd"] = ""
Font_cmd_map["\\textrm"] = ""
Font_cmd_map["\\textsc"] = "toupper"
Font_cmd_map["\\textsl"] = "I"
Font_cmd_map["\\texttt"] = "t"
Font_cmd_map["\\textup"] = ""
Font_decl_map["\\bf"] = "B"
Font_decl_map["\\em"] = "EM"
Font_decl_map["\\it"] = "I"
Font_decl_map["\\rm"] = ""
Font_decl_map["\\sc"] = "toupper"
Font_decl_map["\\sf"] = ""
Font_decl_map["\\tt"] = "TT"
Font_decl_map["\\itshape"] = "I"
Font_decl_map["\\upshape"] = ""
Font_decl_map["\\slshape"] = "I"
Font_decl_map["\\scshape"] = "toupper"
Font_decl_map["\\mdseries"] = ""
Font_decl_map["\\bfseries"] = "B"
Font_decl_map["\\rmfamily"] = ""
Font_decl_map["\\sffamily"] = ""
Font_decl_map["\\ttfamily"] = "TT"
}
function min(a,b)
{
return (a < b) ? a : b
}
function prefix(level)
{
# Return a prefix of up to 60 blanks
if (In_PRE)
return ("")
else
return (substr(" ", \
1, INDENT * level))
}
function print_line(line)
{
if (HTML) # must buffer in memory so that we can accumulate TOC
Body[++BodyLines] = line
else
print line
}
function print_toc_line(author,title,pages, extra,leaders,n,t)
{
# When we have a multiline title, the hypertext link goes only
# on the first line. A multiline hypertext link looks awful
# because of long underlines under the leading indentation.
if (pages == "") # then no leaders needed in title lines other than last one
t = sprintf("%31s %s%s%s", author, Title_prefix, title, Title_suffix)
else # last title line, with page number
{
n = html_length(title) # potentially expensive
extra = n % 2 # extra space for aligned leader dots
if (n <= MAX_TITLE_CHARS) # then need leaders
leaders = substr(LEADERS, 1, MAX_TITLE_CHARS + MIN_LEADERS - extra - \
min(MAX_TITLE_CHARS,n))
else # title (almost) fills line, so no leaders
leaders = substr(MIN_LEADERS_SPACE,1, \
(MAX_TITLE_CHARS + MIN_LEADERS - extra - n))
t = sprintf("%31s %s%s%s%s%s %4s", \
author, Title_prefix, title, Title_suffix, \
(extra ? " " : ""), leaders, pages)
}
Title_prefix = "" # forget any hypertext
Title_suffix = "" # link material
# Efficency note: an earlier version accumulated the body in a
# single scalar like this: "Body = Body t". Profiling revealed
# this statement as the major hot spot, and the change to array
# storage made the program more than twice as fast. This
# suggests that awk might benefit from an optimization of
# "s = s t" that uses realloc() instead of malloc().
if (HTML)
Body[++BodyLines] = t
else
print t
}
function protect_SGML_characters(s)
{
gsub(/&/,"\\&amp;",s) # NB: this one MUST be first
gsub(/</,"\\&lt;",s)
gsub(/>/,"\\&gt;",s)
gsub(/\"/,"\\&quot;",s)
return (s)
}
function strip_braces(s, k)
{ # strip non-backslashed braces from s and return the result
return (strip_char(strip_char(s,"{"),"}"))
}
function strip_char(s,c, k)
{ # strip non-backslashed instances of c from s, and return the result
k = index(s,c)
if (k > 0) # then found the character
{
if (substr(s,k-1,1) != "\\") # then not backslashed char
s = substr(s,1,k-1) strip_char(substr(s,k+1),c) # so remove it (recursively)
else # preserve backslashed char
s = substr(s,1,k) strip_char(s,k+1,c)
}
return (s)
}
function strip_html(s)
{
gsub(/<\/?[^>]*>/,"",s)
return (s)
}
function terminate()
{
if (HTML)
{
html_end_pre()
HTML = 0 # NB: stop line buffering
html_header()
html_toc()
html_body()
html_trailer()
}
}
function TeX_to_HTML(s, k,n,parts)
{
# First convert the four SGML reserved characters to SGML entities
if (HTML)
{
gsub(/>/, "\\&gt;", s)
gsub(/</, "\\&lt;", s)
gsub(/"/, "\\&quot;", s)
}
gsub(/[$][$]/,"$$",s) # change display math to triple dollars for split
n = split(s,parts,/[$]/)# split into non-math (odd) and math (even) parts
s = ""
for (k = 1; k <= n; ++k) # unbrace non-math part, leaving math mode intact
s = s ((k > 1) ? "$" : "") \
((k % 2) ? strip_braces(TeX_to_HTML_nonmath(parts[k])) : \
TeX_to_HTML_math(parts[k]))
gsub(/[$][$][$]/,"$$",s) # restore display math
return (s)
}
function TeX_to_HTML_math(s)
{
# Mostly a dummy for now, but HTML 3 could support some math translation
gsub(/\\&/,"\\&amp;",s) # reduce TeX ampersands to SGML entities
return (s)
}
function TeX_to_HTML_nonmath(s)
{
if (index(s,"\\") > 0) # important optimization
{
gsub(/\\slash +/,"/",s) # replace TeX slashes with conventional ones
gsub(/ *\\emdash +/," --- ",s) # replace BibNet emdashes with conventional ones
gsub(/\\%/,"%",s) # reduce TeX percents to conventional ones
gsub(/\\[$]/,"$",s) # reduce TeX dollars to conventional ones
gsub(/\\#/,"#",s) # reduce TeX sharps to conventional ones
if (HTML) # translate TeX markup to HTML
{
gsub(/\\&/,"\\&amp;",s) # reduce TeX ampersands to SGML entities
s = html_accents(s)
s = html_fonts(s)
}
else # plain ASCII text output: discard all TeX markup
{
gsub(/\\\&/, "\\&", s) # reduce TeX ampersands to conventional ones
gsub(/\\[a-z][a-z] +/,"",s) # remove TeX font changes
gsub(/\\[^A-Za-z]/,"",s) # remove remaining TeX control symbols
}
}
return (s)
}
function trim(s)
{
gsub(/^[ \t]+/,"",s)
gsub(/[ \t]+$/,"",s)
return (s)
}
function vol_no_month_year()
{
return ("Volume " wrap(Volume) ", Number " wrap(Number) ", " wrap(Month) ", " wrap(Year))
}
function wrap(value)
{
return (HTML ? ("<STRONG>" value "</STRONG>") : value)
}

27220
testdir/funstack.in Normal file

File diff suppressed because it is too large Load Diff

3705
testdir/funstack.ok Normal file

File diff suppressed because it is too large Load Diff

1
testdir/ind Executable file
View File

@ -0,0 +1 @@
exec sed '/./s/^/ /' $*

11
testdir/latin1 Normal file
View File

@ -0,0 +1,11 @@
Ich studiere Rechtswissenschaft an der Juristischen Fakultät der LMU
München, arbeite als Aufsicht und Postmaster im CIP-Pool der
Universitätsbibliothek und bin Mitglied von Mensa in Deutschland und
Greenpeace.
Außerdem bin ich im Ortsverband München-Ost des THW für die
Jugendarbeit zuständig.
Serveurs WWW Français et Francophones. Les références qui suivent sont en francais
dansk deutsch español français italiano
À propos de CANARIE
Accélérer l'émergence de la société de l'information
Jysk Åbent universiteit På dansk Fluemønstre Jørgensens

16
testdir/lilly.ifile Executable file
View File

@ -0,0 +1,16 @@
foo=bar
foo==bar
foo+bar
foo+=bar
foo-=bar
foo*=bar
foo/=bar
foo^=bar
foo%=bar
foo!=bar
foo<=bar
foo>=bar
foo bar
foo/bar
foo=bar=fribble
=foo=bar

1258
testdir/lilly.out Normal file

File diff suppressed because it is too large Load Diff

126
testdir/lilly.progs Normal file
View File

@ -0,0 +1,126 @@
BEGIN{foo=6;print foo/2}
BEGIN{foo=10;foo/=2;print foo}
/=/ {print $0}
/==/ {print $0}
/\+=/ {print $0}
/\*=/ {print $0}
/-=/ {print $0}
/\/=/ {print $0}
/%=/ {print $0}
/^=/ {print $0}
/!=/ {print $0}
/<=/ {print $0}
/>=/ {print $0}
!/=/ {print $0}
!/==/ {print $0}
!/\+=/ {print $0}
!/\*=/ {print $0}
!/-=/ {print $0}
!/\/=/ {print $0}
!/%=/ {print $0}
!/^=/ {print $0}
!/!=/ {print $0}
!/<=/ {print $0}
!/>=/ {print $0}
$0~/=/ {print $0}
$0~/==/ {print $0}
$0~/\+=/ {print $0}
$0~/\*=/ {print $0}
$0~/-=/ {print $0}
$0~/\/=/ {print $0}
$0~/%=/ {print $0}
$0~/^=/ {print $0}
$0~/!=/ {print $0}
$0~/<=/ {print $0}
$0~/>=/ {print $0}
$0!~/=/ {print $0}
$0!~/==/ {print $0}
$0!~/\+=/ {print $0}
$0!~/\*=/ {print $0}
$0!~/-=/ {print $0}
$0!~/%=/ {print $0}
$0!~/^=/ {print $0}
$0!~/!=/ {print $0}
$0!~/<=/ {print $0}
$0!~/>=/ {print $0}
{if(match($0,/=/))print $0}
{if(match($0,/\=/))print $0}
{if(match($0,/==/))print $0}
{if(match($0,/\+=/))print $0}
{if(match($0,/\*=/))print $0}
{if(match($0,/-=/))print $0}
{if(match($0,/%=/))print $0}
{if(match($0,/^=/))print $0}
{if(match($0,/!=/))print $0}
{if(match($0,/<=/))print $0}
{if(match($0,/>=/))print $0}
{if(!match($0,/=/))print $0}
{if(!match($0,/==/))print $0}
{if(!match($0,/\+=/))print $0}
{if(!match($0,/\*=/))print $0}
{if(!match($0,/-=/))print $0}
{if(!match($0,/%=/))print $0}
{if(!match($0,/^=/))print $0}
{if(!match($0,/!=/))print $0}
{if(!match($0,/<=/))print $0}
{if(!match($0,/>=/))print $0}
{if(split($0,foo,/=/))print $0}
{if(split($0,foo,/\=/))print $0}
{if(split($0,foo,/==/))print $0}
{if(split($0,foo,/\+=/))print $0}
{if(split($0,foo,/\*=/))print $0}
{if(split($0,foo,/-=/))print $0}
{if(split($0,foo,/\/=/))print $0}
{if(split($0,foo,/%=/))print $0}
{if(split($0,foo,/^=/))print $0}
{if(split($0,foo,/!=/))print $0}
{if(split($0,foo,/<=/))print $0}
{if(split($0,foo,/>=/))print $0}
{if(sub(/=/,"#"))print $0}
{if(sub(/\=/,"#"))print $0}
{if(sub(/==/,"#"))print $0}
{if(sub(/\+=/,"#"))print $0}
{if(sub(/\*=/,"#"))print $0}
{if(sub(/-=/,"#"))print $0}
{if(sub(/\/=/,"#"))print $0}
{if(sub(/%=/,"#"))print $0}
{if(sub(/^=/,"#"))print $0}
{if(sub(/!=/,"#"))print $0}
{if(sub(/<=/,"#"))print $0}
{if(sub(/>=/,"#"))print $0}
{if(gsub(/=/,"#"))print $0}
{if(gsub(/\=/,"#"))print $0}
{if(gsub(/==/,"#"))print $0}
{if(gsub(/\+=/,"#"))print $0}
{if(gsub(/\*=/,"#"))print $0}
{if(gsub(/-=/,"#"))print $0}
{if(gsub(/\/=/,"#"))print $0}
{if(gsub(/%=/,"#"))print $0}
{if(gsub(/^=/,"#"))print $0}
{if(gsub(/!=/,"#"))print $0}
{if(gsub(/<=/,"#"))print $0}
{if(gsub(/>=/,"#"))print $0}
{if(sub(/=/,"#",$0))print $0}
{if(sub(/\=/,"#",$0))print $0}
{if(sub(/==/,"#",$0))print $0}
{if(sub(/\+=/,"#",$0))print $0}
{if(sub(/\*=/,"#",$0))print $0}
{if(sub(/-=/,"#",$0))print $0}
{if(sub(/\/=/,"#",$0))print $0}
{if(sub(/%=/,"#",$0))print $0}
{if(sub(/^=/,"#",$0))print $0}
{if(sub(/!=/,"#",$0))print $0}
{if(sub(/<=/,"#",$0))print $0}
{if(sub(/>=/,"#",$0))print $0}
{if(sub(/=/,"#",$0))print $0}
{if(gsub(/\=/,"#",$0))print $0}
{if(gsub(/==/,"#",$0))print $0}
{if(gsub(/\+=/,"#",$0))print $0}
{if(gsub(/\*=/,"#",$0))print $0}
{if(gsub(/-=/,"#",$0))print $0}
{if(gsub(/\/=/,"#",$0))print $0}
{if(gsub(/%=/,"#",$0))print $0}
{if(gsub(/^=/,"#",$0))print $0}
{if(gsub(/!=/,"#",$0))print $0}
{if(gsub(/<=/,"#",$0))print $0}
{if(gsub(/>=/,"#",$0))print $0}

15
testdir/lsd1.p Normal file
View File

@ -0,0 +1,15 @@
.cstart
B: benzene pointing right
F: flatring pointing left put N at 5 double 3,4 with .V1 at B.V2
H below F.N
R: ring pointing right with .V4 at B.V6
front bond right from R.V6 ; H
W: ring pointing right with .V2 at R.V6 put N at 1 double 3,4
bond right from W.N ; CH3
back bond -60 from W.V5 ; H
bond up from W.V5 ; C
doublebond up from C ; O
bond right from C ; N
bond 45 from N ; C2H5
bond 135 from N ; C2H5
.cend

1
testdir/p.1 Normal file
View File

@ -0,0 +1 @@
{ print }

1
testdir/p.10 Normal file
View File

@ -0,0 +1 @@
$1 == $4

1
testdir/p.11 Normal file
View File

@ -0,0 +1 @@
/Asia/

1
testdir/p.12 Normal file
View File

@ -0,0 +1 @@
$4 ~ /Asia/ { print $1 }

1
testdir/p.13 Normal file
View File

@ -0,0 +1 @@
$4 !~ /Asia/ {print $1 }

1
testdir/p.14 Normal file
View File

@ -0,0 +1 @@
/\$/

1
testdir/p.15 Normal file
View File

@ -0,0 +1 @@
/\\/

1
testdir/p.16 Normal file
View File

@ -0,0 +1 @@
/^.$/

1
testdir/p.17 Normal file
View File

@ -0,0 +1 @@
$2 !~ /^[0-9]+$/

1
testdir/p.18 Normal file
View File

@ -0,0 +1 @@
/(apple|cherry) (pie|tart)/

2
testdir/p.19 Normal file
View File

@ -0,0 +1,2 @@
BEGIN { digits = "^[0-9]+$" }
$2 !~ digits

1
testdir/p.2 Normal file
View File

@ -0,0 +1 @@
{ print $1, $3 }

1
testdir/p.20 Normal file
View File

@ -0,0 +1 @@
$4 == "Asia" && $3 > 500

1
testdir/p.21 Normal file
View File

@ -0,0 +1 @@
$4 == "Asia" || $4 == "Europe"

1
testdir/p.21a Normal file
View File

@ -0,0 +1 @@
/Asia/ || /Africa/

1
testdir/p.22 Normal file
View File

@ -0,0 +1 @@
$4 ~ /^(Asia|Europe)$/

1
testdir/p.23 Normal file
View File

@ -0,0 +1 @@
/Canada/, /Brazil/

1
testdir/p.24 Normal file
View File

@ -0,0 +1 @@
FNR == 1, FNR == 5 { print FILENAME, $0 }

1
testdir/p.25 Normal file
View File

@ -0,0 +1 @@
{ printf "%10s %6.1f\n", $1, 1000 * $3 / $2 }

3
testdir/p.26 Normal file
View File

@ -0,0 +1,3 @@
/Asia/ { pop = pop + $3; n = n + 1 }
END { print "population of", n,\
"Asian countries in millions is", pop }

3
testdir/p.26a Normal file
View File

@ -0,0 +1,3 @@
/Asia/ { pop += $3; ++n }
END { print "population of", n,\
"Asian countries in millions is", pop }

2
testdir/p.27 Normal file
View File

@ -0,0 +1,2 @@
maxpop < $3 { maxpop = $3; country = $1 }
END { print country, maxpop }

1
testdir/p.28 Normal file
View File

@ -0,0 +1 @@
{ print NR ":" $0 }

1
testdir/p.29 Normal file
View File

@ -0,0 +1 @@
{ gsub(/USA/, "United States"); print }

1
testdir/p.3 Normal file
View File

@ -0,0 +1 @@
{ printf "[%10s] [%-16d]\n", $1, $3 }

1
testdir/p.30 Normal file
View File

@ -0,0 +1 @@
{ print length, $0 }

2
testdir/p.31 Normal file
View File

@ -0,0 +1,2 @@
length($1) > max { max = length($1); name = $1 }
END { print name }

1
testdir/p.32 Normal file
View File

@ -0,0 +1 @@
{ $1 = substr($1, 1, 3); print }

Some files were not shown because too many files have changed in this diff Show More