The sgmls SGML parser. Support for our hypertext documents.

Reviewed by:	Paul Richards, Garrett Wollman
This commit is contained in:
jfieber 1995-04-27 16:03:47 +00:00
commit fb558ed8ec
84 changed files with 22756 additions and 0 deletions

43
usr.bin/sgmls/LICENSE Normal file
View File

@ -0,0 +1,43 @@
LICENSE AND DISCLAIMER OF WARRANTIES
Standard Generalized Markup Language Users' Group (SGMLUG)
SGML Parser Materials
1. License
SGMLUG hereby grants to any user: (1) an irrevocable royalty-free,
worldwide, non-exclusive license to use, execute, reproduce, display,
perform and distribute copies of, and to prepare derivative works
based upon these materials; and (2) the right to authorize others to
do any of the foregoing.
2. Disclaimer of Warranties
(a) The SGML Parser Materials are provided "as is" to any USER. USER
assumes responsibility for determining the suitability of the SGML
Parser Materials for its use and for results obtained. SGMLUG makes
no warranty that any errors have been eliminated from the SGML Parser
Materials or that they can be eliminated by USER. SGMLUG shall not
provide any support maintenance or other aid to USER or its licensees
with respect to SGML Parser Materials. SGMLUG shall not be
responsible for losses of any kind resulting from use of the SGML
Parser Materials including (without limitation) any liability for
business expense, machine downtime, or damages caused to USER or third
parties by any deficiency, defect, error, or malfunction.
(b) SGMLUG DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, ARISING OUT
OF OR RELATING TO THE SGML PARSER MATERIALS OR ANY USE THEREOF,
INCLUDING (WITHOUT LIMITATION) ANY WARRANTY WHATSOEVER AS TO THE
FITNESS FOR A PARTICULAR USE OR THE MERCHANTABILITY OF THE SGML PARSER
MATERIALS.
(c) In no event shall SGMLUG be liable to USER or third parties
licensed by USER for any indirect, special, incidental, or
consequential damages (including lost profits).
(d) SGMLUG has no knowledge of any conditions that would impair its right
to license the SGML Parser Materials. Notwithstanding the foregoing,
SGMLUG does not make any warranties or representations that the
SGML Parser Materials are free of claims by third parties of patent,
copyright infringement or the like, nor does SGMLUG assume any
liability in respect of any such infringement of rights of third
parties due to USER's operation under this license.

9
usr.bin/sgmls/Makefile Normal file
View File

@ -0,0 +1,9 @@
#
# Bmake file for sgmls
# $Id:$
#
SUBDIR= libsgmls sgmls sgmlsasp rast
.include <bsd.subdir.mk>

View File

@ -0,0 +1,13 @@
#
# Bmakefile for rast
#
# $id$
#
.include "${.CURDIR}/../../Makefile.inc"
.if exists(${.CURDIR}/../libsgmls/obj)
LIBSGMLS= ${.CURDIR}/../libsgmls/obj/libsgmls.a
.else
LIBSGMLS= ${.CURDIR}/../libsgmls/libsgmls.a
.endif

138
usr.bin/sgmls/README Normal file
View File

@ -0,0 +1,138 @@
$Id:$
This the sgmls release 1.1 SGML parser written by James Clark
jjc@jclark.com, repackaged for FreeBSD. The original source may be
obtained from ftp://ftp.jclark.com/.
Pieces removed include:
* Test documents: Compiled on FreeBSD, sgmls passes all tests.
* sgml-mode.el: The sole file covered by the GNU GPL. This is not
installed anyway and anyone wishing to do serious SGML editing
would be best to get the psgml package.
* Makefiles and config files for other operating systems (vms, dos,
cms).
* Formatted versions of the man pages.
20-Apr-1995 John Fieber <jfieber@freebsd.org>
The original README and TODO follow.
----------------------------------------------------------------------
This is sgmls, an SGML parser derived from the ARCSGML parser
materials which were written by Charles F. Goldfarb. (These are
available for anonymous ftp from ftp.ifi.uio.no [128.240.88.1] in the
directory SIGhyper/SGMLUG/distrib.)
The version number is given in the file version.c.
The file INSTALL contains installation instructions.
The file NEWS describes recent user-visible changes.
The file sgmls.man contains a Unix manual page; sgmls.txt is the
formatted version of this.
The file sgml-mode.el contains a very simple SGML mode for GNU Emacs.
The files sgmls.c and sgmls.h contain a small library for parsing the
output of sgmls. This is used by sgmlsasp, which translates the
output of sgmls using an ASP replacement file, and by rast, which
translates the output of sgmls to the format of a RAST result. The
files sgmlsasp.man and rast.man contain Unix manual pages for sgmlsasp
and rast; sgmlsasp.txt and rast.txt are the formatted versions of
these.
The file LICENSE contains the license which applies to arcsgml and
accordingly to those parts of sgmls derived from arcsgml. See also
the copyright notice at the beginning of sgmlxtrn.c. The parts that
were written by me are in the public domain (any files that were
written entirely by me contain a comment to that effect.) The file
sgml-mode.el is covered by the GNU GPL.
Please report any bugs to me. When reporting bugs, please include the
version number, details of your machine, OS and compiler, and a
complete self-contained file that will allow me to reproduce the bug.
James Clark
jjc@jclark.com
----------------------------------------------------------------------
Warn about mixed content models where #PCDATA can't occur everywhere.
Perhaps there should be a configuration option saying what a control
character is for the purpose of SHUNCHAR CONTROLS.
Should the current character that is printed in error messages be
taken from be taken from the file entity or the current entity?
Refine SYS_ action. If we distinguish DELNONCH in lexmark, lexgrp,
lexsd, we can have separate action that ignores the following
character as well.
Should RSs in CDATA/SDATA entities be ignored as specified in 322:1-2?
Similarily, do the rules on REs in 322:3-11 apply to CDATA/SDATA
entities? (I don't think they count as being `in content'.)
What should the entity manager do when it encounters code 13 in an
input file? (Currently it treats it as an RE.)
Document when invalid exclusions are detected.
Option not to perform capacity checking.
Give a warning if the recommendation of 422:1-3 is contravened.
Should an empty CDATA/RCDATA marked section be allowed in the document
type declaration subset?
Include example of use of SGML_PATH in documentation.
Try to detect the situation in 310:8-10 (but see 282:1-2).
Resize hash tables if they become too full.
Say something in the man page about message catalogues.
Consider whether support for SHORTREF NONE requires further changes
(other than disallowing short reference mapping declaration).
Fake /dev/fd/N and /dev/stdin for systems that don't provide it.
Improve the effficiency of the entity manager by not closing and
reopening files. If we run out of FILEs choose the stream with the
fewest bytes remaining to be read, and read the rest of it into
memory. Each entity level will have its own read buffer.
Support multi-line error messages: automatically indent after
newline. (We could output to a temporary file first, then copy to
stderr replacing newlines by newline+indent).
Option that says to output out of context things.
Divide up formal public identifier errors. Give these errors their
own type code.
Consider whether, when OMITTAG is NO, we need to change interpretation
of an empty start-tag (7.4.1.1).
Possibly turn errors 70 and 136 into warnings.
Make things work with NORMSEP > 2. Would need to keep track of number
of CDATA and SDATA entities in CDATA attributes.
Handle `SCOPE INSTANCE'.
In entgen.c, truncate filenames for OSs that don't do this themselves.
Provide an option that specifies that maximum number of errors; when
this limit is exceeded sgmls would exit.
Document non-portable assumptions in the code.
Option to write out SGML declaration. In this case make it write out
APPINFO parameter.
Allow there to be catalogs mapping public ids to filenames.
Environment variable SGML_CATALOG containing list of filenames of
catalogs.

617
usr.bin/sgmls/configure vendored Executable file
View File

@ -0,0 +1,617 @@
#!/bin/sh
# Generate config.h from unix.cfg.
trap 'rm -f doit doit.c doit.o doit.log config.out; exit 1' 1 2 3 15
on=
off=
CC=${CC-cc}
# Normally we use VARARGS if __STDC__ is not defined.
# Test whether this assumption is wrong.
cat >doit.c <<\EOF
#ifdef __STDC__
#include <stdarg.h>
int foo(char *s,...)
{
va_list ap;
va_start(ap, s);
va_end(ap);
return 0;
}
#else
int foo = 0;
#endif
EOF
$CC $CFLAGS -c doit.c >/dev/null 2>&1
if test $? -ne 0
then
on="$on VARARGS"
fi
cat >doit.c <<\EOF
#include <stddef.h>
int foo = 0;
EOF
if $CC $CFLAGS -c doit.c >/dev/null 2>&1
then
off="$off STDDEF_H_MISSING"
else
on="$on STDDEF_H_MISSING"
fi
cat >doit.c <<\EOF
#include <stdlib.h>
int foo = 0;
EOF
if $CC $CFLAGS -c doit.c >/dev/null 2>&1
then
off="$off STDLIB_H_MISSING"
else
on="$on STDLIB_H_MISSING"
fi
cat >doit.c <<\EOF
#include <limits.h>
int foo = 0;
EOF
if $CC $CFLAGS -c doit.c >/dev/null 2>&1
then
off="$off LIMITS_H_MISSING"
else
on="$on LIMITS_H_MISSING"
fi
cat >doit.c <<\EOF
#include <vfork.h>
int foo = 0;
EOF
if $CC $CFLAGS -c doit.c >/dev/null 2>&1
then
on="$on HAVE_VFORK_H"
else
off="$off HAVE_VFORK_H"
fi
cat >doit.c <<\EOF
#include <unistd.h>
int foo = 0;
EOF
if $CC $CFLAGS -c doit.c >/dev/null 2>&1
then
on="$on HAVE_UNISTD_H"
else
off="$off HAVE_UNISTD_H"
fi
cat >doit.c <<\EOF
#include <sys/types.h>
#include <sys/stat.h>
int foo = 0;
EOF
if $CC $CFLAGS -c doit.c >/dev/null 2>&1
then
on="$on HAVE_SYS_STAT_H"
else
off="$off HAVE_SYS_STAT_H"
fi
cat >doit.c <<\EOF
/* Exit normally unless we need to use isascii. */
#include <ctype.h>
#include <signal.h>
static int whoops()
{
_exit(1);
}
main()
{
int c;
#ifdef isascii
#ifdef SIGSEGV
signal(SIGSEGV, whoops);
#endif
#ifdef SIGBUS
signal(SIGBUS, whoops);
#endif
#ifdef SIGIOT
signal(SIGIOT, whoops);
#endif
for (c = 128; c < 256; c++) {
if (c == '0' || c == '1' || c == '2' || c == '3' || c == '4' || c == '5'
|| c == '6' || c == '7' || c == '8' || c == '9') {
if (!isdigit(c) || isalpha(c) || iscntrl(c) || isspace(c) || ispunct(c))
exit(1);
}
else if (isdigit(c))
exit(1);
else if (isalpha(c)) {
if (iscntrl(c) || isspace(c) || ispunct(c)
|| (islower(c) && toupper(c) != c && !isupper(toupper(c)))
|| (isupper(c) && tolower(c) != c && !islower(tolower(c))))
exit(1);
}
else if (islower(c) || isupper(c))
exit(1);
else if (iscntrl(c)) {
if (ispunct(c))
exit(1);
}
}
#endif /* isascii */
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
# This tries to find the symbol that looks like the array
# used by <ctype.h>, and sees if its length appears to be 128
# rather than 256.
if test 1 = `(nm -n doit 2>/dev/null) | awk '
BEGIN {
weight["0"] = 0;
weight["1"] = 1;
weight["2"] = 2;
weight["3"] = 3;
weight["4"] = 4;
weight["5"] = 5;
weight["6"] = 6;
weight["7"] = 7;
weight["8"] = 8;
weight["9"] = 9;
weight["a"] = weight["A"] = 10;
weight["b"] = weight["B"] = 11;
weight["c"] = weight["C"] = 12;
weight["d"] = weight["D"] = 13;
weight["e"] = weight["E"] = 14;
weight["f"] = weight["F"] = 15;
}
/^[0-9a-zA-Z]* D .*ctype/ && ctype_nr == 0 {
ctype_nr = NR;
addr = 0;
len = length($1);
for (i = 1; i <= len; i++)
addr = addr*16 + weight[substr($1, i, 1)];
}
/^[0-9a-zA-Z]* D / && NR == ctype_nr + 1 {
next_addr = 0;
len = length($1);
for (i = 1; i <= len; i++)
next_addr = next_addr*16 + weight[substr($1, i, 1)];
}
END {
size = next_addr - addr;
if (size >= 128 && size < 256)
print "1";
else
print "0";
}'`
then
on="$on USE_ISASCII"
else
if ((yes | man 3 ctype) 2>/dev/null) \
| sed -e 's/.//g' -e 's/ *$//' -e '/de-$/N' \
-e 's/-\n//g' -e '/defined$/N' -e '/only$/N' \
-e '/where$/N' -e '/isascii$/N' -e '/is$/N' \
-e 's/\n/ /g' -e 's/ */ /g' \
| grep "defined only where isascii is true" >/dev/null
then
on="$on USE_ISASCII"
else
off="$off USE_ISASCII"
fi
fi
else
on="$on USE_ISASCII"
fi
cat >doit.c <<\EOF
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0)
remove("foo");
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
off="$off REMOVE_MISSING"
else
on="$on REMOVE_MISSING"
fi
cat >doit.c <<\EOF
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0)
getopt(argc, argv, "v");
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
on="$on HAVE_GETOPT"
else
off="$off HAVE_GETOPT"
fi
cat >doit.c <<\EOF
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0)
access("foo", 4);
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
on="$on HAVE_ACCESS"
else
off="$off HAVE_ACCESS"
fi
cat >doit.c <<\EOF
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0)
vfork();
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
on="$on HAVE_VFORK"
else
off="$off HAVE_VFORK"
fi
cat >doit.c <<\EOF
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0) {
int status;
waitpid(-1, &status, 0);
}
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
on="$on HAVE_WAITPID"
else
off="$off HAVE_WAITPID"
fi
cat >doit.c <<\EOF
#include <string.h>
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0)
strerror(0);
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
off="$off STRERROR_MISSING"
else
on="$on STRERROR_MISSING"
fi
cat >doit.c <<\EOF
#include <strings.h>
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0)
bcopy((char *)0, (char *)0, 0);
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
# Only use BSD_STRINGS if ANSI string functions don't work.
cat >doit.c <<\EOF
#include <string.h>
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0)
memcpy((char *)0, (char *)0, 0);
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
off="$off BSD_STRINGS"
else
on="$on BSD_STRINGS"
fi
else
off="$off BSD_STRINGS"
fi
cat >doit.c <<\EOF
#include <signal.h>
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0)
raise(SIGINT);
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
off="$off RAISE_MISSING"
else
on="$on RAISE_MISSING"
fi
cat >doit.c <<\EOF
#include <stdio.h>
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0) {
fpos_t pos;
fsetpos(stdin, &pos);
fgetpos(stdin, &pos);
}
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
off="$off FPOS_MISSING"
else
on="$on FPOS_MISSING"
fi
cat >doit.c <<\EOF
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0) {
pid_t pid;
int status;
long n = sysconf(_SC_OPEN_MAX);
pid = waitpid(-1, &status, 0);
WIFSTOPPED(status);
WIFSIGNALED(status);
WIFEXITED(status);
WEXITSTATUS(status);
WTERMSIG(status);
WSTOPSIG(status);
}
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
on="$on POSIX"
else
off="$off POSIX"
fi
cat >doit.c <<\EOF
#include <stdio.h>
#include <signal.h>
static int whoops()
{
_exit(1);
}
main()
{
char buf[30];
#ifdef SIGSEGV
signal(SIGSEGV, whoops);
#endif
#ifdef SIGBUS
signal(SIGBUS, whoops);
#endif
#ifdef SIGIOT
signal(SIGIOT, whoops);
#endif
sprintf(buf, "%2$s%2$s%1$s%1$s", "bar", "foo");
exit(!!strcmp(buf, "foofoobarbar"));
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
on="$on HAVE_EXTENDED_PRINTF"
else
off="$off HAVE_EXTENDED_PRINTF"
fi
cat >doit.c <<\EOF
#include <nl_types.h>
main(argc, argv)
int argc;
char **argv;
{
if (argc == 0) {
nl_catd d = catopen("foo", 0);
catgets(d, 1, 1, "default");
catclose(d);
}
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
on="$on HAVE_CAT"
else
off="$off HAVE_CAT"
fi
cat >doit.c <<\EOF
#include <limits.h>
char c = UCHAR_MAX;
main(argc, argv)
int argc;
char **argv;
{
#if CHAR_MIN < 0
exit(!(c < 0));
#else
exit(!(c > 0));
#endif
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
char_signed=
else
cat >doit.c <<\EOF
main()
{
int i;
for (i = 0; i < 512; i++) {
char c = (char)i;
if (c < 0)
exit(1);
}
exit(0);
}
EOF
if $CC $CFLAGS -o doit doit.c $LIBS >/dev/null 2>&1 && ./doit 2>/dev/null
then
char_signed=0
else
char_signed=1
fi
fi
cat >doit.c <<\EOF
typedef void VOID;
extern VOID bar();
VOID foo()
{
}
EOF
if $CC $CFLAGS -c doit.c >/dev/null 2>&1
then
void_ret=void
else
void_ret=int
fi
cat >doit.c <<\EOF
void *foo()
{
static char *buf;
return buf;
}
EOF
if $CC $CFLAGS -c doit.c >doit.log 2>&1
then
if test -s doit.log
then
void_star="char \*"
else
void_star="void \*"
fi
else
void_star="char \*"
fi
edit=
rm -f doit.c doit doit.log doit.o
for var in $on
do
edit="$edit -e 's;^/\\* *\\(#define $var [^/]*\\) *\\*/;\\1;'"
done
for var in $off
do
edit="$edit -e 's;^#define $var [^/]*;/* & */;'"
done
if test -n "$char_signed"
then
edit="$edit -e 's;^/\\* *\\(#define CHAR_SIGNED $char_signed\\) *\\*/;\\1;'"
fi
edit="$edit -e 's/^typedef .*VOID;/typedef $void_ret VOID;/'"
edit="$edit -e 's/^typedef .*UNIV;/typedef ${void_star}UNIV;/'"
if test "X$(PREFIX)" != "X/usr/local"
then
edit="$edit -e '/DEFAULT_PATH/s;/usr/local;$PREFIX;g'"
fi
eval sed $edit unix.cfg ">config.out"
mv config.out config.h
exit 0

View File

@ -0,0 +1,17 @@
#
# Bmakefile for libsgmls
#
# $id$
#
LIB= sgmls
SRCS= sgmls.c
CFLAGS+= -I${.CURDIR}/../sgmls
NOMAN= noman
NOPROFILE= noprofile
install:
.include <bsd.lib.mk>

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,127 @@
/* sgmls.h
Interface to a library for reading output of sgmls. */
struct sgmls_data {
char *s;
unsigned len;
char is_sdata;
};
struct sgmls_notation {
char *name;
char *sysid;
char *pubid;
};
struct sgmls_internal_entity {
char *name;
struct sgmls_data data;
};
enum sgmls_external_entity_type {
SGMLS_ENTITY_CDATA,
SGMLS_ENTITY_SDATA,
SGMLS_ENTITY_NDATA,
SGMLS_ENTITY_SUBDOC
};
struct sgmls_external_entity {
char *name;
enum sgmls_external_entity_type type;
char **filenames;
int nfilenames;
char *pubid;
char *sysid;
struct sgmls_attribute *attributes;
struct sgmls_notation *notation;
};
struct sgmls_entity {
union {
struct sgmls_internal_entity internal;
struct sgmls_external_entity external;
} u;
char is_internal;
};
enum sgmls_attribute_type {
SGMLS_ATTR_IMPLIED,
SGMLS_ATTR_CDATA,
SGMLS_ATTR_TOKEN,
SGMLS_ATTR_ENTITY,
SGMLS_ATTR_NOTATION
};
struct sgmls_attribute {
struct sgmls_attribute *next;
char *name;
enum sgmls_attribute_type type;
union {
struct {
struct sgmls_data *v;
int n;
} data;
struct {
struct sgmls_entity **v;
int n;
} entity;
struct {
char **v;
int n;
} token;
struct sgmls_notation *notation;
} value;
};
enum sgmls_event_type {
SGMLS_EVENT_DATA, /* data */
SGMLS_EVENT_ENTITY, /* external entity reference */
SGMLS_EVENT_PI, /* processing instruction */
SGMLS_EVENT_START, /* element start */
SGMLS_EVENT_END, /* element end */
SGMLS_EVENT_SUBSTART, /* subdocument start */
SGMLS_EVENT_SUBEND, /* subdocument end */
SGMLS_EVENT_APPINFO, /* appinfo */
SGMLS_EVENT_CONFORMING /* the document was conforming */
};
struct sgmls_event {
enum sgmls_event_type type;
union {
struct {
struct sgmls_data *v;
int n;
} data;
struct sgmls_external_entity *entity;
struct {
char *s;
unsigned len;
} pi;
struct {
char *gi;
struct sgmls_attribute *attributes;
} start;
struct {
char *gi;
} end;
char *appinfo;
} u;
char *filename; /* SGML filename */
unsigned long lineno; /* SGML lineno */
};
#ifdef __STDC__
void sgmls_free_attributes(struct sgmls_attribute *);
struct sgmls *sgmls_create(FILE *);
int sgmls_next(struct sgmls *, struct sgmls_event *);
void sgmls_free(struct sgmls *);
typedef void sgmls_errhandler(int, char *, unsigned long);
sgmls_errhandler *sgmls_set_errhandler(sgmls_errhandler *);
#else /* not __STDC__ */
void sgmls_free_attributes();
struct sgmls *sgmls_create();
int sgmls_next();
void sgmls_free();
typedef void sgmls_errhandler();
sgmls_errhandler *sgmls_set_errhandler();
#endif /* not __STDC__ */

View File

@ -0,0 +1,18 @@
#
# Bmakefile for rast
#
# $id$
#
PROG= rast
SRCS+= rast.c
CFLAGS+= -I${.CURDIR}/../libsgmls -I${.CURDIR}/../sgmls
LDADD= ${LIBSGMLS}
DPADD= ${LIBSGMLS}
.include "../Makefile.inc"
.include <bsd.prog.mk>

75
usr.bin/sgmls/rast/rast.1 Normal file
View File

@ -0,0 +1,75 @@
.\" -*- nroff -*-
.tr \(ts"
.TH RAST 1
.SH NAME
rast \- translate output of sgmls to RAST format
.SH SYNOPSIS
.B rast
[
.BI \-o output_file
]
[
.I input_file
]
.SH DESCRIPTION
.I Rast
translates the output of sgmls to the format of a RAST result.
RAST is the Reference Application for SGML Testing defined in the Proposed
American National Standard on Conformance Testing for Standard Generalized
Markup Language (SGML) Systems (X3.190-199X).
.I Rast
reads from
.I input_file
or from standard input if
.I input_file
is not specified.
It writes to
.I output_file
or to standard output if
.I output_file
is not specified;
use of the
.B \-o
option avoids the need for
.I rast
to use a temporary file.
.LP
Note that the
.B -c
option of
.I sgmls
can generate a capacity report in RACT format.
.SH BUGS
Production [9] in clause 14.5.5 of the draft standard is clearly wrong;
.I rast
corrects it by appending
.RI `,\ LE '.
An alternative way to correct it would be to delete the
.RB `, \(tsEND-ENTITY\(ts '.
.LP
In production [18] in clause 14.5.9,
.RI ` markup\ data +'
should be
.RI ` markup\ data *'
since internal sdata entities need not contain any characters (14.5.11),
and
.I markup\ data
cannot be empty (14.5.9, 14.5.12).
.LP
The RAST result for the example in Annex B.4 is incorrect.
The line
.B G03-A1=
should be immediately followed by a line
.BR !g03-e1! .
(The problem with production [9] also applies to this example.)
.LP
.I Rast
outputs a newline after
.B #ERROR
in order to avoid producing files with partial lines.
.SH "SEE ALSO"
.IR sgmls (1)
.br
.I
Conformance Testing for Standard Generalized Markup Language (SGML) Systems,
(X3.190-199X), Draft July 1991

534
usr.bin/sgmls/rast/rast.c Normal file
View File

@ -0,0 +1,534 @@
/* rast.c
Translate sgmls output to RAST result format.
Written by James Clark (jjc@jclark.com). */
#include "config.h"
#include "std.h"
#include "sgmls.h"
#include "getopt.h"
#ifdef USE_PROTOTYPES
#define P(parms) parms
#else
#define P(parms) ()
#endif
#ifdef __GNUC__
#define NO_RETURN volatile
#else
#define NO_RETURN /* as nothing */
#endif
#ifdef VARARGS
#define VP(parms) ()
#else
#define VP(parms) P(parms)
#endif
#ifdef USE_ISASCII
#define ISASCII(c) isascii(c)
#else
#define ISASCII(c) (1)
#endif
NO_RETURN void error VP((char *,...));
static void input_error P((int, char *, unsigned long));
static int do_file P((FILE *));
static void usage P((void));
static void output_processing_instruction P((char *, unsigned));
static void output_data P((struct sgmls_data *, int));
static void output_data_lines P((char *, unsigned));
static void output_internal_sdata P((char *, unsigned));
static void output_external_entity P((struct sgmls_external_entity *));
static void output_external_entity_info P((struct sgmls_external_entity *));
static void output_element_start P((char *, struct sgmls_attribute *));
static void output_element_end P((char *));
static void output_attribute P((struct sgmls_attribute *));
static void output_tokens P((char **, int));
static void output_markup_chars P((char *, unsigned));
static void output_markup_string P((char *));
static void output_char P((int, int));
static void output_flush P((int));
static void output_external_id P((char *, char *));
static void output_entity P((struct sgmls_entity *));
static void output_external_entity_info P((struct sgmls_external_entity *));
static void output_internal_entity P((struct sgmls_internal_entity *));
#define output_flush_markup() output_flush('!')
#define output_flush_data() output_flush('|')
static FILE *outfp;
static int char_count = 0;
static char *program_name;
int main(argc, argv)
int argc;
char **argv;
{
int c;
int opt;
char *output_file = 0;
program_name = argv[0];
while ((opt = getopt(argc, argv, "o:")) != EOF)
switch (opt) {
case 'o':
output_file = optarg;
break;
case '?':
usage();
default:
abort();
}
if (output_file) {
errno = 0;
outfp = fopen(output_file, "w");
if (!outfp)
error("couldn't open `%s' for output: %s", strerror(errno));
}
else {
outfp = tmpfile();
if (!outfp)
error("couldn't create temporary file: %s", strerror(errno));
}
if (argc - optind > 1)
usage();
if (argc - optind == 1) {
if (!freopen(argv[optind], "r", stdin))
error("couldn't open `%s' for input: %s", argv[optind], strerror(errno));
}
(void)sgmls_set_errhandler(input_error);
if (!do_file(stdin)) {
fclose(outfp);
if (output_file) {
if (!freopen(output_file, "w", stdout))
error("couldn't reopen `%s' for output: %s", strerror(errno));
}
fputs("#ERROR\n", stdout);
exit(EXIT_FAILURE);
}
if (output_file) {
errno = 0;
if (fclose(outfp) == EOF)
error("error closing `%s': %s", output_file, strerror(errno));
}
else {
errno = 0;
if (fseek(outfp, 0L, SEEK_SET))
error("couldn't rewind temporary file: %s", strerror(errno));
while ((c = getc(outfp)) != EOF)
if (putchar(c) == EOF)
error("error writing standard output: %s", strerror(errno));
}
exit(EXIT_SUCCESS);
}
static
void usage()
{
fprintf(stderr, "usage: %s [-o output_file] [input_file]\n", program_name);
exit(EXIT_FAILURE);
}
static
int do_file(fp)
FILE *fp;
{
struct sgmls *sp;
struct sgmls_event e;
int conforming = 0;
sp = sgmls_create(fp);
while (sgmls_next(sp, &e))
switch (e.type) {
case SGMLS_EVENT_DATA:
output_data(e.u.data.v, e.u.data.n);
break;
case SGMLS_EVENT_ENTITY:
output_external_entity(e.u.entity);
break;
case SGMLS_EVENT_PI:
output_processing_instruction(e.u.pi.s, e.u.pi.len);
break;
case SGMLS_EVENT_START:
output_element_start(e.u.start.gi, e.u.start.attributes);
sgmls_free_attributes(e.u.start.attributes);
break;
case SGMLS_EVENT_END:
output_element_end(e.u.end.gi);
break;
case SGMLS_EVENT_SUBSTART:
{
int level = 1;
output_external_entity(e.u.entity);
while (level > 0) {
if (!sgmls_next(sp, &e))
return 0;
switch (e.type) {
case SGMLS_EVENT_SUBSTART:
level++;
break;
case SGMLS_EVENT_SUBEND:
level--;
break;
case SGMLS_EVENT_START:
sgmls_free_attributes(e.u.start.attributes);
break;
default:
/* prevent compiler warnings */
break;
}
}
}
break;
case SGMLS_EVENT_APPINFO:
break;
case SGMLS_EVENT_CONFORMING:
conforming = 1;
break;
default:
abort();
}
sgmls_free(sp);
return conforming;
}
static
void output_processing_instruction(s, len)
char *s;
unsigned len;
{
fputs("[?", outfp);
if (len > 0) {
putc('\n', outfp);
output_data_lines(s, len);
output_flush_data();
}
fputs("]\n", outfp);
}
static
void output_data(v, n)
struct sgmls_data *v;
int n;
{
int i;
for (i = 0; i < n; i++) {
if (v[i].is_sdata)
output_internal_sdata(v[i].s, v[i].len);
else if (v[i].len > 0)
output_data_lines(v[i].s, v[i].len);
}
}
static
void output_data_lines(s, n)
char *s;
unsigned n;
{
assert(n > 0);
for (; n > 0; --n)
output_char((unsigned char)*s++, '|');
output_flush_data();
}
static
void output_internal_sdata(s, n)
char *s;
unsigned n;
{
fputs("#SDATA-TEXT\n", outfp);
output_markup_chars(s, n);
output_flush_markup();
fputs("#END-SDATA\n", outfp);
}
static
void output_external_entity(e)
struct sgmls_external_entity *e;
{
fprintf(outfp, "[&%s\n", e->name);
output_external_entity_info(e);
fputs("]\n", outfp);
}
static
void output_element_start(gi, att)
char *gi;
struct sgmls_attribute *att;
{
fprintf(outfp, "[%s", gi);
if (att) {
struct sgmls_attribute *p;
putc('\n', outfp);
for (p = att; p; p = p->next)
output_attribute(p);
}
fputs("]\n", outfp);
}
static
void output_element_end(gi)
char *gi;
{
fprintf(outfp, "[/%s]\n", gi);
}
static
void output_attribute(p)
struct sgmls_attribute *p;
{
fprintf(outfp, "%s=\n", p->name);
switch (p->type) {
case SGMLS_ATTR_IMPLIED:
fputs("#IMPLIED\n", outfp);
break;
case SGMLS_ATTR_CDATA:
{
struct sgmls_data *v = p->value.data.v;
int n = p->value.data.n;
int i;
for (i = 0; i < n; i++)
if (v[i].is_sdata)
output_internal_sdata(v[i].s, v[i].len);
else {
output_markup_chars(v[i].s, v[i].len);
output_flush_markup();
}
}
break;
case SGMLS_ATTR_TOKEN:
output_tokens(p->value.token.v, p->value.token.n);
break;
case SGMLS_ATTR_ENTITY:
{
int i;
for (i = 0; i < p->value.entity.n; i++) {
struct sgmls_entity *e = p->value.entity.v[i];
char *name;
if (e->is_internal)
name = e->u.internal.name;
else
name = e->u.external.name;
if (i > 0)
output_markup_string(" ");
output_markup_string(name);
}
output_flush_markup();
for (i = 0; i < p->value.entity.n; i++)
output_entity(p->value.entity.v[i]);
}
break;
case SGMLS_ATTR_NOTATION:
output_tokens(&p->value.notation->name, 1);
output_external_id(p->value.notation->pubid, p->value.notation->sysid);
break;
}
}
static void output_tokens(v, n)
char **v;
int n;
{
int i;
assert(n > 0);
output_markup_string(v[0]);
for (i = 1; i < n; i++) {
output_markup_string(" ");
output_markup_string(v[i]);
}
output_flush_markup();
}
static
void output_markup_chars(s, n)
char *s;
unsigned n;
{
for (; n > 0; --n)
output_char((unsigned char)*s++, '!');
}
static
void output_markup_string(s)
char *s;
{
while (*s)
output_char((unsigned char)*s++, '!');
}
static
void output_char(c, delim)
int c;
int delim;
{
if (ISASCII(c) && isprint(c)) {
if (char_count == 0)
putc(delim, outfp);
putc(c, outfp);
char_count++;
if (char_count == 60) {
putc(delim, outfp);
putc('\n', outfp);
char_count = 0;
}
}
else {
output_flush(delim);
switch (c) {
case RECHAR:
fputs("#RE\n", outfp);
break;
case RSCHAR:
fputs("#RS\n", outfp);
break;
case TABCHAR:
fputs("#TAB\n", outfp);
break;
default:
fprintf(outfp, "#%d\n", c);
}
}
}
static
void output_flush(delim)
int delim;
{
if (char_count > 0) {
putc(delim, outfp);
putc('\n', outfp);
char_count = 0;
}
}
static
void output_external_id(pubid, sysid)
char *pubid;
char *sysid;
{
if (!pubid && !sysid)
fputs("#SYSTEM\n#NONE\n", outfp);
else {
if (pubid) {
fputs("#PUBLIC\n", outfp);
if (*pubid) {
output_markup_string(pubid);
output_flush_markup();
}
else
fputs("#EMPTY\n", outfp);
}
if (sysid) {
fputs("#SYSTEM\n", outfp);
if (*sysid) {
output_markup_string(sysid);
output_flush_markup();
}
else
fputs("#EMPTY\n", outfp);
}
}
}
static
void output_entity(e)
struct sgmls_entity *e;
{
if (e->is_internal)
output_internal_entity(&e->u.internal);
else
output_external_entity_info(&e->u.external);
fputs("#END-ENTITY", outfp);
#ifndef ASIS
putc('\n', outfp);
#endif
}
static
void output_external_entity_info(e)
struct sgmls_external_entity *e;
{
switch (e->type) {
case SGMLS_ENTITY_CDATA:
fputs("#CDATA-EXTERNAL", outfp);
break;
case SGMLS_ENTITY_SDATA:
fputs("#SDATA-EXTERNAL", outfp);
break;
case SGMLS_ENTITY_NDATA:
fputs("#NDATA-EXTERNAL", outfp);
break;
case SGMLS_ENTITY_SUBDOC:
fputs("#SUBDOC", outfp);
break;
}
putc('\n', outfp);
output_external_id(e->pubid, e->sysid);
if (e->type != SGMLS_ENTITY_SUBDOC) {
struct sgmls_attribute *p;
fprintf(outfp, "#NOTATION=%s\n", e->notation->name);
output_external_id(e->notation->pubid, e->notation->sysid);
for (p = e->attributes; p; p = p->next)
output_attribute(p);
}
}
static
void output_internal_entity(e)
struct sgmls_internal_entity *e;
{
if (e->data.is_sdata)
fputs("#SDATA-INTERNAL", outfp);
else
fputs("#CDATA-INTERNAL", outfp);
putc('\n', outfp);
output_markup_chars(e->data.s, e->data.len);
output_flush_markup();
}
static
void input_error(num, str, lineno)
int num;
char *str;
unsigned long lineno;
{
error("Error at input line %lu: %s", lineno, str);
}
NO_RETURN
#ifdef VARARGS
void error(va_alist) va_dcl
#else
void error(char *message,...)
#endif
{
#ifdef VARARGS
char *message;
#endif
va_list ap;
fprintf(stderr, "%s: ", program_name);
#ifdef VARARGS
va_start(ap);
message = va_arg(ap, char *);
#else
va_start(ap, message);
#endif
vfprintf(stderr, message, ap);
va_end(ap);
fputc('\n', stderr);
fflush(stderr);
exit(EXIT_FAILURE);
}

247
usr.bin/sgmls/sgmls.pl Executable file
View File

@ -0,0 +1,247 @@
#! /usr/bin/perl
# This is a skeleton of a perl script for processing the output of
# sgmls. You must change the parts marked with "XXX".
# XXX This is for troff: in data, turn \ into \e (which prints as \).
# Backslashes in SDATA entities are left as backslashes.
$backslash_in_data = "\\e";
$prog = $0;
$prog =~ s|.*/||;
$level = 0;
while (<STDIN>) {
chop;
$command = substr($_, 0, 1);
substr($_, 0, 1) = "";
if ($command eq '(') {
&start_element($_);
$level++;
}
elsif ($command eq ')') {
$level--;
&end_element($_);
foreach $key (keys %attribute_value) {
@splitkey = split($;, $key);
if ($splitkey[0] == $level) {
delete $attribute_value{$key};
delete $attribute_type{$key};
}
}
}
elsif ($command eq '-') {
&unescape_data($_);
&data($_);
}
elsif ($command eq 'A') {
@field = split(/ /, $_, 3);
$attribute_type{$level,$field[0]} = $field[1];
&unescape_data($field[2]);
$attribute_value{$level,$field[0]} = $field[2];
}
elsif ($command eq '&') {
&entity($_);
}
elsif ($command eq 'D') {
@field = split(/ /, $_, 4);
$data_attribute_type{$field[0], $field[1]} = $field[2];
&unescape_data($field[3]);
$data_attribute_value{$field[0], $field[1]} = $field[3];
}
elsif ($command eq 'N') {
$notation{$_} = 1;
if (defined($sysid)) {
$notation_sysid{$_} = $sysid;
undef($sysid);
}
if (defined($pubid)) {
$notation_pubid{$_} = $pubid;
undef($pubid);
}
}
elsif ($command eq 'I') {
@field = split(/ /, $_, 3);
$entity_type{$field[0]} = $field[1];
&unescape($field[2]);
# You may want to substitute \e for \ if the type is CDATA.
$entity_text{$field[0]} = $field[2];
$entity_code{$field[0]} = 'I';
}
elsif ($command eq 'E') {
@field = split(/ /, $_);
$entity_code{$field[0]} = 'E';
$entity_type{$field[0]} = $field[1];
$entity_notation{$field[0]} = $field[2];
if (defined(@files)) {
foreach $i (0..$#files) {
$entity_filename{$field[0], $i} = $files[i];
}
undef(@files);
}
if (defined($sysid)) {
$entity_sysid{$field[0]} = $sysid;
undef($sysid);
}
if (defined($pubid)) {
$entity_pubid{$field[0]} = $pubid;
undef($pubid);
}
}
elsif ($command eq 'S') {
$entity_code{$_} = 'S';
if (defined(@files)) {
foreach $i (0..$#files) {
$entity_filename{$_, $i} = $files[i];
}
undef(@files);
}
if (defined($sysid)) {
$entity_sysid{$_} = $sysid;
undef($sysid);
}
if (defined($pubid)) {
$entity_pubid{$_} = $pubid;
undef($pubid);
}
}
elsif ($command eq '?') {
&unescape($_);
&pi($_);
}
elsif ($command eq 'L') {
@field = split(/ /, $_);
$lineno = $field[0];
if ($#field >= 1) {
&unescape($field[1]);
$filename = $field[1];
}
}
elsif ($command eq 'V') {
@field = split(/ /, $_, 2);
&unescape($field[1]);
$environment{$field[0]} = $field[1];
}
elsif ($command eq '{') {
&start_subdoc($_);
}
elsif ($command eq '}') {
&end_subdoc($_);
}
elsif ($command eq 'f') {
&unescape($_);
push(@files, $_);
}
elsif ($command eq 'p') {
&unescape($_);
$pubid = $_;
}
elsif ($command eq 's') {
&unescape($_);
$sysid = $_;
}
elsif ($command eq 'C') {
$conforming = 1;
}
else {
warn "$prog:$ARGV:$.: unrecognized command \`$command'\n";
}
}
sub unescape {
$_[0] =~ s/\\([0-7][0-7]?[0-7]?|.)/&esc($1)/eg;
}
sub esc {
local($_) = $_[0];
if ($_ eq '012' || $_ eq '12') {
""; # ignore RS
}
elsif (/^[0-7]/) {
sprintf("%c", oct);
}
elsif ($_ eq 'n') {
"\n";
}
elsif ($_ eq '|') {
"";
}
elsif ($_ eq "\\") {
"\\";
}
else {
$_;
}
}
sub unescape_data {
local($sdata) = 0;
$_[0] =~ s/\\([0-7][0-7]?[0-7]?|.)/&esc_data($1)/eg;
}
sub esc_data {
local($_) = $_[0];
if ($_ eq '012' || $_ eq '12') {
""; # ignore RS
}
elsif (/^[0-7]/) {
sprintf("%c", oct);
}
elsif ($_ eq 'n') {
"\n";
}
elsif ($_ eq '|') {
$sdata = !$sdata;
"";
}
elsif ($_ eq "\\") {
$sdata ? "\\" : $backslash_in_data;
}
else {
$_;
}
}
sub start_element {
local($gi) = $_[0];
# XXX
}
sub end_element {
local($gi) = $_[0];
# XXX
}
sub data {
local($data) = $_[0];
# XXX
}
# A processing instruction.
sub pi {
local($data) = $_[0];
# XXX
}
# A reference to an external entity.
sub entity {
local($name) = $_[0];
# XXX
}
sub start_subdoc {
local($name) = $_[0];
# XXX
}
sub end_subdoc {
local($name) = $_[0];
# XXX
}

View File

@ -0,0 +1,18 @@
#
# Bmakefile for sgmls
#
# $id$
#
PROG= sgmls
SRCS+= lexrf.c pcbrf.c synrf.c context.c md1.c md2.c pars1.c pars2.c serv.c
SRCS+= sgml1.c sgml2.c sgmlmsg.c sgmlxtrn.c traceset.c entgen.c sgmlio.c
SRCS+= xfprintf.c main.c unixproc.c sgmldecl.c version.c strerror.c getopt.c
SRCS+= msgcat.c lineout.c ambig.c exclude.c lextaba.c
CFLAGS+= -I${.CURDIR}/../libsgmls
.include "../Makefile.inc"
.include <bsd.prog.mk>

View File

@ -0,0 +1,179 @@
/* ACTION.H: Symbols for all PCB action codes. */
/* CONACT.H: Symbols for content parse action names (end with '_').
There must be no conflict with PARSEACT.H, which
uses 0 through 19, or SGMLACT.H, which uses 20 through 32
(except that 31 - 32 can be defined here because they are
used only by PARSEPRO and do not conflict with SGML.C).
*/
#define CIR_ 31 /* Invalid character(s) ignored in MDS; restarting parse. */
#define DTD_ 32 /* Process DOCTYPE declaration. */
#define DTE_ 33 /* End of DOCTYPE declaration. */
#define PEP_ 34 /* TEMP: Previous character ended prolog. */
#define DAS_ 35 /* Current character begins data. */
#define FCE_ 36 /* Process free character (SR12-18, 21-30). */
#define DCE_ 37 /* Data character in element text; change PCB. */
#define LAS_ 38 /* Start lookahead buffer with current character. */
#define LAM_ 39 /* Move character to lookahead buffer. */
#define LAF_ 40 /* Flush the lookahead buffer; REPEATCC. */
#define NED_ 41 /* Process null end-tag delimiter. */
#define NET_ 42 /* Process null end-tag. */
#define NST_ 43 /* Process null start-tag. */
#define NLF_ 44 /* Flush lookahead buffer except for trailing NET or SR. */
#define ETC_ 45 /* End-tag in CDATA or RCDATA; treat as data if invalid. */
#define SRMIN 46 /* Dummy for SHORT REFERENCES: srn = SRn - SRMIN. */
#define SR1_ 47 /* TAB */
#define SR2_ 48 /* RE */
#define SR3_ 49 /* RS */
#define SR4_ 50 /* Leading blanks */
#define SR5_ 51 /* Null record */
#define DAR_ 52 /* Flush data buffer after repeating current character. */
#define SR7_ 53 /* Trailing blanks */
#define SR8_ 54 /* Space */
#define SR9_ 55 /* Two or more blanks */
#define SR10 56 /* Quotation mark (first data character) */
#define SR11 57 /* Number sign */
#define SR12 58 /* FCE CHARACTERS start here */
/* _ 59 */
#define BSQ_ 60 /* Blank sequence begun; find its end. */
/* 61 In use by PARSEACT.H */
/* 62 In use by PARSEACT.H */
/* 63 In use by PARSEACT.H */
/* 64 In use by PARSEACT.H */
#define SR19 65 /* Hyphen */
#define SR20 66 /* Two hyphens */
#define SR25 71 /* Left bracket */
#define SR26 72 /* Right bracket */
#define RBR_ 73 /* Two right brackets. */
#define GTR_ 74 /* EOB with pending data character */
#define MSP_ 75 /* Marked section start in prolog outside DTD */
#define APP_ 76 /* APPINFO (other than NONE) */
#define STE_ 77 /* Start tag ended prolog */
/* GRPACT.H: Symbols for group tokenization action names (all alpha).
There must be no conflict with PARSEACT.H, which
uses 0 - 19.
*/
#define AND 20 /* AND connector found. */
#define DTAG 21 /* Data tag token group occurred (treat as #CHARS). */
#define GRPE 22 /* Group ended. */
#define GRP_ 23 /* Group started. */
#define NAS_ 24 /* Name started in content model or name group. */
#define NMT_ 25 /* Name or name token started in name token group. */
#define OPT 26 /* OPT occurrence indicator for previous token. */
#define OR 27 /* OR connector found. */
#define OREP 28 /* OREP occurrence indicator for previous token. */
#define REP 29 /* REP occurrence indicator for previous token. */
#define RNS_ 30 /* Reserved name started (#PCDATA). */
#define SEQ 31 /* SEQ connector found. */
/* LITACT.H: Symbols for content parse action names (end with '_').
There must be no conflict with PARSEACT.H, which
uses 0 through 19.
*/
#define MLA_ 20 /* Move character to look-aside data buffer. */
#define LPR_ 21 /* Move previous character to data buffer. */
#define RSM_ 22 /* Process record start and move it to data buffer. */
#define FUN_ 23 /* Replace function character with a space. */
#define LP2_ 24 /* Move previous two characters to data buffer. */
#define MLE_ 25 /* Minimum literal error: invalid character ignored. */
#define RPR_ 26 /* Remove previous character from data buffer; terminate. */
#define TER_ 27 /* Terminate the parse. */
/* MDACT.H: Symbols for markup declaration parse action names (all alpha).
There must be no conflict with PARSEACT.H, which
uses 0 - 19.
*/
#define CDR 20 /* CD[1] (MINUS) occurred previously. */
#define EMD 21 /* End of markup declaration. */
#define GRPS 22 /* Group started. */
#define LIT 23 /* Literal started: character data. */
#define LITE 24 /* Literal started: character data; LITA is delimiter. */
#define MGRP 25 /* Minus exception group (MINUS,GRPO). */
#define NAS 26 /* Name started. */
#define NMT 27 /* Name token started. */
#define NUM 28 /* Number or number token started. */
#define PEN 29 /* Parameter entity name being defined (PERO found). */
#define PGRP 30 /* Plus exception group (PLUS,GRPO). */
#define RNS 31 /* Reserved name started. */
#define MDS 32 /* Markup declaration subset start. */
#define PENR 33 /* REPEATCC; PERO found. */
/* PARSEACT.H: Symbols for common parse action names (end with '_').
There must be no conflict with other action name
files, which use numbers greater than 19.
*/
#define CRA_ 1 /* Character reference: alphabetic. */
#define CRN_ 2 /* Character reference: numeric; non-char refs o.k.. */
#define NON_ 3 /* Single byte of non-character data found. */
#define EOF_ 4 /* Error: illegal entity end; resume old input; return. */
#define ER_ 5 /* Entity reference; start new input source; continue. */
#define GET_ 6 /* EOB, EOS, or EE: resume old input source; continue. */
#define INV_ 7 /* Error: invalid char terminated markup; repeat char. */
#define LEN_ 8 /* Error: length limit exceeded; end markup; repeat char. */
#define NOP_ 9 /* No action necessary. */
#define PCI_ 10 /* Previous character was invalid. */
#define PER_ 11 /* Parameter reference; start new input source; continue. */
#define RC2_ 12 /* Back up two characters. */
#define RCC_ 13 /* Repeat current character. */
#define RCR_ 14 /* Repeat current character and return to caller. */
#define EE_ 15 /* EOS or EE: resume old input source; return to caller. */
#define RS_ 16 /* Record start: ccnt=0; ++rcnt. */
#define ERX_ 17 /* Entity reference; start new input source; return. */
#define SYS_ 18 /* Error allowed: SYSCHAR in input stream; replace it. */
#define EOD_ 19 /* End of document. */
/* Number way out of order to avoid recompilation. */
#define NSC_ 58 /* Handle DELNONCH/DELXNONCH when NON_ is allowed */
#define PEX_ 61 /* Parameter entity ref; start new input source; return. */
#define DEF_ 62 /* Data entity found. */
#define PIE_ 63 /* PI entity found (needed in markup). */
#define LNR_ 64 /* LEN_ error with extra REPEATCC. */
/* SGMLACT.H: Symbols for content parse action names (end with '_')
that are returned to SGML.C for processing.
There must be no conflict with PARSEACT.H, which
uses 0 through 19, or CONACT.H, which uses 34 and above.
(Note: 31 is also used in CONACT.H, but no conflict
is created because they are tested only in PARSEPRO.C, which
completes before SGML.C starts to examine those codes.
Also, when EOD_ is returned from PARSECON, it is changed
to LOP_.)
*/
#define CON_ 20 /* Normal content action (one of the following). */
#define DAF_ 21 /* Data found. */
#define ETG_ 22 /* Process end-tag. */
#define MD_ 23 /* Process markup declaration (NAMESTRT found). */
#define MDC_ 24 /* Process markup declaration comment (CD found). */
#define MSS_ 25 /* Process marked section start. */
#define MSE_ 26 /* Process marked section end. */
#define PIS_ 27 /* Processing instruction (string). */
#define REF_ 28 /* Record end found. */
#define STG_ 29 /* Process start-tag. */
#define RSR_ 30 /* Return RS to effect SGML state transition. */
#define LOP_ 31 /* Loop for new content without returning anything. */
/* TAGACT.H: Symbols for tag parse action names (all alpha).
There must be no conflict with PARSEACT.H, which
uses 0 - 19.
*/
#define AVD 20 /* Delimited attribute value started: normal delimiter. */
#define AVU 21 /* Undelimited value started. */
#define ETIC 22 /* Tag closed with ETI. */
#define NVS 23 /* Name of attribute or value started. */
#define NASV 24 /* Saved NAS was actually an NTV. */
#define NTV 25 /* Name token value started; get name and full value. */
#define TAGC 26 /* Tag closed normally. */
#define TAGO 27 /* Tag closed implicitly by TAGO character. */
#define AVDA 28 /* Delimited attribute value started: alternative delim. */
#define DSC 29 /* Closed by DSC character. */
/* VALACT.H: Symbols for attribute value tokenization action names (all alpha).
*/
#define NOPA 0 /* No action necessary. */
#define INVA 1 /* Invalid character; terminate parse. */
#define LENA 2 /* Length limit of token exceeded; terminate parse. */
#define NASA 3 /* Name started. */
#define NMTA 4 /* Name token started. */
#define NUMA 5 /* Number or number token started. */
/* SGML declaration parsing actions. */
#define ESGD 20 /* End of SGML declaration. */
#define LIT1 21 /* Literal started. */
#define LIT2 22 /* Literal started with LITA delimiter. */
#define NUM1 23 /* Number started. */
#define NAS1 24 /* Name started. */
#define ISIG 25 /* Insignificant character occurred. */

118
usr.bin/sgmls/sgmls/adl.h Normal file
View File

@ -0,0 +1,118 @@
/* ADL.H: Definitions for attribute descriptor list processing.
*/
/* N/C/SDATA external entity types for nxetype member of ne structure. */
#define ESNCDATA 1 /* External character data entity. */
#define ESNNDATA 2 /* Non-SGML data entity. */
#define ESNSDATA 3 /* External specific character data entity. */
#define ESNSUB 4 /* SGML subdocument entity. */
/* N/C/SDATA control block for AENTITY attributes and NDATA returns.*/
struct ne { /* N/C/SDATA entity control block. */
UNIV neid; /* Files for NDATA entity. */
UNCH *nepubid; /* Public identifier if specified. */
UNCH *nesysid; /* System identifier if specified. */
PDCB nedcn; /* Data content notation control block. */
struct ad *neal; /* Data attribute list (NULL if none). */
UNCH *neename; /* Ptr to entity name (length and EOS). */
UNCH nextype; /* Entity type: NDATA SDATA CDATA SUBDOC. */
};
#define NESZ (sizeof(struct ne))
typedef struct ne *PNE;
/* NDATA entity control block fields. */
#define NEID(p) (((PNE)p)->neid) /* File ID of NDATA entity. */
#define NESYSID(p) (((PNE)p)->nesysid) /* System ID of NDATA entity. */
#define NEPUBID(p) (((PNE)p)->nepubid) /* Public ID of NDATA entity. */
#define NEDCN(p) (((PNE)p)->nedcn->ename) /* Data content notation name. */
#define NEDCNSYSID(p) (((PNE)p)->nedcn->sysid) /* Notation system ID.*/
#define NEDCNPUBID(p) (((PNE)p)->nedcn->pubid) /* Notation public ID.*/
#define NEDCNDEFINED(p) (((PNE)p)->nedcn->defined) /* Notation defined? */
#define NEDCNADL(p) (((PNE)p)->nedcn->adl) /* Data content notation attlist.*/
#define NEENAME(p) (((PNE)p)->neename) /* Entity name pointer. */
#define NEXTYPE(p) (((PNE)p)->nextype) /* External entity type. */
#define NEAL(p) (((PNE)p)->neal) /* Data attributes (if any). */
#define NEDCNMARK(p) DCNMARK(((PNE)p)->nedcn)
/* Attribute descriptor list entry. */
struct ad {
UNCH *adname; /* Attribute name with length and EOS. */
UNCH adflags; /* Attribute flags. */
UNCH adtype; /* Value type. */
UNS adnum; /* Group size or member pos in grp. */
UNS adlen; /* Length of default or value (for capacity). */
UNCH *addef; /* Default value (NULL if REQUIRED or IMPLIED). */
union {
PNE n; /* AENTITY: NDATA control block. */
PDCB x; /* ANOTEGRP: DCN control block. */
} addata; /* Special data associated with some attributes.*/
};
#define ADSZ (sizeof(struct ad)) /* Size of an ad structure. */
/* Attribute flags for entire list adflags: ADLF. */
#define ADLREQ 0x80 /* Attribute list: 1=REQUIRED att defined. */
#define ADLNOTE 0x40 /* Attribute list: 1=NOTATION att defined. */
#define ADLCONR 0x20 /* Attribute list: 1=CONREF att defined. */
/* Attribute flags for list member adflags: ADFLAGS(n). */
#define AREQ 0x80 /* Attribute: 0=null; 1=required. */
#define ACURRENT 0x40 /* Attribute: 0=normal; 1=current. */
#define AFIXED 0x20 /* Attribute: 0=normal; 1=must equal default. */
#define AGROUP 0x10 /* Attribute: 0=single; 1=group of ad's. */
#define ACONREF 0x08 /* Attribute: 0=normal; 1=att is CONREF. */
#define AINVALID 0x04 /* Attribute: 1=value is invalid; 0=o.k. */
#define AERROR 0x02 /* Attribute: 1=error was specified; 0=o.k. */
#define ASPEC 0x01 /* Attribute: 1=value was specified; 0=default. */
/* Attribute types for adtype. */
#define ANMTGRP 0x00 /* Attribute: Name token group or member. */
#define ANOTEGRP 0x01 /* Attribute: Notation (name group). */
#define ACHARS 0x02 /* Attribute: Character string. */
#define AENTITY 0x03 /* Attribute: Data entity (name). */
#define AID 0x04 /* Attribute: ID value (name). */
#define AIDREF 0x05 /* Attribute: ID reference value (name). */
#define ANAME 0x06 /* Attribute: Name. */
#define ANMTOKE 0x07 /* Attribute: Name token. */
#define ANUMBER 0x08 /* Attribute: Number. */
#define ANUTOKE 0x09 /* Attribute: Number token. */
#define ATKNLIST 0x0A /* Attribute: >= means value is a token list. */
#define AENTITYS 0x0A /* Attribute: Data entities (name list). */
#define AIDREFS 0x0B /* Attribute: ID reference value (name list). */
#define ANAMES 0x0C /* Attribute: Name list. */
#define ANMTOKES 0x0D /* Attribute: Name token list. */
#define ANUMBERS 0x0E /* Attribute: Number list. */
#define ANUTOKES 0x0F /* Attribute: Number token list. */
/* Field definitions for entries in an attribute list.
The first argument to all of these is the list address.
*/
/* Attribute list: flags. */
#define ADLF(a) ((a)[0].adflags)
/* Attribute list: number of list members. */
#define ADN(a) ((a)[0].adtype)
/* Attribute list: number of attributes. */
#define AN(a) ((a)[0].adnum)
/* Nth attribute in list: name. */
#define ADNAME(a, n) (((a)[n].adname+1))
/* Nth att in list: number of val)ues. */
#define ADNUM(a, n) ((a)[n].adnum)
/* Nth attribute in list: flags. */
#define ADFLAGS(a, n) ((a)[n].adflags)
/* Nth attribute in list: type. */
#define ADTYPE(a, n) ((a)[n].adtype)
/* Nth attribute in list: len of def or val.*/
#define ADLEN(a, n) ((a)[n].adlen)
/* Nth attribute in list: def or value. */
#define ADVAL(a, n) ((a)[n].addef)
/* Nth attribute in list: special data. */
#define ADDATA(a, n) ((a)[n].addata)
/* Nth att: token at Pth pos in value. */
#define ADTOKEN(a, n, p)(((a)[n].addef+(p)))
#define IDHASH 101 /* Size of ID hash table. Must be prime. */
struct id { /* ID attribute control block. */
struct id *idnext; /* Next ID in chain. */
UNCH *idname; /* ID name with length prefix and EOS. */
UNCH iddefed; /* Non-zero if it has been defined. */
struct fwdref *idrl; /* Chain of forward references to this ID. */
};
#define IDSZ sizeof(struct id)
typedef struct id *PID; /* Ptr to ID attribute control block. */

438
usr.bin/sgmls/sgmls/ambig.c Normal file
View File

@ -0,0 +1,438 @@
/* ambig.c -
Content model ambiguity checking.
Written by James Clark (jjc@jclark.com).
*/
/*
This uses the construction in pp8-9 of [1], extended to deal with AND
groups.
Note that it is not correct for the purposes of ambiguity analysis to
handle AND groups by turning them into an OR group of SEQ groups
(consider (a&b?)).
We build an automaton for the entire content model by adding the
following case for AND:
nullable(v) := nullable(left child) and nullable(right child)
if nullable(right child) then
for each x in last(left child) do
follow(v,x) = follow(left child,x) U first(right child);
if nullable(left child) then
for each x in last(right child) do
follow(v,x) = follow(right child,x) U first(left child);
first(v) := first(left child) U first(right child);
last(v) := first(left child) U first(right child);
We also build an automaton for each AND group by building automata for
each of the members of the AND group using the above procedure and
then combine the members using:
for each x in last(left child) do
follow(v,x) = follow(left child,x) U first(right child);
for each x in last(right child) do
follow(v,x) = follow(right child,x) U first(left child);
first(v) := first(left child) U first(right child);
The content model is ambiguous just in case one of these automata is
non-deterministic. (Note that when checking determinism we need to
check the `first' set as well as all the `follow' sets.)
Why is this correct? Consider a primitive token in a member of an AND
group. There are two worst cases for ambiguity: firstly, when none of
the other members of AND group have been matched; secondly, when just
the nullable members remain to be matched. The first case is not
affected by context of the AND group (unless the first case is
identical to the second case.)
Note that inclusions are not relevant for the purposes of determining
the ambiguity of content models. Otherwise the case in clause
11.2.5.1:
An element that can satisfy an element in the content model is
considered to do so, even if the element is also an inclusion.
could never arise.
[1] Anne Brueggemann-Klein, Regular Expressions into Finite Automata,
Universitaet Freiburg, Institut fur Informatik, 33 July 1991.
*/
#include "sgmlincl.h"
/* Sets of states are represented by 0-terminated, ordered lists of
indexes in gbuf. */
#define MAXSTATES (GRPGTCNT+2)
#define listcat(x, y) strcat((char *)(x), (char *)(y))
#define listcpy(x, y) strcpy((char *)(x), (char *)(y))
/* Information about a content token. */
struct contoken {
UNCH size;
UNCH nullable;
UNCH *first;
UNCH *last;
};
static VOID contoken P((int, int, struct contoken *));
static VOID andgroup P((int, int, struct contoken *));
static VOID orgroup P((int, int, struct contoken *));
static VOID seqgroup P((int, int, struct contoken *));
static VOID andambig P((int));
static int listambig P((UNCH *));
static VOID listmerge P((UNCH *, UNCH *));
static struct contoken *newcontoken P((void));
static VOID freecontoken P((struct contoken *));
/* Dynamically allocated vector of follow sets. */
static UNCH **follow;
static UNCH *mergebuf; /* for use by listmerge */
/* Set to non-zero if the content model is ambiguous. */
static int ambigsw;
/* Check the current content model (in gbuf) for ambiguity. */
VOID ambig()
{
struct contoken *s;
int i;
if (!follow) {
/* We can't allocate everything in one chunk, because that would
overflow a 16-bit unsigned if GRPGTCNT was 253. */
UNCH *ptr;
follow = (UNCH **)rmalloc(MAXSTATES*sizeof(UNCH *));
follow[0] = 0;
ptr = (UNCH *)rmalloc((MAXSTATES - 1)*MAXSTATES);
for (i = 1; i < MAXSTATES; i++) {
follow[i] = ptr;
ptr += MAXSTATES;
}
mergebuf = (UNCH *)rmalloc(MAXSTATES);
}
for (i = 1; i < MAXSTATES; i++)
follow[i][0] = 0;
ambigsw = 0;
s = newcontoken();
contoken(1, 1, s);
ambigsw = ambigsw || listambig(s->first);
freecontoken(s);
for (i = 1; !ambigsw && i < MAXSTATES; i++)
if (listambig(follow[i]))
ambigsw = 1;
if (ambigsw)
mderr(137, (UNCH *)0, (UNCH *)0);
}
/* Free memory used for ambiguity checking. */
VOID ambigfree()
{
if (follow) {
frem((UNIV)follow[1]);
frem((UNIV)follow);
frem((UNIV)mergebuf);
follow = 0;
}
}
/* Determine whether a list of primitive content tokens (each
represented by its index in gbuf) is ambiguous. */
static
int listambig(list)
UNCH *list;
{
UNCH *p;
int chars = 0;
int rc = 0;
for (p = list; *p; p++) {
if ((gbuf[*p].ttype & TTMASK) == TTETD) {
struct etd *e = gbuf[*p].tu.thetd;
if (e->mark) {
rc = 1;
break;
}
e->mark = 1;
}
else {
assert((gbuf[*p].ttype & TTMASK) == TTCHARS);
if (chars) {
rc = 1;
break;
}
chars = 1;
}
}
for (p = list; *p; p++)
if ((gbuf[*p].ttype & TTMASK) == TTETD)
gbuf[*p].tu.thetd->mark = 0;
return rc;
}
/* Analyze a content token. The `checkand' argument is needed to ensure
that the algorithm is not exponential in the AND-group nesting depth.
*/
static
VOID contoken(m, checkand, res)
int m; /* Index of content token in gbuf */
int checkand; /* Non-zero if AND groups should be checked */
struct contoken *res; /* Result */
{
UNCH flags = gbuf[m].ttype;
switch (flags & TTMASK) {
case TTCHARS:
case TTETD:
res->first[0] = m;
res->first[1] = 0;
res->last[0] = m;
res->last[1] = 0;
res->size = 1;
res->nullable = 0;
break;
case TTAND:
if (checkand)
andambig(m);
andgroup(m, checkand, res);
break;
case TTOR:
orgroup(m, checkand, res);
break;
case TTSEQ:
seqgroup(m, checkand, res);
break;
default:
abort();
}
if (flags & TREP) {
UNCH *p;
for (p = res->last; *p; p++)
listmerge(follow[*p], res->first);
}
if (flags & TOPT)
res->nullable = 1;
}
/* Check an AND group for ambiguity. */
static
VOID andambig(m)
int m;
{
int i, tnum;
int lim;
struct contoken *curr;
struct contoken *next;
tnum = gbuf[m].tu.tnum;
assert(tnum > 0);
curr = newcontoken();
next = newcontoken();
contoken(m + 1, 0, curr);
i = m + 1 + curr->size;
curr->size += 1;
for (--tnum; tnum > 0; --tnum) {
UNCH *p;
contoken(i, 0, next);
curr->size += next->size;
i += next->size;
for (p = curr->last; *p; p++)
listcat(follow[*p], next->first);
for (p = next->last; *p; p++)
listmerge(follow[*p], curr->first);
listcat(curr->first, next->first);
listcat(curr->last, next->last);
}
lim = m + curr->size;
for (i = m + 1; i < lim; i++) {
if (listambig(follow[i]))
ambigsw = 1;
follow[i][0] = 0;
}
freecontoken(curr);
freecontoken(next);
}
/* Handle an AND group. */
static
VOID andgroup(m, checkand, res)
int m;
int checkand;
struct contoken *res;
{
int i, tnum;
/* union of the first sets of nullable members of the group */
UNCH *nullablefirst;
struct contoken *next;
tnum = gbuf[m].tu.tnum;
assert(tnum > 0);
contoken(m + 1, checkand, res);
nullablefirst = (UNCH *)rmalloc(MAXSTATES);
if (res->nullable)
listcpy(nullablefirst, res->first);
else
nullablefirst[0] = 0;
i = m + 1 + res->size;
res->size += 1;
next = newcontoken();
for (--tnum; tnum > 0; --tnum) {
UNCH *p;
contoken(i, checkand, next);
res->size += next->size;
i += next->size;
if (next->nullable)
for (p = res->last; *p; p++)
listcat(follow[*p], next->first);
for (p = next->last; *p; p++)
listmerge(follow[*p], nullablefirst);
listcat(res->first, next->first);
if (next->nullable)
listcat(nullablefirst, next->first);
listcat(res->last, next->last);
res->nullable &= next->nullable;
}
frem((UNIV)nullablefirst);
freecontoken(next);
}
/* Handle a SEQ group. */
static
VOID seqgroup(m, checkand, res)
int m;
int checkand;
struct contoken *res;
{
int i, tnum;
struct contoken *next;
tnum = gbuf[m].tu.tnum;
assert(tnum > 0);
contoken(m + 1, checkand, res);
i = m + 1 + res->size;
res->size += 1;
next = newcontoken();
for (--tnum; tnum > 0; --tnum) {
UNCH *p;
contoken(i, checkand, next);
res->size += next->size;
i += next->size;
for (p = res->last; *p; p++)
listcat(follow[*p], next->first);
if (res->nullable)
listcat(res->first, next->first);
if (next->nullable)
listcat(res->last, next->last);
else
listcpy(res->last, next->last);
res->nullable &= next->nullable;
}
freecontoken(next);
}
/* Handle an OR group. */
static
VOID orgroup(m, checkand, res)
int m;
int checkand;
struct contoken *res;
{
int i, tnum;
struct contoken *next;
tnum = gbuf[m].tu.tnum;
assert(tnum > 0);
contoken(m + 1, checkand, res);
i = m + 1 + res->size;
res->size += 1;
next = newcontoken();
for (--tnum; tnum > 0; --tnum) {
contoken(i, checkand, next);
res->size += next->size;
i += next->size;
listcat(res->first, next->first);
listcat(res->last, next->last);
res->nullable |= next->nullable;
}
freecontoken(next);
}
/* Merge the second ordered list into the first. */
static
VOID listmerge(p, b)
UNCH *p, *b;
{
UNCH *a = mergebuf;
strcpy((char *)a, (char *)p);
for (;;) {
if (*a) {
if (*b) {
if (*a < *b)
*p++ = *a++;
else if (*a > *b)
*p++ = *b++;
else
a++;
}
else
*p++ = *a++;
}
else if (*b)
*p++ = *b++;
else
break;
}
*p = '\0';
}
static
struct contoken *newcontoken()
{
struct contoken *p = (struct contoken *)rmalloc(sizeof(struct contoken)
+ MAXSTATES*2);
p->first = (UNCH *)(p + 1);
p->last = p->first + MAXSTATES;
return p;
}
static
VOID freecontoken(p)
struct contoken *p;
{
frem((UNIV)p);
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,33 @@
/* appl.h */
enum {
E_NOMEM = 1,
E_DOC,
E_EXEC,
E_FORK,
E_WAIT,
E_SIGNAL,
E_OPEN,
E_CAPBOTCH,
E_SUBDOC
};
VOID process_document P((int));
VOID output_conforming P((void));
UNIV xmalloc P((UNS));
UNIV xrealloc P((UNIV, UNS));
VOID appl_error VP((int, ...));
#ifdef SUPPORT_SUBDOC
int run_process P((char **));
char **make_argv P((UNIV));
VOID get_subcaps P((void));
#endif
#ifdef SUPPORT_SUBDOC
extern int suberr;
#endif
extern int suppsw;
extern int locsw;

View File

@ -0,0 +1,147 @@
/* unix.cfg: Configuration file for sgmls on Unix. */
/* A list of filename templates to use for searching for external entities.
The filenames are separated by the character specified in PATH_FILE_SEP.
See sgmls.man for details. */
#define DEFAULT_PATH "/usr/share/sgml/%O/%C/%T:%N.%X:%N.%D"
/* The character that separates the filenames templates. */
#define PATH_FILE_SEP ':'
/* The character that separates filenames in a system identifier.
Usually the same as PATH_FILE_SEP. */
#define SYSID_FILE_SEP ':'
/* The environment variable that contains the list of filename templates. */
#define PATH_ENV_VAR "SGML_PATH"
/* MIN_DAT_SUBS_FROM and MIN_DATS_SUBS_TO tell sgmls how to transform a name
or system identifier into a legal filename. A character in
MIN_DAT_SUBS_FROM will be transformed into the character in the
corresponding position in MIN_DAT_SUBS_TO. If there is no such
position, then the character is removed. */
/* This says that spaces should be transformed to underscores, and
slashes to percents. */
#define MIN_DAT_SUBS_FROM " /"
#define MIN_DAT_SUBS_TO "_%"
/* Define this to allow tracing. */
/* #define TRACE 1 */
/* Define this you want support for subdocuments. This is implemented
using features that are not part of Standard C, so you might not want
to define it if you are porting to a new system. Otherwise I suggest
you leave it defined. */
#define SUPPORT_SUBDOC 1
/* Define HAVE_EXTENDED_PRINTF if your *printf functions supports
X/Open extensions; if they do, then, for example,
printf("%2$s%1$s", "bar", "foo")
should print `foobar'. */
/* #define HAVE_EXTENDED_PRINTF 1 */
/* Define HAVE_CAT if your system provides the X/Open message
catalogue functions catopen() and catgets(), and you want to use them.
An implementations of these functions is included and will be used if
you don't define this. On SunOS 4.1.1, if you do define this you
should set CC=/usr/xpg2bin/cc in the makefile. */
/* #define HAVE_CAT 1 */
#ifdef __STDC__
/* Define this if your compiler supports prototypes. */
#define USE_PROTOTYPES 1
#endif
/* Can't use <stdarg.h> without prototypes. */
#ifndef USE_PROTOTYPES
#define VARARGS 1
#endif
/* If your compiler defines __STDC__ but doesn't provide <stdarg.h>,
you must define VARARGS yourself here. */
/* #define VARARGS 1 */
/* Define this if you do not have strerror(). */
/* #define STRERROR_MISSING 1 */
/* Define this unless the character testing functions in ctype.h
are defined for all values representable as an unsigned char. You do
not need to define this if your system is ANSI C conformant. You
should define for old Unix systems. */
/* #define USE_ISASCII 1 */
/* Define this if your system provides the BSD style string operations
rather than ANSI C ones (eg bcopy() rather than memcpy(), and index()
rather than strchr()). */
/* #define BSD_STRINGS 1 */
/* Define this if you have getopt(). */
#define HAVE_GETOPT 1
/* Define this if you have access(). */
#define HAVE_ACCESS 1
/* Define this if you have <unistd.h>. */
#define HAVE_UNISTD_H 1
/* Define this if you have <sys/stat.h>. */
#define HAVE_SYS_STAT_H 1
/* Define this if you have waitpid(). */
#define HAVE_WAITPID 1
/* Define this if your system is POSIX.1 (ISO 9945-1:1990) compliant. */
#define POSIX 1
/* Define this if you have the vfork() system call. */
#define HAVE_VFORK 1
/* Define this if you have <vfork.h>. */
/* #define HAVE_VFORK_H 1 */
/* Define this if you don't have <stdlib.h> */
/* #define STDLIB_H_MISSING 1 */
/* Define this if you don't have <stddef.h> */
/* #define STDDEF_H_MISSING 1 */
/* Define this if you don't have <limits.h> */
/* #define LIMITS_H_MISSING 1 */
/* Define this if you don't have remove(); unlink() will be used instead. */
/* #define REMOVE_MISSING 1 */
/* Define this if you don't have raise(); kill() will be used instead. */
/* #define RAISE_MISSING 1 */
/* Define this if you don't have fsetpos() and fgetpos(). */
/* #define FPOS_MISSING 1 */
/* Universal pointer type. */
/* If your compiler doesn't fully support void *, change `void' to `char'. */
typedef void *UNIV;
/* If your compiler doesn't support void as a function return type,
change `void' to `int'. */
typedef void VOID;
/* If you don't have an ANSI C conformant <limits.h>, define
CHAR_SIGNED as 1 or 0 according to whether the `char' type is signed.
The <limits.h> on some versions of System Release V 3.2 is not ANSI C
conformant: the value of CHAR_MIN is 0 even though the `char' type is
signed. */
/* #define CHAR_SIGNED 1 */
/* #define CHAR_SIGNED 0 */
#ifndef CHAR_SIGNED
#include <limits.h>
#if CHAR_MIN < 0
#define CHAR_SIGNED 1
#else
#define CHAR_SIGNED 0
#endif
#endif /* not CHAR_SIGNED */
/* Assume the system character set is ISO Latin-1. */
#include "latin1.h"

View File

@ -0,0 +1,444 @@
#include "sgmlincl.h" /* #INCLUDE statements for SGML parser. */
#include "context.h"
#define GI (tags[ts].tetd->etdgi+1) /* GI of current element. */
#define NEWGI (newetd->etdgi+1) /* GI of new tag. */
#define STATUS (*statuspt) /* Token status: RCHIT RCMISS RCEND RCREQ RCNREQ*/
#define PEX (-1) /* GI is a plus exception and not a minus. */
#define ANYHIT(h) (grplongs == 1 ? ((h)[0] != 0) : anyhit(h))
#define HITSET(h, n) (h[(unsigned)(n-1)>>LONGPOW] \
|= (1L<<((n-1)&(LONGBITS-1))))
#define HITON(h, n) (h[(unsigned)(n-1)>>LONGPOW] & (1L<<((n-1)&(LONGBITS-1))))
#define HITOFF(h, n) (!(HITON(h, n)))
#define TOKENHIT HITON(H,T)
static
VOID copypos(to, from)
struct mpos *to, *from;
{
int i;
for (i = 0; i <= (int)from[0].t; i++) {
to[i].g = from[i].g;
to[i].t = from[i].t;
memcpy(to[i].h, from[i].h, grplongs*sizeof(unsigned long));
}
}
/* CONTEXT: Determine whether a GI is valid in the present structural context.
Returns RCHIT if valid, RCEND if element has ended, RCREQ if a
different element is required, and RCMISS if it is totally invalid.
On entry, pos points to the model token to be tested against the GI.
TO DO: Save allowed GIs for an error message on an RCMISS.
Support a "query" mode (what is allowed now?) by working
with a copy of pos.
*/
int context(gi, mod, pos, statuspt, mexts)
struct etd *gi; /* ETD of new GI. */
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
UNCH *statuspt; /* Token status: RCHIT RCMISS RCEND RCREQ RCNREQ*/
int mexts; /* >0=stack level of minus grp; -1=plus; 0=none.*/
{
UNCH toccsv, gtypesv; /* Save token's TOCC and GTYPE in case grp ends.*/
if (mexts == -1) {
if (STATUS == RCEND)
return RCPEX;
copypos(savedpos, pos);
}
Tstart = T; /* Save starting token for AND group testing. */
while (STATUS!=RCMISS && STATUS!=RCEND) {
TRACEGI("CONTEXT", gi, mod, pos, Tstart);
while (TTYPE==TTOR || TTYPE==TTSEQ || TTYPE==TTAND) {
pos[P+1].g = M++; pos[++P].t = 1; HITCLEAR(H);
Tstart = T; /* Save starting token for AND group testing. */
TRACEGI("OPENGRP", gi, mod, pos, Tstart);
}
STATUS = (UNCH)tokenreq(gi, mod, pos);
TRACEGI("STATUS", gi, mod, pos, Tstart);
if (gi==TOKEN.tu.thetd) { /* Hit in model. */
STATUS = (UNCH)RCHIT;
gtypesv = GTYPE; toccsv = TOCC;
newtoken(mod, pos, statuspt);
return(mexts<=0 ? RCHIT : (gtypesv==TTOR || BITON(toccsv, TOPT))
? RCMEX : RCHITMEX);
}
if (STATUS==RCREQ) {
if (mexts == -1)
break;
STATUS = RCHIT;
nextetd = TOKEN.tu.thetd;
newtoken(mod, pos, statuspt);
return(RCREQ);
}
/* else if (STATUS==RCNREQ) */
if (mexts>0) return(RCMEX);
newtoken(mod, pos, statuspt);
}
if (mexts == -1) {
copypos(pos, savedpos);
return STATUS = RCPEX;
}
return((int)STATUS);
}
/* ECONTEXT: Determine whether the current element can be ended, or whether
non-optional tokens remain at the current level or higher.
Returns 1 if element can be ended, or 0 if tokens remain.
On entry, STATUS==RCEND if there are no tokens left; if not,
pos points to the next model token to be tested.
TO DO: Support a "query" mode (what is required now?) by working
with a copy of pos.
*/
int econtext(mod, pos, statuspt)
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
UNCH *statuspt; /* Token status: RCHIT RCMISS RCEND RCREQ RCNREQ*/
{
unsigned next; /* Position in AND group of next testable token.*/
Tstart = T;
TRACEEND("ECONT", mod, pos, 0, 0, Tstart);
if (P<=1) {nextetd = 0; return(TOKENHIT || BITON(TOCC, TOPT));}
nextetd = TTYPE == TTETD ? TOKEN.tu.thetd : 0;
while (STATUS!=RCMISS && STATUS!=RCEND) {
STATUS = (UNCH)testend(mod, pos, 0, 0);
TRACEEND("ECONTEND", mod, pos, 0, 0, Tstart);
nextetd = P<=1 || TTYPE != TTETD ? 0 : TOKEN.tu.thetd;
if (STATUS==RCEND) return(1);
if (P<=1) return(TOKENHIT || BITON(TOCC, TOPT));
if (STATUS==RCMISS) {
if (BITON(TOCC, TOPT)) nextetd = 0;
return(0);
}
if (!tokenopt(mod, pos)) return(0);
STATUS = RCNREQ;
if (GTYPE!=TTAND) ++T; /* T!=GNUM or group would have ended. */
else T = (UNCH)(((next = (UNS)offbit(H, (int)T, GNUM))!=0) ?
next : offbit(H, 0, GNUM));
M = G + grpsz(&GHDR, (int)T-1) + 1;
TRACEEND("ECONTNEW", mod, pos, 0, 0, Tstart);
}
if (STATUS==RCMISS) {
if (BITON(TOCC, TOPT)) nextetd = 0;
return(0);
}
return(1); /* STATUS==RCEND */
}
/* NEWTOKEN: Find the next token to test. Set STATUS to indicate results:
RCEND if element has ended (no more tokens to test);
RCREQ if required new token was found;
RCNREQ if non-required new token was found;
RCHIT if a hit token was repeated (now non-required);
and RCMISS if a new token can't be found because current token
(which was not hit) was neither unconditionally required nor
optional.
*/
VOID newtoken(mod, pos, statuspt)
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
UNCH *statuspt; /* Token status: RCHIT RCMISS RCEND RCREQ RCNREQ*/
{
unsigned nextand = 0; /* Position in AND group of next testable token.*/
int currhit = (STATUS==RCHIT); /* 1=current GI hit; 0=not. */
/* If the GI was a hit, turn on the hit bit and set the status to
assume that the token to be tested against the next GI will
be non-required. If the current token is repeatable, exit so
it will stand as the next token to test.
*/
if (STATUS==RCHIT) {
HITSET(H, T);
STATUS = RCNREQ;
if (BITON(TOCC, TREP)) return;
}
/* At this point, we must determine the next token to test:
either against the next GI, if this one was a hit, or
against the same GI if conditions permit a retry.
To find the next token, we must first end the current group,
if possible, and any we can that contain it.
If the outermost group was a hit and is repeatable, or
if the element has ended, we exit now.
If it hasn't ended, or was optional and ended with a miss,
we can retry the GI against the next token.
*/
if ((STATUS = (UNCH)testend(mod, pos, 1, 1))!=RCNREQ) return;
/* At this point, the "current token" is either the original one,
or the token for the highest level unhit group that it ended.
We will retry a missed GI, by testing it against the next
token, if the current token:
1. Is optional;
2. Was hit (i.e., because it is repeatable and was hit by a
previous GI or because it is a hit group that just ended);
3. Is in an AND or OR group and is not the last testable token.
It will be the next sequential one (unhit one, in an AND group);
if there are none left, use the first unhit token in the group.
In either case, set M to correspond to the new T.
*/
retest:
TRACEEND("RETEST", mod, pos, (int)nextand, 1, Tstart);
if (GTYPE==TTAND) {
nextand = offbit(H, (int)T, GNUM);
if (!nextand)
nextand = offbit(H, 0, GNUM);
}
if ( BITON(TOCC, TOPT)
|| TOKENHIT
|| GTYPE==TTOR /* T!=GNUM or group would have ended. */
|| nextand ) {
if (GTYPE!=TTAND) ++T; /* T!=GNUM or group would have ended. */
else T = nextand;
M = G + grpsz(&GHDR, (int)T-1) + 1;
if (GTYPE==TTAND) {
/* If AND group wrapped, it can end if all non-optionals were
hit. */
if (T==Tstart && !currhit) {
UNCH Psave = P;
int rc = testend(mod, pos, 0, 1);
if (Psave!=P) {if ((STATUS = (UNCH)rc)==RCNREQ) goto retest;}
else STATUS = RCMISS;
}
/* We only test unhit tokens, so we must use an unhit token
as Tstart (which is used to detect when the AND group has
wrapped). */
else if (HITON(H,Tstart)) Tstart = T;
}
}
else STATUS = RCMISS;
TRACEEND("NEWTOKEN", mod, pos, (int)nextand, 1, Tstart);
}
/* TESTEND: End the current group, if possible, and any that it is nested in.
The current token will either be a group header, or some token
that could not end its group. Return 1 if the (possibly new)
current token is repeatable; 0 if it is not.
*/
int testend(mod, pos, andoptsw, newtknsw)
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
int andoptsw; /* 1=test optional AND members; 0=ignore. */
int newtknsw; /* 1=new token test; 0=end element test. */
{
int rc = 0; /* Return code: RCNREQ RCHIT RCMISS RCEND */
while (!rc) {
TRACEEND("TRACEEND", mod, pos, rc, andoptsw, Tstart);
/* TESTMISS:
If we've hit no tokens yet in the current group, and
the current token is the last unhit one in the group we can test,
we will end the group (it may never really have started!)
because we might be able to try the token that follows it.
In any group, a token is the last testable unhit token if it
is the last sequential one, as the GI was already tested against
the preceding unhit tokens. In addition,
in a SEQ group, it is the last testable unhit token if it isn't
optional, because we can't skip past it to the following ones.
If we end the group, before popping the level, set M to G, as this
level`s group header will be the next level's current token.
*/
if (!ANYHIT(H) && (T==GNUM
|| (GTYPE==TTSEQ && BITOFF(TOCC, TOPT)))) {
M = G; --P; Tstart = T;
if (P<=1) {
if (BITON(TOCC, TOPT) || TOKENHIT) rc = RCEND;
else rc = RCMISS;
}
continue;
}
/* TESTHIT:
See if we've hit all the non-optional tokens in the group.
If so, pop to the previous level and set the group's hit bit.
If we were called from NEWTOKEN we are trying to find the token
to test against the next start-tag, so if the group is repeatable,
process it again. (If not, we were called from ECONTEXT and
are testing whether the element can be ended.)
Otherwise, if we are at the first level, the element is over.
*/
if ((GTYPE==TTOR && TOKENHIT)
|| (GTYPE==TTSEQ && T==(UNCH)GNUM
&& (TOKENHIT || BITON(TOCC, TOPT)))
|| (GTYPE==TTAND && allhit(&GHDR, H, 0, andoptsw))) {
M = G;
--P;
HITSET(H, T);
Tstart = T;
if (newtknsw && BITON(TOCC, TREP)) rc = RCHIT;
else if (P<=1) rc = RCEND;
/* If we are looking for a new token to test against the next
start-tag, then we need to consider optional and members
in this group, even if we didn't need to consider them
in the group that we just ended because that group had
wrapped. */
else if (newtknsw) andoptsw = 1;
/* Else loop to test new outer group. */
}
else rc = RCNREQ; /* No group ended this time, so return. */
}
TRACEEND("ENDFOUND", mod, pos, rc, andoptsw, Tstart);
return(rc);
}
/* TOKENOPT: Return 1 if current token is contextually optional;
otherwise, return 0.
*/
int tokenopt(mod, pos)
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
{
TRACEEND("TOKENOPT", mod, pos, 0, 0, Tstart);
return (BITON(TOCC, TOPT) /* Inherently optional. */
|| TOKENHIT /* Was hit (handles "plus" suffix case). */
|| (!ANYHIT(H) && groupopt(mod, pos)));
/* In optional group with no hits. */
}
/* GROUPOPT: Temporarily makes the current group be the current token so that
TOKENOPT() can be applied to it. Returns the value returned
by TOKENOPT.
*/
int groupopt(mod, pos)
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
{
UNCH saveM; /* Save M when testing if group is not required.*/
int rc; /* 1=contextually optional; 0=not. */
if (P==1) return(BITON(GOCC, TOPT) || TOKENHIT);
saveM = M; M = G; --P;
rc = tokenopt(mod, pos);
++P; G = M; M = saveM;
return(rc);
}
/* TOKENREQ: Returns RCREQ if the current token is "contextually required".
That is, it is not contextually optional and
1) it is a member of a "seq" group that is either required
or has at least 1 hit token.
2) it is a member of an "and" group in which all other
tokens were hit.
Optional tokens are not counted
if GI is ETDCDATA, as we are looking for an
omitted start-tag. Otherwise, they are counted,
as the GI might match one of them.
Returns RCNREQ if the current token is "not required".
*/
int tokenreq(gi, mod, pos)
struct etd *gi; /* ETD of new GI. */
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
{
TRACEGI("TOKENREQ", gi, mod, pos, Tstart);
return( tokenopt(mod, pos) ? RCNREQ
: ( GTYPE==TTSEQ && (ANYHIT(H) || groupreq(gi, mod, pos)==RCREQ)
#if 0
|| (GTYPE==TTAND && allhit(&GHDR, H, T, \*gi!=ETDCDATA*\ 1))
#endif
)
? RCREQ : RCNREQ );
}
/* GROUPREQ: Temporarily makes the current group be the current token so that
TOKENREQ() can be applied to it. Returns the value returned
by TOKENREQ.
*/
int groupreq(gi, mod, pos)
struct etd *gi; /* ETD of new GI. */
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
{
UNCH saveM; /* Save M when testing if group is not required.*/
int rc; /* Return code: RCREQ RCNREQ */
if (P==1) return(BITOFF(GOCC, TOPT) ? RCREQ : RCNREQ);
saveM = M; M = G; --P;
rc = tokenreq(gi, mod, pos);
++P; G = M; M = saveM;
return(rc);
}
/* GRPSZ: Returns the number of tokens spanned by a group in the model (M),
from the group's start (G) to a specified index within the group (T).
M = 0, plus 1 for each token in the group, plus the size of
any subgroups (gotten by calling GRPSZ recursively). On entry,
M must be equal to G at the current level.
*/
int grpsz(g, t)
struct thdr *g; /* mod[G]: Ptr to group in the model. */
int t; /* T: Index of last token in the group. */
{
struct thdr *p = g; /* Ptr to current token in the model. */
int m = 0; /* Size of group (including nested groups). */
int i = 0; /* Number of group members (loop counter). */
UNS type; /* Token type (without TOREP bits). */
while (++i<=t) {
++p; ++m;
type = GET(p->ttype, TTMASK);
if (type==TTOR || type==TTSEQ || type==TTAND) {
m += grpsz(p, p->tu.tnum);
p = g+m;
}
}
return(m);
}
/* ALLHIT: Returns 1 if all hit bits for the specified group are turned on,
(other than those that correspond to optional tokens if "opt" is
0) and the "but" bit (all bits if "but" bit is zero). Otherwise,
returns 0. GRPSZ is used to skip past subgroup tokens.
*/
int allhit(p, hits, but, opt)
struct thdr *p; /* mod[G]: Ptr to group in the model. */
unsigned long *hits; /* H: Hit bits to be tested. */
int but; /* Index of bit to ignore; 0=test all. */
int opt; /* 1=optional tokens must be hit; 0=ignore. */
{
int b = 0; /* Index of bit being tested in hits. */
int e = p->tu.tnum; /* Ending index (number of bits to test). */
unsigned type; /* Token type (without TOREP bits). */
while (++p, ++b<=e) {
if (HITOFF(hits,b) && (opt || BITOFF(p->ttype,TOPT)) && b!=but)
return 0;
if ((type = GET(p->ttype,TTMASK))==TTOR || type==TTSEQ || type==TTAND)
p += grpsz(p, p->tu.tnum);
}
return 1;
}
/* OFFBIT: Returns the index of the first unset bit after (i.e., not including)
the caller's "first" bit. If all bits through the
specified last bit are on, it returns 0.
*/
int offbit(bits, first, last)
unsigned long *bits; /* Bits to be tested. */
int first; /* Index of first bit to be tested in bits. */
int last; /* Index of last bit to be tested in bits. */
{
while (++first <= last)
if (HITOFF(bits, first))
return first;
return 0;
}
/* ANYHIT: Return 1 if any bit is set. */
int anyhit(bits)
unsigned long *bits;
{
int i;
for (i = 0; i < grplongs; i++)
if (bits[i] != 0)
return 1;
return 0;
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
comment-column: 30
End:
*/

View File

@ -0,0 +1,17 @@
/* context.h */
#define M pos[0].g /* Index of current token in model. */
#ifdef P
#undef P
#endif
#define P pos[0].t /* Index of current group in pos. */
#define G pos[P].g /* Index of current group in model. */
#define T pos[P].t /* Index of current token in its group. */
#define H pos[P].h /* Pointer to hit bits for current group. */
#define GHDR mod[G] /* Current group header. */
#define TOKEN mod[M] /* Current token. */
#define TTYPE (GET(TOKEN.ttype, TTMASK)) /* Token type of current token. */
#define TOCC (GET(TOKEN.ttype, TOREP)) /* Occurrence for current token. */
#define GTYPE (GET(GHDR.ttype, TTMASK)) /* Token type of current group. */
#define GOCC (GET(GHDR.ttype, TOREP)) /* Occurrence for current group. */
#define GNUM GHDR.tu.tnum /* Number of tokens in current grp. */

View File

@ -0,0 +1,40 @@
/* dosproc.c -
MS-DOS implementation of run_process().
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#ifdef SUPPORT_SUBDOC
#include "std.h"
#include "entity.h"
#include "appl.h"
#include <process.h>
int run_process(argv)
char **argv;
{
int ret;
fflush(stdout);
fflush(stderr);
ret = spawnvp(P_WAIT, argv[0], argv);
if (ret < 0)
appl_error(E_EXEC, argv[0], strerror(errno));
return ret;
}
#endif /* SUPPORT_SUBDOC */
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,42 @@
/* ASCII to EBCDIC (ISO 8859-1 to IBM CP 37v2) table. */
/* Contributed by C. M. Sperberg-McQueen <u35395@uicvm.uic.edu>. */
/* The mapping must be 1 to 1. The positions of *CHAR and *CH in the table
must not be changed, although the values in ebcdic.h can be. */
#include "ebcdic.h"
unsigned char charset[] = {
0, 1, 2, 3, 55, 45, 46, 47,
GENRECHAR, TABCHAR, RSCHAR, 11, 12, RECHAR, 14, 15,
16, 17, 18, 19, 60, 61, 50, 38,
24, 25, EOFCHAR, 39, EOBCHAR, DELCDATA, DELSDATA, DELNONCH,
SPCCHAR, 90, 127, 123, 91, 108, 80, 125,
77, 93, 92, 78, 107, 96, 75, 97,
240, 241, 242, 243, 244, 245, 246, 247,
248, 249, 122, 94, 76, 126, 110, 111,
124, 193, 194, 195, 196, 197, 198, 199,
200, 201, 209, 210, 211, 212, 213, 214,
215, 216, 217, 226, 227, 228, 229, 230,
231, 232, 233, 173, 224, 189, 176, 109,
121, 129, 130, 131, 132, 133, 134, 135,
136, 137, 145, 146, 147, 148, 149, 150,
151, 152, 153, 162, 163, 164, 165, 166,
167, 168, 169, 192, 79, 208, 161, 7,
4, 6, 8, 9, 10, 20, 21, 23,
26, 27, 32, 33, 34, 35, 36, 40,
41, 42, 43, 44, 48, 49, 51, 52,
53, 54, 56, 57, 58, 59, 62, 255,
65, 170, 74, 177, 159, 178, 106, 181,
187, 180, 154, 138, 95, 202, 175, 188,
144, 143, 234, 250, 190, 160, 182, 179,
157, 218, 155, 139, 183, 184, 185, 171,
100, 101, 98, 102, 99, 103, 158, 104,
116, 113, 114, 115, 120, 117, 118, 119,
172, 105, 237, 238, 235, 239, 236, 191,
128, 253, 254, 251, 252, 186, 174, 89,
68, 69, 66, 70, 67, 71, 156, 72,
84, 81, 82, 83, 88, 85, 86, 87,
140, 73, 205, 206, 203, 207, 204, 225,
112, 221, 222, 219, 220, 141, 142, 223,
};

View File

@ -0,0 +1,40 @@
/* SGML Character Use: EBCDIC
*/
#define EOFCHAR '\077' /* FUNCTION: EE (entity end: files). */
#define EOBCHAR '\034' /* NONCHAR: EOB (file entity: end of buffer. */
#define RSCHAR '\045' /* FUNCTION: RS (record start). */
#define RECHAR '\015' /* FUNCTION: RE (record end). */
#define TABCHAR '\005' /* FUNCTION: TAB (horizontal tab). */
#define SPCCHAR '\100' /* FUNCTION: SPACE (horizontal space). */
#define GENRECHAR '\026' /* NONCHAR: Generated RE. */
#define DELCDATA '\035' /* NONCHAR: Delimiter for CDATA entity in
attribute value. */
#define DELSDATA '\036' /* NONCHAR: Delimiter for SDATA entity in
attribute value. */
#define DELNONCH '\037' /* NONCHAR: non-SGML character prefix. */
/* This should work for EBCDIC. See comment in latin1.h. */
#define SHIFTNON(ch) ((UNCH)(ch) | 0200)
#define UNSHIFTNON(ch) ((UNCH)(ch) & ~0200)
/* See comment in latin1.h. */
#define CANON_NONSGML 255
/* See comment in latin1.h. */
#define CANON_DATACHAR 254
/* Components for a formal public identifier for the whole of the
system character set. Protect with ifndef so that it can be overriden
in config.h. */
/* Use a private escape sequence. */
#ifndef SYSTEM_CHARSET_DESIGNATING_SEQUENCE
#define SYSTEM_CHARSET_DESIGNATING_SEQUENCE "ESC 2/5 2/15 3/0"
#endif
#ifndef SYSTEM_CHARSET_OWNER
#define SYSTEM_CHARSET_OWNER "-//IBM"
#endif
#ifndef SYSTEM_CHARSET_DESCRIPTION
#define SYSTEM_CHARSET_DESCRIPTION "Code Page 1047"
#endif

View File

@ -0,0 +1,405 @@
/* entgen.c -
Implement entgen() which generates a list of filenames from a struct fpi.
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#ifdef HAVE_ACCESS
#ifdef HAVE_UNISTD_H
#include <unistd.h> /* For R_OK. */
#endif /* HAVE_UNISTD_H */
#ifndef R_OK
#define R_OK 4
#endif /* not R_OK */
#endif /* HAVE_ACCESS */
#include "sgmlaux.h"
/* Environment variable that contains path. */
#ifndef PATH_ENV_VAR
#define PATH_ENV_VAR "SGML_PATH"
#endif
/* Default search path. See field() for interpretation of %*. */
#ifndef DEFAULT_PATH
#define DEFAULT_PATH "/usr/local/lib/sgml/%O/%C/%T:%N.%X:%N.%D"
#endif
#ifndef PATH_FILE_SEP
#define PATH_FILE_SEP ':'
#endif
#ifndef SYSID_FILE_SEP
#define SYSID_FILE_SEP ':'
#endif
/* This says: change space to underscore, slash to percent. */
#ifndef MIN_DAT_SUBS_FROM
#define MIN_DAT_SUBS_FROM " /"
#endif
#ifndef MIN_DAT_SUBS_TO
#define MIN_DAT_SUBS_TO "_%"
#endif
static int field P((struct fpi *, int, char *));
static int mindatcpy P((char *, char *, int, int));
static int testopen P((char *));
static UNIV sysidgen P((char *));
static char *path = 0;
/* Non-zero if searching should be performed when a system identifier
is specified. */
static int sysidsrch = 0;
#define EMPTY_VERSION "default"
static char *classes[] = {
"capacity",
"charset",
"notation",
"syntax",
"document",
"dtd",
"elements",
"entities",
"lpd",
"nonsgml",
"shortref",
"subdoc",
"text"
};
/* This is mainly for compatibility with arcsgml. */
static char *genext[] = {
"nsd", /* Non-SGML data entity. */
"gml", /* GML document or text entity. */
"spe", /* System parameter entity. */
"dtd", /* Document type definition. */
"lpd", /* Link process definition. */
"pns", /* Public non-SGML data entity. */
"pge", /* Public general entity. */
"ppe", /* Public parameter entity. */
"pdt", /* Public document type definition. */
"plp", /* Public link process definition. */
"vns", /* Display version non-SGML data entity. */
"vge", /* Display version general entity. */
"vpe", /* Display version parameter entity. */
"vdt", /* Display version document type definition.*/
"vlp", /* Display version link process definition.*/
};
static char *ext[] = {
"sgml", /* SGML subdocument */
"data", /* Data */
"text", /* General text */
"parm", /* Parameter entity */
"dtd", /* Document type definition */
"lpd", /* Link process definition */
};
/* Like memcpy, but substitute, fold to lower case (if fold is
non-zero) and null terminate. This is used both for minimum data and
for names. If p is NULL, do nothing. Return len. */
static int mindatcpy(p, q, len, fold)
char *p, *q;
int len;
int fold;
{
static char subsfrom[] = MIN_DAT_SUBS_FROM;
static char substo[] = MIN_DAT_SUBS_TO;
int n;
if (!p)
return len;
for (n = len; --n >= 0; q++) {
char *r = strchr(subsfrom, *q);
if (!r) {
if (fold && ISASCII(*q) && isupper((UNCH)*q))
*p++ = tolower((UNCH)*q);
else
*p++ = *q;
}
else {
int i = r - subsfrom;
if (i < sizeof(substo) - 1)
*p++ = substo[i];
}
}
*p = '\0';
return len;
}
/* Return length of field. Copy into buf if non-NULL. */
static int field(f, c, buf)
struct fpi *f;
int c;
char *buf;
{
int n;
switch (c) {
case '%':
if (buf) {
buf[0] = '%';
buf[1] = '\0';
}
return 1;
case 'N': /* the entity, document or dcn name */
return mindatcpy(buf, (char *)f->fpinm, ustrlen(f->fpinm),
(f->fpistore != 1 && f->fpistore != 2 && f->fpistore != 3
? NAMECASE
: ENTCASE));
case 'D': /* dcn name */
if (f->fpistore != 1) /* not a external data entity */
return -1;
if (f->fpinedcn == 0) /* it's a SUBDOC */
return -1;
return mindatcpy(buf, (char *)f->fpinedcn, ustrlen(f->fpinedcn),
NAMECASE);
case 'X':
/* This is for compatibility with arcsgml */
if (f->fpistore < 1 || f->fpistore > 5)
return -1;
n = (f->fpipubis != 0)*(f->fpiversw > 0 ? 2 : 1)*5+f->fpistore - 1;
if (buf)
strcpy(buf, genext[n]);
return strlen(genext[n]);
case 'Y': /* tYpe */
n = f->fpistore;
if (n < 1 || n > 5)
return -1;
if (n == 1 && f->fpinedcn == 0) /* it's a SUBDOC */
n = 0;
if (buf)
strcpy(buf, ext[n]);
return strlen(ext[n]);
case 'P': /* public identifier */
if (!f->fpipubis)
return -1;
return mindatcpy(buf, (char *)f->fpipubis, ustrlen(f->fpipubis), 0);
case 'S': /* system identifier */
if (!f->fpisysis)
return -1;
else {
UNCH *p;
n = 0;
for (p = f->fpisysis; *p; p++)
if (*p != RSCHAR) {
if (buf)
buf[n] = *p == RECHAR ? '\n' : *p;
n++;
}
return n;
}
}
/* Other fields need a formal public identifier. */
/* return -1 if the formal public identifier was invalid or missing. */
if (f->fpiversw < 0 || !f->fpipubis)
return -1;
switch (c) {
case 'A': /* Is it available? */
return f->fpitt == '+' ? 0 : -1;
case 'I': /* Is it ISO? */
return f->fpiot == '!' ? 0 : -1;
case 'R': /* Is it registered? */
return f->fpiot == '+' ? 0 : -1;
case 'U': /* Is it unregistered? */
return f->fpiot == '-' ? 0 : -1;
case 'L': /* public text language */
if (f->fpic == FPICHARS)
return -1;
/* it's entered in all upper case letters */
return mindatcpy(buf, (char *)f->fpipubis + f->fpil, f->fpill, 1);
case 'O': /* owner identifier */
return mindatcpy(buf, (char *)f->fpipubis + f->fpio, f->fpiol, 0);
case 'C': /* public text class */
n = f->fpic - 1;
if (n < 0 || n >= sizeof(classes)/sizeof(classes[0]))
return -1;
if (buf)
strcpy(buf, classes[n]);
return strlen(classes[n]);
case 'T': /* text description */
return mindatcpy(buf, (char *)f->fpipubis + f->fpit, f->fpitl, 0);
case 'V':
if (f->fpic < FPICMINV) /* class doesn't have version */
return -1;
if (f->fpiversw > 0) /* no version */
return -1;
if (f->fpivl == 0) { /* empty version: */
/* use device-independent version*/
if (buf)
strcpy(buf, EMPTY_VERSION);
return strlen(EMPTY_VERSION);
}
return mindatcpy(buf, (char *)f->fpipubis + f->fpiv, f->fpivl, 0);
case 'E': /* public text designating (escape) sequence */
if (f->fpic != FPICHARS)
return -1;
return mindatcpy(buf, (char *)f->fpipubis + f->fpil, f->fpill, 0);
default:
break;
}
return -1;
}
static int testopen(pathname)
char *pathname;
{
#ifdef HAVE_ACCESS
return access(pathname, R_OK) >= 0;
#else /* not HAVE_ACCESS */
FILE *fp;
fp = fopen(pathname, "r");
if (!fp)
return 0;
fclose(fp);
return 1;
#endif /* not HAVE_ACCESS */
}
/* Return a pointer to an dynamically-allocated buffer that contains
the names of the files containing this entity, with each filename
terminated by a '\0', and with the list of filenames terminated by
another '\0'. */
UNIV entgen(f)
struct fpi *f;
{
char *file;
assert(f->fpistore != 6); /* Musn't call entgen for a notation. */
if (!path) {
char *p;
char c;
path = getenv(PATH_ENV_VAR);
if (!path)
path = DEFAULT_PATH;
p = path;
/* Only search for system identifiers if path uses %S. */
while ((c = *p++) != '\0')
if (c == '%') {
if (*p == 'S') {
sysidsrch = 1;
break;
}
if (*p != '\0' && *p != PATH_FILE_SEP)
p++;
}
}
if (f->fpisysis
&& (!sysidsrch
|| strchr((char *)f->fpisysis, SYSID_FILE_SEP)
|| strcmp((char *)f->fpisysis, STDINNAME) == 0))
return sysidgen((char *)f->fpisysis);
file = path;
for (;;) {
char *p;
int len = 0;
char *fileend = strchr(file, PATH_FILE_SEP);
if (!fileend)
fileend = strchr(file, '\0');
/* Check that all substitutions are non-null, and calculate
the resulting total length of the filename. */
for (p = file; p < fileend; p++)
if (*p == '%') {
int n;
/* Set len to -1 if a substitution is invalid. */
if (++p >= fileend) {
len = -1;
break;
}
n = field(f, *p, (char *)0);
if (n < 0) {
len = -1;
break;
}
len += n;
}
else
len++;
if (len > 0) {
/* We've got a valid non-empty filename. */
char *s;
char *buf;
s = buf = (char *)rmalloc(len + 2);
for (p = file; p < fileend; p++)
if (*p == '%')
s += field(f, *++p, s);
else
*s++ = *p;
*s++ = '\0';
if (testopen(buf)) {
/* Terminate the array of filenames. */
*s++ = '\0';
return buf;
}
free((UNIV)buf);
}
if (*fileend == '\0')
break;
file = ++fileend;
}
return 0;
}
/* Handle a system identifier without searching. */
static
UNIV sysidgen(s)
char *s;
{
char *buf, *p;
buf = (char *)rmalloc(strlen(s) + 2);
for (p = buf; *s; s++) {
if (*s == SYSID_FILE_SEP) {
if (p > buf && p[-1] != '\0')
*p++ = '\0';
}
else if (*s == RECHAR)
*p++ = '\n';
else if (*s != RSCHAR)
*p++ = *s;
}
/* Terminate this filename. */
if (p > buf && p[-1] != '\0')
*p++ = '\0';
if (p == buf) {
/* No filenames. */
frem((UNIV)buf);
return 0;
}
/* Terminate the list. */
*p++ = '\0';
return buf;
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,189 @@
/* Struct dcncb: attribute list added to support data attributes. */
#ifndef ENTITY_H /* Don't include this file more than once. */
#define ENTITY_H
/* ENTITY.H: Definitions and control block templates for entity management.
*/
#include "tools.h" /* Definitions for type declarations, etc. */
#include "msgcat.h"
#define STDINNAME "-" /* File name that refers to standard input. */
#define EOS '\0' /* NONCHAR: EE (entity end: strings). */
#define AVALCASE 2 /* 2=untranslated string of name characters. */
#define REFNAMELEN 8 /* reference quantity set NAMELEN */
#define REFLITLEN 240 /* reference quantity set LITLEN */
/* Minimization status of returned tags.
*/
#define MINNONE 0 /* Minimization: tag not minimized. */
#define MINNULL 1 /* Minimization: tag was null. */
#define MINNET 2 /* Minimization: end-tag was NET delimiter. */
#define MINDATA 3 /* Minimization: end-tag was data tag. */
#define MINSTAG 4 /* Minimization: tag implied by start-tag. */
#define MINETAG 5 /* Minimization: end-tag implied by end-tag. */
/* Formal public identifier public text classes.
*/
#define FPICAP 1
#define FPICHARS 2
#define FPINOT 3
#define FPISYN 4
#define FPICMINV 5 /* Minimum fpic value for versionable text. */
#define FPIDOC 5
#define FPIDTD 6
#define FPIELEM 7
#define FPIENT 8
#define FPILPD 9
#define FPINON 10
#define FPISHORT 11
#define FPISUB 12
#define FPITEXT 13
struct fpi { /* Formal public identifier. */
UNCH fpiot; /* Owner type: + or - or ! (for ISO). */
UNS fpiol; /* Length of owner identifier. */
UNS fpio; /* Offset in pubis of owner identifier (no EOS).*/
int fpic; /* Public text class. */
UNCH fpitt; /* Text type: - or + (for available). */
UNS fpitl; /* Length of text identifier. */
UNS fpit; /* Offset in pubis of text identifier (no EOS). */
UNS fpill; /* Language/designating sequence length. */
UNS fpil; /* Offset in pubis of language. */
UNS fpivl; /* Length of display version . */
UNS fpiv; /* Offset in pubis of display version (no EOS). */
int fpiversw; /* 1=use best ver; 0=use stated ver; -1=error. */
UNCH *fpinm; /* Entity/DCN name (EOS, no length). */
UNCH fpistore; /* 1=NDATA 2=general 3=parm 4=DTD 5=LPD 6=DCN. */
/* Name of the entity's DCN. Valid only when fpistore == 1.
NULL if it's a SUBDOC. */
UNCH *fpinedcn;
UNCH *fpipubis; /* Public ID string (EOS). */
UNCH *fpisysis; /* System ID string (EOS). */
};
#define FPISZ sizeof(struct fpi)
typedef struct fpi *PFPI; /* Ptr to FPI control block. */
/* General control blocks.
*/
#define NONONCH 1 /* Character references to non-chars invalid. */
#define OKNONCH 0 /* Character references to non-chars allowed. */
struct parse { /* Parse control block. */
char *pname; /* Parse name; content, tag, etc. */
UNCH *plex; /* Lexical analysis table. */
UNCH **ptab; /* State and action table. */
UNS state; /* State. */
UNS input; /* Input. */
UNS action; /* Action. */
UNS newstate; /* Next state. */
};
struct restate {
UNS sstate; /* State. */
UNS sinput; /* Input. */
UNS saction; /* Action. */
UNS snext; /* Next state. */
};
struct map {
UNCH *mapnm; /* Name followed by EOS. */
int mapdata; /* Data associated with that name. */
};
struct hash { /* Dummy structure for function arguments. */
struct hash *enext; /* Next entry in chain. */
UNCH *ename; /* Entry name with size and EOS. */
};
typedef struct hash *PHASH; /* Ptr to hash table entry. */
typedef struct hash **THASH; /* Ptr to hash table. */
struct fwdref { /* A forward id reference. */
struct fwdref *next; /* Pt to next reference in chain. */
UNIV msg; /* Ptr to saved error messsage. */
};
#define FWDREFSZ sizeof(struct fwdref)
struct dcncb { /* Data content notation control block. */
struct dcncb *enext; /* Next DCN in chain. */
UNCH *ename; /* Notation name followed by EOS. */
UNCH mark; /* For use by application. */
UNCH entsw; /* Entity defined with this notation? */
UNCH defined; /* Has this notation been defined. */
UNCH *sysid; /* System identifier of notation. */
UNCH *pubid; /* Public identifier of notation. */
struct ad *adl; /* Data attribute list (NULL if none). */
};
#define DCBSZ sizeof(struct dcncb)
#define DCNMARK(p) ((p)->mark ? 1 : ((p)->mark = 1, 0))
typedef struct dcncb *PDCB; /* Ptr to DCN control block. */
/* Number of capacities in a capacity set. */
#define NCAPACITY 17
struct sgmlcap {
char **name;
UNCH *points;
long *number;
long *limit;
};
struct sgmlstat { /* Document statistics. */
UNS dcncnt; /* Number of data content notations defined. */
UNS pmexgcnt; /* Number of plus or minus exception groups. */
UNS etdcnt; /* Number of element types declared. */
UNS etdercnt; /* Number of element types defined by default. */
UNS pmexcnt; /* Number of plus/minus exception grp members. */
UNS modcnt; /* Number of content model tokens defined. */
UNS attcnt; /* Number of attributes defined. */
UNS attdef; /* Characters of attribute defaults defined. */
UNS attgcnt; /* Number of att value grp members (incl dcn). */
UNS idcnt; /* Number of ID attributes specified. */
UNS idrcnt; /* Number of ID references specified. */
UNS ecbcnt; /* Number of entities declared. */
UNS ecbtext; /* Characters of entity text defined. */
UNS srcnt; /* Number of short reference tables defined. */
UNS dcntext; /* Characters of notation identifiers defined. */
};
struct switches { /* Parser control switches (1=non-standard). */
int swdupent; /* 1=msg if duplicate ENTITY def attempted;0=no.*/
int swcommnt; /* 1=return comment declarations as data; 0=no. */
int swrefmsg; /* 1=msg if undeclared ref is defaulted; 0=no. */
UNS swbufsz; /* Size of source file buffer for READ(). */
int swenttr; /* 1=trace entity stack in error messages; 0=no.*/
int sweltr; /* 1=trace element stack in error messages; 0=no. */
int swambig; /* 1=check content model ambiguity */
int swundef; /* 1=warn about undefined elements and notations. */
char *prog; /* Program name for error messages. */
#ifdef TRACE
char *trace; /* What to trace in the body. */
char *ptrace; /* What to trace in the prolog. */
#endif /* TRACE */
nl_catd catd; /* Message catalog descriptor. */
long nopen; /* Number of open document entities */
int onlypro; /* Parse only the prolog. */
char **includes; /* List of parameter entities to be defined
as "INCLUDE"; NULL terminated.*/
VOID (*die) P((void)); /* Function to call on fatal error. */
};
struct markup { /* Delimiter strings for text processor. */
UNCH *cro; /* LEXCON markup string: CRO */
UNCH *dso; /* LEXCON markup string: DSO */
UNCH *ero; /* LEXCON markup string: ERO */
UNCH *etag; /* LEXMARK markup string: end-tag */
UNCH *lit; /* LEXMARK markup string: LIT */
UNCH *lita; /* LEXMARK markup string: LITA */
UNCH *mdc; /* LEXCON markup string: MDC */
UNCH *mdo; /* LEXCON markup string: MDO */
UNCH *mse; /* LEXCON markup string: mse */
UNCH *mss; /* LEXCON markup string: mss */
UNCH *mssc; /* LEXCON markup string: mss CDATA */
UNCH *mssr; /* LEXCON markup string: mss RCDATA */
UNCH *pic; /* LEXCON markup string: PIC */
UNCH *pio; /* LEXCON markup string: PIO */
UNCH *refc; /* LEXGRP markup string: REFC */
UNCH *stag; /* LEXMARK markup string: start-tag */
UNCH *tagc; /* LEXMARK markup string: TAGC */
UNCH *vi; /* LEXMARK markup string: VI */
int lennet; /* LEXMARK markup string length: null end-tag. */
int lennst; /* LEXMARK markup string length: null start-tag.*/
};
#endif /* ndef ENTITY_H */

View File

@ -0,0 +1,61 @@
/* ERROR.H: Symbols for SGML error codes (start with 'E_').
Numbers 46 - 56 are generated by ERROR.C.
Later numbers are coded directly.
*/
/* SGMLERR.C: General errors and syntax errors.
*/
#define E_CONTEXT 1 /* W GI not allowed at this point in structure. */
#define E_MDNAME 2 /* E Invalid markup declaration name. */
/*efine E_LEN 3 E Syntax error: length exceeded. */
#define E_SYS 4 /* W Illegal system character. */
#define E_ETAG 5 /* E End-tag does not match any open start-tag. */
#define E_STAGMAX 6 /* E Maximum number of open elements exceeded. */
/* E_ALLNULL 7 W Start- and end-tag omitted with null content. */
#define E_EOF 8 /* E/W Illegal entity end in markup or delimited text. */
/* fine E_INV 9 E Markup error: invalid character. */
#define E_CHARS 10 /* W Data found in content that doesn't allow it. */
/* fine E_NOETDE 11 E End-tag GI not defined by element declaration. */
#define E_BADNM 12 /* E Name is not syntactically valid. */
#define E_BADATT 13 /* E Attribute was not defined by element declaration. */
#define E_VALINV 14 /* W Att value/declaration conflict: invalid char. */
#define E_VALLEN 15 /* W Att value/declaration conflict: token too long. */
#define E_VALCNT 16 /* W Att value/declaration conflict: too many tokens. */
#define E_VALTYPE 17 /* W Att value/declaration conflict: wrong token type.*/
#define E_VALGRP 18 /* W Att value/declaration conflict: token not in grp.*/
#define E_VALREQ 19 /* W Att value/declaration conflict: req unspecified. */
/* E_EMIN 20 W End-tag implied by end-tag; not minimizable. */
/* E_SMIN 21 W Omitted start-tag was not minimizable. */
#define E_POSSATT 22 /* E Possible att found but not defined; used as data.*/
/* Late additions numbered out of order to avoid recompilation. */
/*efine E_ENTSYNC 37 E Entity and group nesting levels out of sync. */
#define E_BADVAL 25 /* W Att value omitted (null); default used. */
/* E_ECONTXT 30 W Element ended prematurely (some content omitted).*/
/* E_EMINST 39 W End-tag implied by start-tag; not minimizable. */
/* E_MEXTAG 40 W *** In Use *** */
#define E_MEXERR 41 /* W Attempt to exclude contextually required element.*/
#define E_DOCTYPE 42 /* W No document type defined; *DOCTYPE assumed. */
/* E_NOETDS 43 E Start-tag GI not defined by element declaration. */
#define E_RESTART 44 /* E Invalid chars ignored; trying to restart parse. */
/* MDERROR.C: Errors in markup declarations.
*/
/*efine E_DUP 23 E Duplicate specification. */
/*efine E_KEY 24 E Incorrect keyword for parameter. */
/*efine E_MSE 26 E MSE occurred with no corresponding MS. */
/*efine E_MSS 27 E MSS exceeded maximum nesting level. */
/*efine E_NUM 28 E Incorrect number of parameters. */
#define E_TYPE 29 /* E Incorrect parameter type. */
/* Late additions numbered out of order to avoid recompilation. */
/*efine E_VAL 38 W Incorrect parameter value. */
/* RESERROR.C: Errors in resource routines.
*/
/* Unused I End of primary source entity. */
/* fine E_FILBUF 31 E Could not read next buffer. */
/* fine E_ERFILE 32 E Could not open file. */
/* fine E_MALLOC 33 T Could not obtain required main storage. */
/* fine E_ERMAX 34 E Maximum number of open entities exceeded. */
/* fine E_ERNAME 35 E Referenced entity undeclared. */
/* fine E_ERLOOP 36 E Entity referenced within itself: ref ignored. */
/* Late additions numbered out of order to avoid recompilation. */
/* E_ERDEF 45 E Referenced entity undeclared; SYSTEM assumed. */

View File

@ -0,0 +1,91 @@
/* ETYPE.H: Definitions for element type and group processing.
*/
#define MCHARS 0x80 /* Model: contains #CHARS. */
#define MGI 0x40 /* Model: contains GI names. */
#define MPHRASE 0x20 /* Model: first token is #CHARS. */
#define MKEYWORD 0x1F /* Model: defined with single keyword. */
#define MNONE 0x10 /* Model: contains no GIs or #CHARS. */
#define MANY 0x08 /* Model: contains any GIs or #CHARS. */
#define MRCDATA 0x04 /* Model: contains RCDATA. */
#define MCDATA 0x02 /* Model: contains CDATA. */
#define TOREP (TOPT+TREP) /* 11000000 Optional and repeatable. */
#define TOPT 0x80 /* Token: 1=optional; 0=required. */
#define TREP 0x40 /* Token: 1=repeatable; 0=not. */
#define TXOREP (TXOPT+TXREP) /* * explicitly specified */
#define TXOPT 0x20 /* ? explicitly specified */
#define TXREP 0x10 /* + explicitly specified */
#define TTMASK 0x0F /* 00001111 Mask for testing token type. */
#define TTETD 4 /* 00000100 Token is an ETD. */
#define TTAND 3 /* 00000011 Token is an AND group. */
#define TTSEQ 2 /* 00000010 Token is a sequence group. */
#define TTOR 1 /* 00000001 Token is an OR group. */
#define TTCHARS 0 /* 00000000 Token is #CHARS. */
struct thdr { /* Token header or model header. */
UNCH ttype; /* Token type attributes or model content. */
union {
int tnum; /* Group token: tokens in group.
Model header: content tokens at any level. */
struct etd *thetd; /* GI token: ptr to etd. */
} tu;
};
#define THSZ (sizeof(struct thdr))
#define ETDHASH 211 /* Size of element hash table. Must be prime. */
#define SMO 0x40 /* ETDMIN: Start-tag O minimization. */
#define EMO 0x04 /* ETDMIN: End-tag O minimization. */
#define EMM 0x02 /* ETDMIN: End-tag minimization explicitly
specified to be minus */
#define ETDDCL 0x80 /* ETDMIN: Element was declared. */
#define ETDUSED 0x20 /* ETDMIN: Element used in another declaration. */
#define ETDOCC 0x10 /* ETDMIN: Element occurred in document. */
struct etd { /* Element type definition. */
struct etd *etdnext; /* Next element type definition in hash chain. */
UNCH *etdgi; /* GI preceded by its length, followed by EOS. */
UNCH etdmin; /* Flag bits: minimization. */
UNCH mark; /* Mark bit: for ambiguity checking */
struct thdr *etdmod; /* Content model. */
struct etd **etdmex; /* Minus exceptions. */
struct etd **etdpex; /* Plus exceptions. */
struct ad *adl; /* Attribute descriptor list. */
struct entity **etdsrm; /* Short reference map. */
};
#define ETDSZ (sizeof(struct etd))
typedef struct etd *PETD;
extern struct etd dumetd[];
/* Number of bits in a long must be >= 1<<LONGPOW */
#define LONGPOW 5
#define LONGBITS (1<<LONGPOW)
struct mpos { /* Position of current element in model. */
UNCH g; /* Index of this group in the model. */
UNCH t; /* Index of the current token in this group. */
unsigned long *h; /* Hit bits of this group's tokens. */
};
#define HITCLEAR(h) MEMZERO((UNIV)(h), grplongs*sizeof(unsigned long))
#define TAGCONER 0x01 /* 00000001 (contersw) Tag was out of context. */
#define TAGNET 0x02 /* 00000010 (etisw) Tag has NET enabled. */
#define TAGPEX 0x04 /* 00000100 (pexsw) Tag was plus exception. */
#define TAGREF 0x08 /* 00001000 (conrefsw) Tag had CONREF or EMPTY.*/
struct tag { /* Tag control block. */
UNCH status; /* Status of context check. */
UNCH tflags; /* Flags: TAGCONER TAGNET TAGPEX TAGREF */
struct etd *tetd; /* Element type definition for tag. */
struct entity **tsrm; /* Current short reference map. */
struct mpos *tpos; /* Position of next tag in this model. */
};
#define RCEND 1 /* No more tokens: end element and retry GI. */
#define RCREQ 2 /* Required GI must precede proposed GI. */
#define RCMISS 3 /* GI invalid: not element end; no required GI. */
#define RCHIT 4 /* GI is the one expected next. */
#define RCMEX 5 /* GI invalid: minus exception. */
#define RCHITMEX 6 /* RCMEX with invalid attempted minus exclusion.*/
#define RCPEX 7 /* GI is valid solely because of plus exclusion.*/
#define RCNREQ 8 /* Token is not required; can retry invalid GI. */

View File

@ -0,0 +1,121 @@
/* exclude.c -
Exclusion checking.
Written by James Clark (jjc@jclark.com).
*/
#include "sgmlincl.h"
static int excktok P((struct thdr *, int, int *));
static int exmark P((int));
/* Check that the current exclusions are legal for the content model
of the current element. */
VOID exclude()
{
struct thdr *mod = tags[ts].tetd->etdmod;
if ((mod->ttype & MKEYWORD) == 0 && exmark(1)) {
int excl;
excktok(mod + 1, 0, &excl);
exmark(0);
}
}
/* Set the mark field of all current exclusions to val. Return 1 if
there are some current exclusions. */
static
int exmark(val)
int val;
{
int i;
int gotone = 0;
for (i = ts; i > 0; --i) {
struct etd **p = tags[i].tetd->etdmex;
if (p) {
for (; *p; p++)
(*p)->mark = val;
gotone = 1;
}
}
return gotone;
}
/* Check exclusions for this token. Return size of token. */
static
int excktok(t, orgrp, excl)
struct thdr *t;
int orgrp; /* 1 if token is member of or group */
int *excl; /* Set to 1 if token is excluded. */
{
int size;
struct thdr *tem;
int tnum;
int optional = 0;
int hadopt, hadreq;
*excl = 0;
switch (t->ttype & TTMASK) {
case TTETD:
if (t->tu.thetd->mark) {
if (orgrp || (t->ttype & TOPT))
*excl = 1;
else
sgmlerr(217, &pcbstag, t->tu.thetd->etdgi + 1,
tags[ts].tetd->etdgi + 1);
}
/* fall through */
case TTCHARS:
size = 1;
break;
case TTOR:
case TTAND:
case TTSEQ:
tem = t + 1;
hadopt = 0;
hadreq = 0;
for (tnum = t->tu.tnum; tnum > 0; --tnum) {
int ex;
int n = excktok(tem, (t->ttype & TTMASK) == TTOR, &ex);
if (!ex) {
if (tem->ttype & TOPT)
hadopt = 1;
else
hadreq = 1;
}
tem += n;
}
size = tem - t;
if ((t->ttype & TTMASK) == TTOR)
optional = hadreq ? hadopt : 1;
else
optional = !hadreq;
break;
default:
abort();
}
/* Was required, but exclusions have made it optional.
eg <!element foo - - (a | b) -(a, b)> */
if (optional && !(t->ttype & TOPT))
sgmlerr(216, &pcbstag, tags[ts].tetd->etdgi + 1, (UNCH *)0);
return size;
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,114 @@
/* genlex: Generate lexical tables for non-ASCII charsets. */
#include "config.h"
#include "std.h"
#include "tools.h"
#define CANON_ASCII_NONSGML 255 /* Canonical non-SGML character in ASCII. */
#define CANON_ASCII_DATACHAR 254 /* Canonical DATACHAR in ASCII. */
extern unsigned char charset[];
extern UNCH *lextabs[];
extern UNCH lextran[];
static char *lextabnames[] = {
"lexcnm", "lexcon", "lexgrp", "lexlms", "lexmark", "lexsd", "lextoke"
};
static VOID print_tab(s, t)
char *s;
UNCH *t;
{
int i;
printf("UNCH %s[] = {\n", s);
for (i = 0; i < 256; i++)
printf("%2d,%c", t[i], (i + 1) % 16 == 0 ? '\n' : ' ');
fputs("};\n\n", stdout);
}
int main(argc, argv)
int argc;
char **argv;
{
int i;
UNCH tab[256];
char special[256];
/* Shunned character numbers in the reference concrete syntax. */
static UNCH refshun[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 127, 255
};
char shunned[256];
char *program_name;
program_name = strrchr(argv[0], '/');
if (program_name)
program_name++;
else
program_name = argv[0];
/* Check that the mapping is 1-1. */
for (i = 0; i < 256; i++)
tab[i] = 0;
for (i = 0; i < 256; i++)
tab[charset[i]] = 1;
for (i = 0; i < 256; i++)
if (!tab[i]) {
fprintf(stderr, "%s: bad mapping: no character mapped to %d\n",
program_name, i);
exit(EXIT_FAILURE);
}
/* Compute special. */
for (i = 0; i < 256; i++)
special[i] = 0;
for (i = 0; lextabs[i]; i++) {
int j;
for (j = 0; j < 256; j++)
if (lextabs[i][j] != lextabs[i][CANON_ASCII_NONSGML]
&& lextabs[i][j] != lextabs[i][CANON_ASCII_DATACHAR])
special[charset[j]] = 1;
}
/* Compute shunned. */
for (i = 0; i < 256; i++)
shunned[i] = 0;
for (i = 0; i < sizeof(refshun); i++)
shunned[refshun[i]] = 1;
printf("/* This file was automatically generated by %s. Do not edit. */\n\n",
program_name);
fputs("#include \"config.h\"\n#include \"entity.h\"\n#include \"sgmldecl.h\"\n\n",
stdout);
/* Generate each of the lexical tables. */
for (i = 0; lextabs[i]; i++) {
int j;
for (j = 0; j < 256; j++)
tab[charset[j]] = lextabs[i][j];
for (j = 0; j < 256; j++)
if (!special[j]) {
if (shunned[j])
tab[j] = lextabs[i][CANON_ASCII_NONSGML];
else
tab[j] = lextabs[i][CANON_ASCII_DATACHAR];
}
print_tab(lextabnames[i], tab);
}
/* Generate lextran. */
for (i = 0; i < 256; i++)
tab[charset[i]] = charset[lextran[i]];
print_tab("lextran", tab);
/* Generate asciicharset. */
fputs("int asciicharset[] = {\n", stdout);
for (i = 0; i < 128; i++)
printf("%3d,%c", charset[i], (i + 1) % 16 == 0 ? '\n' : ' ');
for (i = 128; i < 256; i++)
printf("UNUSED,%c", (i + 1) % 8 == 0 ? '\n' : ' ');
fputs("};\n", stdout);
exit(EXIT_SUCCESS);
}

View File

@ -0,0 +1,166 @@
/* getopt.c -
getopt() for those systems that don't have it.
Derived from comp.sources.unix/volume3/att_getopt.
Modified by James Clark (jjc@jclark.com).
*/
#include "config.h"
#ifndef HAVE_GETOPT
#include "std.h"
#include "getopt.h"
#ifdef SWITCHAR
#include <dos.h>
#endif
int opterr = 1;
int optind = 1;
int optopt;
char *optarg;
#ifndef OPTION_CHAR
#define OPTION_CHAR '-'
#endif
int getopt(argc, argv, opts)
int argc;
char **argv;
char *opts;
{
#ifdef SWITCHAR
union REGS regs;
static char switchar = '\0';
#endif
static int sp = 1;
register int c;
register char *cp;
char *message;
#ifdef SWITCHAR
if (switchar == '\0') {
regs.x.ax = 0x3700;
intdos(&regs, &regs);
if (!regs.x.cflag)
switchar = regs.h.dl;
else
switchar = '/';
}
#endif
if (sp == 1) {
if (optind >= argc)
return EOF;
if ((
#ifdef SWITCHAR
argv[optind][0] != switchar &&
#endif
argv[optind][0] != OPTION_CHAR) || argv[optind][1] == '\0') {
#ifdef REORDER_ARGS
int i;
for (i = optind; i < argc; i++)
if ((
#ifdef SWITCHAR
argv[i][0] == switchar ||
#endif
argv[i][0] == OPTION_CHAR) && argv[i][1] != '\0')
break;
if (i < argc) {
c = argv[i][1];
#ifdef CASE_INSENSITIVE_OPTIONS
if (isupper(c))
c = tolower(c);
#endif
if (c != ':' && c != OPTION_CHAR && (cp = strchr(opts, c)) != NULL
&& cp[1] == ':' && argv[i][2] == 0 && i < argc - 1) {
int j;
char *temp1 = argv[i];
char *temp2 = argv[i+1];
for (j = i - 1; j >= optind; j--)
argv[j+2] = argv[j];
argv[optind] = temp1;
argv[optind+1] = temp2;
}
else {
int j;
char *temp = argv[i];
for (j = i - 1; j >= optind; j--)
argv[j+1] = argv[j];
argv[optind] = temp;
}
}
else
#endif
return EOF;
}
if ((argv[optind][0] == OPTION_CHAR && argv[optind][1] == OPTION_CHAR
&& argv[optind][2] == '\0')
#ifdef SWITCHAR
|| (argv[optind][0] == switchar && argv[optind][1] == switchar
&& argv[optind][2] == '\0')
#endif
) {
optind++;
return(EOF);
}
}
optopt = c = argv[optind][sp];
#ifdef CASE_INSENSITIVE_OPTIONS
if (
#ifdef USE_ISASCII
isascii(c) &&
#endif /* USE_ISASCII */
isupper((unsigned char)c))
optopt = c = tolower((unsigned char)c);
#endif /* CASE_INSENSITIVE_OPTIONS */
if (c == ':' || (cp = strchr(opts, c)) == NULL) {
if (argv[optind][++sp] == '\0') {
optind++;
sp = 1;
}
message = ": illegal option -- ";
goto bad;
}
if (*++cp == ':') {
if (argv[optind][sp+1] != '\0')
optarg = &argv[optind++][sp+1];
else if (++optind >= argc) {
sp = 1;
message = ": option requires an argument -- ";
goto bad;
}
else
optarg = argv[optind++];
sp = 1;
}
else {
if (argv[optind][++sp] == '\0') {
sp = 1;
optind++;
}
optarg = NULL;
}
return c;
bad:
if (opterr) {
fputs(argv[0], stderr);
fputs(message, stderr);
fputc(optopt, stderr);
fputc('\n', stderr);
}
return '?';
}
#endif /* not HAVE_GETOPT */
/*
Local Variables:
c-indent-level: 4
c-continued-statement-offset: 4
c-brace-offset: 4
c-argdecl-indent: 4
c-label-offset: -4
tab-width: 4
End:
*/

View File

@ -0,0 +1,11 @@
/* Declare getopt() and associated variables. */
/* Don't use prototypes in case some system header file has a
conflicting definition. Systems differ on how they declare the second
parameter. */
extern int getopt();
extern char *optarg;
extern int optind;
extern int opterr;

View File

@ -0,0 +1,22 @@
/* KEYWORD.H: Definitions for markup declaration keyword processing.
*/
/* Default value types for attribute definition list declaration.
*/
#define DNULL 1 /* Default value: implied attribute. */
#define DREQ 2 /* Default value: required attribute. */
#define DCURR 3 /* Default value: current attribute. */
#define DCONR 4 /* Default value: content reference attribute. */
#define DFIXED 5 /* Default value: fixed attribute. */
/* External identifier types for entity and notation declarations.
*/
#define EDSYSTEM 1 /* SYSTEM (but not PUBLIC) identifier specified.*/
#define EDPUBLIC 2 /* PUBLIC (but not SYSTEM) identifier specified.*/
#define EDBOTH 3 /* PUBLIC and also SYSTEM identifiers specified.*/
/* Marked section keywords.
*/
#define MSTEMP 1
#define MSRCDATA 2
#define MSCDATA 3
#define MSIGNORE 4

View File

@ -0,0 +1,51 @@
/* SGML Character Use: ISO Latin 1.
*/
#define EOFCHAR '\032' /* FUNCTION: EE (entity end: files). */
#define EOBCHAR '\034' /* NONCHAR: EOB (file entity: end of buffer. */
#define RSCHAR '\012' /* FUNCTION: RS (record start). */
#define RECHAR '\015' /* FUNCTION: RE (record end). */
#define TABCHAR '\011' /* FUNCTION: TAB (horizontal tab). */
#define SPCCHAR '\040' /* FUNCTION: SPACE (horizontal space). */
#define GENRECHAR '\010' /* NONCHAR: Generated RE. */
#define DELCDATA '\035' /* NONCHAR: Delimiter for CDATA entity in
attribute value. */
#define DELSDATA '\036' /* NONCHAR: Delimiter for SDATA entity in
attribute value. */
#define DELNONCH '\037' /* NONCHAR: non-SGML character prefix. */
/* These two macros are used to handle non-SGML characters. A non-SGML
by character is represented by a DELNONCH character followed by
SHIFTNON(original_character). SHIFTNON must transform any character
in the set 0, EOFCHAR, EOBCHAR, GENRECHAR, DELCDATA, DELSDATA,
DELNONCH into a character that is not one of the set 0, EOFCHAR,
EOBCHAR. Furthermore UNSHIFTNON(SHIFTNON(c)) must be equal to c for
every character c in the former set. */
/* This is a simple definition that works for ASCII-like character sets. */
#define SHIFTNON(ch) ((UNCH)(ch) | 0100)
#define UNSHIFTNON(ch) ((UNCH)(ch) & ~0100)
/* A canonical NONSGML character. The character number that is shunned
in the reference concrete syntax and is not the number of a
significant (in the reference concrete syntax) character nor one of
the above characters nor 0. */
#define CANON_NONSGML 255
/* A canonical DATACHAR character. The character number that is not
shunned in the reference concrete syntax and is not the number of a
significant (in the reference concrete syntax) SGML character nor one
of the above characters. */
#define CANON_DATACHAR 254
/* Components for a formal public identifier for the whole of the
system character set. Protect with ifndef so that it can be overriden
in config.h. */
#ifndef SYSTEM_CHARSET_DESIGNATING_SEQUENCE
#define SYSTEM_CHARSET_DESIGNATING_SEQUENCE "ESC 2/13 4/1"
#endif
#ifndef SYSTEM_CHARSET_OWNER
#define SYSTEM_CHARSET_OWNER "ISO Registration Number 100"
#endif
#ifndef SYSTEM_CHARSET_DESCRIPTION
#define SYSTEM_CHARSET_DESCRIPTION "ECMA-94 Right Part of Latin Alphabet Nr. 1"
#endif

View File

@ -0,0 +1,11 @@
/* Definitions of lexical codes needed by both lextaba.c and lexrf.c. */
#define FCE 27 /* FRE Free character in use as an entity reference */
#define FRE 0 /* FREECHAR that is not in a CON delimiter-in-context. */
#define LITC 21 /* LIT LITA PIC or EE in use as a literal terminator */
#define MSC3 15 /* ] Also MSC[2]. */
#define NET 17 /* / When enabled. */
#define ETI 16 /* / Actually ETAGO[2] */
#define SPCR 19 /* Space in use as SR8. */
#define TGO2 25 /* < TAGO; also MDO[1], PIO[1] */
#define CDE 11 /* NONSGML delcdata CDATA/SDATA delimiter */

124
usr.bin/sgmls/sgmls/lexrf.c Normal file
View File

@ -0,0 +1,124 @@
/* LEXRF: Lexical tables for reference concrete syntax.
*/
#include "config.h"
#include "entity.h" /* Templates for entity control blocks. */
#include "synxtrn.h" /* Declarations for concrete syntax constants. */
#include "action.h" /* Action names for all parsing. */
#include "lexcode.h"
static UNCH SRTAB[] = { TABCHAR, '\0' };
static UNCH SRRE[] = { RECHAR, '\0' };
static UNCH SRRS[] = { RSCHAR, '\0' };
static UNCH SRRSB[] = { RSCHAR, 'B', '\0' };
static UNCH SRRSRE[] = { RSCHAR, RECHAR, '\0' };
static UNCH SRRSBRE[] = { RSCHAR, 'B', RECHAR, '\0' };
static UNCH SRBRE[] = { 'B', RECHAR, '\0' };
struct lexical lex = { /* Delimiter set constants for parser use. */
{ /* Markup strings for text processor use. */
(UNCH *)"\4&#", /* LEXCON markup string: CRO */
(UNCH *)"[", /* LEXCON markup string: DSO */
(UNCH *)"\3&", /* LEXCON markup string: ERO */
(UNCH *)"\4</", /* LEXMARK markup string: end-tag */
(UNCH *)"\3\"", /* LEXMARK markup string: LIT */
(UNCH *)"\3'", /* LEXMARK markup string: LITA */
(UNCH *)"\3>", /* LEXCON markup string: MDC */
(UNCH *)"\4<!", /* LEXCON markup string: MDO */
(UNCH *)"\5]]>", /* LEXCON markup string: mse */
(UNCH *)"\5<![", /* LEXCON markup string: mss */
(UNCH *)"\13<![CDATA[", /* LEXCON markup string: mss CDATA */
(UNCH *)"\14<![RCDATA[", /* LEXCON markup string: mss RCDATA */
(UNCH *)"\3>", /* LEXCON markup string: PIC */
(UNCH *)"\4<?", /* LEXCON markup string: PIO */
(UNCH *)"\3;", /* LEXGRP markup string: ref close. */
(UNCH *)"\3<", /* LEXMARK markup string: start-tag */
(UNCH *)"\3>", /* LEXMARK markup string: TAGC */
(UNCH *)"\3=", /* LEXMARK markup string: VI */
3, /* LEXMARK: length of null end-tag. */
2 /* LEXMARK: length of null start-tag. */
},
{ /* Short reference delimiters. */
{ /* Short reference delimiter table. */
{(UNCH *)"", SRCT}, /* Dummy entry to store SR count. */
{SRTAB, 1}, /* TAB */
{SRRE, 2}, /* RE */
{SRRS, 3}, /* RS */
{SRRSB, 4}, /* Leading blanks */
{SRRSRE, 5}, /* Null record */
{SRRSBRE, 6}, /* Blank record */
{SRBRE, 7}, /* Trailing blanks */
{(UNCH *)" ", 8}, /* Space */
{(UNCH *)"BB", 9}, /* Two or more blanks */
{(UNCH *)"\"", 10}, /* Quotation mark (first data character) */
{(UNCH *)"#", 11}, /* Number sign */
{(UNCH *)"%", 12}, /* FCE CHARACTERS start here */
{(UNCH *)"'", 13},
{(UNCH *)"(", 14},
{(UNCH *)")", 15},
{(UNCH *)"*", 16},
{(UNCH *)"+", 17},
{(UNCH *)",", 18},
{(UNCH *)"-", 19}, /* Hyphen */
{(UNCH *)"--", 20}, /* Two hyphens */
{(UNCH *)":", 21},
{(UNCH *)";", 22},
{(UNCH *)"=", 23},
{(UNCH *)"@", 24},
{(UNCH *)"[", 25},
{(UNCH *)"]", 26},
{(UNCH *)"^", 27},
{(UNCH *)"_", 28}, /* Low line */
{(UNCH *)"{", 29},
{(UNCH *)"|", 30},
{(UNCH *)"}", 31},
{(UNCH *)"~", 32},
{0, 0}
},
{ /* Printable form of unprintable SR delims.*/
"", /* Dummy entry to balance s.dtb. */
"&#TAB;", /* TAB */
"&#RE;", /* RE */
"&#RS;", /* RS */
"&#RS;B", /* Leading blanks */
"&#RS;&#RE;", /* Null record */
"&#RS;B&#RE;", /* Blank record */
"B&#RE;", /* Trailing blanks */
"&#SPACE;" /* Space */
},
12, /* LEXCNM: Index of first FCE in srdeltab. */
20, /*LEXCNM:Index of "two hyphens" in srdeltab*/
10, /* LEXCNM: Index of first SR with data char. */
19, /* LEXCNM: Index of hyphen in srdeltab. */
SRNPRT+1, /* LEXCNM: Index of 1st printable SR. */
8, /* LEXCNM: Index of space in srdeltab. */
25, /* LEXCNM: Index of left bracket in srdeltab. */
26, /* LEXCNM: Index of right bracket in srdeltab. */
}, /* End of short reference delimiters. */
{ /* General delimiter characters. */
GENRECHAR, /*LEXCNM:(BS)Generated RE; can't be markup.*/
'"', /* LEXMARK: Char used as LIT delimiter.*/
'\'', /* LEXMARK: Char used as LITA delimiter.*/
'>', /* LEXLMS: Char used as MDC delimiter.*/
']', /* LEXLMS: Char used as MSC when enabled.*/
'/', /* LEXCON: Char used as NET when enabled.*/
'%', /* LEXMARK: Char used as PERO delimiter. */
'>', /* LEXCON: Char used as PIC delimiter.*/
'<' /* LEXCON: Char used as TAGO when enabled.*/
},
{ /* Lexical table code assignments. */
FCE, /* LEXCNM: FRE char as entity reference.*/
FRE, /* LEXLMS: Free character not an entity ref.*/
LITC, /* LEXLMS: Literal close delimiter enabled. */
MSC3, /* LEXLMS: Marked section close delim enabled. */
NET, /* LEXCON: Null end-tag delimiter enabled. */
ETI, /* LEXCON: NET disabled; still used as ETI. */
SPCR, /* LEXCNM: Space in use as SHORTREF delim. */
TGO2, /* LEXCON: Tag open delimiter enabled. */
CDE /* LEXLMS: CDATA/SDATA delimiters. */
}
};
UNCH *lextabs[] = {
lexcnm, lexcon, lexgrp, lexlms, lexmark, lexsd, lextoke, 0
};

View File

@ -0,0 +1,559 @@
/* lextaba.c: lexical tables for ASCII. */
/* These tables are munged by setnonsgml(). */
#include "config.h"
#include "entity.h"
#include "lexcode.h"
#include "sgmldecl.h"
/* LEXCNM: Lexical table for mixed content (PCBCONM) parse.
*/
/* Symbols for SGML character set divisions and function characters. */
#define NU 1 /* NUMERAL Numerals */
#define NMC 2 /* LC/UCNMCHAR . - Period and hyphen */
#define NMS 3 /* LC/UCNMSTRT Lower and uppercase letters */
#define SPC 4 /* SPACE 32 Space */
#define NON 5 /* NONSGML 0-31 127 255 Unused, except for: */
#define EE 6 /* NONSGML 00 26 Entity end (end of file) */
#define EOB 7 /* NONSGML 28 End disk buffer */
#define RS 8 /* Function 10 Line feed */
#define RE 9 /* Function 13 Carrier return */
#define SEP 10 /* SEPCHAR 09 TAB: horizontal tab */
#define NSC 12 /* NONSGML delnonch Non-SGML character prefix */
/* Symbols for SGML delimiter roles in CON and CXT.
ETI and NET must be the same in LEXCNM and LEXCON.
FRE characters are changed to FCE if an FCE entity is declared.
They are changed back to FRE when the entity is canceled.
*/
#define ERO 13 /* & Also CRO[1] */
#define NMRE 14 /* 08 Generated non-markup RE */
#define COM 15 /* - For MDO context; also SR19 and SR20. */
#undef LIT1
#define LIT1 18 /* " SR10 */
#define MDO 20 /* ! Actually MDO[2] */
#define MSC1 21 /* ] Both MSC[1] and MSC[2]; also SR26. */
#define MSO 22 /* [ For MDO context; also SR25. */
#define PIO 23 /* ? Actually PIO[2] */
#define RNI 24 /* # For CRO[2]; also SR11. */
#define TGC1 25 /* > For TAGO and MSC context; also MDC, PIC */
#define TGO1 26 /* < TAGO; also MDO[1], PIO[1] */
UNCH lexcnm[256] = { /*
000 001       bs tab lf home ff cr so si */
EE, NON, NON, NON, NON, NON, NON, NON, NMRE,SEP, RS, NON, NON, RE, NON, NON, /*
          eof esc rt left up down */
NON, NON, NON, NON, NON, NON, NON, NON, NON, NON, EE, NON, EOB, NON, NON, NSC, /*
032 ! " # $ % & ' ( ) * + , - . / */
SPC, MDO, LIT1,RNI, FRE, FRE ,ERO, FRE, FRE, FRE, FRE, FRE, FRE, COM, NMC, ETI, /*
0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
NU , NU , NU , NU , NU , NU , NU , NU , NU , NU , FRE, FRE, TGO1,FRE, TGC1,PIO, /*
@ A B C D E F G H I J K L M N O */
FRE, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
P Q R S T U V W X Y Z [ \ ] ^ _ */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, MSO, FRE, MSC1,FRE, FRE, /*
` a b c d e f g h i j k l m n o */
FRE, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
p q r s t u v w x y z { | } ~ 127 */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, FRE, FRE, FRE, FRE, NON,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, NON
};
/* free nu nmc nms spc non ee eob rs re sep cde nsc ero
nmre com eti lit spcr mdo msc mso net pio rni tagc tago fce */
#undef ERO
#undef NMRE
#undef COM
#undef LIT1
/* def SPCR*/
#undef MDO
#undef MSC1
#undef MSO
#undef PIO
#undef RNI
#undef TGC1
/* def TGO1*/
/* def FCE*/
/* LEXCON: Lexical table for RCDATA and CDATA content (PCBCON?),
prolog (PCBPRO), and nested declaration set (PCBMDS) parses.
Note: NMC is same as FRE; kept for consistency with LEXCNM and LEXLMS.
*/
/* Symbols for SGML character set divisions and function characters. */
/* Same as for LEXCNM. */
/* Symbols for SGML delimiter roles in CON, CXT, and DS.
ETI and NET must be the same in LEXCNM and LEXCON.
FRE characters are changed to FCE if an FCE entity is declared.
They are changed back to FRE when the entity is canceled.
*/
#define ERO 13 /* & Also CRO[1] */
#define NMRE 14 /* 08 Generated non-markup RE */
#define COM 15 /* - For MDO context. */
/*#define ETI 16 / Actually ETAGO[2] */
/*#define NET 17 / When enabled. */
#define MDO 18 /* ! Actually MDO[2] */
#define MSC2 19 /* ] Both MSC[1] and MSC[2]. */
#define MSO 20 /* [ For MDO context. */
#define PERO 21 /* % For prolog */
#define PIO 22 /* ? Actually PIO[2] */
#define RNI 23 /* # For CRO[2]. */
#define TGC2 24 /* > For TAGO and MSC context; also MDC, PIC */
UNCH lexcon[256] = { /*
000 001       bs tab lf home ff cr so si */
EE, NON, NON, NON, NON, NON, NON, NON, NMRE,SEP, RS, NON, NON, RE, NON, NON, /*
          eof esc rt left up down */
NON, NON, NON, NON, NON, NON, NON, NON, NON, NON, EE, NON, EOB, NON, NON, NSC, /*
032 ! " # $ % & ' ( ) * + , - . / */
SPC, MDO, FRE, RNI, FRE, PERO,ERO, FRE, FRE, FRE, FRE, FRE, FRE, COM, NMC, ETI, /*
0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
NU , NU , NU , NU , NU , NU , NU , NU , NU , NU , FRE, FRE, TGO2,FRE, TGC2,PIO, /*
@ A B C D E F G H I J K L M N O */
FRE, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
P Q R S T U V W X Y Z [ \ ] ^ _ */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, MSO, FRE, MSC2,FRE, FRE, /*
` a b c d e f g h i j k l m n o */
FRE, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
p q r s t u v w x y z { | } ~ 127 */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, FRE, FRE, FRE, FRE, NON,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, NON
};
/* free nu nmc nms spc non ee eob rs re sep cde nsc ero
nmre com eti net mdo msc mso pero pio rni tagc tago */
#undef FRE
#undef NU
#undef NMC
#undef NMS
#undef SPC
#undef NON
#undef EE
#undef EOB
#undef RS
#undef RE
#undef SEP
#undef NSC
#undef ERO
#undef NMRE
#undef COM
/* def ETI*/
/* def NET*/
#undef MDO
#undef MSC2
#undef MSO
#undef PERO
#undef PIO
#undef RNI
#undef TGC2
/* LEXGRP: Lexical table for group parses, including PCBREF.
*/
/* Symbols for SGML character set divisions. */
#define BIT 0 /* Bit combinations (not NONCHAR) not allowed in a group. */
#define NMC 1 /* NAMECHAR . - Period, underscore, and numerals */
#define NMS 2 /* NAMESTRT Lower and uppercase letters */
#define RE 3 /* Function 13 Carrier return */
#define SPC 4 /* SPACE 32 09 Space; includes TAB */
#define NON 5 /* NONCHAR 0-31 127 255 Unused, except for: */
#define EE 6 /* Function 26 00 EE: entity end (end of file) */
#define EOB 7 /* NONCHAR 28 End disk buffer. */
#define RS 8 /* Function 10 RS: record start (line feed) */
/* Symbols for SGML delimiter roles in GRP. */
#define AND1 9 /* & */
#define GRPC 10 /* ) */
#define GRPO 11 /* ( */
#undef LIT2
#define LIT2 12 /* " For datatags. */
#define LITA 13 /* ' For datatags. */
#define DTGC 14 /* ] For datatags. */
#define DTGO 15 /* [ For datatags. */
#define OPT1 16 /* ? */
#define OR1 17 /* | */
#define PERO 18 /* % */
#define PLUS 19 /* + */
#define REP1 20 /* * */
#define RNI 21 /* # For #CHARS */
#define SEQ1 22 /* , */
#define REFC 23 /* ; For references */
UNCH lexgrp[256] = { /*
000 001       bs tab lf home ff cr so si */
EE , NON, NON, NON, NON, NON, NON, NON, NON, SPC, RS, NON, NON, RE, NON, NON, /*
          eof esc rt left up down */
NON, NON, NON, NON, NON, NON, NON, NON, NON, NON, EE , NON, EOB, NON, NON, NON, /*
032 ! " # $ % & ' ( ) * + , - . / */
SPC, BIT, LIT2,RNI, BIT, PERO,AND1,LITA,GRPO,GRPC,REP1,PLUS,SEQ1,NMC, NMC, BIT, /*
0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
NMC, NMC, NMC, NMC, NMC, NMC, NMC, NMC, NMC, NMC, BIT, REFC,BIT, BIT, BIT, OPT1,/*
@ A B C D E F G H I J K L M N O */
BIT, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
P Q R S T U V W X Y Z [ \ ] ^ _ */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, DTGO,BIT, DTGC,BIT, BIT, /*
` a b c d e f g h i j k l m n o */
BIT, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
p q r s t u v w x y z { | } ~ 127 */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, BIT, OR1, BIT, BIT, NON,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, NON
};
/* bit nmc nms re spc non ee eob rs and grpc grpo lit lita
dtgc dtgo opt or pero plus rep rni seq refc */
#undef BIT
#undef NMC
#undef NMS
#undef RE
#undef SPC
#undef NON
#undef EE
#undef EOB
#undef RS
#undef AND1
#undef GRPC
#undef GRPO
#undef LIT2
#undef LITA
#undef DTGC
#undef DTGO
#undef OPT1
#undef OR1
#undef PERO
#undef PLUS
#undef REP1
#undef RNI
#undef SEQ1
#undef REFC
/* LEXLMS: Lexical table for literal parses and marked sections.
*/
/* Symbols for SGML character set divisions and function characters.
*/
#define FRE 0 /* Free char: not in a delimiter or minimum literal. */
#define NU 1 /* Numeral Numerals */
#undef MIN
#define MIN 2 /* Minimum literal '()+,-./:?= */
#define NMS 3 /* LC/UCNMSTRT Lower and uppercase letters */
#define SPC 4 /* SPACE 32 Space */
#define NON 5 /* NONSGML 0-31 127 255 Unused, except for: */
#define EE 6 /* NONSGML 00 26 Entity end (end of file) */
#define EOB 7 /* NONSGML 28 End disk buffer */
#define RS 8 /* Function 10 Line feed */
#define RE 9 /* Function 13 Carrier return */
#define SEP 10 /* SEPCHAR 09 TAB: horizontal tab */
/*#define CDE 11 NONSGML delcdata CDATA/SDATA delimiter */
#define NSC 12 /* NONSGML delnonch Non-SGML character prefix */
/* Symbols for SGML delimiter roles in LIT, PI, and marked sections.
Either LIT, LITA, PIC, or EE, is changed to LITC when a literal is begun.
It is changed back when the LITC occurs (i.e., when the literal ends).
*/
#define ERO 13 /* & */
#define MDO 14 /* ! Actually MDO[2] */
#define MSO 16 /* [ For MDO context. */
#define PERO 17 /* % For prolog. */
#define RNI 18 /* # For CRO[2] */
#define TGC3 19 /* > Also MDC for MSC context. */
#define TGO3 20 /* < TAGO; also MDO[1] */
/* Room has been left in the parse tables in case re-parsing of text
is eventually supported (i.e., saved parsed text is used by the
application to create a new SGML document, but CDATA and SDATA
entities in literals, and non-SGML characters, are left in their
parsed state to avoid the overhead of reconstituting the original
markup). In such a case, the two non-SGML characters DELCDATA and
DELSDATA are changed to CDE.
NOTE: The idea is a bad one, because the generated document would
be non-conforming, as it would contain non-SGML characters.
*/
UNCH lexlms[256] = { /*
000 001       bs tab lf home ff cr so si */
EE, NON, NON, NON, NON, NON, NON, NON, NON ,SEP, RS, NON, NON, RE, NON, NON, /*
          eof esc rt left up down */
NON, NON, NON, NON, NON, NON, NON, NON, NON, NON, EE, NON, EOB, NON, NON, NSC, /*
032 ! " # $ % & ' ( ) * + , - . / */
SPC, MDO, FRE, RNI, FRE, PERO,ERO, MIN, MIN, MIN, FRE, MIN, MIN, MIN, MIN, MIN, /*
0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
NU , NU , NU , NU , NU , NU , NU , NU , NU , NU , MIN, FRE, TGO3,MIN, TGC3,MIN, /*
@ A B C D E F G H I J K L M N O */
FRE, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
P Q R S T U V W X Y Z [ \ ] ^ _ */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, MSO, FRE, MSC3,FRE, FRE, /*
` a b c d e f g h i j k l m n o */
FRE, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
p q r s t u v w x y z { | } ~ 127 */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, FRE, FRE, FRE, FRE, NON,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE,
FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, FRE, NON
};
/* free nu min nms spc non ee eob rs re sep cde nsc ero
mdo msc mso pero rni tago tagc litc */
/* def FRE*/
#undef NU
#undef MIN
#undef NMS
#undef SPC
#undef NON
#undef EE
#undef EOB
#undef RS
#undef RE
#undef SEP
/* def CDE*/
/* def NSC*/
#undef ERO
#undef MDO
/* def MSC3*/
#undef MSO
#undef PERO
#undef RNI
#undef TGC3
#undef TGO3
/* def LITC*/
/* LEXMARK: Lexical scan table for markup: PCBMD? and PCB?TAG.
*/
/* Symbols for SGML character set divisions. */
#define BIT 0 /* Bit combinations not allowed; includes ESC SO SI */
#define NMC 1 /* NAMECHAR . _ Period and underscore */
#define NU 2 /* NUMERAL Numerals */
#define NMS 3 /* NAMESTRT Lower and uppercase letters */
#define SPC 4 /* SPACE 32 13 09 Space; includes RE TAB */
#define NON 5 /* NONCHAR 0-31 127 255 Unused, except for: */
#define EE 6 /* Function 26 00 EE: entity end (end of file) */
#define EOB 7 /* NONCHAR 28 End disk buffer. */
#define RS 8 /* Function 10 RS: record start (line feed) */
/* Symbols for SGML delimiter roles in MD and TAG. */
#define COM1 9 /* - Actually COM[1]; also COM[2], MINUS. */
#define ETIB 10 /* / ETI; actually ETAGO[2]. */
#define GRPO 11 /* ( */
#define LIT3 12 /* " */
#define LITA 13 /* ' */
#define DSO 14 /* [ */
#define DSC1 15 /* ] For data attribute specifications */
#define PERO 16 /* % */
#define PLUS 17 /* + */
#define REFC 18 /* ; For references */
#define RNI 19 /* # Also CRO[2] */
#define TGC4 20 /* > Also MDC, PIC */
#define TGO4 21 /* < TAGO; also MDO[1] */
#define VI 22 /* = */
UNCH lexmark[256] = { /*
000 001       bs tab lf home ff cr so si */
EE , NON, NON, NON, NON, NON, NON, NON, NON, SPC, RS, NON, NON, SPC, NON, NON, /*
          eof esc rt left up down */
NON, NON, NON, NON, NON, NON, NON, NON, NON, NON, EE , NON, EOB, NON, NON, NON, /*
032 ! " # $ % & ' ( ) * + , - . / */
SPC, BIT, LIT3,RNI, BIT, PERO,BIT, LITA,GRPO,BIT, BIT, PLUS,BIT, COM1,NMC ,ETIB,/*
0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, BIT, REFC,TGO4,VI, TGC4,BIT, /*
@ A B C D E F G H I J K L M N O */
BIT, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
P Q R S T U V W X Y Z [ \ ] ^ _ */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, DSO, BIT, DSC1, BIT, BIT, /*
` a b c d e f g h i j k l m n o */
BIT, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
p q r s t u v w x y z { | } ~ 127 */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, BIT, BIT, BIT, BIT, NON,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT,
BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, BIT, NON
};
/* bit nmc nu nms spc non ee eob rs com eti grpo lit lita
dso pero plus refc rni tagc tago vi */
#undef BIT
#undef NMC
#undef NU
#undef NMS
#undef SPC
#undef NON
#undef EE
#undef EOB
#undef RS
#undef COM1
#undef ETIB
#undef GRPO
#undef LIT3
#undef LITA
#undef DSO
#undef DSC
#undef PERO
#undef PLUS
#undef REFC
#undef RNI
#undef TGC4
#undef TGO4
#undef VI
/* LEXSD: Lexical scan table for SGML declaration.
*/
/* Symbols for SGML character set divisions. */
#define SIG 0 /* Significant SGML characters. */
#define DAT 1 /* DATACHAR Not significant, and not non-sgml. */
#define NU 2 /* NUMERAL Numerals */
#define NMS 3 /* NAMESTRT Lower and uppercase letters */
#define SPC 4 /* SPACE 32 13 09 Space; includes RE TAB */
#define NON 5 /* NONCHAR NONSGML */
#define EE 6 /* Function 26 00 EE: entity end (end of file) */
#define EOB 7 /* NONCHAR 28 End disk buffer. */
#define RS 8 /* Function 10 RS: record start (line feed) */
/* Symbols for SGML delimiter roles in SGML declaration. */
#define COM1 9 /* - Actually COM[1]; also COM[2]. */
#define LIT3 10 /* " */
#define LITA 11 /* ' */
#define TGC4 12 /* > Also MDC, PIC */
UNCH lexsd[256] = { /*
000 001       bs tab lf home ff cr so si */
EE , NON, NON, NON, NON, NON, NON, NON, NON, SPC, RS, NON, NON, SPC, NON, NON, /*
          eof esc rt left up down */
NON, NON, NON, NON, NON, NON, NON, NON, NON, NON, EE , NON, EOB, NON, NON, NON, /*
032 ! " # $ % & ' ( ) * + , - . / */
SPC, SIG, LIT3,SIG, DAT, SIG ,SIG, LITA,SIG, SIG, SIG, SIG, SIG, COM1,SIG ,SIG,/*
0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, SIG, SIG, SIG, SIG, TGC4,SIG, /*
@ A B C D E F G H I J K L M N O */
SIG, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
P Q R S T U V W X Y Z [ \ ] ^ _ */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, SIG, DAT, SIG, SIG, SIG, /*
` a b c d e f g h i j k l m n o */
DAT, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
p q r s t u v w x y z { | } ~ 127 */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, SIG, SIG, SIG, SIG, NON,
DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT,
DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT,
DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT,
DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT,
DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT,
DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT,
DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT,
DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, DAT, NON
};
#undef SIG
#undef DAT
#undef NON
#undef NU
#undef NMS
#undef SPC
#undef EE
#undef EOB
#undef RS
#undef COM1
#undef LIT3
#undef LITA
#undef TGC4
/* LEXTRAN: Translation table for SGML names.
*/
UNCH lextran[256] = { /*
000 001       bs tab lf home ff cr so si */
0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , /*
          eof esc rt left up down */
16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , /*
space! " # $ % & ' ( ) * + , - . / */
32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , /*
0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , /*
@ A B C D E F G H I J K L M N O */
64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , /*
P Q R S T U V W X Y Z [ \ ] ^ _ */
80 , 81 , 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , /*
` a b c d e f g h i j k l m n o */
96 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , /*
p q r s t u v w x y z { | } ~ 127 */
80 , 81 , 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 , 123, 124, 125, 126, 127,
128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143,
144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159,
160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175,
176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191,
192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207,
208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223,
224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239,
240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255
};
/* LEXTOKE: Lexical class table for tokenization scan.
*/
#include "lextoke.h" /* Symbols for tokenization lexical classes. */
UNCH lextoke[256] = { /*
000 001       bs tab lf home ff cr   */
INV, INV, INV, INV, INV, INV, INV, INV, INV, SEP, REC, INV, INV, REC, INV, INV, /*
          eof esc rt left up down */
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, EOB, INV, INV, INV, /*
space! " # $ % & ' ( ) * + , - . / */
SP , INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, NMC, NMC, INV, /*
0 1 2 3 4 5 6 7 8 9 : ; < = > ? */
NU , NU , NU , NU , NU , NU , NU , NU , NU , NU , INV, INV, INV, INV, INV, INV, /*
@ A B C D E F G H I J K L M N O */
INV, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
P Q R S T U V W X Y Z [ \ ] ^ _ */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, INV, INV, INV, INV, INV, /*
` a b c d e f g h i j k l m n o */
INV, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, /*
p q r s t u v w x y z { | } ~ 127 */
NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, NMS, INV, INV, INV, INV, INV,
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV,
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV,
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV,
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV,
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV,
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV,
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV,
INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV, INV
};
/* This table maps ASCII to the system character set. */
int asciicharset[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
};

View File

@ -0,0 +1,184 @@
/* This file was automatically generated by genlex. Do not edit. */
#include "config.h"
#include "entity.h"
#include "sgmldecl.h"
UNCH lexcnm[] = {
6, 5, 5, 5, 5, 10, 5, 5, 5, 5, 5, 5, 5, 9, 5, 5,
5, 5, 5, 5, 5, 5, 14, 5, 5, 5, 5, 5, 7, 5, 5, 12,
0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 26, 0, 0, 0,
13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0,
15, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 23,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 24, 0, 0, 0, 18,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 22, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 5,
};
UNCH lexcon[] = {
6, 5, 5, 5, 5, 10, 5, 5, 5, 5, 5, 5, 5, 9, 5, 5,
5, 5, 5, 5, 5, 5, 14, 5, 5, 5, 5, 5, 7, 5, 5, 12,
0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 25, 0, 0, 0,
13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18, 0, 0, 0, 0, 0,
15, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 24, 22,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 23, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 20, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 19, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 5,
};
UNCH lexgrp[] = {
6, 5, 5, 5, 5, 4, 5, 5, 5, 5, 5, 5, 5, 3, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 5, 5, 5,
0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 11, 19, 17,
9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20, 10, 23, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 22, 18, 0, 0, 16,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 13, 0, 12,
0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0,
0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0,
0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 15, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 0, 0,
0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0,
0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0,
0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 5,
};
UNCH lexlms[] = {
6, 5, 5, 5, 5, 10, 5, 5, 5, 5, 5, 5, 5, 9, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 5, 5, 12,
0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 20, 2, 2, 0,
13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 0, 0, 2, 0, 0,
2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 17, 0, 19, 2,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 18, 0, 2, 2, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 16, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 5,
};
UNCH lexmark[] = {
6, 5, 5, 5, 5, 4, 5, 5, 5, 5, 5, 5, 5, 4, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 5, 5, 5,
0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 21, 11, 17, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18, 0,
9, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 16, 0, 20, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 19, 0, 13, 22, 12,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 14, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 5,
};
UNCH lexsd[] = {
6, 5, 5, 5, 5, 4, 5, 5, 5, 5, 5, 5, 5, 4, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 7, 5, 5, 5,
1, 1, 1, 1, 1, 8, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6,
4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1,
9, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 12, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 11, 0, 10,
1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1,
1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1,
1, 0, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 0, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1,
0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1,
1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 5,
};
UNCH lextoke[] = {
0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 5, 5, 5, 5, 5, 5, 5, 5, 5, 0, 0, 0, 0, 0, 0,
0, 5, 5, 5, 5, 5, 5, 5, 5, 5, 0, 0, 0, 0, 0, 0,
0, 0, 5, 5, 5, 5, 5, 5, 5, 5, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 5, 5, 5, 5, 5, 5, 5, 5, 5, 0, 0, 0, 0, 0, 0,
0, 5, 5, 5, 5, 5, 5, 5, 5, 5, 0, 0, 0, 0, 0, 0,
0, 0, 5, 5, 5, 5, 5, 5, 5, 5, 0, 0, 0, 0, 0, 0,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 0, 0, 0, 0, 0, 0,
};
UNCH lextran[] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111,
112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127,
128, 193, 194, 195, 196, 197, 198, 199, 200, 201, 138, 139, 140, 141, 142, 143,
144, 209, 210, 211, 212, 213, 214, 215, 216, 217, 154, 155, 156, 157, 158, 159,
160, 161, 226, 227, 228, 229, 230, 231, 232, 233, 170, 171, 172, 173, 174, 175,
176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191,
192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207,
208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223,
224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239,
240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255,
};
int asciicharset[] = {
0, 1, 2, 3, 55, 45, 46, 47, 22, 5, 37, 11, 12, 13, 14, 15,
16, 17, 18, 19, 60, 61, 50, 38, 24, 25, 63, 39, 28, 29, 30, 31,
64, 90, 127, 123, 91, 108, 80, 125, 77, 93, 92, 78, 107, 96, 75, 97,
240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 122, 94, 76, 126, 110, 111,
124, 193, 194, 195, 196, 197, 198, 199, 200, 201, 209, 210, 211, 212, 213, 214,
215, 216, 217, 226, 227, 228, 229, 230, 231, 232, 233, 173, 224, 189, 176, 109,
121, 129, 130, 131, 132, 133, 134, 135, 136, 137, 145, 146, 147, 148, 149, 150,
151, 152, 153, 162, 163, 164, 165, 166, 167, 168, 169, 192, 79, 208, 161, 7,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED, UNUSED,
};

View File

@ -0,0 +1,10 @@
/* LEXTOKE.H: Symbols for tokenization lexical classes.
*/
#define INV 0 /* Invalid Chars Not allowed in an SGML name. */
#define REC 1 /* Record Boundary RS and RE. */
#define SEP 2 /* Separator TAB. */
#define SP 3 /* SPACE */
#define NMC 4 /* NAMECHAR . _ Period, underscore (plus NMS, NUM). */
#define NMS 5 /* NAMESTRT Lower and uppercase letters */
#define NU 6 /* NUMERAL Numerals */
#define EOB 7 /* NONCHAR 28 End disk buffer. */

View File

@ -0,0 +1,653 @@
/* lineout.c -
Implements line-oriented output format.
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#include "std.h"
#include "entity.h" /* Templates for entity control blocks. */
#include "adl.h" /* Definitions for attribute list processing. */
#include "sgmlmain.h" /* Main interface to SGML services. */
#include "lineout.h"
#include "appl.h"
static VOID flush_data P((void));
static VOID define_external_entity P((PNE));
static VOID define_entity P((UNCH *));
static VOID handle_attributes P((UNCH *, struct ad *));
static VOID handle_token_list P((UNCH *, struct ad *, int));
static VOID handle_single_token P((UNCH *, struct ad *, int));
static VOID output_notation P((UNCH *, UNCH *, UNCH *));
static VOID output_internal_entity P((UNCH *, int, UNCH *));
static VOID output_external_entity P((UNCH *, int, UNIV, UNCH *, UNCH *,
UNCH *));
static VOID output_subdoc P((UNCH *, UNIV, UNCH *, UNCH *));
#ifdef SUPPORT_SUBDOC
static VOID process_subdoc P((UNCH *, UNIV));
#endif /* SUPPORT_SUBDOC */
static VOID output_record_end P((void));
static VOID output_pcdata P((UNS, UNCH *));
static VOID output_cdata P((UNS, UNCH *));
static VOID output_sdata P((UNS, UNCH *));
static VOID output_entity_reference P((UNCH *));
static VOID output_start_tag P((UNCH *));
static VOID output_end_tag P((UNCH *));
static VOID output_processing_instruction P((UNS, UNCH *));
static VOID output_implied_attribute P((UNCH *, UNCH *));
static char *attribute_type_string P((int));
static VOID output_begin_attribute P((UNCH *, UNCH *, int));
static VOID output_attribute_token P((UNS, UNCH *));
static VOID output_end_attribute P((void));
static VOID print_data P((UNS, UNCH *, int));
static VOID print_string P((UNS, UNCH *, int));
static VOID print_id P((UNIV, UNCH *, UNCH *));
static VOID print_filename P((char *));
static VOID output_location P((void));
static VOID output_appinfo P((UNS, UNCH *));
static int have_data = 0;
static char *current_filename = 0;
static unsigned long current_lineno = 0;
VOID process_document(subdocsw)
int subdocsw;
{
enum sgmlevent rc;
struct rcbtag rcbtag;
struct rcbdata rcbdaf;
while ((rc = sgmlnext(&rcbdaf, &rcbtag)) != SGMLEOD) {
#ifdef SUPPORT_SUBDOC
if (rc == SGMLDAF && !CONTERSW(rcbdaf) && NDESW(rcbdaf)
&& NEXTYPE(NEPTR(rcbdaf)) == ESNSUB) {
if (!suppsw && !sgmlment(NEENAME(NEPTR(rcbdaf))))
define_external_entity(NEPTR(rcbdaf));
process_subdoc(NEENAME(NEPTR(rcbdaf)) + 1,
NEID(NEPTR(rcbdaf)));
continue;
}
#endif /* SUPPORT_SUBDOC */
if (!suppsw)
switch (rc) {
case SGMLDAF:
if (CONTERSW(rcbdaf))
break;
if (CDESW(rcbdaf))
output_cdata(CDATALEN(rcbdaf), CDATA(rcbdaf));
else if (SDESW(rcbdaf))
output_sdata(CDATALEN(rcbdaf), CDATA(rcbdaf));
else if (NDESW(rcbdaf)) {
assert(NEXTYPE(NEPTR(rcbdaf)) != ESNSUB);
if (!sgmlment(NEENAME(NEPTR(rcbdaf))))
define_external_entity(NEPTR(rcbdaf));
output_entity_reference(NEENAME(NEPTR(rcbdaf)) + 1);
}
else
output_pcdata(CDATALEN(rcbdaf), CDATA(rcbdaf));
break;
case SGMLSTG:
if (CONTERSW(rcbtag))
break;
if (ALPTR(rcbtag))
handle_attributes((UNCH *)NULL, ALPTR(rcbtag));
output_start_tag(CURGI(rcbtag));
break;
case SGMLETG:
if (CONTERSW(rcbtag))
break;
output_end_tag(CURGI(rcbtag));
break;
case SGMLPIS:
if (CONTERSW(rcbdaf))
break;
output_processing_instruction(PDATALEN(rcbdaf),
PDATA(rcbdaf));
break;
case SGMLREF:
if (CONTERSW(rcbdaf))
break;
output_record_end();
break;
case SGMLAPP:
if (CONTERSW(rcbdaf))
break;
if (!subdocsw)
output_appinfo(ADATALEN(rcbdaf), ADATA(rcbdaf));
break;
default:
abort();
}
}
}
/* Output an indication that the document was conforming. */
VOID output_conforming()
{
if (!suppsw)
printf("%c\n", CONFORMING_CODE);
}
static VOID define_external_entity(p)
PNE p;
{
if (NEXTYPE(p) == ESNSUB)
output_subdoc(NEENAME(p) + 1, NEID(p), NEPUBID(p), NESYSID(p));
else {
if (!NEDCNMARK(p))
output_notation(NEDCN(p) + 1, NEDCNPUBID(p), NEDCNSYSID(p));
output_external_entity(NEENAME(p) + 1, NEXTYPE(p), NEID(p),
NEPUBID(p), NESYSID(p), NEDCN(p) + 1);
if (NEAL(p))
handle_attributes(NEENAME(p) + 1, NEAL(p));
}
}
static VOID define_entity(ename)
UNCH *ename;
{
int rc;
PNE np;
UNCH *tp;
if (sgmlment(ename)) /* already defined it */
return;
rc = sgmlgent(ename, &np, &tp);
switch (rc) {
case 1:
define_external_entity(np);
break;
case 2:
case 3:
output_internal_entity(ename + 1, rc == 3, tp);
break;
}
}
/* ENT is the name of the entity with which these attributes are associated;
if it's NULL, they're associated with the next start tag. */
static VOID handle_attributes(ent, al)
UNCH *ent;
struct ad *al;
{
int aln;
for (aln = 1; aln <= ADN(al); aln++) {
if (GET(ADFLAGS(al, aln), AERROR))
;
else if (GET(ADFLAGS(al, aln), AINVALID))
;
else if (ADVAL(al, aln) == NULL)
output_implied_attribute(ent, ADNAME(al, aln));
else if (ADTYPE(al, aln) >= ATKNLIST)
handle_token_list(ent, al, aln);
else
handle_single_token(ent, al, aln);
if (BITON(ADFLAGS(al, aln), AGROUP))
aln += ADNUM(al, aln);
}
}
static VOID handle_token_list(ent, al, aln)
UNCH *ent;
struct ad *al;
int aln;
{
UNCH *ptr;
int i;
if (ADTYPE(al, aln) == AENTITYS) {
ptr = ADVAL(al, aln);
for (i = 0; i < ADNUM(al, aln); i++) {
/* Temporarily make token look like normal
name with length and EOS. */
UNCH c = ptr[*ptr + 1];
ptr[*ptr + 1] = '\0';
*ptr += 2;
define_entity(ptr);
*ptr -= 2;
ptr += *ptr + 1;
*ptr = c;
}
}
output_begin_attribute(ent, ADNAME(al, aln), ADTYPE(al, aln));
ptr = ADVAL(al, aln);
for (i = 0; i < ADNUM(al, aln); i++) {
/* The first byte is a length NOT including the length
byte; the tokens are not EOS terminated. */
output_attribute_token(*ptr, ptr + 1);
ptr += *ptr + 1;
}
output_end_attribute();
}
static VOID handle_single_token(ent, al, aln)
UNCH *ent;
struct ad *al;
int aln;
{
if (ADTYPE(al, aln) == ANOTEGRP && !DCNMARK(ADDATA(al, aln).x))
output_notation(ADVAL(al, aln) + 1,
ADDATA(al, aln).x->pubid,
ADDATA(al, aln).x->sysid);
else if (ADTYPE(al, aln) == AENTITY)
define_entity(ADVAL(al, aln));
output_begin_attribute(ent, ADNAME(al, aln), ADTYPE(al, aln));
if (ADTYPE(al, aln) == ACHARS)
output_attribute_token(ustrlen(ADVAL(al, aln)), ADVAL(al, aln));
else
output_attribute_token(*ADVAL(al, aln) - 2, ADVAL(al, aln) + 1);
output_end_attribute();
}
static VOID output_notation(name, pubid, sysid)
UNCH *name;
UNCH *pubid, *sysid;
{
flush_data();
print_id((UNIV)0, pubid, sysid);
printf("%c%s\n", DEFINE_NOTATION_CODE, name);
}
static VOID output_internal_entity(ename, is_sdata, text)
UNCH *ename;
int is_sdata;
UNCH *text;
{
flush_data();
printf("%c%s %s ", DEFINE_INTERNAL_ENTITY_CODE, ename,
is_sdata ? "SDATA" : "CDATA");
print_string(text ? ustrlen(text) : 0, text, 0);
putchar('\n');
}
static VOID output_subdoc(nm, id, pubid, sysid)
UNCH *nm;
UNIV id;
UNCH *pubid, *sysid;
{
flush_data();
print_id(id, pubid, sysid);
printf("%c%s\n", DEFINE_SUBDOC_ENTITY_CODE, nm);
}
#ifdef SUPPORT_SUBDOC
static VOID process_subdoc(nm, id)
UNCH *nm;
UNIV id;
{
if (!suppsw) {
flush_data();
output_location();
printf("%c%s\n", START_SUBDOC_CODE, nm);
fflush(stdout);
}
fflush(stderr);
if (id) {
char **argv;
int ret;
argv = make_argv(id);
ret = run_process(argv);
if (ret != 0)
suberr++;
current_filename = 0;
free(argv);
if (ret == 0)
get_subcaps();
}
else {
suberr++;
appl_error(E_SUBDOC, nm);
}
if (!suppsw)
printf("%c%s\n", END_SUBDOC_CODE, nm);
}
#endif /* SUPPORT_SUBDOC */
static VOID output_external_entity(nm, xtype, id, pubid, sysid, dcn)
UNCH *nm, *dcn;
UNIV id;
UNCH *pubid, *sysid;
int xtype;
{
char *type;
flush_data();
print_id(id, pubid, sysid);
switch (xtype) {
case ESNCDATA:
type = "CDATA";
break;
case ESNNDATA:
type = "NDATA";
break;
case ESNSDATA:
type = "SDATA";
break;
default:
return;
}
printf("%c%s %s %s\n", DEFINE_EXTERNAL_ENTITY_CODE, nm, type, dcn);
}
static VOID output_record_end()
{
static UNCH re = RECHAR;
print_data(1, &re, 0);
}
static VOID output_pcdata(n, s)
UNS n;
UNCH *s;
{
print_data(n, s, 0);
}
static VOID output_cdata(n, s)
UNS n;
UNCH *s;
{
print_data(n, s, 0);
}
static VOID output_sdata(n, s)
UNS n;
UNCH *s;
{
print_data(n, s, 1);
}
static VOID output_entity_reference(s)
UNCH *s;
{
flush_data();
output_location();
printf("%c%s\n", REFERENCE_ENTITY_CODE, s);
}
static VOID output_start_tag(s)
UNCH *s;
{
flush_data();
output_location();
printf("%c%s\n", START_CODE, s);
}
static VOID output_end_tag(s)
UNCH *s;
{
flush_data();
printf("%c%s\n", END_CODE, s);
}
static VOID output_processing_instruction(n, s)
UNS n;
UNCH *s;
{
flush_data();
output_location();
putchar(PI_CODE);
print_string(n, s, 0);
putchar('\n');
}
static VOID output_appinfo(n, s)
UNS n;
UNCH *s;
{
flush_data();
output_location();
putchar(APPINFO_CODE);
print_string(n, s, 0);
putchar('\n');
}
static VOID output_implied_attribute(ent, aname)
UNCH *ent, *aname;
{
flush_data();
if (ent)
printf("%c%s %s IMPLIED\n", DATA_ATTRIBUTE_CODE, ent, aname);
else
printf("%c%s IMPLIED\n", ATTRIBUTE_CODE, aname);
}
static char *attribute_type_string(type)
int type;
{
switch (type) {
case ANMTGRP:
case ANAME:
case ANMTOKE:
case ANUTOKE:
case ANUMBER:
case ANAMES:
case ANMTOKES:
case ANUTOKES:
case ANUMBERS:
case AID:
case AIDREF:
case AIDREFS:
return "TOKEN";
case ANOTEGRP:
return "NOTATION";
case ACHARS:
return "CDATA";
case AENTITY:
case AENTITYS:
return "ENTITY";
}
#if 0
fatal("invalid attribute type %d", type);
#endif
return "INVALID";
}
static VOID output_begin_attribute(ent, aname, type)
UNCH *ent, *aname;
int type;
{
flush_data();
if (ent)
printf("%c%s %s %s", DATA_ATTRIBUTE_CODE, ent, aname,
attribute_type_string(type));
else
printf("%c%s %s", ATTRIBUTE_CODE, aname,
attribute_type_string(type));
}
static VOID output_attribute_token(vallen, val)
UNS vallen;
UNCH *val;
{
putchar(' ');
print_string(vallen, val, 0);
}
static VOID output_end_attribute()
{
putchar('\n');
}
static VOID print_data(n, s, is_sdata)
UNS n;
UNCH *s;
int is_sdata;
{
if (n > 0 || is_sdata) {
if (n == 1 && *s == RECHAR)
current_lineno++;
else
output_location();
if (!have_data)
putchar(DATA_CODE);
print_string(n, s, is_sdata);
have_data = 1;
}
}
static VOID flush_data()
{
if (have_data) {
putchar('\n');
have_data = 0;
}
}
static VOID output_location()
{
char *filename;
unsigned long lineno;
int filename_changed = 0;
if (!locsw)
return;
if (!sgmlloc(&lineno, &filename))
return;
if (!current_filename || strcmp(filename, current_filename) != 0)
filename_changed = 1;
else if (lineno == current_lineno)
return;
flush_data();
printf("%c%lu", LOCATION_CODE, lineno);
current_lineno = lineno;
if (filename_changed) {
putchar(' ');
print_filename(filename);
current_filename = filename;
}
putchar('\n');
}
static VOID print_string(slen, s, is_sdata)
UNS slen;
UNCH *s;
int is_sdata;
{
if (is_sdata)
fputs("\\|", stdout);
while (slen > 0) {
UNCH ch = *s++;
slen--;
if (ch == DELSDATA) {
if (is_sdata)
; /* I don't think this should happen */
else
fputs("\\|", stdout);
;
}
else if (ch == DELCDATA)
;
else {
if (ch == DELNONCH) {
if (!slen)
break;
ch = UNSHIFTNON(*s);
s++;
slen--;
}
switch (ch) {
case RECHAR:
fputs("\\n", stdout);
break;
case '\\':
fputs("\\\\", stdout);
break;
default:
if (ISASCII(ch) && isprint(ch))
putchar(ch);
else
printf("\\%03o", ch);
break;
}
}
}
if (is_sdata)
fputs("\\|", stdout);
}
static VOID print_id(id, pubid, sysid)
UNIV id;
UNCH *pubid;
UNCH *sysid;
{
if (pubid) {
putchar(PUBID_CODE);
print_string(ustrlen(pubid), pubid, 0);
putchar('\n');
}
if (sysid) {
putchar(SYSID_CODE);
print_string(ustrlen(sysid), sysid, 0);
putchar('\n');
}
if (id) {
char *p;
for (p = id; *p != '\0'; p++) {
putchar(FILE_CODE);
do {
switch (*p) {
case '\\':
fputs("\\\\", stdout);
break;
case '\n':
fputs("\\n", stdout);
break;
default:
if (ISASCII(*p) && isprint((UNCH)*p))
putchar(*p);
else
printf("\\%03o", (UNCH)*p);
break;
}
} while (*++p);
putchar('\n');
}
}
}
static VOID print_filename(s)
char *s;
{
for (; *s; s++)
switch (*s) {
case '\\':
fputs("\\\\", stdout);
break;
case '\n':
fputs("\\n", stdout);
break;
default:
if (ISASCII(*s) && isprint((UNCH)*s))
putchar(*s);
else
printf("\\%03o", (UNCH)*s);
break;
}
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,23 @@
/* lineout.h */
/* Output codes used by sgmls. */
#define DATA_CODE '-'
#define START_CODE '('
#define END_CODE ')'
#define ATTRIBUTE_CODE 'A'
#define DATA_ATTRIBUTE_CODE 'D'
#define REFERENCE_ENTITY_CODE '&'
#define DEFINE_NOTATION_CODE 'N'
#define DEFINE_EXTERNAL_ENTITY_CODE 'E'
#define DEFINE_INTERNAL_ENTITY_CODE 'I'
#define PI_CODE '?'
#define DEFINE_SUBDOC_ENTITY_CODE 'S'
#define START_SUBDOC_CODE '{'
#define END_SUBDOC_CODE '}'
#define LOCATION_CODE 'L'
#define APPINFO_CODE '#'
#define PUBID_CODE 'p'
#define SYSID_CODE 's'
#define FILE_CODE 'f'
#define CONFORMING_CODE 'C'

602
usr.bin/sgmls/sgmls/main.c Normal file
View File

@ -0,0 +1,602 @@
/* main.c -
Main program for sgmls.
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#include "std.h"
#include "getopt.h"
#include "entity.h" /* Templates for entity control blocks. */
#include "adl.h" /* Definitions for attribute list processing. */
#include "sgmlmain.h" /* Main interface to SGML services. */
#include "appl.h"
#define READCNT 512
/* Before using argv[0] in error messages, strip off everything up to and
including the last character in prog that occurs in PROG_PREFIX. */
#ifndef PROG_PREFIX
#define PROG_PREFIX "/"
#endif /* not PROG_PREFIX */
/* Message catalogue name. */
#define CAT_NAME "sgmls"
/* Message set to use for application error messages. */
#define APP_SET 4
#ifdef HAVE_EXTENDED_PRINTF
#define xvfprintf vfprintf
#else
extern int xvfprintf P((FILE *, char *, va_list));
#endif
static VOID usage P((void));
static VOID fatal VP((int, ...));
static VOID do_error P((int, va_list));
static VOID swinit P((struct switches *));
static VOID write_caps P((char *, struct sgmlcap *));
static UNIV make_docent P((int, char **));
static char *munge_program_name P((char *, char *));
static VOID die P((void));
#ifdef SUPPORT_SUBDOC
static VOID build_subargv P((struct switches *));
static VOID cleanup P((void));
static char *create_subcap_file P((void));
#endif /* SUPPORT_SUBDOC */
static char *errlist[] = {
0,
"Out of memory",
"Cannot open SGML document entity",
"Cannot exec `%s': %s",
"Cannot fork: %s",
"Error waiting for process: %s",
"Program %s got fatal signal %d",
"Cannot open `%s': %s",
"Subdocument capacity botch",
"Non-existent subdocument entity `%s' not processed",
};
int suppsw = 0; /* Non-zero means suppress output. */
int locsw = 0; /* Non-zero means generate location info. */
static char *prog; /* Program name (for error messages). */
static nl_catd catd; /* Message catalogue descriptor. */
static char *capfile = 0; /* File for capacity report. */
extern char *version_string;
char options[] = {
'c', ':', 'd', 'e', 'g', 'i', ':', 'l', 'o', ':', 'p', 'r', 's', 'u', 'v',
#ifdef CANT_REDIRECT_STDERR
'f', ':',
#endif /* CANT_REDIRECT_STDERR */
#ifdef TRACE
'x', ':', 'y', ':',
#endif /* TRACE */
'\0'
};
#ifdef SUPPORT_SUBDOC
int suberr = 0; /* Error in subdocument. */
static char *subargv[sizeof(options)];
static int subargc = 0;
static char nopenbuf[sizeof(long)*3 + 1];
static char sgmldecl_file[L_tmpnam];
static char subcap_file[L_tmpnam];
#endif
int main(argc, argv)
int argc;
char **argv;
{
static char stderr_buf[BUFSIZ];
int opt;
#ifdef CANT_REDIRECT_STDERR
char *errfile = 0;
#endif
struct sgmlcap cap;
struct switches sw;
int nincludes = 0; /* number of -i options */
setbuf(stderr, stderr_buf);
/* Define MAIN_HOOK in config.h if some function needs to be called here. */
#ifdef MAIN_HOOK
MAIN_HOOK(argc, argv);
#endif
#ifdef SUPPORT_SUBDOC
subargv[subargc++] = argv[0];
#endif
prog = argv[0] = munge_program_name(argv[0], "sgmls");
catd = catopen(CAT_NAME, 0);
swinit(&sw);
while ((opt = getopt(argc, argv, options)) != EOF) {
switch (opt) {
case 'l': /* Generate location information. */
locsw = 1;
break;
case 'c': /* Print capacity usage. */
capfile = optarg;
break;
case 's': /* Suppress output. */
suppsw = 1;
break;
case 'd': /* Report duplicate entity declarations. */
sw.swdupent = 1;
break;
case 'e': /* Provide entity stack trace in error msg. */
sw.swenttr = 1;
break;
#ifdef CANT_REDIRECT_STDERR
case 'f': /* Redirect errors. */
errfile = optarg;
break;
#endif /* CANT_REDIRECT_STDERR */
case 'g': /* Provide GI stack trace in error messages. */
sw.sweltr = 1;
break;
case 'p': /* Parse only the prolog. */
sw.onlypro = 1;
suppsw = 1;
break;
case 'r': /* Give warning for defaulted references. */
sw.swrefmsg = 1;
break;
case 'u':
sw.swundef = 1;
break;
#ifdef TRACE
case 'x': /* Trace options for the document body. */
sw.trace = optarg;
break;
case 'y': /* Trace options for the prolog. */
sw.ptrace = optarg;
break;
#endif /* TRACE */
case 'v': /* Print the version number. */
fprintf(stderr, "sgmls version %s\n", version_string);
fflush(stderr);
break;
case 'o':
sw.nopen = atol(optarg);
if (sw.nopen <= 0)
usage();
break;
case 'i': /* Define parameter entity as "INCLUDE". */
sw.includes = (char **)xrealloc((UNIV)sw.includes,
(nincludes + 2)*sizeof(char *));
sw.includes[nincludes++] = optarg;
sw.includes[nincludes] = 0;
break;
case '?':
usage();
default:
abort();
}
}
#ifdef CANT_REDIRECT_STDERR
if (errfile) {
FILE *fp;
errno = 0;
fp = fopen(errfile, "w");
if (!fp)
fatal(E_OPEN, errfile, strerror(errno));
fclose(fp);
errno = 0;
if (!freopen(errfile, "w", stderr)) {
/* Can't use fatal() since stderr is now closed */
printf("%s: ", prog);
printf(errlist[E_OPEN], errfile, strerror(errno));
putchar('\n');
exit(EXIT_FAILURE);
}
}
#endif /* CANT_REDIRECT_STDERR */
(void)sgmlset(&sw);
#ifdef SUPPORT_SUBDOC
build_subargv(&sw);
#endif
if (sgmlsdoc(make_docent(argc - optind, argv + optind)))
fatal(E_DOC);
process_document(sw.nopen > 0);
sgmlend(&cap);
if (capfile)
write_caps(capfile, &cap);
#ifdef SUPPORT_SUBDOC
cleanup();
if (suberr)
exit(EXIT_FAILURE);
#endif /* SUPPORT_SUBDOC */
if (sgmlgcnterr() > 0)
exit(EXIT_FAILURE);
if (!sw.nopen)
output_conforming();
exit(EXIT_SUCCESS);
}
static char *munge_program_name(arg, dflt)
char *arg, *dflt;
{
char *p;
#ifdef PROG_STRIP_EXTENSION
char *ext;
#endif
if (!arg || !*arg)
return dflt;
p = strchr(arg, '\0');
for (;;) {
if (p == arg)
break;
--p;
if (strchr(PROG_PREFIX, *p)) {
p++;
break;
}
}
arg = p;
#ifdef PROG_STRIP_EXTENSION
ext = strrchr(arg, '.');
if (ext) {
p = (char *)xmalloc(ext - arg + 1);
memcpy(p, arg, ext - arg);
p[ext - arg] = '\0';
arg = p;
}
#endif /* PROG_STRIP_EXTENSION */
#ifdef PROG_FOLD
#ifdef PROG_STRIP_EXTENSION
if (!ext) {
#endif
p = xmalloc(strlen(arg) + 1);
strcpy(p, arg);
arg = p;
#ifdef PROG_STRIP_EXTENSION
}
#endif
for (p = arg; *p; p++)
if (ISASCII((unsigned char)*p) && isupper((unsigned char)*p))
*p = tolower((unsigned char)*p);
#endif /* PROG_FOLD */
return arg;
}
static UNIV make_docent(argc, argv)
int argc;
char **argv;
{
UNS len = 1;
int i;
UNIV res;
char *ptr;
static char *stdinname = STDINNAME;
if (argc == 0) {
argv = &stdinname;
argc = 1;
}
for (i = 0; i < argc; i++)
len += strlen(argv[i]) + 1;
res = xmalloc(len);
ptr = (char *)res;
for (i = 0; i < argc; i++) {
strcpy(ptr, argv[i]);
ptr = strchr(ptr, '\0') + 1;
}
*ptr = '\0';
return res;
}
static VOID usage()
{
/* Don't mention -o since this are for internal use only. */
fprintf(stderr, "Usage: %s [-deglprsuv]%s [-c file] [-i entity]%s [filename ...]\n",
prog,
#ifdef CANT_REDIRECT_STDERR
" [-f file]",
#else /* not CANT_REDIRECT_STDERR */
"",
#endif /* not CANT_REDIRECT_STDERR */
#ifdef TRACE
" [-x flags] [-y flags]"
#else /* not TRACE */
""
#endif /* not TRACE */
);
exit(EXIT_FAILURE);
}
static VOID die()
{
#ifdef SUPPORT_SUBDOC
cleanup();
#endif /* SUPPORT_SUBDOC */
exit(EXIT_FAILURE);
}
static VOID swinit(swp)
struct switches *swp;
{
swp->swenttr = 0;
swp->sweltr = 0;
swp->swbufsz = READCNT+2;
swp->prog = prog;
swp->swdupent = 0;
swp->swrefmsg = 0;
#ifdef TRACE
swp->trace = 0;
swp->ptrace = 0;
#endif /* TRACE */
swp->catd = catd;
swp->swambig = 1; /* Always check for ambiguity. */
swp->swundef = 0;
swp->nopen = 0;
swp->onlypro = 0;
swp->includes = 0;
swp->die = die;
}
#ifdef SUPPORT_SUBDOC
static VOID build_subargv(swp)
struct switches *swp;
{
if (suppsw)
subargv[subargc++] = "-s";
if (locsw)
subargv[subargc++] = "-l";
if (swp->swdupent)
subargv[subargc++] = "-d";
if (swp->swenttr)
subargv[subargc++] = "-e";
if (swp->sweltr)
subargv[subargc++] = "-g";
if (swp->swrefmsg)
subargv[subargc++] = "-r";
#ifdef TRACE
if (swp->trace) {
subargv[subargc++] = "-x";
subargv[subargc++] = swp->trace;
}
if (swp->ptrace) {
subargv[subargc++] = "-y";
subargv[subargc++] = swp->ptrace;
}
#endif /* TRACE */
subargv[subargc++] = "-o";
sprintf(nopenbuf, "%ld", swp->nopen + 1);
subargv[subargc++] = nopenbuf;
}
static
VOID handler(sig)
int sig;
{
signal(sig, SIG_DFL);
cleanup();
raise(sig);
}
static
VOID cleanup()
{
if (sgmldecl_file[0]) {
(void)remove(sgmldecl_file);
sgmldecl_file[0] = '\0';
}
if (subcap_file[0]) {
(void)remove(subcap_file);
subcap_file[0] = '\0';
}
}
static
char *store_sgmldecl()
{
if (!sgmldecl_file[0]) {
FILE *fp;
if (signal(SIGINT, SIG_IGN) != SIG_IGN)
signal(SIGINT, handler);
#ifdef SIGTERM
if (signal(SIGTERM, SIG_IGN) != SIG_IGN)
signal(SIGTERM, handler);
#endif /* SIGTERM */
#ifdef SIGPIPE
if (signal(SIGPIPE, SIG_IGN) != SIG_IGN)
signal(SIGPIPE, handler);
#endif
#ifdef SIGHUP
if (signal(SIGHUP, SIG_IGN) != SIG_IGN)
signal(SIGHUP, handler);
#endif
tmpnam(sgmldecl_file);
errno = 0;
fp = fopen(sgmldecl_file, "w");
if (!fp)
fatal(E_OPEN, sgmldecl_file, strerror(errno));
sgmlwrsd(fp);
fclose(fp);
}
return sgmldecl_file;
}
static
char *create_subcap_file()
{
if (subcap_file[0] == '\0') {
FILE *fp;
tmpnam(subcap_file);
fp = fopen(subcap_file, "w");
if (!fp)
fatal(E_OPEN, subcap_file, strerror(errno));
fclose(fp);
}
return subcap_file;
}
char **make_argv(id)
UNIV id;
{
int nfiles;
char *p;
char **argv;
int i;
for (p = (char *)id, nfiles = 0; *p; p = strchr(p, '\0') + 1)
nfiles++;
argv = (char **)xmalloc((subargc + 2 + 1 + nfiles + 1)*sizeof(char *));
memcpy((UNIV)argv, (UNIV)subargv, subargc*sizeof(char *));
i = subargc;
argv[i++] = "-c";
argv[i++] = create_subcap_file();
argv[i++] = store_sgmldecl();
for (p = (char *)id; *p; p = strchr(p, '\0') + 1)
argv[i++] = p;
argv[i] = 0;
return argv;
}
VOID get_subcaps()
{
long cap[NCAPACITY];
FILE *fp;
int i;
if (!subcap_file[0])
return;
errno = 0;
fp = fopen(subcap_file, "r");
if (!fp)
fatal(E_OPEN, subcap_file, strerror(errno));
for (i = 0; i < NCAPACITY; i++)
if (fscanf(fp, "%*s %ld", cap + i) != 1)
fatal(E_CAPBOTCH);
fclose(fp);
sgmlsubcap(cap);
}
#endif /* SUPPORT_SUBDOC */
/* Print capacity statistics.*/
static VOID write_caps(name, p)
char *name;
struct sgmlcap *p;
{
FILE *fp;
int i;
fp = fopen(name, "w");
if (!fp)
fatal(E_OPEN, name, strerror(errno));
/* This is in RACT format. */
for (i = 0; i < NCAPACITY; i++)
fprintf(fp, "%s %ld\n", p->name[i], p->number[i]*p->points[i]);
fclose(fp);
}
UNIV xmalloc(n)
UNS n;
{
UNIV p = malloc(n);
if (!p)
fatal(E_NOMEM);
return p;
}
UNIV xrealloc(s, n)
UNIV s;
UNS n;
{
s = s ? realloc(s, n) : malloc(n);
if (!s)
fatal(E_NOMEM);
return s;
}
static
#ifdef VARARGS
VOID fatal(va_alist) va_dcl
#else
VOID fatal(int errnum,...)
#endif
{
#ifdef VARARGS
int errnum;
#endif
va_list ap;
#ifdef VARARGS
va_start(ap);
errnum = va_arg(ap, int);
#else
va_start(ap, errnum);
#endif
do_error(errnum, ap);
va_end(ap);
exit(EXIT_FAILURE);
}
#ifdef VARARGS
VOID appl_error(va_alist) va_dcl
#else
VOID appl_error(int errnum,...)
#endif
{
#ifdef VARARGS
int errnum;
#endif
va_list ap;
#ifdef VARARGS
va_start(ap);
errnum = va_arg(ap, int);
#else
va_start(ap, errnum);
#endif
do_error(errnum, ap);
va_end(ap);
}
static
VOID do_error(errnum, ap)
int errnum;
va_list ap;
{
char *text;
fprintf(stderr, "%s: ", prog);
assert(errnum > 0);
assert(errnum < sizeof(errlist)/sizeof(errlist[0]));
text = catgets(catd, APP_SET, errnum, errlist[errnum]);
assert(text != 0);
xvfprintf(stderr, text, ap);
fputc('\n', stderr);
fflush(stderr);
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
comment-column: 30
End:
*/

862
usr.bin/sgmls/sgmls/md1.c Normal file
View File

@ -0,0 +1,862 @@
#include "sgmlincl.h" /* #INCLUDE statements for SGML parser. */
/* MDADL: Process ATTLIST declaration.
*/
VOID mdadl(tbuf)
UNCH *tbuf; /* Work area for tokenization (tbuf). */
{
int i; /* Loop counter; temporary variable. */
int adlim; /* Number of unused ad slots in al. */
struct ad *alperm = 0; /* Attribute definition list. */
int stored = 0;
mdname = key[KATTLIST]; /* Identify declaration for messages. */
subdcl = 0; /* No subject as yet. */
parmno = 0; /* No parameters as yet. */
mdessv = es; /* Save es level for entity nesting check. */
reqadn = noteadn = 0; /* No required attributes yet. */
idadn = conradn = 0; /* No special atts yet.*/
AN(al) = 0; /* Number of attributes defined. */
ADN(al) = 0; /* Number of ad's in al (atts + name vals).*/
/* PARAMETER 1: Element name or a group of them.
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("1: element name or group");
switch (pcbmd.action) {
case NAS:
nmgrp[0] = etddef(tbuf);
nmgrp[1] = 0;
break;
case GRPS:
parsegrp(nmgrp, &pcbgrnm, tbuf);
break;
case RNS: /* Reserved name started. */
if (ustrcmp(tbuf+1, key[KNOTATION])) {
mderr(118, tbuf+1, key[KNOTATION]);
return;
}
mdnadl(tbuf);
return;
default:
mderr(121, (UNCH *)0, (UNCH *)0);
return;
}
/* Save first GI for error msgs. */
if (nmgrp[0])
subdcl = nmgrp[0]->etdgi+1;
/* PARAMETER 2: Attribute definition list.
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("2: attribute list");
if (pcbmd.action!=NAS) {
mderr(120, (UNCH *)0, (UNCH *)0);
return;
}
while (pcbmd.action==NAS) {
al[ADN(al)+1].adname = savenm(tbuf);
if ((adlim = ATTCNT-((int)++ADN(al)))<0) {
mderr(111, (UNCH *)0, (UNCH *)0);
adlfree(al, 1);
return;
}
++AN(al);
if (mdattdef(adlim, 0)) {
adlfree(al, 1);
return;
}
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
}
if (AN(al)>0) { /* Save list only if 1 or more good atts. */
if (reqadn) SET(ADLF(al), ADLREQ); /* Element must have start-tag. */
if (noteadn) SET(ADLF(al), ADLNOTE); /* Element cannot be EMPTY. */
if (conradn) SET(ADLF(al), ADLCONR); /* Element cannot be EMPTY. */
alperm = (struct ad *)rmalloc((1+ADN(al))*ADSZ);
memcpy((UNIV)alperm, (UNIV)al, (1+ADN(al))*ADSZ );
ds.attcnt += AN(al); /* Number of attributes defined. */
ds.attgcnt += ADN(al) - AN(al); /* Number of att grp members. */
TRACEADL(alperm);
}
/* Clear attribute list for next declaration. */
MEMZERO((UNIV)al, (1+ADN(al))*ADSZ);
/* PARAMETER 3: End of declaration.
*/
/* Next pcb.action was set during attribute definition loop. */
TRACEMD(emd);
if (pcbmd.action!=EMD) {mderr(126, (UNCH *)0, (UNCH *)0); return;}
if (es!=mdessv) synerr(37, &pcbmd);
/* EXECUTE: Store the definition for each element name specified.
*/
TRACEGRP(nmgrp);
for (i = 0; nmgrp[i]; i++) {
if (nmgrp[i]->adl) { /* Error if an ADL exists. */
mderr(112, (UNCH *)0, (UNCH *)0);
continue;
}
nmgrp[i]->adl = alperm; /* If virgin, store the adl ptr. */
stored = 1;
if (alperm && nmgrp[i]->etdmod)
etdadl(nmgrp[i]); /* Check for conflicts with ETD. */
}
if (!stored && alperm) {
adlfree(alperm, 1);
frem((UNIV)alperm);
}
}
/* ETDADL: Check compatibility between ETD and ADL.
*/
VOID etdadl(p)
struct etd *p; /* Pointer to element type definition. */
{
parmno = 0;
/* Minimizable element cannot have required attribute. */
if (GET(p->etdmin, SMO) && GET(p->adl[0].adflags, ADLREQ)) {
mderr(40, (UNCH *)0, (UNCH *)0);
RESET(p->etdmin, SMO);
}
/* Empty element cannot have NOTATION attribute.
Attribute is not removed (too much trouble), but we trap
attempts to specify it on the start-tag in adlval().
*/
if (GET(p->etdmod->ttype, MNONE)) {
if (GET(p->adl[0].adflags, ADLNOTE))
mderr(83, (UNCH *)0, (UNCH *)0);
/* Empty element cannot have CONREF attribute.
Attribute is not removed because it just acts
like IMPLIED anyway.
*/
if (GET(p->adl[0].adflags, ADLCONR))
mderr(85, (UNCH *)0, (UNCH *)0);
}
/* "-" should not be specified for the end-tag minimization if
the element has a content reference attribute. */
if (GET(p->adl[0].adflags, ADLCONR) && BITON(p->etdmin, EMM))
mderr(153, (UNCH *)0, (UNCH *)0);
}
/* MDNADL: Process ATTLIST declaration for notation.
TO DO: Pass deftab and dvtab as parameters so
that prohibited types can be handled by leaving
them out of the tables.
*/
VOID mdnadl(tbuf)
UNCH *tbuf; /* Work area for tokenization (tbuf). */
{
int i; /* Loop counter; temporary variable. */
int adlim; /* Number of unused ad slots in al. */
struct ad *alperm = 0; /* Attribute definition list. */
int stored = 0;
/* PARAMETER 1: Notation name or a group of them.
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("1: notation name or group");
switch (pcbmd.action) {
case NAS:
nnmgrp[0] = dcndef(tbuf);
nnmgrp[1] = 0;
break;
case GRPS:
parsngrp(nnmgrp, &pcbgrnm, tbuf);
break;
default:
mderr(121, (UNCH *)0, (UNCH *)0);
return;
}
subdcl = nnmgrp[0]->ename+1; /* Save first name for error msgs. */
/* PARAMETER 2: Attribute definition list.
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("2: attribute list");
if (pcbmd.action!=NAS) {
mderr(120, (UNCH *)0, (UNCH *)0);
return;
}
while (pcbmd.action==NAS) {
al[ADN(al)+1].adname = savenm(tbuf);
if ((adlim = ATTCNT-((int)ADN(al)++))<0) {
mderr(111, (UNCH *)0, (UNCH *)0);
adlfree(al, 1);
return;
}
++AN(al);
if (mdattdef(adlim, 1)) {
adlfree(al, 1);
return;
}
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
}
if (AN(al)>0) { /* Save list only if 1 or more good atts. */
alperm = (struct ad *)rmalloc((1+ADN(al))*ADSZ);
memcpy((UNIV)alperm, (UNIV)al, (1+ADN(al))*ADSZ );
ds.attcnt += AN(al); /* Number of attributes defined. */
ds.attgcnt += ADN(al) - AN(al); /* Number of att grp members. */
TRACEADL(alperm);
}
/* Clear attribute list for next declaration. */
MEMZERO((UNIV)al, (1+ADN(al))*ADSZ);
/* PARAMETER 3: End of declaration.
*/
/* Next pcb.action was set during attribute definition loop. */
TRACEMD(emd);
if (pcbmd.action!=EMD) {mderr(126, (UNCH *)0, (UNCH *)0); return;}
if (es!=mdessv) synerr(37, &pcbmd);
/* EXECUTE: Store the definition for each notation name specified.
*/
TRACENGR(nnmgrp);
for (i = 0; nnmgrp[i]; i++) {
if (nnmgrp[i]->adl) { /* Error if an ADL exists. */
mderr(112, (UNCH *)0, (UNCH *)0);
continue;
}
nnmgrp[i]->adl = alperm; /* If virgin, store the adl ptr. */
if (nnmgrp[i]->entsw)
fixdatt(nnmgrp[i]);
stored = 1;
TRACEDCN(nnmgrp[i]);
}
if (!stored && alperm) {
adlfree(alperm, 1);
frem((UNIV)alperm);
}
}
/* Data attributes have been specified for notation p, but entities
have already been declared with notation p. Fix up the definitions of
all entities with notation p. Generate an error for any data
attribute that was required. */
VOID fixdatt(p)
struct dcncb *p;
{
int i;
for (i = 0; i < ENTHASH; i++) {
struct entity *ep;
for (ep = etab[i]; ep; ep = ep->enext)
if (ep->estore == ESN && ep->etx.n && ep->etx.n->nedcn == p) {
int adn;
initatt(p->adl);
/* Don't use adlval because if there were required
attributes the error message wouldn't say what
entity was involved. */
for (adn = 1; adn <= ADN(al); adn++) {
if (GET(ADFLAGS(al,adn), AREQ)) {
sgmlerr(218, &pcbstag, ADNAME(al,adn),
ep->ename + 1);
SET(ADFLAGS(al,adn), AINVALID);
}
if (BITON(ADFLAGS(al, adn), AGROUP))
adn += ADNUM(al, adn);
}
storedatt(ep->etx.n);
}
}
}
/* MDATTDEF: Process an individual attribute definition.
The attribute name is parsed by the caller.
Duplicate attributes are parsed, but removed from list.
Returns 0 if successful, otherwise returns 1.
*/
int mdattdef(adlim, datt)
int adlim; /* Remaining capacity of al (in tokens).*/
int datt; /* Non-zero if a data attribute. */
{
int deftype; /* Default value type: 0=not keyword. */
int errsw = 0; /* 1=semantic error; ignore att. */
int novalsw = 0; /* 1=semantic error; treat as IMPLIED. */
int attadn = (int)ADN(al); /* Save ad number of this attribute. */
struct parse *grppcb = NULL; /* PCB for name/token grp parse. */
int errcode; /* Error type returned by PARSEVAL, ANMTGRP. */
UNCH *advalsv; /* Save area for permanent value ptr. */
/* PARAMETER 1: Attribute name (parsed by caller).
*/
TRACEMD("1: attribute name");
if (anmget((int)ADN(al)-1, al[attadn].adname)) {
errsw = 1;
mderr(99, ADNAME(al,attadn), (UNCH *)0);
}
ADNUM(al,attadn) = ADFLAGS(al,attadn) = ADLEN(al,attadn) = 0;
ADVAL(al,attadn) = 0; ADDATA(al,attadn).x = 0; ADTYPE(al,attadn) = ANMTGRP;
/* PARAMETER 2: Declared value.
*/
parsemd(lbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("2: declared value");
switch (pcbmd.action) {
case NAS: /* Keyword for value type. */
switch (ADTYPE(al,attadn) = (UNCH)mapsrch(dvtab, lbuf+1)) {
case 0:
mderr(100, ADNAME(al,attadn), lbuf+1);
return 1;
case ANOTEGRP:
if (datt) {
errsw = 1;
mderr(156, (UNCH *)0, (UNCH *)0);
}
else if (!noteadn) noteadn = ADN(al);
else {
errsw = 1;
mderr(101, ADNAME(al,attadn), (UNCH *)0);
}
grppcb = &pcbgrnm; /* NOTATION requires name grp. */
parsemd(lbuf, NAMECASE, &pcblitp, NAMELEN);/* Get GRPO*/
break;
case AID:
if (datt) {
errsw = 1;
mderr(144, (UNCH *)0, (UNCH *)0);
}
else if (!idadn)
idadn = attadn;
else {
errsw = 1;
mderr(102, ADNAME(al,attadn), (UNCH *)0);
}
break;
case AIDREF:
case AIDREFS:
if (datt) {
errsw = 1;
mderr(155, (UNCH *)0, (UNCH *)0);
}
break;
case AENTITY:
case AENTITYS:
if (datt) {
errsw = 1;
mderr(154, (UNCH *)0, (UNCH *)0);
}
break;
}
break;
case GRPS:
grppcb = &pcbgrnt; /* Normal grp is name token grp. */
break;
case EMD:
mderr(103, ADNAME(al,attadn), (UNCH *)0);
return 1;
default:
mderr(104, ADNAME(al,attadn), (UNCH *)0);
return 1;
}
/* PARAMETER 2A: Name token group.
*/
if (grppcb != NULL) {
TRACEMD("2A: name group");
switch (pcbmd.action) {
case GRPS: /* Name token list. */
SET(ADFLAGS(al,attadn), AGROUP);
/* Call routine to parse group, create ad entries in adl. */
errcode = anmtgrp(grppcb, al+attadn,
(GRPCNT<adlim ? GRPCNT+1 : adlim+1),
&al[attadn].adnum, ADN(al));
if (errcode<=0) {
if (adlim < GRPCNT)
mderr(111, (UNCH *)0, (UNCH *)0);
else
mderr(105, ADNAME(al,attadn), (UNCH *)0);
return 1;
}
ADN(al) += ADNUM(al,attadn); /* Add grp size to total ad cnt.*/
break;
default:
mderr(106, ADNAME(al,attadn), (UNCH *)0);
return 1;
}
}
/* PARAMETER 3: Default value keyword.
*/
parsemd(lbuf, AVALCASE,
(ADTYPE(al,attadn)==ACHARS) ? &pcblitr : &pcblitt, LITLEN);
TRACEMD("3: default keyword");
switch (pcbmd.action) {
case RNS: /* Keyword. */
deftype = mapsrch(deftab, lbuf+1);
switch (deftype) {
case DFIXED: /* FIXED */
SET(ADFLAGS(al,attadn), AFIXED);
parsemd(lbuf, AVALCASE,
(ADTYPE(al,attadn)==ACHARS) ? &pcblitr : &pcblitt,
LITLEN); /* Real default. */
goto parm3x; /* Go process specified value. */
case DCURR: /* CURRENT: If ID, treat as IMPLIED. */
if (ADTYPE(al,attadn)==AID) {
mderr(80, ADNAME(al,attadn), (UNCH *)0);
break;
}
if (datt) {
mderr(157, (UNCH *)0, (UNCH *)0);
break;
}
SET(ADFLAGS(al,attadn), ACURRENT);
break;
case DREQ: /* REQUIRED */
SET(ADFLAGS(al,attadn), AREQ); ++reqadn;
break;
case DCONR: /* CONREF */
if (ADTYPE(al,attadn)==AID) {
mderr(107, ADNAME(al,attadn), (UNCH *)0);
break;
}
if (datt) {
mderr(158, (UNCH *)0, (UNCH *)0);
break;
}
SET(ADFLAGS(al,attadn), ACONREF); conradn = 1;
case DNULL: /* IMPLIED */
break;
default: /* Unknown keyword is an error. */
mderr(108, ADNAME(al,attadn), lbuf+1);
errsw = 1;
}
if (errsw) {
/* Ignore erroneous att. */
adlfree(al, attadn);
--AN(al);
ADN(al) = (UNCH)attadn-1;
}
return(0);
default:
break;
}
/* PARAMETER 3x: Default value (non-keyword).
*/
parm3x:
TRACEMD("3x: default (non-keyword)");
if (ADTYPE(al,attadn)==AID) { /* If ID, treat as IMPLIED. */
mderr(81, ADNAME(al,attadn), (UNCH *)0);
novalsw = 1; /* Keep parsing to keep things straight. */
}
switch (pcbmd.action) {
case LIT: /* Literal. */
case LITE: /* Literal. */
/* Null string (except CDATA) is error: msg and treat as IMPLIED. */
if (*lbuf == '\0' && ADTYPE(al,attadn)!=ACHARS) {
mderr(82, ADNAME(al,attadn), (UNCH *)0);
novalsw = 1;
}
break;
case NAS: /* Name character string. */
case NMT: /* Name character string. */
case NUM: /* Number or number token string. */
/* The name won't have a length byte because AVALCASE was specified. */
break;
case CDR:
parsetkn(lbuf, NMC, LITLEN);
break;
case EMD:
mderr(109, ADNAME(al,attadn), (UNCH *)0);
return 1;
default:
mderr(110, ADNAME(al,attadn), (UNCH *)0);
return 1;
}
if (errsw) {
/* Ignore erroneous att. */
adlfree(al, attadn);
--AN(al);
ADN(al) = (UNCH)attadn-1;
return(0);
}
if (novalsw) return(0);
/* PARAMETER 3y: Validate and store default value.
*/
if (ADTYPE(al,attadn)==ACHARS) {
UNS len = vallen(ACHARS, 0, lbuf);
if (len > LITLEN) {
/* Treat as implied. */
sgmlerr(224, &pcbmd, ADNAME(al,attadn), (UNCH *)0);
return 0;
}
/* No more checking for CDATA value. */
ADNUM(al,attadn) = 0; /* CDATA is 0 tokens. */
ADVAL(al,attadn) = savestr(lbuf);/* Store default; save ptr. */
ADLEN(al,attadn) = len;
ds.attdef += len;
return 0;
}
/* Parse value and save token count (GROUP implies 1 token). */
advalsv = (UNCH *)rmalloc(ustrlen(lbuf)+2); /* Storage for tokenized value. */
errcode = parseval(lbuf, (UNS)ADTYPE(al,attadn), advalsv);
if (BITOFF(ADFLAGS(al,attadn), AGROUP)) ADNUM(al,attadn) = (UNCH)tokencnt;
/* If value was invalid, or was a group member that was not in the group,
issue an appropriate message and set the error switch. */
if (errcode)
{sgmlerr((UNS)errcode, &pcbmd, ADNAME(al,attadn), lbuf); errsw = 1;}
else if ( BITON(ADFLAGS(al,attadn), AGROUP)
&& !amemget(&al[attadn], (int)ADNUM(al,attadn), advalsv) ) {
sgmlerr(79, &pcbmd, ADNAME(al,attadn), advalsv+1);
errsw = 1;
}
ADLEN(al,attadn) = vallen(ADTYPE(al,attadn), ADNUM(al,attadn), advalsv);
if (ADLEN(al,attadn) > LITLEN) {
sgmlerr(224, &pcbmd, ADNAME(al,attadn), (UNCH *)0);
ADLEN(al,attadn) = 0;
errsw = 1;
}
/* For valid tokenized value, save it and update statistics. */
if (!errsw) {
ADVAL(al,attadn) = advalsv;
ds.attdef += ADLEN(al,attadn);
return 0;
}
/* If value was bad, free the value's storage and treat as
IMPLIED or REQUIRED. */
frem((UNIV)advalsv); /* Release storage for value. */
ADVAL(al,attadn) = NULL; /* And make value NULL. */
return 0;
}
/* ANMTGRP: Parse a name or name token group, create attribute descriptors
for its members, and add them to the attribute descriptor list.
The parse either terminates or returns a good token, so no
switch is needed.
*/
int anmtgrp(pcb, nt, grplim, adn, adsz)
struct parse *pcb; /* PCB for name or name token grp. */
struct ad nt[]; /* Buffer for creating name token list. */
int grplim; /* Maximum size of list (plus 1). */
UNS *adn; /* Ptr to number of names or tokens in grp. */
int adsz; /* Size of att def list. */
{
UNCH adtype = (UNCH)(pcb==&pcbgrnt ? ANMTGRP:ANOTEGRP);/*Attribute type.*/
int essv = es; /* Entity stack level when grp started. */
*adn = 0; /* Group is empty to start. */
while (parse(pcb)!=GRPE && *adn<grplim) {
switch (pcb->action) {
case NAS_: /* Name or name token (depending on pcb). */
case NMT_:
parsenm(lbuf, NAMECASE);
nt[*adn+1].adname = savenm(lbuf);
if (antvget((int)(adsz+*adn), nt[*adn+1].adname, (UNCH **)0))
mderr(98, ntoa((int)*adn+1), nt[*adn+1].adname+1);
nt[++*adn].adtype = adtype;
nt[*adn].addef = NULL;
continue;
case EE_: /* Entity ended (correctly or incorrectly). */
if (es<essv) {synerr(37, pcb); essv = es;}
continue;
case PIE_: /* PI entity reference (invalid). */
entpisw = 0; /* Reset PI entity indicator. */
synerr(59, pcb);
continue;
default:
break;
}
break;
}
if (es!=essv) synerr(37, pcb);
if (*adn==grplim) return -1;
else return *adn; /* Return number of tokens. */
}
/* MDDTDS: Process start of DOCTYPE declaration (through MSO).
*/
VOID mddtds(tbuf)
UNCH *tbuf; /* Work area for tokenization[LITLEN+2]. */
{
struct fpi fpicb; /* Formal public identifier structure. */
union etext etx; /* Ptr to entity text. */
UNCH estore = ESD; /* Entity storage class. */
int emdsw = 0; /* 1=end of declaration found; 0=not yet. */
mdname = key[KDOCTYPE]; /* Identify declaration for messages. */
subdcl = NULL; /* No subject as yet. */
parmno = 0; /* No parameters as yet. */
mdessv = es; /* Save es for checking entity nesting. */
dtdrefsw = 0; /* No external DTD entity as yet. */
/* PARAMETER 1: Document type name.
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("1: doc type name");
if (pcbmd.action!=NAS) {mderr(120, (UNCH *)0, (UNCH *)0); return;}
dtype = savenm(tbuf);
subdcl = dtype+1; /* Subject of declaration for error msgs. */
/* PARAMETER 2: External identifier keyword or MDS.
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("2: extid or MDS");
switch (pcbmd.action) {
case NAS:
if (mdextid(tbuf, &fpicb, dtype+1, &estore, (PNE)0)==0) return;
if ((etx.x = entgen(&fpicb))==0)
mderr(146, dtype+1, (UNCH *)0);
else
dtdrefsw = 1; /* Signal external DTD entity. */
break;
case MDS:
goto execute;
default:
mderr(128, (UNCH *)0, (UNCH *)0);
return;
}
/* PARAMETER 3: MDS or end of declaration.
*/
TRACEMD("3: MDS or EMD");
switch (pcbmd.action) {
default: /* Treat as end of declaration. */
mderr(126, (UNCH *)0, (UNCH *)0);
case EMD:
emdsw = 1;
case MDS:
break;
}
/* EXECUTE: Store entity definition if an external ID was specified.
*/
execute:
if (es!=mdessv) synerr(37, &pcbmd);
propcb = &pcbmds; /* Prepare to parse doc type definition (MDS). */
if (dtdrefsw) {
/* TO DO: If concurrent DTD's supported, free existing
etext for all but first DTD (or reuse it). */
entdef(indtdent, estore, &etx);
++ds.ecbcnt; ds.ecbtext += entlen;
if (emdsw) {
REPEATCC; /* Push back the MDC. */
*FPOS = lex.d.msc; /* Simulate end of DTD subset. */
REPEATCC; /* Back up to read MSC next. */
delmscsw = 1; /* Insert MSC after referenced DTD. */
}
}
indtdsw = 1; /* Allow "DTD only" parameters. */
return;
}
/* MDDTDE: Process DOCTYPE declaration end.
*/
VOID mddtde(tbuf)
UNCH *tbuf; /* Work area for tokenization. */
{
mdessv = es; /* Save es for checking entity nesting. */
propcb = &pcbpro; /* Restore normal prolog parse. */
indtdsw = 0; /* Prohibit "DTD only" parameters. */
mdname = key[KDOCTYPE]; /* Identify declaration for messages. */
subdcl = dtype+1; /* Subject of declaration for error msgs. */
parmno = 0; /* No parameters as yet. */
/* PARAMETER 4: End of declaration.
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, LITLEN);
TRACEMD(emd);
if (pcbmd.action!=EMD) mderr(126, (UNCH *)0, (UNCH *)0);
if (es!=mdessv) synerr(37, &pcbmd);
}
/* MDELEM: Process ELEMENT declaration.
*/
VOID mdelem(tbuf)
UNCH *tbuf; /* Work area for tokenization (tbuf). */
{
UNCH *ranksuff = lbuf; /* Rank suffix. */
UNS dctype = 0; /* Declared content type (from dctab). */
UNCH fmin = 0; /* Minimization bit flags. */
int i; /* Loop counter. */
UNS u; /* Temporary variable. */
struct etd **mexgrp, **pexgrp; /* Ptr to model exceptions array. */
struct thdr *cmod, *cmodsv; /* Ptr to content model. */
UNCH *etdgi; /* GI of current etd (when going through group).*/
int minomitted = 0; /* Tag minimization parameters omitted. */
mdname = key[KELEMENT]; /* Identify declaration for messages. */
subdcl = NULL; /* No subject as yet. */
parmno = 0; /* No parameters as yet. */
mdessv = es; /* Save es level for entity nesting check. */
ranksuff[0] = 0;
mexgrp = pexgrp = 0;
/* PARAMETER 1: Element name or a group of them.
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("1: element name or grp");
switch (pcbmd.action) {
case NAS:
nmgrp[0] = etddef(tbuf);
nmgrp[1] = 0;
break;
case GRPS:
parsegrp(nmgrp, &pcbgrnm, tbuf);
break;
default:
mderr(121, (UNCH *)0, (UNCH *)0);
return;
}
/* Save first GI for trace and error messages. */
if (nmgrp[0])
subdcl = nmgrp[0]->etdgi+1;
/* PARAMETER 1A: Rank suffix (optional).
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("1A: rank suffix");
switch (pcbmd.action) {
case NUM:
ustrcpy(ranksuff, tbuf);
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
default:
break;
}
/* PARAMETER 2A: Start-tag minimization.
*/
TRACEMD("2A: start min");
switch (pcbmd.action) {
case CDR:
break;
case NAS:
if (!ustrcmp(tbuf+1, key[KO])) {
if (OMITTAG==YES) SET(fmin, SMO);
break;
}
/* fall through */
default:
if (OMITTAG==NO) {minomitted=1; break;}
mderr(129, tbuf+1, (UNCH *)0);
return;
}
/* Must omit omitted end-tag minimization, if omitted
start-tag minimization was omitted (because OMITTAG == NO). */
if (!minomitted) {
/* PARAMETER 2B: End-tag minimization.
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("2B: end min");
switch (pcbmd.action) {
case NAS:
if (ustrcmp(tbuf+1, key[KO])) {mderr(129, tbuf+1, (UNCH *)0); return;}
if (OMITTAG==YES) SET(fmin, EMO);
break;
case CDR:
SET(fmin, EMM);
break;
default:
mderr(129, tbuf+1, (UNCH *)0);
return;
}
/* PARAMETER 3: Declared content.
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
}
TRACEMD("3: declared content");
switch (pcbmd.action) {
case NAS:
dctype = mapsrch(dctab, tbuf+1);
if (!dctype) {mderr(24, tbuf+1, (UNCH *)0); return;}
/* Eliminate incompatibilities among parameters. */
if (GET(fmin, SMO) && GET(dctype, MNONE+MCDATA+MRCDATA)) {
mderr(58, (UNCH *)0, (UNCH *)0);
RESET(fmin, SMO);
}
if (GET(dctype, MNONE) && BITON(fmin, EMM)) {
mderr(87, (UNCH *)0, (UNCH *)0);
SET(fmin, EMO);
}
/* If valid, process like a content model. */
case GRPS:
cmodsv = parsemod((int)(pcbmd.action==GRPS ? 0 : dctype));
if (cmodsv==0) return;
u = (dctype ? 1 : cmodsv->tu.tnum+2) * THSZ;
cmod = (struct thdr *)rmalloc(u);
memcpy((UNIV)cmod , (UNIV)cmodsv, u );
ds.modcnt += cmod->tu.tnum;
TRACEMOD(cmod);
break;
default:
mderr(130, (UNCH *)0, (UNCH *)0);
return;
}
/* PARAMETERS 3A, 3B: Exceptions or end.
*/
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
if (BITOFF(cmod->ttype, MCDATA+MRCDATA+MNONE)) {
/* PARAMETER 3A: Minus exceptions.
*/
TRACEMD("3A: -grp");
switch (pcbmd.action) {
case MGRP:
/* We cheat and use nnmgrp for this. */
mexgrp = copygrp((PETD *)nnmgrp,
u = parsegrp((PETD *)nnmgrp, &pcbgrnm, tbuf));
++ds.pmexgcnt; ds.pmexcnt += u-1;
TRACEGRP(mexgrp);
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
default:
break;
}
/* PARAMETER 3B: Plus exceptions.
*/
TRACEMD("3B: +grp");
switch (pcbmd.action) {
case PGRP:
pexgrp = copygrp((PETD *)nnmgrp,
u = parsegrp((PETD *)nnmgrp, &pcbgrnm, tbuf));
++ds.pmexgcnt; ds.pmexcnt += u-1;
TRACEGRP(pexgrp);
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
default:
break;
}
}
/* PARAMETER 4: End of declaration.
*/
TRACEMD(emd);
if (pcbmd.action!=EMD) mderr(126, (UNCH *)0, (UNCH *)0);
if (es!=mdessv) synerr(37, &pcbmd);
/* EXECUTE: Store the definition for each element name specified.
*/
TRACEGRP(nmgrp);
for (i = -1; nmgrp[++i];) {
etdgi = nmgrp[i]->etdgi;
if (*ranksuff) {
if ((tbuf[0] = *etdgi + ustrlen(ranksuff)) - 2 > NAMELEN) {
mderr(131, etdgi+1, ranksuff);
continue;
}
memcpy(tbuf+1, etdgi+1, *etdgi-1);
ustrcpy(tbuf+*etdgi-1, ranksuff);
etdcan(etdgi);
nmgrp[i] = etddef(tbuf);
}
if (nmgrp[i]->etdmod) {mderr(56, etdgi+1, (UNCH *)0); continue;}
etdset(nmgrp[i], fmin+ETDDCL, cmod, mexgrp, pexgrp, nmgrp[i]->etdsrm);
++ds.etdcnt;
if (nmgrp[i]->adl) etdadl(nmgrp[i]); /* Check ETD conflicts. */
TRACEETD(nmgrp[i]);
}
}
VOID adlfree(al, aln)
struct ad *al;
int aln;
{
for (; aln <= ADN(al); aln++) {
frem((UNIV)al[aln].adname);
if (ADVAL(al, aln))
frem((UNIV)ADVAL(al, aln));
if (BITON(ADFLAGS(al, aln), AGROUP)) {
int i;
for (i = 0; i < ADNUM(al, aln); i++)
frem((UNIV)al[aln + i + 1].adname);
aln += ADNUM(al, aln);
}
}
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
comment-column: 30
End:
*/

801
usr.bin/sgmls/sgmls/md2.c Normal file
View File

@ -0,0 +1,801 @@
#include "sgmlincl.h" /* #INCLUDE statements for SGML parser. */
/* MDENTITY: Process ENTITY declaration.
*/
VOID mdentity(tbuf)
UNCH *tbuf; /* Work area for tokenization[LITLEN+2]. */
{
struct fpi fpicb; /* Formal public identifier structure. */
struct fpi *fpis = &fpicb; /* Ptr to current or #DEFAULT fpi. */
union etext etx; /* Ptr to entity text. */
UNCH estore = ESM; /* Entity storage class. */
struct entity *ecb; /* Ptr to entity control block. */
int parmsw = 0; /* 1=parameter entity declaration; 0 = not. */
int defltsw = 0; /* 1=#DEFAULT declaration; 0=not. */
PNE pne = 0; /* Ptr to N/C/SDATA entity control block. */
mdname = key[KENTITY]; /* Declaration name for messages. */
subdcl = NULL; /* No subject as yet. */
parmno = 0; /* No parameters as yet. */
mdessv = es; /* Save es for checking entity nesting. */
/* PARAMETER 1: Entity name.
*/
pcbmd.newstate = 0;
parsemd(nmbuf, ENTCASE, &pcblitp, NAMELEN);
TRACEMD("1: entity nm");
switch (pcbmd.action) {
case PEN:
parsemd(nmbuf + 1, ENTCASE, &pcblitp, NAMELEN);
if (pcbmd.action!=NAS) {mderr(120, (UNCH *)0, (UNCH *)0); return;}
if (nmbuf[1] == NAMELEN + 2) {
/* It was too long. */
nmbuf[0] = NAMELEN + 2;
nmbuf[NAMELEN + 1] = '\0';
mderr(65, (UNCH *)0, (UNCH *)0);
}
else
nmbuf[0] = nmbuf[1] + 1; /* Increment length for PERO. */
nmbuf[1] = lex.d.pero; /* Prefix PERO to name. */
parmsw = 1; /* Indicate parameter entity. */
case NAS:
break;
case RNS: /* Reserved name started. */
if (ustrcmp(nmbuf+1, key[KDEFAULT])) {
mderr(118, nmbuf+1, key[KDEFAULT]);
return;
}
memcpy(nmbuf, indefent, *indefent);/* Copy #DEFAULT to name buffer. */
fpis = &fpidf; /* Use #DEFAULT fpi if external. */
defltsw = 1; /* Indicate #DEFAULT is being defined.*/
break;
default:
mderr(122, (UNCH *)0, (UNCH *)0);
return;
}
subdcl = nmbuf+1; /* Subject name for error messages. */
/* PARAMETER 2: Entity text keyword (optional).
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, LITLEN);
TRACEMD("2: keyword");
switch (pcbmd.action) {
case NAS:
if ((estore = (UNCH)mapsrch(enttab, tbuf+1))==0) {
estore = parmsw ? ESP : ESF;
pne = (PNE)rmalloc(NESZ);
if (mdextid(tbuf, fpis, nmbuf+1+parmsw, &estore, pne)==0)
return;
if (defltsw) etx.x = NULL;
else if ((etx.x = entgen(&fpicb))==0) {
if (parmsw)
mderr(148, nmbuf+2, (UNCH *)0);
else
mderr(147, nmbuf+1, (UNCH *)0);
}
goto parm4;
}
if (parmsw && (estore==ESX || estore==ESC)) {
mderr(38, tbuf+1, (UNCH *)0);
estore = ESM;
}
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, LITLEN);
break;
default:
estore = ESM;
break;
}
/* PARAMETER 3: Parameter literal.
*/
TRACEMD("3: literal");
switch (pcbmd.action) {
case LITE:
case LIT:
switch (estore) {
case ESM: /* LITERAL: parameter literal required. */
case ESC: /* CDATA: parameter literal required. */
case ESX: /* SDATA: parameter literal required. */
case ESI: /* PI: parameter literal required. */
etx.c = savestr(tbuf);
break;
case ESMD: /* MD: parameter literal required. */
etx.c = sandwich(tbuf, lex.m.mdo, lex.m.mdc);
goto bcheck;
case ESMS: /* MS: parameter literal required. */
etx.c = sandwich(tbuf, lex.m.mss, lex.m.mse);
goto bcheck;
case ESS: /* STARTTAG: parameter literal required. */
etx.c = sandwich(tbuf, lex.m.stag, lex.m.tagc);
goto bcheck;
case ESE: /* ENDTAG: parameter literal required. */
etx.c = sandwich(tbuf, lex.m.etag, lex.m.tagc);
bcheck:
if (etx.c == 0) {
mderr(225, (UNCH *)0, (UNCH *)0);
return;
}
break;
}
break;
default:
mderr(123, (UNCH *)0, (UNCH *)0);
return;
}
/* PARAMETER 4: End of declaration.
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, LITLEN);
parm4:
TRACEMD(emd);
if (pcbmd.action!=EMD) mderr(126, (UNCH *)0, (UNCH *)0);
if (es!=mdessv) synerr(37, &pcbmd);
/* EXECUTE: If the entity already exists, ignore the new definition.
If it is a new entity, store the definition.
*/
if ((ecb = entfind(nmbuf))!=0 && ecb->estore) {
if (ecb->dflt) {
mderr(228, nmbuf + 1, (UNCH *)0);
hout((THASH)etab, nmbuf, hash(nmbuf, ENTHASH));
if (ecb->estore == ESN) {
frem((UNIV)NEID(ecb->etx.n));
frem((UNIV)ecb->etx.n);
}
else if (ecb->estore >= ESFM)
frem((UNIV)ecb->etx.x);
frem((UNIV)ecb);
}
else {
/* Duplicate definition: not an error. */
if (sw.swdupent) mderr(68, nmbuf+1, (UNCH *)0);
if (estore<ESFM) frem((UNIV)etx.c);
return;
}
}
++ds.ecbcnt; /* Do capacity before NOTATION. */
ds.ecbtext += estore<ESFM ? ustrlen(etx.c) : entlen;
ecb = entdef(nmbuf, estore, &etx); /* Define the entity. */
if (estore==ESN) { /* If entity is external: */
NEENAME(pne) = ecb->ename; /* Store entity name in ne. */
NEID(pne) = etx.x; /* Store system fileid in ne. */
NESYSID(pne) = fpis->fpisysis ? savestr(fpis->fpisysis) : 0;
NEPUBID(pne) = fpis->fpipubis ? savestr(fpis->fpipubis) : 0;
ecb->etx.n = pne; /* Store ne control block in etx. */
TRACEESN(pne);
}
else if (pne)
frem((UNIV)pne);
if (defltsw) {
ecbdeflt = ecb; /* If #DEFAULT save ecb. */
if (fpidf.fpipubis)
fpidf.fpipubis = savestr(fpidf.fpipubis);
if (fpidf.fpisysis)
fpidf.fpisysis = savestr(fpidf.fpisysis);
}
}
/* SANDWICH: Catenate a prefix and suffix to a string.
The result has an EOS but no length.
Return 0 if the result if longer than LITLEN.
*/
UNCH *sandwich(s, pref, suff)
UNCH *s; /* String, with EOS. */
UNCH *pref; /* Prefix, with length and EOS. */
UNCH *suff; /* Suffix, with length and EOS. */
{
UNCH *pt;
UNS slen, tlen;
slen = ustrlen(s);
tlen = slen + (*pref - 2) + (*suff - 2);
if (tlen > LITLEN)
return 0;
pt = (UNCH *)rmalloc(tlen + 1);
memcpy(pt, pref + 1, *pref - 2);
memcpy(pt + (*pref - 2), s, slen);
memcpy(pt + (*pref - 2) + slen, suff + 1, *suff - 1);
return pt;
}
/* MDEXTID: Process external identifier parameter of a markup declaration.
On entry, tbuf contains SYSTEM or PUBLIC if all is well.
NULL is returned if an error, otherwise fpis. If it is a
valid external data entity, the caller's estore is set to ESN
and its nxetype is set to the code for the external entity type.
The event that terminated the parse is preserved in pcb.action,
so the caller should process it before further parsing.
*/
struct fpi *mdextid(tbuf, fpis, ename, estore, pne)
UNCH *tbuf; /* Work area for tokenization[2*(LITLEN+2)]. */
struct fpi *fpis; /* FPI structure. */
UNCH *ename; /* Entity or notation name, with EOS, no length.*/
/* NOTE: No PERO on parameter entity name. */
UNCH *estore; /* DTD, general or parameter entity, DCN. */
PNE pne; /* Caller's external entity ptr. */
{
PDCB dcb; /* Ptr to DCN control block. */
int exidtype; /* External ID type: 0=none 1=system 2=public. */
int exetype; /* External entity type. */
MEMZERO((UNIV)fpis, (UNS)FPISZ); /* Initialize fpi structure. */
/* Move entity name into fpi (any PERO was stripped by caller). */
fpis->fpinm = ename;
entlen = 0; /* Initialize external ID length. */
/* PARAMETER 1: External identifier keyword or error.
*/
TRACEMD("1: extid keyword");
if ((exidtype = mapsrch(exttab, tbuf+1))==0) {
mderr(29, (UNCH *)0, (UNCH *)0);
return (struct fpi *)0;
}
if (exidtype==EDSYSTEM) goto parm3;
/* PARAMETER 2: Public ID literal.
*/
pcbmd.newstate = 0;
/* The length of a minimum literal cannot exceed the value of LITLEN
in the reference quantity set. */
parsemd(pubibuf, NAMECASE, &pcblitv, REFLITLEN);
TRACEMD("2: pub ID literal");
switch (pcbmd.action) {
case LITE: /* Use alternative literal delimiter. */
case LIT: /* Save literal as public ID string. */
entlen = ustrlen(pubibuf);
fpis->fpipubis = pubibuf;
break;
default:
mderr(117, (UNCH *)0, (UNCH *)0);
return (struct fpi *)0; /* Signal error to caller. */
}
/* PARAMETER 3: System ID literal.
*/
parm3:
pcbmd.newstate = 0;
parsemd(sysibuf, NAMECASE, &pcblitc, LITLEN);
TRACEMD("3: sys ID literal");
if (pcbmd.action==LIT || pcbmd.action==LITE) {
entlen += ustrlen(sysibuf);
fpis->fpisysis = sysibuf;
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, LITLEN);
}
else memcpy(tbuf, sysibuf, *sysibuf);
if (*estore!=ESF || pcbmd.action!=NAS) goto genfpi;
/* PARAMETER 4: Entity type keyword.
*/
TRACEMD("4: Entity type");
if ((exetype = mapsrch(extettab, tbuf+1))==0) {
mderr(24, tbuf+1, (UNCH *)0);
return (struct fpi *)0;
}
if (exetype==ESNSUB && SUBDOC == NO) {
mderr(90, tbuf+1, (UNCH *)0);
return (struct fpi *)0;
}
NEXTYPE(pne) = (UNCH)exetype; /* Save entity type in caller's ne. */
*estore = ESN; /* Signal that entity is a data entity. */
if (exetype==ESNSUB) {
pne->nedcn = 0;
pcbmd.newstate = 0; /* Parse next token for caller. */
parsemd(tbuf, NAMECASE, &pcblitp, LITLEN);
goto genfpi;
}
/* PARAMETER 5: Notation name.
*/
pcbmd.newstate = 0;
parsemd(lbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("5: notation");
if (pcbmd.action!=NAS) {mderr(119, tbuf+1, (UNCH *)0); return (struct fpi *)0;}
/* Locate the data content notation. */
pne->nedcn = dcb = dcndef(lbuf);
/* Note that we have defined an entity with this notation.
If attributes are later defined for this notation, we'll
have to fix up this entity. */
dcb->entsw = 1;
/* PARAMETER 6: Data attribute specification.
*/
pcbmd.newstate = 0;
parsemd(lbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("6: [att list]");
if (pcbmd.action!=MDS) { /* No attributes specified. */
if (dcb->adl == 0)
NEAL(pne) = 0;
else {
initatt(dcb->adl);
adlval((int)ADN(al), (struct etd *)0);
storedatt(pne);
}
goto genfpi;
}
if (dcb->adl==0) { /* Atts specified, but none defined. */
mderr(22, (UNCH *)0, (UNCH *)0);
return (struct fpi *)0;
}
pcbstag.newstate = pcbstan; /* First separator is optional. */
if ((parseatt(dcb->adl, tbuf))==0)/* Empty list. */
mderr(91, (UNCH *)0, (UNCH *)0);
else {
adlval((int)ADN(al), (struct etd *)0);
storedatt(pne);
}
parse(&pcbeal); /* Parse the list ending. */
pcbmd.newstate = 0; /* Parse next token for caller. */
parsemd(tbuf, NAMECASE, &pcblitp, LITLEN);
/* GENFPI: Builds a formal public identifier structure, including the
entity name, offsets of the components of the public ID, and
other data a system might use to identify the actual file.
*/
genfpi:
TRACEMD("7: generate fpi");
fpis->fpistore = *estore - ESFM + 1; /* External entity type: 1-6. */
if (*estore == ESN) {
if (NEXTYPE(pne) == ESNSUB)
fpis->fpinedcn = 0;
else
fpis->fpinedcn = NEDCN(pne) + 1;
}
/* Analyze public ID and make structure entries. */
if (exidtype==EDPUBLIC) {
if (FORMAL==NO)
fpis->fpiversw = -1;
else if (parsefpi(fpis)>0) {
mderr(88, fpis->fpipubis, (UNCH *)0);
fpis->fpiversw = -1; /* Signal bad formal public ID. */
}
}
return fpis;
}
/* Store a data attribute. */
VOID storedatt(pne)
PNE pne;
{
int i;
NEAL(pne) = (struct ad *)rmalloc((1+ADN(al))*ADSZ);
memcpy((UNIV)NEAL(pne), (UNIV)al, (1+ADN(al))*ADSZ);
for (i = 1; i <= (int)ADN(al); i++) {
if (GET(ADFLAGS(al, i), ASPEC))
ds.attdef += ADLEN(al, i);
if (NEAL(pne)[i].addef != 0)
NEAL(pne)[i].addef = savestr(NEAL(pne)[i].addef);
}
ds.attcnt += AN(al); /* Number of attributes defined. */
#if 0
/* I can't see any reason to increase AVGRPCNT here. */
ds.attgcnt += ADN(al) - AN(al); /* Number of att grp members. */
#endif
}
/* PARSEFPI: Parses a formal public identifier and builds a control block.
PARSEFPI returns a positive error code (1-10), or 0 if no errors.
It set fpiversw if no version was specified in the ID and the
public text is in a class that permits display versions.
Note: An empty version ("//") can be specified (usually it is
the non-device-specific form, such as a definitional entity set).
*/
int parsefpi(f)
PFPI f; /* Ptr to formal public identifier structure. */
{
UNCH *l; /* Pointer to EOS of public identifier. */
UNCH *p, *q; /* Ptrs to current field in public identifier. */
UNS len; /* Field length */
p = f->fpipubis; /* Point to start of identifier. */
l = p + ustrlen(p); /* Point to EOS of identifier. */
if (*p=='+' || *p=='-') { /* If owner registered, unregistered. */
f->fpiot = *p; /* Save owner type. */
if ((p += 3)>=l) return 1; /* Get to owner ID field. */
}
else f->fpiot = '!'; /* Indicate ISO owner identifier. */
if ((q = pubfield(p, l, '/', &len))==0) /* Find end of owner ID field. */
return 2;
f->fpiol = len; /* Save owner ID length. */
f->fpio = p - f->fpipubis; /* Save offset in pubis to owner ID. */
if ((p = pubfield(q, l, ' ', &len))==0) /* Find end of text class field. */
return 3;
*(--p) = EOS; /* Temporarily make class a string. */
f->fpic = mapsrch(pubcltab, q); /* Check for valid uc class name.*/
*p++ = ' '; /* Restore the SPACE delimiter. */
if (f->fpic==0) return 4; /* Error if not valid uc class name.*/
/* The public text class in a notation identifier must be NOTATION. */
if (f->fpistore == ESK - ESFM + 1 && f->fpic != FPINOT) return 10;
if (*p=='-') { /* If text is unavailable public text.*/
f->fpitt = *p; /* Save text type. */
if ((p += 3)>=l) return 5; /* Get to text description field. */
}
else f->fpitt = '+'; /* Indicate available public text. */
if ((q = pubfield(p, l, '/', &len))==0) /* Find end of text description. */
return 6;
f->fpitl = len; /* Save text description length. */
f->fpit = p - f->fpipubis; /* Save ptr to description.*/
p = pubfield(q, l, '/', &len); /* Bound language field. */
if (f->fpic != FPICHARS) {
int i;
/* Language must be all upper-case letters. */
/* The standard only says that it *should* be two letters, so
don't enforce that. */
for (i = 0; i < len; i++) {
/* Don't assume ASCII. */
if (!strchr("ABCDEFGHIJKLMNOPQRSTUVWXYZ", q[i]))
return 7;
}
}
f->fpill = len;
f->fpil = q - f->fpipubis;
if (p!=0) { /* If there is a version field: */
if (f->fpic<FPICMINV) /* Error if class prohibits versions. */
return 8;
if ((pubfield(p, l, '/', &len))!=0) /* Bound version field. */
return 9; /* Error if yet another field. */
f->fpivl = len; /* Save version length. */
f->fpiv = p - f->fpipubis; /* Save ptr (in pubis) to version. */
}
else if (f->fpic>=FPICMINV) f->fpiversw = 1;/* No version: get the best. */
return(0);
}
/* PUBFIELD: Returns ptr to next field, or NULL if ID has ended.
*/
#ifdef USE_PROTOTYPES
UNCH *pubfield(UNCH *p, UNCH *l, UNCH d, UNS *lenp)
#else
UNCH *pubfield(p, l, d, lenp)
UNCH *p; /* Public identifier field (no length or EOS). */
UNCH *l; /* Pointer to EOS of public identifier. */
UNCH d; /* Field delimiter: ' ' or '/'. */
UNS *lenp; /* Gets field length */
#endif
{
UNCH *psv = p+1; /* Save starting value of p. */
while (p<l) {
if (*p++==d) { /* Test for delimiter character. */
*lenp = p - psv; /* Save field length (no len or EOS). */
if (d=='/' && *p++!=d) /* Solidus requires a second one. */
continue;
return(p); /* Return ptr to next field. */
}
}
*lenp = p - --psv; /* Save field length (no len or EOS). */
return NULL;
}
/* MDMS: Process marked section start.
If already in special parse, bump the level counters and return
without parsing the declaration.
*/
struct parse *mdms(tbuf, pcb)
UNCH *tbuf; /* Work area for tokenization [NAMELEN+2]. */
struct parse *pcb; /* Parse control block for this parse. */
{
int key; /* Index of keyword in mslist. */
int ptype; /* Parameter token type. */
int pcbcode = 0; /* Parse code: 0=same; 2-4 per defines. */
if (++mslevel>TAGLVL) {
--mslevel;
sgmlerr(27, (struct parse *)0, ntoa(TAGLVL), (UNCH *)0);
}
/* If already in IGNORE mode, return without parsing parameters. */
if (msplevel) {++msplevel; return(pcb);}
parmno = 0; /* No parameters as yet. */
mdessv = es; /* Save es for checking entity nesting. */
pcbmd.newstate = pcbmdtk; /* First separator is optional. */
/* PARAMETERS: TEMP, RCDATA, CDATA, IGNORE, INCLUDE, or MDS. */
while ((ptype = parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN))==NAS){
if ((key = mapsrch(mstab, tbuf+1))==0) {
sgmlerr(64, (struct parse *)0, ntoa(parmno), tbuf+1);
continue;
}
if (key==MSTEMP) continue; /* TEMP: for documentation. */
msplevel = 1; /* Special parse required. */
if (key>pcbcode) pcbcode = key; /* Update if higher priority. */
}
if (ptype!=MDS) {
NEWCC; /* Syntax error did REPEATCC. */
sgmlerr(97, (struct parse *)0, lex.m.dso, (UNCH *)0);
REPEATCC; /* 1st char of marked section. */
}
if (es!=mdessv) synerr(37, pcb);
TRACEMS(1, pcbcode, mslevel, msplevel);
if (pcbcode==MSIGNORE) pcb = &pcbmsi;
else if (pcbcode) {
pcb = pcbcode==MSCDATA ? &pcbmsc : (rcessv = es, &pcbmsrc);
}
return(pcb); /* Tell caller whether to change the parse. */
}
/* MDMSE: Process marked section end.
Issue an error if no marked section had started.
*/
int mdmse()
{
int retcode = 0; /* Return code: 0=same parse; 1=cancel special. */
if (mslevel) --mslevel;
else sgmlerr(26, (struct parse *)0, (UNCH *)0, (UNCH *)0);
if (msplevel) if (--msplevel==0) retcode = 1;
TRACEMS(0, retcode, mslevel, msplevel);
return retcode;
}
/* MDNOT: Process NOTATION declaration.
*/
VOID mdnot(tbuf)
UNCH *tbuf; /* Work area for tokenization[LITLEN+2]. */
{
struct fpi fpicb; /* Formal public identifier structure. */
PDCB dcb; /* Ptr to notation entity in dcntab. */
UNCH estore = ESK; /* Entity storage class. */
mdname = key[KNOTATION]; /* Identify declaration for messages. */
subdcl = NULL; /* No subject as yet. */
parmno = 0; /* No parameters as yet. */
mdessv = es; /* Save es for checking entity nesting. */
/* PARAMETER 1: Notation name.
*/
pcbmd.newstate = 0;
parsemd(lbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("1: name");
if (pcbmd.action!=NAS) {mderr(120, (UNCH *)0, (UNCH *)0); return;}
subdcl = lbuf+1; /* Save notation name for error msgs. */
/* PARAMETER 2: External identifier keyword.
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("2: extid");
if (pcbmd.action!=NAS) {mderr(29, (UNCH *)0, (UNCH *)0); return;}
if (mdextid(tbuf, &fpicb, lbuf+1, &estore, (PNE)0)==0) return;
/* PARAMETER 3: End of declaration.
Token was parsed by MDEXTID.
*/
TRACEMD(emd);
if (pcbmd.action!=EMD) mderr(126, (UNCH *)0, (UNCH *)0);
if (es!=mdessv) synerr(37, &pcbmd);
/* EXECUTE: Store notation name.
*/
if ((dcb = dcnfind(lbuf)) != 0 && dcb->defined) {
mderr(56, lbuf+1, (UNCH *)0);
return;
}
/* else */
dcb = dcndef(lbuf);
dcb->defined = 1;
dcb->sysid = fpicb.fpisysis ? savestr(fpicb.fpisysis) : 0;
dcb->pubid = fpicb.fpipubis ? savestr(fpicb.fpipubis) : 0;
++ds.dcncnt;
ds.dcntext += entlen;
TRACEDCN(dcb);
return;
}
/* DCNDEF: Define a notation and return its DCNCB.
If caller does not care if it already exists,
he should specify NULL for the notation text
so we don't clobber the existing text (if any).
*/
struct dcncb *dcndef(nname)
UNCH *nname; /* Notation name (with length and EOS). */
{
return((PDCB)hin((THASH)dcntab, nname, 0, DCBSZ));
}
/* DCNFIND: If a notation was declared, return its DCNCB.
Return NULL if it is not defined.
*/
struct dcncb *dcnfind(nname)
UNCH *nname; /* Notation name (with length and EOS). */
{
return((PDCB)hfind((THASH)dcntab, nname, 0));
}
#define SRM(i) (srhptr->srhsrm[i]) /* Current entry in SHORTREF map. */
/* MDSRMDEF: Process short reference mapping declaration.
*/
VOID mdsrmdef(tbuf)
UNCH *tbuf; /* Work area for tokenization[LITLEN+2]. */
{
struct entity *entcb; /* Ptr to defined entity. */
PSRH srhptr; /* Ptr to short reference map hdr (in srhtab).*/
int srn; /* Short reference delimiter number in srdeltab.*/
int mapused = 0; /* Has map already been used? */
mdname = key[KSHORTREF]; /* Identify declaration for messages. */
subdcl = NULL; /* No subject as yet. */
parmno = 0; /* No parameters as yet. */
if (!sd.shortref) {mderr(198, (UNCH *)0, (UNCH *)0); return;}
mdessv = es; /* Save es for checking entity nesting. */
/* PARAMETER 1: SHORTREF map name.
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("1: map nm");
if (pcbmd.action!=NAS) {mderr(120, (UNCH *)0, (UNCH *)0); return;}
if ((srhptr = srhfind(tbuf))!=0) {
mapused = 1;
/* Error if map was declared (not just used). */
if (SRM(0)) {mderr(56, tbuf+1, (UNCH *)0); return;}
}
else srhptr = srhdef(tbuf); /* Create map with SRs mapped to NULL.*/
SRM(0) = (PECB)srhptr; /* Indicate map was actually declared.*/
subdcl = srhptr->ename+1; /* Save map name for error msgs. */
while ( pcbmd.newstate = 0,
parsemd(tbuf, NAMECASE, &pcblitp, SRMAXLEN)==LIT
|| pcbmd.action==LITE ) {
/* PARAMETER 2: Delimiter string.
*/
TRACEMD("2: SR string");
if ((srn = mapsrch(lex.s.dtb, tbuf))==0) {
mderr(124, tbuf, (UNCH *)0);
goto cleanup;
}
/* PARAMETER 3: Entity name.
*/
pcbmd.newstate = 0;
parsemd(tbuf, ENTCASE, &pcblitp, NAMELEN);
TRACEMD("3: entity");
if (pcbmd.action!=NAS) {mderr(120, (UNCH *)0, (UNCH *)0); goto cleanup;}
if ((entcb = entfind(tbuf))==0) {
union etext etx;
etx.x = 0;
entcb = entdef(tbuf, '\0', &etx);
}
if (SRM(srn)) {
mderr(56, (srn<lex.s.prtmin ? (UNCH *)lex.s.pdtb[srn]
: lex.s.dtb[srn].mapnm), (UNCH *)0);
continue;
}
SRM(srn) = entcb;
if (srn>=lex.s.fce && srn!=lex.s.hyp && srn!=lex.s.hyp2
&& srn!=lex.s.lbr && srn!=lex.s.rbr)
lexcnm[*lex.s.dtb[srn].mapnm] = lex.l.fce;
else if (srn==lex.s.spc) lexcnm[' '] = lex.l.spcr;
}
/* PARAMETER 4: End of declaration.
*/
TRACEMD(emd);
if (parmno==2)
{mderr((UNS)(pcbmd.action==EMD ? 28:123), (UNCH *)0, (UNCH *)0); goto cleanup;}
if (pcbmd.action!=EMD) mderr(126, (UNCH *)0, (UNCH *)0);
if (es!=mdessv) synerr(37, &pcbmd);
++ds.srcnt;
TRACESRM("SHORTREF", srhptr->srhsrm, (UNCH *)0);
return;
cleanup:
/* Don't free the map if the map was in use (because of a USEMAP
declaration) before this declaration. */
if (mapused)
MEMZERO((UNIV)srhptr->srhsrm, sizeof(PECB)*(lex.s.dtb[0].mapdata+1));
else {
frem((UNIV)srhptr->srhsrm);
hout((THASH)srhtab, srhptr->ename, 0);
frem((UNIV)srhptr);
}
}
/* MDSRMUSE: Activate a short reference map.
*/
VOID mdsrmuse(tbuf)
UNCH *tbuf; /* Work area for tokenization[LITLEN+2]. */
{
PSRH srhptr; /* Ptr to short reference map hdr (in srhtab).*/
TECB srmptr; /* Ptr to short reference map (in header). */
int i; /* Loop counter; temporary variable. */
mdname = key[KUSEMAP]; /* Identify declaration for messages. */
subdcl = NULL; /* No subject as yet. */
parmno = 0; /* No parameters as yet. */
mdessv = es; /* Save es for checking entity nesting. */
/* PARAMETER 1: SHORTREF map name or "#EMPTY".
*/
pcbmd.newstate = 0;
parsemd(lbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("1: map nm");
subdcl = lbuf+1; /* Subject name for error messages. */
switch (pcbmd.action) {
case RNS: /* Empty SHORTREF map requested. */
if (ustrcmp(lbuf+1, key[KEMPTY])) {
mderr(118, lbuf+1, key[KEMPTY]);
return;
}
srmptr = SRMNULL;
break;
case NAS: /* Map name specified; save if undefined. */
if ((srhptr = srhfind(lbuf))==0) {
if (!indtdsw) {mderr(125, (UNCH *)0, (UNCH *)0); return;}
srmptr = NULL;
}
else
srmptr = srhptr->srhsrm;
break;
default:
mderr(120, (UNCH *)0, (UNCH *)0);
return;
}
/* PARAMETER 2: Element name or a group of them. (In DTD only.)
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD("2: GI or grp");
switch (pcbmd.action) {
case NAS:
if (!indtdsw) {mderr(142, (UNCH *)0, (UNCH *)0); return;}
nmgrp[0] = etddef(tbuf);
nmgrp[1] = (PETD)NULL;
break;
case GRPS:
if (!indtdsw) {mderr(142, (UNCH *)0, (UNCH *)0); return;}
parsegrp(nmgrp, &pcbgrnm, tbuf);
break;
case EMD:
if (indtdsw) {mderr(28, (UNCH *)0, (UNCH *)0); return;}
tags[ts].tsrm = srmptr;
TRACESRM("USEMAP", tags[ts].tsrm, tags[ts].tetd->etdgi+1);
goto realemd;
default:
mderr(indtdsw ? 121 : 126, (UNCH *)0, (UNCH *)0);
return;
}
/* PARAMETER 3: End of declaration.
*/
pcbmd.newstate = 0;
parsemd(tbuf, NAMECASE, &pcblitp, NAMELEN);
TRACEMD(emd);
if (pcbmd.action!=EMD) mderr(126, (UNCH *)0, (UNCH *)0);
/* If map has not yet been defined, do it and get map pointer. */
if (!srmptr) srmptr = (srhdef(lbuf))->srhsrm;
/* Store the map pointer for each element name specified.
*/
TRACEGRP(nmgrp);
for (i = -1; nmgrp[++i];) {
if (!nmgrp[i]->etdsrm) nmgrp[i]->etdsrm = srmptr;
else if (sw.swdupent) mderr(68, nmgrp[i]->etdgi+1, (UNCH *)0);
}
realemd:
if (es!=mdessv) synerr(37, &pcbmd);
}
/* SRHDEF: Define a SHORTREF map and return ptr to its header.
All entries in map are mapped to NULL.
Caller must determine whether it already exists.
*/
PSRH srhdef(sname)
UNCH *sname; /* SHORTREF map name (with length and EOS). */
{
PSRH srh; /* Ptr to SHORTREF map hdr in srhtab. */
(srh = (PSRH)hin((THASH)srhtab, sname, 0, SRHSZ))->srhsrm =
(TECB)rmalloc((UNS)(lex.s.dtb[0].mapdata+1)*sizeof(PECB));
return(srh);
}
/* SRHFIND: If a SHORTREF map was declared, return the ptr to its header.
Return NULL if it is not defined.
*/
PSRH srhfind(sname)
UNCH *sname; /* SHORTREF map name (with length and EOS). */
{
return((PSRH)hfind((THASH)srhtab, sname, 0));
}
#undef SRM
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
comment-column: 30
End:
*/

252
usr.bin/sgmls/sgmls/msg.h Normal file
View File

@ -0,0 +1,252 @@
/*
Severity codes:
I information (not an SGML error at all)
W warning (an SGML markup error but it knows what you mean)
E error
C critical (fatal)
Type codes:
R resource
C ?context/content
M minimization
Q quantity
S syntax
D declaration
U unsupported feature
*/
struct {
char *text;
char severity;
char type;
} messages[] = {
/* 0 */ {0},
/* 1 */ {"%s element not allowed at this point in %s element", 'E', 'C'},
/* 2 */ {"%s markup declaration not permitted here; declaration ended", 'E', 'D'},
/* 3 */ {"Length of name, number, or token exceeded NAMELEN or LITLEN limit", 'E', 'Q'},
/* 4 */ {"Non-SGML character ignored", 'E', 'S'},
/* 5 */ {"%s end-tag ignored: doesn't end any open element (current is %s)", 'E', 'C'},
/* 6 */ {"%s start-tag exceeds open element limit; possible lies from %s on", 'E', 'Q'},
/* 7 */ {"Start-tag omitted from %s with empty content", 'E', 'M'},
/* 8 */ {"Illegal entity end in markup or delimited text", 'E', 'S'},
/* 9 */ {"Incorrect character in markup; markup terminated", 'E', 'S'},
/* 10 */ {"Data not allowed at this point in %s element", 'E', 'C'},
/* 11 */ {"No element declaration for %s end-tag GI; end-tag ignored", 'E', 'C'},
/* 12 */ {"%s name ignored: not a syntactically valid SGML name", 'E', 'S'},
/* 13 */ {"%s = \"%s\" attribute ignored: not defined for this element", 'E', 'C'},
/* 14 */ {"%s = \"%s\" attribute value defaulted: invalid character", 'E', 'S'},
/* 15 */ {"%s = \"%s\" attribute value defaulted: token too long", 'E', 'Q'},
/* 16 */ {"%s = \"%s\" attribute value defaulted: too many tokens", 'E', 'C'},
/* 17 */ {"%s = \"%s\" attribute value defaulted: wrong token type", 'E', 'C'},
/* 18 */ {"%s = \"%s\" attribute value defaulted: token not in group", 'E', 'C'},
/* 19 */ {"Required %s attribute was not specified; may affect processing", 'E', 'C'},
/* 20 */ {"%s end-tag implied by %s end-tag; not minimizable", 'E', 'M'},
/* 21 */ {"%s start-tag implied by %s start-tag; not minimizable", 'W', 'M'},
/* 22 */ {"Possible attributes treated as data because none were defined", 'E', 'C'},
/* 23 */ {"Duplicate specification occurred for \"%s\"; may affect processing", 'E', 'D'},
/* 24 */ {"\"%s\" keyword invalid; declaration terminated", 'E', 'D'},
/* 25 */ {"%s = \"%s\" attribute defaulted: empty string not allowed for token", 'E', 'C'},
/* 26 */ {"Marked section end ignored; not in a marked section", 'E', 'S'},
/* 27 */ {"Marked section start ignored; %s marked sections open already", 'E', 'Q'},
/* 28 */ {"One or more parameters missing; declaration ignored", 'E', 'D'},
/* 29 */ {"\"PUBLIC\" or \"SYSTEM\" required; declaration terminated", 'E', 'D'},
/* 30 */ {"%s element ended prematurely; required %s omitted", 'E', 'C'},
/* 31 */ {"Entity \"%s\" terminated: could not read file", 'E', 'R'},
/* 32 */ {"Could not open file for entity \"%s\"; entity reference ignored", 'E', 'R'},
/* 33 */ {"Insufficient main memory; unable to continue parsing", 'C', 'R'},
/* 34 */ {"%s entity reference ignored; exceeded open entity limit (%s)", 'E', 'Q'},
/* 35 */ {"No declaration for entity \"%s\"; reference ignored", 'E', 'C'},
/* 36 */ {"%s entity reference occurred within own text; reference ignored", 'E', 'C'},
/* 37 */ {"Entity nesting level out of sync", 'E', 'S'},
/* 38 */ {"Parameter entity text cannot have %s keyword; keyword ignored", 'E', 'D'},
/* 39 */ {"%s end-tag implied by %s start-tag; not minimizable", 'W', 'M'},
/* 40 */ {"Start-tag minimization ignored; element has required attribute", 'E', 'D'},
/* 41 */ {"Required %s element cannot be excluded from %s element", 'E', 'C'},
/* 42 */ {"No DOCTYPE declaration; document type is unknown", 'E', 'C'},
/* 43 */ {"Undefined %1$s start-tag GI was used in DTD; \"%1$s O O ANY\" assumed", 'E', 'C'},
/* 44 */ {"Invalid character(s) ignored; attempting to resume DOCTYPE subset", 'E', 'S'},
/* 45 */ {"No declaration for entity \"%s\"; default definition used", 'I', 'C'},
/* 46 */ {"%s end-tag implied by NET delimiter; not minimizable", 'W', 'M'},
/* 47 */ {"%s end-tag implied by data; not minimizable", 'W', 'M'},
/* 48 */ {"%s end-tag implied by short start-tag (no GI); not minimizable", 'W', 'M'},
/* 49 */ {"%s start-tag implied by data; not minimizable", 'W', 'M'},
/* 50 */ {"%s start-tag implied by short start-tag (no GI); not minimizable", 'W', 'M'},
/* 51 */ {"Short end-tag (no GI) ignored: no open elements", 'E', 'C'},
/* 52 */ {"No definition for %1$s document type; \"%1$s O O ANY\" assumed", 'E', 'C'},
/* 53 */ {"No definition for %1$s implied start-tag; \"%1$s O O ANY\" assumed", 'E', 'C'},
/* 54 */ {"%s element ended prematurely; required subelement omitted", 'E', 'C'},
/* 55 */ {"Content model token %s: connectors conflict; first was used", 'E', 'D'},
/* 56 */ {"Duplicate specification occurred for \"%s\"; duplicate ignored", 'E', 'D'},
/* 57 */ {"Bad end-tag in R/CDATA element; treated as short (no GI) end-tag", 'E', 'S'},
/* 58 */ {"Start-tag minimization should be \"-\" for element with declared content", 'I', 'D'},
/* 59 */ {"Reference to PI entity not permitted here; reference ignored", 'E', 'S'},
/* 60 */ {"Non-SGML character found; should have been character reference", 'W', 'S'},
/* 61 */ {"Numeric character reference exceeds 255; reference ignored", 'E', 'S'},
/* 62 */ {"Invalid alphabetic character reference ignored", 'E', 'S'},
/* 63 */ {"Invalid character in minimum literal; character ignored", 'E', 'S'},
/* 64 */ {"Keyword %s ignored; \"%s\" is not a valid marked section keyword", 'E', 'D'},
/* 65 */ {"Parameter entity name longer than (NAMELEN-1); truncated", 'E', 'Q'},
/* 66 */ {"Start-tag length exceeds TAGLEN limit; parsed correctly", 'W', 'Q'},
/* 67 */ {"%s attribute defaulted: FIXED attribute must equal default", 'W', 'C'},
/* 68 */ {"Duplicate specification occurred for \"%s\"; duplicate ignored", 'I', 'D'},
/* 69 */ {"%s = \"%s\" IDREF attribute ignored: referenced ID does not exist", 'E', 'C'},
/* 70 */ {"%s = \"%s\" IDREF attribute ignored: number of IDs in list exceeds GRPCNT limit", 'E', 'Q'},
/* 71 */ {"%s = \"%s\" ID attribute ignored: ID in use for another element", 'E', 'C'},
/* 72 */ {"%s = \"%s\" ENTITY attribute not general entity; may affect processing", 'E', 'C'},
/* 73 */ {"%s = \"%s\" attribute ignored: previously specified in same list", 'W', 'C'},
/* 74 */ {"\"?\" = \"%s\" name token ignored: not in any group in this list", 'E', 'C'},
/* 75 */ {"Normalized attribute specification length over ATTSPLEN limit", 'E', 'Q'},
/* 76 */ {"%s = \"%s\" NOTATION ignored: element content is empty", 'E', 'C'},
/* 77 */ {"%s = \"%s\" NOTATION undefined: may affect processing", 'E', 'C'},
/* 78 */ {"Entity \"%2$s\" has undefined notation \"%1$s\"", 'E', 'C'},
/* 79 */ {"%s = \"%s\" default attribute value not in group; #IMPLIED used", 'E', 'C'},
/* 80 */ {"#CURRENT default value treated as #IMPLIED for %s ID attribute", 'E', 'D'},
/* 81 */ {"ID attribute %s cannot have a default value; treated as #IMPLIED", 'E', 'D'},
/* 82 */ {"%s attribute must be token, not empty string; treated as #IMPLIED", 'E', 'D'},
/* 83 */ {"NOTATION attribute ignored for EMPTY element", 'E', 'D'},
/* 84 */ {"%s = \"%s\" NOTATION ignored: content reference specified", 'E', 'C'},
/* 85 */ {"#CONREF default value treated as #IMPLIED for EMPTY element", 'W', 'D'},
/* 86 */ {"%s = \"%s\" entity not data entity; may affect processing", 'E', 'C'},
/* 87 */ {"End-tag minimization should be \"O\" for EMPTY element", 'I', 'D'},
/* 88 */ {"Formal public identifier \"%s\" invalid; treated as informal", 'E', 'S'},
/* 89 */ {"Out-of-context %2$s start-tag ended %1$s document element (and parse)", 'E', 'C'},
/* 90 */ {"\"%s\" keyword is for unsupported feature; declaration terminated", 'E', 'D'},
/* 91 */ {"Attribute specification list in prolog cannot be empty", 'E', 'D'},
/* 92 */ {"Document ended invalidly within a literal; parsing ended", 'C', 'S'},
/* 93 */ {"Short ref in map \"%2$s\" to undeclared entity \"%1$s\" treated as data", 'E', 'C'},
/* 94 */ {"Could not reopen file to continue entity \"%s\"; entity terminated", 'E', 'R'},
/* 95 */ {"Out-of-context data ended %s document element (and parse)", 'E', 'C'},
/* 96 */ {"Short start-tag (no GI) ended %s document element (and parse)", 'E', 'C'},
/* 97 */ {"DSO delimiter (%s) omitted from marked section declaration", 'E', 'D'},
/* 98 */ {"Group token %s: duplicate name or name token \"%s\" ignored", 'E', 'D'},
/* 99 */ {"Attempt to redefine %s attribute ignored", 'E', 'D'},
/* 100 */ {"%s definition ignored: %s is not a valid declared value keyword", 'E', 'D'},
/* 101 */ {"%s definition ignored: NOTATION attribute already defined", 'E', 'D'},
/* 102 */ {"%s definition ignored: ID attribute already defined", 'E', 'D'},
/* 103 */ {"%s definition ignored: no declared value specified", 'E', 'D'},
/* 104 */ {"%s definition ignored: invalid declared value specified", 'E', 'D'},
/* 105 */ {"%s definition ignored: number of names or name tokens in group exceeded GRPCNT limit", 'E', 'D'},
/* 106 */ {"%s definition ignored: name group omitted for NOTATION attribute", 'E', 'D'},
/* 107 */ {"#CONREF default value treated as #IMPLIED for %s ID attribute", 'E', 'D'},
/* 108 */ {"%s definition ignored: %s is not a valid default value keyword", 'E', 'D'},
/* 109 */ {"%s definition ignored: no default value specified", 'E', 'D'},
/* 110 */ {"%s definition ignored: invalid default value specified", 'E', 'D'},
/* 111 */ {"More than ATTCNT attribute names and/or name (token) values; terminated", 'E', 'D'},
/* 112 */ {"Attempted redefinition of attribute definition list ignored", 'E', 'D'},
/* 113 */ {"Content model token %s: more than GRPCNT model group tokens; terminated", 'E', 'Q'},
/* 114 */ {"Content model token %s: more than GRPGTCNT content model tokens; terminated", 'E', 'Q'},
/* 115 */ {"Content model token %s: more than GRPLVL nested model groups; terminated", 'E', 'Q'},
/* 116 */ {"Content model token %s: %s invalid; declaration terminated", 'E', 'D'},
/* 117 */ {"\"PUBLIC\" specified without public ID; declaration terminated", 'E', 'D'},
/* 118 */ {"\"%s\" keyword invalid (only %s permitted); declaration terminated", 'E', 'D'},
/* 119 */ {"\"%s\" specified without notation name; declaration terminated", 'E', 'D'},
/* 120 */ {"Parameter must be a name; declaration terminated", 'E', 'D'},
/* 121 */ {"Parameter must be a GI or a group of them; declaration terminated", 'E', 'D'},
/* 122 */ {"Parameter must be a name or PERO (%%); declaration terminated", 'E', 'D'},
/* 123 */ {"Parameter must be a literal; declaration terminated", 'E', 'D'},
/* 124 */ {"\"%s\" not valid short reference delimiter; declaration terminated", 'E', 'D'},
/* 125 */ {"Map does not exist; declaration ignored", 'E', 'C'},
/* 126 */ {"MDC delimiter (>) expected; following text may be misinterpreted", 'E', 'D'},
/* 127 */ {"Document ended invalidly within prolog; parsing ended", 'C', 'S'},
/* 128 */ {"\"PUBLIC\" or \"SYSTEM\" or DSO ([) required; declaration terminated", 'E', 'D'},
/* 129 */ {"Minimization must be \"-\" or \"O\" (not \"%s\"); declaration terminated", 'E', 'D'},
/* 130 */ {"Content model or keyword expected; declaration terminated", 'E', 'D'},
/* 131 */ {"Rank stem \"%s\" + suffix \"%s\" more than NAMELEN characters; not defined", 'E', 'D'},
/* 132 */ {"Undefined %s start-tag GI ignored; not used in DTD", 'E', 'C'},
/* 133 */ {"Document ended invalidly within a markup declaration; parsing ended", 'C', 'S'},
/* 134 */ {"Normalized length of literal exceeded %s; markup terminated", 'E', 'Q'},
/* 135 */ {"R/CDATA marked section in declaration subset; prolog terminated", 'E', 'D'},
/* 136 */ {"%s = \"%s\" ENTITIES attribute ignored: more than GRPCNT in list", 'E', 'Q'},
/* 137 */ {"Content model is ambiguous", 'W', 'D'},
/* 138 */ {"Invalid parameter entity name \"%s\"", 'E', 'S'},
/* 139 */ {"Document ended invalidly within a marked section; parsing ended", 'C', 'S'},
/* 140 */ {"Element \"%s\" used in DTD but not defined", 'I', 'D'},
/* 141 */ {"Invalid NDATA or SUBDOC entity reference occurred; ignored", 'E', 'S'},
/* 142 */ {"Associated element type not allowed in document instance", 'E', 'C'},
/* 143 */ {"Illegal DSC character; in different entity from DSO", 'E', 'C'},
/* 144 */ {"Declared value of data attribute cannot be ID", 'E', 'D' },
/* 145 */ {"Invalid reference to external CDATA or SDATA entity; ignored", 'E', 'S'},
/* 146 */ {"Could not find external document type \"%s\"", 'E', 'R'},
/* 147 */ {"Could not find external general entity \"%s\"", 'I', 'R'},
/* 148 */ {"Could not find external parameter entity \"%s\"", 'I', 'R'},
/* 149 */ {"Reference to non-existent general entity \"%s\" ignored", 'E', 'R'},
/* 150 */ {"Could not find entity \"%s\" using default declaration", 'E', 'R'},
/* 151 */ {"Could not find entity \"%2$s\" in attribute %1$s using default declaration", 'E', 'R'},
/* 152 */ {"Short reference map \"%s\" used in DTD but not defined", 'I', 'D'},
/* 153 */ {"End-tag minimization should be \"O\" for element with CONREF attribute", 'I', 'D'},
/* 154 */ {"Declared value of data attribute cannot be ENTITY or ENTITIES", 'E', 'D' },
/* 155 */ {"Declared value of data attribute cannot be IDREF or IDREFS", 'E', 'D' },
/* 156 */ {"Declared value of data attribute cannot be NOTATION", 'E', 'D' },
/* 157 */ {"CURRENT cannot be specified for a data attribute", 'E', 'D' },
/* 158 */ {"CONREF cannot be specified for a data attribute", 'E', 'D' },
/* 159 */ {"Short reference map for element \"%s\" not defined; ignored", 'E', 'C'},
/* 160 */ {"Cannot create temporary file", 'C', 'R'},
/* 161 */ {"Document ended invalidly within SGML declaration", 'C', 'D'},
/* 162 */ {"Capacity limit %s exceeded by %s points", 'W', 'Q'},
/* 163 */ {"Amendment 1 requires \"ISO 8879:1986\" instead of \"ISO 8879-1986\"", 'W', 'D'},
/* 164 */ {"Non-markup, non-minimum data character in SGML declaration", 'E', 'D'},
/* 165 */ {"Parameter cannot be a literal", 'E', 'D'},
/* 166 */ {"Invalid concrete syntax scope \"%s\"", 'E', 'D'},
/* 167 */ {"Parameter must be a number", 'E', 'D'},
/* 168 */ {"\"%s\" should have been \"%s\"", 'E', 'D'},
/* 169 */ {"Character number %s is not supported as an additional name character", 'E', 'U'},
/* 170 */ {"Parameter must be a literal or \"%s\"", 'E', 'D'},
/* 171 */ {"Bad character description for character %s", 'E', 'D'},
/* 172 */ {"Character number %s is described more than once", 'W', 'D'},
/* 173 */ {"Character number plus number of characters exceeds 256", 'E', 'D'},
/* 174 */ {"No description for upper half of character set: assuming \"128 128 UNUSED\"", 'W', 'D'},
/* 175 */ {"Character number %s was not described; assuming UNUSED", 'E', 'D'},
/* 176 */ {"Non-significant shunned character number %s not declared UNUSED", 'E', 'D'},
/* 177 */ {"Significant character \"%s\" cannot be non-SGML", 'E', 'D'},
/* 178 */ {"Unknown capacity set \"%s\"", 'E', 'U'},
/* 179 */ {"No capacities specified." , 'E', 'D'},
/* 180 */ {"Unknown concrete syntax \"%s\"", 'E', 'U'},
/* 181 */ {"Character number exceeds 255", 'E', 'D'},
/* 182 */ {"Concrete syntax SWITCHES not supported", 'E', 'U'},
/* 183 */ {"\"INSTANCE\" scope not supported", 'E', 'U'},
/* 184 */ {"Value of \"%s\" feature must be one or more", 'E', 'D'},
/* 185 */ {"\"%s\" invalid; must be \"YES\" or \"NO\"", 'E', 'D'},
/* 186 */ {"\"%s\" invalid; must be \"PUBLIC\" or \"SGMLREF\"", 'E', 'D'},
/* 187 */ {"Feature \"%s\" is not supported", 'E', 'U'},
/* 188 */ {"Too many open subdocument entities", 'E', 'Q'},
/* 189 */ {"Invalid formal public identifier", 'I', 'D'},
/* 190 */ {"Public text class should have been \"%s\"", 'I', 'D'},
/* 191 */ {"Character number %s must be non-SGML", 'W', 'D'},
/* 192 */ {"Notation \"%s\" not defined in DTD", 'W', 'D'},
/* 193 */ {"Unclosed start or end tag requires \"SHORTTAG YES\"", 'W', 'M'},
/* 194 */ {"Net-enabling start tag requires \"SHORTTAG YES\"", 'W', 'M'},
/* 195 */ {"Attribute name omission requires \"SHORTTAG YES\"", 'W', 'M'},
/* 196 */ {"Undelimited attribute value requires \"SHORTTAG YES\"", 'W', 'M'},
/* 197 */ {"Attribute specification omitted for \"%s\": requires markup minimization", 'W', 'M'},
/* 198 */ {"Concrete syntax does not have any short reference delimiters", 'E', 'D'},
/* 199 */ {"Character number %s does not exist in the base character set", 'E', 'D'},
/* 200 */ {"Character number %s is UNUSED in the syntax reference character set", 'E', 'D'},
/* 201 */ {"Character number %s was not described in the syntax reference character set", 'E', 'D'},
/* 202 */ {"Character number %s in the syntax reference character set has no corresponding character in the system character set", 'E', 'D'},
/* 203 */ {"Character number %s was described using an unknown base set", 'E', 'D'},
/* 204 */ {"Duplication specification for added funtion \"%s\"", 'E', 'D'},
/* 205 */ {"Added function character cannot be \"%s\"", 'E', 'D'},
/* 206 */ {"Only reference concrete syntax function characters supported", 'E', 'U'},
/* 207 */ {"Only reference concrete syntax general delimiters supported", 'E', 'U'},
/* 208 */ {"Only reference concrete syntax short reference delimiters supported", 'E', 'U'},
/* 209 */ {"Unrecognized keyword \"%s\"", 'E', 'D'},
/* 210 */ {"Unrecognized quantity name \"%s\"", 'E', 'D'},
/* 211 */ {"Interpretation of \"%s\" is not a valid name in the declared concrete syntax", 'E', 'D'},
/* 212 */ {"Replacement reserved name \"%s\" cannot be reference reserved name", 'E', 'D'},
/* 213 */ {"Duplicate replacement reserved name \"%s\"", 'E', 'D'},
/* 214 */ {"Quantity \"%s\" must not be less than %s", 'E', 'D'},
/* 215 */ {"Only values up to %2$s are supported for quantity \"%1$s\"", 'E', 'U'},
/* 216 */ {"Exclusions attempt to change required status of group in \"%s\"", 'E', 'C'},
/* 217 */ {"Exclusion cannot apply to token \"%s\" in content model for \"%s\"", 'E', 'C'},
/* 218 */ {"Required %s attribute was not specified for entity %s", 'E', 'C'},
/* 219 */ {"UCNMSTRT must have the same number of characters as LCNMSTRT", 'E', 'D'},
/* 220 */ {"UCNMCHAR must have the same number of characters as LCNMCHAR", 'E', 'D'},
/* 221 */ {"Character number %s assigned to both LCNMSTRT or UCNMSTRT and LCNMCHAR or UCNMCHAR", 'E', 'D'},
/* 222 */ {"Character number %s cannot be an additional name character", 'E', 'D'},
/* 223 */ {"It is unsupported for \"-\" not to be assigned to UCNMCHAR or LCNMCHAR", 'E', 'U'},
/* 224 */ {"Normalized length of value of attribute \"%s\" exceeded LITLEN", 'E', 'Q'},
/* 225 */ {"Length of interpreted parameter literal exceeds LITLEN less the length of the bracketing delimiters", 'E', 'Q'},
/* 226 */ {"Start tag of document element omitted; not minimizable", 'W', 'M'},
/* 227 */ {"Unrecognized designating escape sequence \"%s\"", 'I', 'U'},
/* 228 */ {"Earlier reference to entity \"%s\" used default entity", 'I', 'D'},
/* 229 */ {"Reference to non-existent parameter entity \"%s\" ignored", 'E', 'R'},
};

View File

@ -0,0 +1,833 @@
/* msgcat.c -
X/Open message catalogue functions and gencat utility.
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#ifndef HAVE_CAT
/* In this implementation the message catalogue format is the same as the
message text source file format (see pp 42-43 of the X/Open
Portability Guide, Issue 3, Volume 3.) This means that you don't have
to use the gencat utility, but it is still useful for checking and
merging catalogues. */
/* Compile this with -DGENCAT to get the gencat utility. */
#include "std.h"
#include "msgcat.h"
#ifdef USE_PROTOTYPES
#define P(parms) parms
#else
#define P(parms) ()
#endif
/* Default message set. */
#define NL_SETD 1
#ifndef PATH_FILE_SEP
#define PATH_FILE_SEP ':'
#endif
#ifndef DEFAULT_NLSPATH
#define DEFAULT_NLSPATH ""
#endif
#ifndef DEFAULT_LANG
#define DEFAULT_LANG "default"
#endif
#define HASH_TAB_SIZE 251
struct message {
struct message *next;
unsigned msgnum;
unsigned setnum;
char *text;
};
struct cat {
char *name;
int loaded;
int bad;
struct message *table[HASH_TAB_SIZE];
};
static char *read_buf = 0;
static unsigned read_buf_len = 0;
/* Errors that can be generated by read_catalog. */
enum cat_err {
E_ZERO, /* not an error */
E_BADARG,
E_NOMEM,
E_NOSUCHCOMMAND,
E_INPUT,
E_EOF,
E_BADSEP,
E_BADLINE
};
#ifdef GENCAT
/* These must match enum cat_err. */
static char *cat_errlist[] = {
"Error 0",
"Invalid argument to command",
"Out of memory",
"Unrecognized command",
"Input error",
"Unexpected end of file",
"Space or tab expected after message number",
"Invalid line",
};
#endif /* GENCAT */
#ifndef GENCAT
/* The value of NLSPATH. */
static char *nlspath = 0;
/* The value of LANG. */
static char *lang = 0;
#endif /* not GENCAT */
static int current_lineno = -1;
static enum cat_err cat_errno = E_ZERO;
#ifndef GENCAT
static void load_catalog P((struct cat *));
static FILE *find_catalog P((char *, char **));
#endif
static int read_catalog P((FILE *, struct message **));
static void delete_set P((struct message **, unsigned));
static void delete_message P((struct message **, unsigned, unsigned));
static int hash P((unsigned setnum, unsigned msgnum));
static char *parse_text P((FILE *, int));
#ifndef GENCAT
nl_catd catopen(name, oflag)
char *name;
int oflag;
{
struct cat *catp;
int i;
if (!name)
return 0;
catp = (struct cat *)malloc(sizeof *catp);
if (!catp)
return 0;
for (i = 0; i < HASH_TAB_SIZE; i++)
catp->table[i] = 0;
catp->name = malloc(strlen(name) + 1);
catp->loaded = 0;
catp->bad = 0;
strcpy(catp->name, name);
return (nl_catd)catp;
}
int catclose(catd)
nl_catd catd;
{
int i;
struct cat *catp = (struct cat *)catd;
if (!catp)
return 0;
for (i = 0; i < HASH_TAB_SIZE; i++) {
struct message *p, *nextp;
for (p = catp->table[i]; p; p = nextp) {
nextp = p->next;
free(p->text);
free((char *)p);
}
}
if (catp->name)
free(catp->name);
free((char *)catp);
return 0;
}
char *catgets(catd, setnum, msgnum, dflt)
nl_catd catd;
int setnum, msgnum;
char *dflt;
{
struct message *p;
struct cat *catp;
/* setnum and msgnum are required to be >= 1. */
if (!catd || setnum <= 0 || msgnum <= 0)
return dflt;
catp = (struct cat *)catd;
if (!catp->loaded)
load_catalog(catp);
if (catp->bad)
return dflt;
for (p = catp->table[hash(setnum, msgnum)]; p; p = p->next)
if (p->msgnum == msgnum && p->setnum == setnum)
break;
if (!p)
return dflt;
return p->text;
}
static
VOID load_catalog(catp)
struct cat *catp;
{
FILE *fp;
char *path;
catp->loaded = 1;
fp = find_catalog(catp->name, &path);
if (!fp) {
catp->bad = 1;
return;
}
current_lineno = 0;
if (read_catalog(fp, catp->table) < 0)
catp->bad = 1;
fclose(fp);
if (read_buf) {
free(read_buf);
read_buf = 0;
}
read_buf_len = 0;
free(path);
}
static
FILE *find_catalog(name, pathp)
char *name;
char **pathp;
{
char *path;
if (!name)
return 0;
if (!nlspath) {
nlspath = getenv("NLSPATH");
if (!nlspath)
nlspath = DEFAULT_NLSPATH;
}
if (!lang) {
lang = getenv("LANG");
if (!lang)
lang = DEFAULT_LANG;
}
path = nlspath;
for (;;) {
char *p;
unsigned len = 0;
for (p = path; *p != '\0' && *p != PATH_FILE_SEP; p++) {
if (*p == '%') {
if (p[1] == 'N') {
p++;
len += strlen(name);
}
else if (p[1] == 'L') {
p++;
len += strlen(lang);
}
else if (p[1] == '%') {
p++;
len++;
}
else
len++;
}
else
len++;
}
if (len > 0) {
char *s, *try;
FILE *fp;
s = try = malloc(len + 1);
if (!s)
return 0;
for (p = path; *p != '\0' && *p != PATH_FILE_SEP; p++) {
if (*p == '%') {
if (p[1] == 'N') {
p++;
strcpy(s, name);
s += strlen(name);
}
else if (p[1] == 'L') {
p++;
strcpy(s, lang);
s += strlen(lang);
}
else if (p[1] == '%') {
p++;
*s++ = '%';
}
else
*s++ = *p;
}
else
*s++ = *p;
}
*s++ = '\0';
fp = fopen(try, "r");
if (fp) {
*pathp = try;
return fp;
}
free(try);
}
if (*p == '\0')
break;
path = ++p;
}
return 0;
}
#endif /* not GENCAT */
/* 0 success, -1 error */
static
int parse_message(c, fp, table, setnum, quote)
int c;
FILE *fp;
struct message **table;
unsigned setnum;
int quote;
{
unsigned msgnum;
struct message *msgp;
char *text;
int hc;
msgnum = c - '0';
for (;;) {
c = getc(fp);
if (!isdigit(c))
break;
msgnum = msgnum*10 + (c - '0');
}
if (c == '\n') {
delete_message(table, setnum, msgnum);
return 0;
}
if (c != ' ' && c != '\t') {
cat_errno = E_BADSEP;
return -1;
}
text = parse_text(fp, quote);
if (!text)
return -1;
hc = hash(setnum, msgnum);
for (msgp = table[hc]; msgp; msgp = msgp->next)
if (msgp->setnum == setnum && msgp->msgnum == msgnum)
break;
if (msgp)
free(msgp->text);
else {
msgp = (struct message *)malloc(sizeof *msgp);
if (!msgp) {
cat_errno = E_NOMEM;
return -1;
}
msgp->next = table[hc];
table[hc] = msgp;
msgp->msgnum = msgnum;
msgp->setnum = setnum;
}
msgp->text = text;
return 0;
}
static
char *parse_text(fp, quote)
FILE *fp;
int quote;
{
unsigned i = 0;
char *p;
int c;
int quoted;
c = getc(fp);
if (c == quote) {
quoted = 1;
c = getc(fp);
}
else
quoted = 0;
for (;; c = getc(fp)) {
if (c == EOF) {
if (ferror(fp)) {
cat_errno = E_INPUT;
return 0;
}
break;
}
if (c == '\n')
break;
/* XXX
Can quotes be used in quoted message text if protected by \ ?
Is it illegal to omit the closing quote if there's an opening
quote?
Is it illegal to have anything after a closing quote?
*/
if (quoted && c == quote) {
/* Skip the rest of the line. */
while ((c = getc(fp)) != '\n')
if (c == EOF) {
if (ferror(fp)) {
cat_errno = E_INPUT;
return 0;
}
break;
}
break;
}
if (c == '\\') {
int d;
c = getc(fp);
if (c == EOF)
break;
switch (c) {
case '\n':
current_lineno++;
continue;
case 'n':
c = '\n';
break;
case 'b':
c = '\b';
break;
case 'f':
c = '\f';
break;
case 't':
c = '\t';
break;
case 'v':
c = '\v';
break;
case 'r':
c = '\r';
break;
case '\\':
c = '\\';
break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
c -= '0';
d = getc(fp);
if (d >= '0' && d <= '7') {
c = c*8 + d - '0';
d = getc(fp);
if (d >= '0' && d <= '7')
c = c*8 + d - '0';
else if (d != EOF)
ungetc(d,fp);
}
else if (d != EOF)
ungetc(d, fp);
if (c == '\0')
continue; /* XXX */
break;
default:
/* Ignore the quote. */
break;
}
}
if (i >= read_buf_len) {
if (!read_buf)
read_buf = malloc(read_buf_len = 40);
else
read_buf = realloc(read_buf, read_buf_len *= 2);
if (!read_buf) {
cat_errno = E_NOMEM;
return 0;
}
}
read_buf[i++] = c;
}
p = malloc(i + 1);
if (!p) {
cat_errno = E_NOMEM;
return 0;
}
memcpy(p, read_buf, i);
p[i] = '\0';
return p;
}
/* 0 success, -1 error */
static
int parse_command(fp, table, setnump, quotep)
FILE *fp;
struct message **table;
unsigned *setnump;
int *quotep;
{
char buf[128];
if (fgets(buf, 128, fp) == NULL) {
cat_errno = ferror(fp) ? E_INPUT : E_EOF;
return -1;
}
if (buf[0] == ' ' || buf[0] == '\t' || buf[0] == '\n')
/* a comment */;
else if (strncmp(buf, "set", 3) == 0) {
if (sscanf(buf + 3, "%u", setnump) != 1) {
cat_errno = E_BADARG;
return -1;
}
}
else if (strncmp(buf, "delset", 6) == 0) {
unsigned num;
if (sscanf(buf + 6, "%u", &num) != 1) {
cat_errno = E_BADARG;
return -1;
}
delete_set(table, num);
*setnump = NL_SETD;
}
else if (strncmp(buf, "quote", 5) == 0) {
char *p = buf + 5;
while (*p == ' ' || *p == '\t')
p++;
/* XXX should \ be allowed as the quote character? */
if (*p == '\0' || *p == '\n')
*quotep = -1;
else
*quotep = *p;
}
else {
cat_errno = E_NOSUCHCOMMAND;
return -1;
}
if (strchr(buf, '\n') == 0) {
int c;
while ((c = getc(fp)) != '\n' && c != EOF)
;
}
return 0;
}
static
VOID delete_set(table, setnum)
struct message **table;
unsigned setnum;
{
int i;
for (i = 0; i < HASH_TAB_SIZE; i++) {
struct message *p, *nextp;
for (p = table[i], table[i] = 0; p; p = nextp) {
nextp = p->next;
if (p->setnum == setnum)
free((char *)p);
else {
p->next = table[i];
table[i] = p;
}
}
}
}
static
VOID delete_message(table, setnum, msgnum)
struct message **table;
unsigned setnum, msgnum;
{
struct message **pp;
for (pp = &table[hash(setnum, msgnum)]; *pp; pp = &(*pp)->next)
if ((*pp)->setnum == setnum && (*pp)->msgnum == msgnum) {
struct message *p = *pp;
*pp = p->next;
free(p->text);
free((char *)p);
break;
}
}
/* 0 success, -1 error. On error cat_errno is set to the error number. */
static
int read_catalog(fp, table)
FILE *fp;
struct message **table;
{
int c;
unsigned setnum = NL_SETD;
int quote_char = -1;
for (;;) {
/* start of line */
c = getc(fp);
if (c == EOF)
break;
++current_lineno;
if (isdigit(c)) {
if (parse_message(c, fp, table, setnum, quote_char) < 0)
return -1;
}
else if (c == '$') {
if (parse_command(fp, table, &setnum, &quote_char) < 0)
return -1;
}
else if (c != '\n') {
while ((c = getc(fp)) != '\n' && c != EOF)
if (c != ' ' && c != '\t') {
cat_errno = E_BADLINE;
return -1;
}
if (c == EOF)
break;
}
}
return 0;
}
static
int hash(setnum, msgnum)
unsigned setnum, msgnum;
{
return ((setnum << 8) + msgnum) % HASH_TAB_SIZE;
}
#ifdef GENCAT
static char *program_name;
static int message_compare P((UNIV, UNIV));
static void print_text P((char *, FILE *));
static void usage P((void));
#ifdef VARARGS
static void fatal();
#else
static void fatal P((char *,...));
#endif
int main(argc, argv)
int argc;
char **argv;
{
FILE *fp;
int i, j, nmessages;
struct message **list;
unsigned setnum;
struct message *table[HASH_TAB_SIZE];
program_name = argv[0];
if (argc < 3)
usage();
for (i = 0; i < HASH_TAB_SIZE; i++)
table[i] = 0;
for (i = 1; i < argc; i++) {
errno = 0;
fp = fopen(argv[i], "r");
if (!fp) {
if (i > 1)
fatal("can't open `%s': %s", argv[i], strerror(errno));
}
else {
current_lineno = 0;
cat_errno = E_ZERO;
if (read_catalog(fp, table) < 0) {
assert(cat_errno != E_ZERO);
assert(cat_errno
< sizeof(cat_errlist)/sizeof(cat_errlist[0]));
fatal("%s:%d: %s", argv[i], current_lineno,
cat_errlist[cat_errno]);
}
fclose(fp);
}
}
errno = 0;
fp = fopen(argv[1], "w");
if (!fp)
fatal("can't open `%s' for output: %s", argv[1], strerror(errno));
nmessages = 0;
for (i = 0; i < HASH_TAB_SIZE; i++) {
struct message *p;
for (p = table[i]; p; p = p->next)
nmessages++;
}
list = (struct message **)malloc(nmessages*sizeof(struct message *));
if (!list)
fatal("out of memory");
j = 0;
for (i = 0; i < HASH_TAB_SIZE; i++) {
struct message *p;
for (p = table[i]; p; p = p->next)
list[j++] = p;
}
assert(j == nmessages);
qsort((UNIV)list, nmessages, sizeof(struct message *), message_compare);
setnum = NL_SETD;
for (i = 0; i < nmessages; i++) {
struct message *p = list[i];
if (p->setnum != setnum) {
setnum = p->setnum;
fprintf(fp, "$set %u\n", setnum);
}
fprintf(fp, "%u ", p->msgnum);
print_text(p->text, fp);
putc('\n', fp);
}
if (fclose(fp) == EOF)
fatal("error closing `%s'", argv[1]);
return 0;
}
static
VOID usage()
{
fprintf(stderr, "usage: %s catfile msgfile...\n", program_name);
exit(1);
}
static
#ifdef VARARGS
VOID fatal(va_alist) va_dcl
#else /* not VARARGS */
VOID fatal(char *message,...)
#endif /* not VARARGS */
{
va_list ap;
#ifdef VARARGS
char *message;
va_start(ap);
message = va_arg(ap, char *);
#else /* not VARARGS */
va_start(ap, message);
#endif /* not VARARGS */
fprintf(stderr, "%s: ", program_name);
vfprintf(stderr, message, ap);
putc('\n', stderr);
va_end(ap);
exit(1);
}
static
int message_compare(p1, p2)
UNIV p1, UNIV p2;
{
struct message *m1 = *(struct message **)p1;
struct message *m2 = *(struct message **)p2;
if (m1->setnum < m2->setnum)
return -1;
if (m1->setnum > m2->setnum)
return 1;
if (m1->msgnum < m2->msgnum)
return -1;
if (m1->msgnum > m2->msgnum)
return 1;
return 0;
}
static
VOID print_text(s, fp)
char *s;
FILE *fp;
{
for (; *s; s++) {
if (*s == '\\')
fputs("\\\\", fp);
else if (ISASCII(*s) && isprint((UNCH)*s))
putc(*s, fp);
else {
switch (*s) {
case '\n':
fputs("\\n", fp);
break;
case '\b':
fputs("\\b", fp);
break;
case '\f':
fputs("\\f", fp);
break;
case '\t':
fputs("\\t", fp);
break;
case '\v':
fputs("\\v", fp);
break;
case '\r':
fputs("\\r", fp);
break;
default:
fprintf(fp, "\\%03o", (unsigned char)*s);
break;
}
}
}
}
#endif /* GENCAT */
#ifdef TEST
int main(argc, argv)
int argc;
char **argv;
{
nl_catd catd;
int msgnum, setnum;
if (argc != 2) {
fprintf(stderr, "usage: %s catalogue\n", argv[0]);
exit(1);
}
catd = catopen(argv[1], 0);
fprintf(stderr, "Enter set number, message number pairs:\n");
fflush(stderr);
while (scanf("%d %d", &setnum, &msgnum) == 2) {
char *msg = catgets(catd, setnum, msgnum, "<default>");
fprintf(stderr, "Returned \"%s\"\n", msg);
fflush(stderr);
}
return 0;
}
#endif /* TEST */
#endif /* not HAVE_CAT */
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,13 @@
#ifdef HAVE_CAT
#include <nl_types.h>
#else
typedef UNIV nl_catd;
#endif
/* Don't use prototypes here in case nl_types.h declares a conflicting
prototype. */
nl_catd catopen();
int catclose();
char *catgets();

958
usr.bin/sgmls/sgmls/pars1.c Normal file
View File

@ -0,0 +1,958 @@
#include "sgmlincl.h" /* #INCLUDE statements for SGML parser. */
#define GI (tags[ts].tetd->etdgi+1) /* GI of current element. */
#define NEWGI (newetd->etdgi+1) /* GI of new tag. */
static VOID doincludes P((void));
static int pentname P((char *));
static struct mpos *newmpos P((void));
static VOID commbufs P((void));
static VOID checkdtd P((void));
/* PARSECON: Parse content of an element.
*/
int parsecon(tbuf, pcb)
UNCH *tbuf; /* Work area for tokenization. */
struct parse *pcb; /* Parse control block for this parse. */
{
int srn; /* SHORTREF delimiter number (1-32). */
int refrc; /* Return code from sentref, stagetd, etc. */
TRACECON(etagimct, dostag, datarc, pcb, conrefsw, didreq);
if (eodsw) return(EOD_);
if (didreq && (conrefsw & TAGREF)) {didreq = 0; goto conr;}
if (etagimct>0) {etagimsw = --etagimct ? 1 : 0; destack(); return(ETG_);}
if (dostag) {
conrefsw = conrefsv;
etisw = etiswsv;
if (charmode) {dostag = 0; return datarc;}
return stag(datarc);
}
if (conrefsw) {
conr:
destack();
conrefsw = 0;
return ETG_;
}
else if (eofsw) return(EOD_);
datarc = 0;
while (1) {
parse(pcb);
srn = (int)pcb->action - SRMIN; /* Just in case it's a SHORTREF. */
switch (pcb->action) {
case DCE_: /* Data character in element content. */
/* The data character might be a non-SGML character so
reprocess it using pcbconm. */
REPEATCC;
pcb = conpcb = &pcbconm;
pcb->newstate = pcbcnet;
continue;
case DAS_: /* Current character begins data. */
data = FPOS;
continue;
case NLF_: /* NET or SR returns data in lookahead buffer. */
datalen = (UNS)(ptcon - data); REPEATCC;
goto rcc;
case LAF_: /* Return data in lookahead buffer: mixed. */
datalen = (UNS)(ptcon+1 - data);
goto rcc;
case NON_: /* Single nonchar in nonchbuf. */
datalen = 2; data = nonchbuf;
goto nrcc;
case DAR_: /* Return data except for last char. */
REPEATCC;
case DAF_: /* Return data in source entity buffer. */
datalen = (UNS)(FPOS - data);
rcc:
REPEATCC;
case DEF_: /* Return data in data entity. */
nrcc:
datarc = DAF_;
if (pcb==&pcbcone) {
pcbconm.newstate = pcbcnet;
conpcb = &pcbconm;
}
if (charmode) return(datarc);
stagmin = MINNONE; stagreal = newetd = ETDCDATA;
return(stag(datarc));
case LAS_: /* Start lookahead buffer with current char. */
*(ptcon = data = tbuf+1) = *FPOS;
continue;
case LAM_: /* Move character to lookahead buffer. */
*++ptcon = *FPOS;
continue;
case STG_: /* Process non-null start-tag. */
CTRSET(tagctr); /* Start counting tag length. */
parsenm(tbuf, NAMECASE); /* Get the GI. */
newetd = etdref(tbuf);
if (newetd && newetd->adl) {
parseatt(newetd->adl, tbuf);
adlval((int)ADN(al), newetd);
}
parsetag(&pcbstag); /* Parse the tag ending. */
if ((CTRGET(tagctr)-tagdelsw)>=TAGLEN)
sgmlerr(66, &pcbstag, (UNCH *)0, (UNCH *)0);
if (!newetd) {
sgmlerr(132, pcb, tbuf+1, (UNCH *)0);
continue;
}
return(stagetd(&pcbstag));
case NST_: /* Process null start-tag. */
return nstetd();
case ETC_: /* End-tag in CDATA or RCDATA. */
case ETG_: /* Process non-null end-tag. */
newetd = etdref(parsenm(tbuf, NAMECASE)); /* Get the GI. */
parsetag(&pcbetag); /* Parse tag end. */
if (!newetd) /* Error: undefined.*/
sgmlerr(11, &pcbetag, tbuf+1, (UNCH *)0);
else if (etagetd(&pcbetag)>=0) return ETG_;/* Open element. */
if (pcb->action!=ETC_) continue;
/* Tag is undefined or not for an open element and we are in
a CDATA or RCDATA element; issue message and treat as
null end-tag (</>).
*/
sgmlerr(57, &pcbetag, (UNCH *)0, (UNCH *)0);
case NET_: /* Process null end-tag. */
if ((refrc = netetd(conpcb))!=0) return ETG_;
continue;
case NED_: /* Process null end-tag delimiter. */
etagmin = MINNET;
newetd = etagreal = ETDNET;
etagimct = etag();
etagimsw = etagimct ? 1 : 0; destack();
return ETG_;
case GTR_:
if (entget()!=-1) {
data = FPOS;
continue;
}
/* fall through */
case EOD_: /* End of primary file. */
if (ts<1) return(EOD_); /* Normal end: stack is empty. */
etagimct = ts-1; /* Treat as end-tag for top tag on stack. */
etagmin = MINETAG; etagreal = tags[0].tetd;
destack();
eofsw = 1; /* Return EOD_ after destacking all. */
return ETG_;
/* Short references ending with blanks:
If the blank sequence is followed by RE, go do SR7 or SR6.
If the entity is undefined and we are in mixed content,
the blanks must be returned as data. If not, they
can be ignored.
*/
case SR9_: /* Process SR9 (two or more blanks). */
REPEATCC; /* Make first blank the CC. */
case SR4_: /* Process SR4 (RS, blanks). */
parseseq(tbuf, BSEQLEN); /* Squeeze out all blanks. */
if (*FPOS=='\r') {srn = (srn==9) ? 7 : 6; data = tbuf; goto sr6;}
else REPEATCC;
if ((refrc = shortref(srn, pcb))==DEF_) goto nrcc;
if (refrc>0) return refrc;
if (refrc==ENTUNDEF && pcb==&pcbconm)
{data = tbuf; goto nrcc;}
continue;
/* Short references ending with RE:
If the reference is defined, the RE is ignored.
For RE and RS RE,
no special action is needed if the reference is undefined,
as the RE will be processed immediately as the current character.
For B RE and RS B RE,
the input is primed with a special character that will
be treated as an RE that cannot be a short reference.
*/
case SR7_: /* Process SR7 (blanks, RE). */
datalen = (UNS)(FPOS - data);
case SR2_: /* Process SR2 (RE). */
case SR5_: /* Process SR5 (RS, RE). */
sr6: /* Process SR6 (RS, blanks, RE). */
if ((refrc = shortref(srn, pcb))!=ENTUNDEF) {
if (refrc==DEF_) goto nrcc; /* Defined: data entity. */
if (refrc>0) return refrc; /* Defined: tag entity. */
continue; /* Defined: not tag. */
}
if (pcb!=&pcbconm) continue; /* Not mixed; ignore chars. */
if (srn>=6) /* Return blanks as data. */
{*FPOS = lex.d.genre; REPEATCC; goto nrcc;}
case REF_: /* Undefined SR with RE; return record end. */
datarc = REF_;
if (charmode) return(datarc);
#if 0
/* The standard says this situation can force a tag.
See 323:3-6, 412:1-7. */
/* If RE would be ignored, don't treat it as start-tag
because it could force a required tag; but do change
state to show that an RE was ignored.
*/
if (scbsgml[pss].snext==scbsgmst) {
scbsgml[pss].snext = scbsgmnr;
TRACEGML(scbsgml, pss, conactsw, conact);
continue;
}
#endif
stagmin = MINNONE; stagreal = newetd = ETDCDATA;
return(stag(datarc));
case SR3_: /* Process SR3 (RS). */
REPEATCC;
if ((refrc = shortref(srn, pcb))==DEF_) goto nrcc;
if (refrc>0) return refrc;
continue;
case RBR_: /* Two right brackets */
srn = 26;
REPEATCC;
/* fall through */
case SR1_: /* Process SR1 (TAB). */
case SR8_: /* Process SR8 (space). */
case SR19: /* Process SR19 (-). */
case SR26: /* Process SR26 (]). */
REPEATCC;
goto srproc;
case FCE_: /* Process free character (SR11-18, SR21-32). */
fce[0] = *FPOS;
srn = mapsrch(&lex.s.dtb[lex.s.fce], fce);
case SR10: /* Process SR10 ("). */
case SR11: /* Process SR11 (#). */
case SR20: /* Process SR20 (-). */
case SR25: /* Process SR25 ([). */
srproc:
if ((refrc = shortref(srn, pcb))==DEF_) goto nrcc;
if (refrc>0) return refrc;
if (refrc==ENTUNDEF) { /* Treat the SR as data. */
data = FPOS - (srn==lex.s.hyp2);/* Two data chars if SR20.*/
if (pcb!=&pcbconm) { /* If not in mixed content: */
if (srn>=lex.s.data) { /* Change PCB. */
pcb = conpcb = &pcbconm;
pcb->newstate = pcbcnda;
}
}
else pcb->newstate = pcbcnda;/* Now in data found state. */
}
continue;
case ERX_: /* Entity ref in RCDATA: cancel ending delims.*/
lexcon[lex.d.tago] = lex.l.fre;
lexcon[lex.d.net] = lex.l.nonet;
lexlms[lex.d.msc] = lex.l.fre;
continue;
case EE_: /* Entity end in RCDATA: check nesting. */
if (es<rcessv) {synerr(37, pcb); rcessv = es;}
/* If back at top level, re-enable the ending delimiters. */
if (es==rcessv) {
lexcon[lex.d.tago] = lex.l.tago;
lexcon[lex.d.net] = etictr ? lex.l.net : lex.l.nonet;
lexlms[lex.d.msc] = lex.l.msc;
}
continue;
case PIE_: /* PI entity: same as PIS_. */
return PIS_;
case RSR_: /* Record start: ccnt=0; ++rcnt.*/
++RCNT; CTRSET(RSCC);
default:
return (int)pcb->action; /* Default (MD_ MDC_ MSS_ MSE_ PIS_). */
}
}
}
/* STAGETD: Process start-tag etd.
*/
int stagetd(pcb)
struct parse *pcb; /* Parse control block for this parse. */
{
if (!newetd->etdmod) {
sgmlerr(43, pcb, newetd->etdgi+1, (UNCH *)0);
++ds.etdercnt;
etdset(newetd, (UNCH)SMO+EMO+ETDOCC, &undechdr,
(PETD *)0, (PETD *)0, (PECB *)0);
TRACEETD(newetd);
}
stagmin = MINNONE; stagreal = newetd;
return stag(0);
}
/* NSTETD: Process null start-tag etd.
*/
int nstetd()
{
newetd = ts>0 ? tags[ts].tetd
: tags[0].tetd->etdmod[2].tu.thetd;
stagmin = MINNULL; stagreal = ETDNULL;
etisw = 0;
return stag(0);
}
/* ETAGETD: Process end-tag etd.
*/
int etagetd(pcb)
struct parse *pcb; /* Parse control block for this parse. */
{
etagmin = MINNONE; etagreal = newetd;
if ((etagimct = etag())<0) {
sgmlerr(E_ETAG, pcb, NEWGI, tags[ts].tetd->etdgi+1);
return etagimct;
}
etagimsw = etagimct ? 1 : 0; destack();
return ETG_;
}
/* NETETD: Process null end-tag etd.
*/
int netetd(pcb)
struct parse *pcb; /* Parse control block for this parse. */
{
if (ts<1) {
sgmlerr(51, pcb, (UNCH *)0, (UNCH *)0);
return 0;
}
etagmin = MINNULL; etagreal = ETDNULL;
etagimsw = 0; destack();
return ETG_;
}
/* SHORTREF: Process a short (alternative) reference to an entity.
Returns ENTUNDEF if entity is not defined, otherwise returns
the return code from stagetd or etagetd if the entity was
a tag, or zero if an error occurred somewhere.
*/
int shortref(srn, pcb)
int srn; /* Short reference number. */
struct parse *pcb; /* Parse control block for this parse. */
{
int rc; /* Return code from entopen. */
if (tags[ts].tsrm==SRMNULL || !tags[ts].tsrm[srn]) return ENTUNDEF;
if (!tags[ts].tsrm[srn]->estore) {
sgmlerr(93, pcb, tags[ts].tsrm[srn]->ename+1,
tags[ts].tsrm[0]->ename+1);
return(ENTUNDEF);
}
rc = entopen(tags[ts].tsrm[srn]);
if (rc==ENTDATA) return DEF_;
if (rc==ENTPI) return PIS_;
return(0);
}
/* PARSEPRO: Parse prolog.
Note: ptpro cannot overrun tbuf (and therefore needn't be
tested), as long as the buffer exceeds the longest
lookahead sequence in the content parse tables.
*/
int parsepro()
{
struct parse *oldpcb;
while (1) {
int rc; /* Return code: DAF MSS DCE */
switch (parse(propcb)) {
case LAS_: /* Start lookahead buffer with current char. */
*(ptpro = data = tbuf+1) = *FPOS;
continue;
case LAM_: /* Move character to lookahead buffer. */
*++ptpro = *FPOS;
continue;
case LAF_: /* Return data in lookahead buffer. */
datalen = (UNS)(ptpro+1 - data);
REPEATCC;
rc = DAF_;
break; /* Prolog ended; data pending. */
case DTD_: /* Process document type declaration. */
parsenm(tbuf, NAMECASE); /* Get declaration name. */
if (!ustrcmp(tbuf+1, sgmlkey)
&& !dtdsw && !sgmlsw++) {
#if 0
parse(&pcbmdi);
#endif
/* If we got some appinfo, return. */
if (sgmldecl())
return APP_;
continue;
}
if (!ustrcmp(tbuf+1, key[KDOCTYPE]) && !dtdsw++) {
startdtd();
mddtds(tbuf);
continue;
}
sgmlerr(E_MDNAME, propcb, tbuf+1, (UNCH *)0);
continue;
case DTE_: /* DOCTYPE declaration (and prolog) ended. */
REPEATCC; /* Put back char that followed MSC. */
if (es != 0)
sgmlerr(143, propcb, (UNCH *)0, (UNCH *)0);
else if (dtdrefsw) {/* Process referenced DTD before real DTE. */
dtdrefsw = 0; /* Keep us from coming through here again. */
REPEATCC; /* Put back MSC so it follows referenced DTD. */
entref(indtdent);
}
else mddtde(tbuf);
continue;
case MD_:
/* Process markup declaration within DTD or LPD. */
parsenm(tbuf, NAMECASE); /* Get declaration name. */
if (!ustrcmp(tbuf+1, key[KENTITY]))
mdentity(tbuf);
else if (!ustrcmp(tbuf+1, key[KUSEMAP]))
mdsrmuse(tbuf);
else if (!ustrcmp(tbuf+1, key[KATTLIST]))
mdadl(tbuf);
else if (!ustrcmp(tbuf+1, key[KSHORTREF]))
mdsrmdef(tbuf);
else if (!ustrcmp(tbuf+1, key[KELEMENT]))
mdelem(tbuf);
else if (!ustrcmp(tbuf+1, key[KNOTATION]))
mdnot(tbuf);
else
sgmlerr(E_MDNAME, propcb, tbuf+1, (UNCH *)0);
continue;
case MDC_: /* Process markup declaration comment. */
sgmlsw++; /* SGML declaration not allowed after comment */
parsemd(tbuf, NAMECASE, (struct parse *)0, NAMELEN);
continue;
case MSS_: /* Process marked section start. */
oldpcb = propcb;
propcb = mdms(tbuf, propcb);
if (propcb==&pcbmsc || propcb==&pcbmsrc) {
if (oldpcb == &pcbmds)
sgmlerr(135, oldpcb, (UNCH *)0, (UNCH *)0);
conpcb = propcb;
rc = DCE_;
break;
}
continue;
case MSE_: /* Process marked section end. */
if (mdmse()) propcb = &pcbmds;
continue;
case MSP_: /* Marked section start in prolog outside DTD */
rc = MSS_;
break;
case PIE_: /* PI entity: same as PIS_. */
return(PIS_);
case EOD_: /* Return end of primary entity. */
if (!sw.onlypro || propcb != &pcbpro || !dtdsw)
sgmlerr(127, propcb, (UNCH *)0, (UNCH *)0);
else {
setdtype();
checkdtd();
}
return propcb->action;
case PIS_: /* Return processing instruction (string). */
sgmlsw++; /* SGML declaration not allowed after PI */
return((int)propcb->action); /* Prolog will continue later. */
case CIR_: /* Chars ignored; trying to resume parse. */
synerr(E_RESTART, propcb);
REPEATCC;
continue;
case STE_: /* Start tag ended prolog */
REPEATCC;
REPEATCC;
rc = STE_;
break;
case PEP_: /* Previous character ended prolog. */
REPEATCC;
case DCE_: /* Data character ended prolog. */
REPEATCC;
rc = DCE_;
break;
case EE_: /* Illegal entity end in ignored marked section. */
/* An error message has already been given. */
continue;
default:
abort();
} /* switch */
setdtype(); /* First pass only: set document type. */
checkdtd();
if (sw.onlypro)
return EOD_;
TRACESET(); /* Set trace switches. */
endprolog();
/* *DOC is first element; stack it at level 0. */
stack(newetd = nextetd = stagreal = etagreal = docetd);
return(rc);
} /* while */
}
/* Allocate buffers that are used in the DTD. */
VOID startdtd()
{
nmgrp = (struct etd **)rmalloc((GRPCNT+1)*sizeof(struct etd *));
nnmgrp = (PDCB *)rmalloc((GRPCNT+1)*sizeof(PDCB));
gbuf = (struct thdr *)rmalloc((GRPGTCNT+3)*sizeof(struct thdr));
/* The extra 1 is for parsing the name of a parameter entity in
mdentity(). */
nmbuf = (UNCH *)rmalloc(NAMELEN+3);
pubibuf = (UNCH *)rmalloc(LITLEN+1);
sysibuf = (UNCH *)rmalloc(LITLEN+1);
commbufs();
doincludes();
}
static
VOID checkdtd()
{
struct dcncb *np;
if (sw.swundef) {
int i;
struct etd *ep;
struct srh *sp;
for (i = 0; i < ETDHASH; i++)
for (ep = etdtab[i]; ep; ep = ep->etdnext)
if (!ep->etdmod)
sgmlerr(140, (struct parse *)0, ep->etdgi + 1,
(UNCH *)0);
for (sp = srhtab[0]; sp; sp = sp->enext)
if (sp->srhsrm[0] == 0)
sgmlerr(152, (struct parse *)0, sp->ename + 1,
(UNCH *)0);
}
for (np = dcntab[0]; np; np = np->enext)
if (!np->defined)
sgmlerr(192, (struct parse *)0, np->ename + 1, (UNCH *)0);
}
/* Return non-zero if s is a valid parameter entity name.
If so put a transformed name in entbuf. */
static
int pentname(s)
char *s;
{
int i;
if (lextoke[(UNCH)*s] != NMS)
return 0;
entbuf[2] = ENTCASE ? lextran[(UNCH)*s] : (UNCH)*s;
for (i = 1; s[i]; i++) {
if (i > NAMELEN - 1)
return 0;
if (lextoke[(UNCH)s[i]] < NMC || s[i] == EOBCHAR)
return 0;
entbuf[i + 2] = ENTCASE ? lextran[(UNCH)s[i]] : (UNCH)s[i];
}
entbuf[1] = lex.d.pero;
entbuf[i + 2] = '\0';
entbuf[0] = (UNCH)(i + 3); /* length byte, PERO and '\0' */
return 1;
}
/* Handle sw.includes. */
static
VOID doincludes()
{
char **p;
if (!sw.includes)
return;
for (p = sw.includes; *p; p++) {
if (pentname(*p)) {
if (!entfind(entbuf)) {
union etext etx;
etx.c = savestr(key[KINCLUDE]);
entdef(entbuf, ESM, &etx);
++ds.ecbcnt;
ds.ecbtext += ustrlen(key[KINCLUDE]);
}
}
else
sgmlerr(138, (struct parse *)0, (UNCH *)*p, (UNCH *)0);
}
}
/* Allocate buffers that are use both in the DTD and the instance. */
static
VOID commbufs()
{
al = (struct ad *)rmalloc((ATTCNT+2)*sizeof(struct ad));
lbuf = (UNCH *)rmalloc(LITLEN + 1);
}
static
struct mpos *newmpos()
{
int j;
unsigned long *h;
struct mpos *p = (struct mpos *)rmalloc((GRPLVL+2)*sizeof(struct mpos));
assert(grplongs > 0);
h = (unsigned long *)rmalloc((GRPLVL+2)*grplongs*sizeof(unsigned long));
for (j = 0; j < GRPLVL+2; j++) {
p[j].h = h;
h += grplongs;
}
return p;
}
/* Perform end of prolog buffer allocation. */
VOID endprolog()
{
int i;
ambigfree();
if (dtdsw) {
frem((UNIV)nmgrp);
frem((UNIV)nnmgrp);
frem((UNIV)gbuf);
frem((UNIV)nmbuf);
frem((UNIV)sysibuf);
frem((UNIV)pubibuf);
}
else {
commbufs();
doincludes();
}
scbsgml = (struct restate *)rmalloc((TAGLVL+1)*sizeof(struct restate));
tags = (struct tag *)rmalloc((TAGLVL+1)*sizeof(struct tag));
grplongs = (GRPCNT + LONGBITS - 1)/LONGBITS;
for (i = 0; i < TAGLVL+1; i++)
tags[i].tpos = newmpos();
savedpos = newmpos();
}
/* SETDTYPE: Establish specified or default document type.
*/
VOID setdtype()
{
/* Initialize default model hdr for declared content. */
undechdr.ttype = MANY+MCHARS+MGI; /* Declared content is ANY. */
undechdr.tu.tnum = 0; /* No content model. */
/* Initialize content model and etd for *DOC. */
prcon[0].ttype = MGI; /* Model is an element model. */
prcon[0].tu.tnum = 2; /* A single group with a single GI in it. */
prcon[1].ttype = TTSEQ; /* Non-repeatable SEQ group. */
prcon[1].tu.tnum = 1; /* Only one token in group. */
prcon[2].ttype = TTETD; /* Token is an etd. */
docetd = etddef(indocetd); /* etd for document as a whole. */
etdset(docetd, ETDOCC, prcon, (PETD *)0, (PETD *)0, SRMNULL);
/* Put specified or default document type etd in *DOC model. */
if (!dtype) {
sgmlerr(E_DOCTYPE, propcb, (UNCH *)0, (UNCH *)0);
dtype = indefetd;
}
prcon[2].tu.thetd = etddef(dtype);
if (!prcon[2].tu.thetd->etdmod) {
if (dtype != indefetd)
sgmlerr(52, propcb, dtype+1, (UNCH *)0);
++ds.etdercnt;
etdset(prcon[2].tu.thetd, (UNCH)SMO+EMO+ETDUSED+ETDOCC, &undechdr,
(PETD *)0, (PETD *)0, (PECB *)0);
}
TRACEETD(docetd);
TRACEMOD(prcon);
TRACEETD(prcon[2].tu.thetd);
return;
}
/* PARSETAG: Tag end parser for SGML documents.
For start-tags, it
sets etisw to TAGNET if tag ended with ETI; otherwise to 0.
*/
VOID parsetag(pcb)
struct parse *pcb; /* Parse control block: pcbstag or pcbetag. */
{
tagdelsw = 1; /* Assume tag had an ETI or TAGC. */
switch (parse(pcb)) {
case ETIC: /* Tag closed with ETI. */
if (!sd.shorttag) synerr(194, pcb);
etisw = TAGNET; /* Set switch for stack entry flag. */
return;
case DSC:
synerr(9, pcb);
REPEATCC;
etisw = 0;
return;
case NVS: /* Att name or value token found. */
case NTV: /* Name token value found. */
synerr(E_POSSATT, pcb);
pcb->newstate = 0; /* Reset parse state. */
REPEATCC; /* Put it back for next read. */
tagdelsw = 0; /* Tag had no closing delimiter. */
etisw = 0; /* Don't flag stack entry. */
return;
case TAGO: /* Tag closing implied by TAGO. */
if (!sd.shorttag) synerr(193, pcb);
REPEATCC; /* Put it back for next read. */
tagdelsw = 0; /* Tag had no closing delimiter. */
case TAGC: /* Normal close. */
default: /* Invalid character (msg was sent). */
etisw = 0; /* Don't flag stack entry. */
return;
}
}
/* STAG: Check whether a start-tag is valid at this point in the document
structure, or whether other tags must precede it.
Special case processing is done for the fake tag, #CDATA, as
it is never stacked.
*/
int stag(dataret)
int dataret; /* Data pending: DAF_ REF_ 0=not #PCDATA. */
{
int rc, realrc; /* Return code from context or other test. */
int mexts = 0; /* >0=stack level of minus grp; -1=plus; 0=none.*/
badresw = pexsw = 0;
/* If real element (i.e., not #PCDATA) set mexts and test if empty. */
if (dataret==0) {
mexts = pexmex(newetd);
/* If element is declared empty, it is same as a conref. */
if (GET(newetd->etdmod->ttype, MNONE)) conrefsw = TAGREF;
}
if (GET(tags[ts].tetd->etdmod->ttype, MANY))
rc = mexts>0 ? RCMEX : RCHIT;
else rc = context(newetd, tags[ts].tetd->etdmod, tags[ts].tpos,
&tags[ts].status, mexts);
TRACESTG(newetd, dataret, rc, nextetd, mexts);
switch (rc) {
case RCEND: /* End current element, then retry start-tag. */
if (ts<1) realrc = RCMISS;
else realrc = RCEND;
break;
case RCREQ: /* Stack compulsory GI, then retry start-tag. */
realrc = RCREQ;
break;
case RCMISS: /* Start-tag invalid (#PCDATA or real). */
if (ts>0 && GET(tags[ts].tetd->etdmod->ttype, MANY))
realrc = RCEND;
else realrc = RCMISS;
break;
case RCMEX: /* Start-tag invalid (minus exception). */
etagimct = ts - mexts;
realrc = RCEND;
break;
case RCHITMEX: /* Invalid minus exclusion for required element. */
#if 0 /* This will have been detected by exclude.c. */
sgmlerr(E_MEXERR, &pcbstag, NEWGI, tags[mexts].tetd->etdgi+1);
#endif
case RCHIT: /* Start-tag was valid. */
realrc = RCHIT;
break;
case RCPEX: /* Start-tag valid only because of plus exception. */
pexsw = TAGPEX;
realrc = RCHIT;
break;
default:
abort();
}
switch (realrc) {
case RCEND: /* End current element, then retry start-tag. */
if (didreq) sgmlerr(07, &pcbstag, nextetd->etdgi+1, (UNCH *)0);
didreq = 0; /* No required start-tag done. */
dostag = 1; etiswsv = etisw; /* Save real start-tag status. */
conrefsv = conrefsw; /* Save real start-tag conref. */
conrefsw = 0; /* Current element is not empty. */
etagmin = MINSTAG; destack(); /* Process omitted end-tag. */
return ETG_;
case RCREQ: /* Stack compulsory GI, then retry start-tag. */
if (!BADPTR(nextetd)) {
#if 0 /* This will have been detected in exclude.c. */
if ((mexts = pexmex(nextetd))>0)
sgmlerr(E_MEXERR, &pcbstag, nextetd->etdgi+1,
tags[mexts].tetd->etdgi+1);
#endif
if (!nextetd->etdmod) {
sgmlerr(53, &pcbstag, nextetd->etdgi+1, (UNCH *)0);
etdset(nextetd, (UNCH)SMO+EMO+ETDOCC, &undechdr,
(PETD *)0, (PETD *)0, (PECB *)0);
++ds.etdercnt;
TRACEETD(nextetd);
}
}
if (BITOFF(nextetd->etdmin, SMO)) {
if (!BADPTR(stagreal))
sgmlerr(21, &pcbstag, nextetd->etdgi+1, stagreal->etdgi+1);
else if (stagreal==ETDCDATA)
sgmlerr(49, &pcbstag, nextetd->etdgi+1, (UNCH *)0);
else sgmlerr(50, &pcbstag, nextetd->etdgi+1, (UNCH *)0);
}
didreq = 1; /* Required start-tag done. */
dostag = 1; etiswsv = etisw; /* Save real start-tag status. */
etisw = 0; conrefsv = conrefsw; /* Save real start-tag conref. */
/* If element is declared empty, it is same as a conref. */
conrefsw = (GET(nextetd->etdmod->ttype, MNONE)) ? TAGREF : 0;
stack(nextetd); /* Process omitted start-tag. */
return STG_;
case RCMISS: /* Start-tag invalid (#PCDATA or actual). */
dostag = 0; contersw |= 1; didreq = 0;
if (dataret) {
if (dataret==REF_) badresw = 1;
else sgmlerr(E_CHARS, conpcb, tags[ts].tetd->etdgi+1, (UNCH *)0);
return dataret;
}
sgmlerr(E_CONTEXT, &pcbstag, NEWGI, tags[ts].tetd->etdgi+1);
if (stagmin!=MINNULL) stagmin = MINNONE; stack(newetd);
return STG_;
case RCHIT: /* Start-tag was valid. */
dostag = 0; didreq = 0;
if (dataret) return dataret;
stack(newetd);
return STG_;
}
return NOP_; /* To avoid Borland C++ warning */
}
/* PEXMEX: See if a GI is in a plus or minus exception group on the stack.
If in a minus, returns stack level of minus group; otherwise,
returns -1 if in a plus and not a minus, and zero if in neither.
*/
int pexmex(curetd)
struct etd *curetd; /* The etd for this GI. */
{
int tsl; /* Temporary stack level for looping. */
int pex = 0; /* 1=found in plus grp; 0=not. */
for (tsl = ts; tsl>0; --tsl) {
if (tags[tsl].tetd->etdmex && ingrp(tags[tsl].tetd->etdmex, curetd))
return(tsl);
if (tags[tsl].tetd->etdpex && ingrp(tags[tsl].tetd->etdpex, curetd))
pex = -1;
}
return(pex);
}
/* STACK: Add a new entry to the tag stack.
If there is no room, issue a message and reuse last position.
*/
VOID stack(curetd)
struct etd *curetd; /* The etd for this entry. */
{
/* Stack the new element type definition (error if no room). */
if (++ts>TAGLVL)
sgmlerr(E_STAGMAX, conpcb, curetd->etdgi+1, tags[--ts].tetd->etdgi+1);
tags[ts].tetd = curetd;
/* Set flags: plus exception + tag had ETI + context error + empty. */
tags[ts].tflags = (UNCH)pexsw + etisw + contersw + conrefsw; contersw = 0;
/* If tag had ETI, update ETI counter and enable NET if first ETI. */
if (etisw && ++etictr==1) lexcon[lex.d.net] = lexcnm[lex.d.net] = lex.l.net;
/* If etd has ALT table, use it; otherwise, use last element's ALT. */
if (curetd->etdsrm) {
if (curetd->etdsrm != SRMNULL && curetd->etdsrm[0] == NULL) {
/* Map hasn't been defined. Ignore it. */
sgmlerr(159, &pcbstag, curetd->etdgi + 1, (UNCH *)0);
curetd->etdsrm = 0;
tags[ts].tsrm = tags[ts-1].tsrm;
}
else
tags[ts].tsrm = curetd->etdsrm;
}
else
tags[ts].tsrm = tags[ts-1].tsrm;
/* Initialize rest of stack entry. */
tags[ts].status = 0;
tags[ts].tpos[0].g = 1; /* M: Index in model of next token to test.*/
tags[ts].tpos[0].t = 1; /* P: Index in tpos of current group. */
HITCLEAR(tags[ts].tpos[0].h);
tags[ts].tpos[1].g = 1; /* Index of group in model (dummy grp). */
tags[ts].tpos[1].t = 1; /* 1st token is next in grp to be tested. */
HITCLEAR(tags[ts].tpos[1].h); /* No hits yet as yet. */
TRACESTK(&tags[ts], ts, etictr);
exclude();
return;
}
/* ETAG: Check validity of an end-tag by seeing if it matches any tag
on the stack. If so, return the offset of the match from the
current entry (0=current). If there is no match, issue a message
and return an error code (-1).
If the newetd is ETDNET, a NET delimiter was found, so check for
a tag that ended with ETI instead of a matching GI.
*/
int etag()
{
int tsl = ts+1; /* Temporary stack level for looping. */
/* See if end-tag is anywhere on stack, starting at current entry. */
while (--tsl) {
if (newetd!=ETDNET ? newetd==tags[tsl].tetd : tags[tsl].tflags) {
TRACEETG(&tags[ts], newetd, tsl, ts-tsl);
return(ts-tsl);
}
}
return (-1); /* End-tag didn't match any start-tag. */
}
/* DESTACK:
Call ECONTEXT to see if element can be ended at this point.
and issue message if there are required tags left.
Remove the current entry from the tag stack.
Issue an error if the destacked element was not minimizable
and its end-tag was omitted.
*/
VOID destack()
{
register int ecode = 0; /* Error code (0=o.k.). */
UNCH *eparm2 = NULL; /* Second parameter of error message. */
register int minmsgsw; /* 1=message if tag omitted; 0=no message. */
/* If element has a content model (i.e., not a keyword) and there
are required tags left, and no CONREF attribute was specified,
issue an error message.
*/
if (!GET(tags[ts].tetd->etdmod->ttype, MKEYWORD)
&& !conrefsw
&& !econtext(tags[ts].tetd->etdmod, tags[ts].tpos, &tags[ts].status)) {
if (BADPTR(nextetd))
sgmlerr(54, conpcb, tags[ts].tetd->etdgi+1, (UNCH *)0);
else
sgmlerr(30, conpcb, tags[ts].tetd->etdgi+1, nextetd->etdgi+1);
}
/* If the current tag ended with ETI, decrement the etictr.
If etictr is now zero, disable the NET delimiter.
*/
if (GET(tags[ts--].tflags, TAGNET) && --etictr==0)
lexcon[lex.d.net] = lexcnm[lex.d.net] = lex.l.nonet;
minmsgsw = BITOFF(tags[ts+1].tetd->etdmin, EMO);
if (!conrefsw && minmsgsw && (etagimsw || etagmin==MINETAG)) {
/* Minimization caused by NET delimiter. */
if (BADPTR(etagreal)) ecode = 46;
/* Minimization caused by a containing end-tag. */
else {ecode = 20; eparm2 = etagreal->etdgi+1;}
}
else if (!conrefsw && etagmin==MINSTAG && (minmsgsw || ts<=0)) {
/* Minimization caused by out-of-context start-tag. */
if (!BADPTR(stagreal)) {
ecode = ts>0 ? 39 : 89;
eparm2 = stagreal->etdgi+1;
}
/* Minimization caused by out-of-context data. */
else if (stagreal==ETDCDATA) ecode = ts>0 ? 47 : 95;
/* Minimization caused by out-of-context short start-tag. */
else ecode = ts>0 ? 48 : 96;
if (ts<=0 && ecode) eodsw = 1;
}
if (ecode) sgmlerr((UNS)ecode, conpcb, tags[ts+1].tetd->etdgi+1, eparm2);
/* TEMP: See if parser bug caused stack to go below zero. */
else if (ts<0) {sgmlerr(64, conpcb, (UNCH *)0, (UNCH *)0); ts = 0;}
TRACEDSK(&tags[ts], &tags[ts+1], ts, etictr);
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
comment-column: 30
End:
*/

1308
usr.bin/sgmls/sgmls/pars2.c Normal file

File diff suppressed because it is too large Load Diff

1344
usr.bin/sgmls/sgmls/pcbrf.c Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,104 @@
/* portproc.c -
Semi-portable implementation of run_process().
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#ifdef SUPPORT_SUBDOC
#include "std.h"
#include "entity.h"
#include "appl.h"
/* This code shows how you might use system() to implement run_process().
ANSI C says very little about the behaviour of system(), and so this
is necessarily system dependent. */
/* Characters that are significant to the shell and so need quoting. */
#define SHELL_MAGIC "$\\\"';&()|<>^ \t\n"
/* Character with which to quote shell arguments. */
#define SHELL_QUOTE_CHAR '\''
/* String that can be used to get SHELL_QUOTE_CHAR into a quoted argument. */
#define SHELL_ESCAPE_QUOTE "'\\''"
/* Character that can be used to separate arguments to the shell. */
#define SHELL_ARG_SEP ' '
static UNS shell_quote P((char *, char *));
int run_process(argv)
char **argv;
{
char **p;
char *s, *command;
int ret;
UNS len = 0;
for (p = argv; *p; p++)
len += shell_quote(*p, (char *)0);
len += p - argv;
s = command = xmalloc(len);
for (p = argv; *p; ++p) {
if (s > command)
*s++ = SHELL_ARG_SEP;
s += shell_quote(*p, s);
}
*s++ = '\0';
errno = 0;
ret = system(command);
if (ret < 0)
appl_error(E_EXEC, argv[0], strerror(errno));
free(command);
return ret;
}
/* Quote a string so that it appears as a single argument to the
shell (as used for system()). Store the quoted argument in result, if
result is not NULL. Return the length. */
static
UNS shell_quote(s, result)
char *s, *result;
{
UNS len = 0;
int quoted = 0;
if (strpbrk(s, SHELL_MAGIC)) {
quoted = 1;
len++;
if (result)
result[0] = SHELL_QUOTE_CHAR;
}
for (; *s; s++) {
if (*s == SHELL_QUOTE_CHAR) {
if (result)
strcpy(result + len, SHELL_ESCAPE_QUOTE);
len += strlen(SHELL_ESCAPE_QUOTE);
}
else {
if (result)
result[len] = *s;
len++;
}
}
if (quoted) {
if (result)
result[len] = SHELL_QUOTE_CHAR;
len++;
}
return len;
}
#endif /* SUPPORT_SUBDOC */
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

299
usr.bin/sgmls/sgmls/serv.c Normal file
View File

@ -0,0 +1,299 @@
#include "sgmlincl.h" /* #INCLUDE statements for SGML parser. */
/* ETDDEF: Define an element type definition.
Use an existing one if there is one; otherwise create one, which
rmalloc initializes to zero which shows it is a virgin etd.
*/
PETD etddef(ename)
UNCH *ename; /* Element name (GI) with length byte. */
{
PETD p; /* Pointer to an etd. */
int hnum; /* Hash number for ename. */
if ((p = (PETD)hfind((THASH)etdtab,ename,hnum = hash(ename, ETDHASH)))==0){
p = (PETD)hin((THASH)etdtab, ename, hnum, ETDSZ);
}
return p;
}
/* ETDSET: Store data in an element type definition.
The etd must be valid and virgin (except for adl and etdmin).
As an etd cannot be modified, there is no checking for existing
pointers and no freeing of their storage.
*/
#ifdef USE_PROTOTYPES
PETD etdset(PETD p, UNCH fmin, struct thdr *cmod, PETD *mexgrp, PETD *pexgrp,
struct entity **srm)
#else
PETD etdset(p, fmin, cmod, mexgrp, pexgrp, srm)
PETD p; /* Pointer to an etd. */
UNCH fmin; /* Minimization bit flags. */
struct thdr *cmod; /* Pointer to content model. */
PETD *mexgrp; /* Pointers to minus and plus exception lists. */
PETD *pexgrp; /* Pointers to minus and plus exception lists. */
struct entity **srm; /* Short reference map. */
#endif
{
p->etdmin |= fmin;
p->etdmod = cmod;
p->etdmex = mexgrp;
p->etdpex = pexgrp;
p->etdsrm = srm;
return p;
}
/* ETDREF: Retrieve the pointer to an element type definition.
*/
PETD etdref(ename)
UNCH *ename; /* Element name (GI) with length byte.. */
{
return (PETD)hfind((THASH)etdtab, ename, hash(ename, ETDHASH));
}
/* ETDCAN: Cancel an element definition. The etd is freed and is removed
from the hash table, but its model and other pointers are not freed.
*/
VOID etdcan(ename)
UNCH *ename; /* GI name (with length and EOS). */
{
PETD p;
if ((p = (PETD)hout((THASH)etdtab, ename, hash(ename, ETDHASH)))!=0)
frem((UNIV)p);
}
/* SYMBOL TABLE FUNCTIONS: These functions manage hash tables that are used
for entities, element type definitions, IDs, and other purposes. The
interface will be expanded in the future to include multiple environments,
probably by creating arrays of the present hash tables with each table
in the array corresponding to an environment level.
*/
/* HASH: Form hash value for a string.
From the Dragon Book, p436.
*/
int hash(s, hashsize)
UNCH *s; /* String to be hashed. */
int hashsize; /* Size of hash table array. */
{
unsigned long h = 0, g;
while (*s != 0) {
h <<= 4;
h += *s++;
if ((g = h & 0xf0000000) != 0) {
h ^= g >> 24;
h ^= g;
}
}
return (int)(h % hashsize);
}
/* HFIND: Look for a name in a hash table.
*/
struct hash *hfind(htab, s, h)
struct hash *htab[]; /* Hash table. */
UNCH *s; /* Entity name. */
int h; /* Hash value for entity name. */
{
struct hash *np;
for (np = htab[h]; np != 0; np = np->enext)
if (ustrcmp(s, np->ename) == 0) return np; /* Found it. */
return (struct hash *)0; /* Not found. */
}
/* HIN: Locates an entry in a hash table, or allocates a new one.
Returns a pointer to a structure containing a name
and a pointer to the next entry. Other data in the
structure must be maintained by the caller.
*/
struct hash *hin(htab, name, h, size)
struct hash *htab[]; /* Hash table. */
UNCH *name; /* Entity name. */
int h; /* Hash value for entity name. */
UNS size; /* Size of structures pointed to by table. */
{
struct hash *np;
if ((np = hfind(htab, name, h))!=0) return np; /* Return if name found. */
/* Allocate space for structure and name. */
np = (struct hash *)rmalloc(size + name[0]);
np->ename = (UNCH *)np + size;
memcpy(np->ename, name, name[0]); /* Store name in it. */
np->enext = htab[h]; /* 1st entry is now 2nd.*/
htab[h] = np; /* New entry is now 1st.*/
return np; /* Return new entry ptr. */
}
/* HOUT: Remove an entry from a hash table and return its pointer.
The caller must free any pointers in the entry and then
free the entry itself if that is what is desired; this
routine does not free any storage.
*/
struct hash *hout(htab, s, h)
struct hash *htab[]; /* Hash table. */
UNCH *s; /* Search argument entry name. */
int h; /* Hash value for search entry name. */
{
struct hash **pp;
for (pp = &htab[h]; *pp != 0; pp = &(*pp)->enext)
if (ustrcmp(s, (*pp)->ename) == 0) { /* Found it. */
struct hash *tem = *pp;
*pp = (*pp)->enext;
return tem;
}
return 0; /* NULL if not found; else ptr. */
}
/* SAVESTR: Save a null-terminated string
*/
UNCH *savestr(s)
UNCH *s;
{
UNCH *rp;
rp = (UNCH *)rmalloc(ustrlen(s) + 1);
ustrcpy(rp, s);
return rp;
}
/* SAVENM: Save a name (with length and EOS)
*/
UNCH *savenm(s)
UNCH *s;
{
UNCH *p;
p = (UNCH *)rmalloc(*s);
memcpy(p, s, *s);
return p;
}
/* REPLACE: Free the storage for the old string (p) and store the new (s).
If the specified ptr is NULL, don't free it.
*/
UNCH *replace(p, s)
UNCH *p;
UNCH *s;
{
if (p) frem((UNIV)p); /* Free old storage (if any). */
if (!s) return(s); /* Return NULL if new string is NULL. */
return savestr(s);
}
/* RMALLOC: Interface to memory allocation with error handling.
If storage is not available, fatal error message is issued.
Storage is initialized to zeros.
*/
UNIV rmalloc(size)
unsigned size; /* Number of bytes of initialized storage. */
{
UNIV p = (UNIV)calloc(size, 1);
if (!p) exiterr(33, (struct parse *)0);
return p;
}
UNIV rrealloc(p, n)
UNIV p;
UNS n;
{
UNIV r = realloc(p, n);
if (!r)
exiterr(33, (struct parse *)0);
return r;
}
UNCH *pt;
/* FREM: Free specified memory area gotten with rmalloc().
*/
VOID frem(ptr)
UNIV ptr; /* Memory area to be freed. */
{
free(ptr);
}
/* MAPSRCH: Find a string in a table and return its associated value.
The last entry must be a dummy consisting of a NULL pointer for
the string and whatever return code is desired if the
string is not found in the table.
*/
int mapsrch(maptab, name)
struct map maptab[];
UNCH *name;
{
int i = 0;
do {
UNCH *mapnm, *nm;
for (mapnm = maptab[i].mapnm, nm=name; *nm==*mapnm; mapnm++) {
if (!*nm++) return maptab[i].mapdata;
}
} while (maptab[++i].mapnm);
return maptab[i].mapdata;
}
/* IDDEF: Define an ID control block; return -1 if it already exists.
*/
int iddef(iname)
UNCH *iname; /* ID name (with length and EOS). */
{
PID p;
struct fwdref *r;
p = (PID)hin((THASH)itab, iname, hash(iname, IDHASH), IDSZ);
if (p->iddefed) return(-1);
p->iddefed = 1;
TRACEID("IDDEF", p);
/* Delete any forward references. */
r = p->idrl;
p->idrl = 0;
while (r) {
struct fwdref *tem = r->next;
if (r->msg)
msgsfree(r->msg);
frem((UNIV)r);
r = tem;
}
return(0);
}
/* IDREF: Store a reference to an ID and define the ID if it doesn't yet exist.
Return 0 if already defined, otherwise pointer to a fwdref.
*/
struct fwdref *idref(iname)
UNCH *iname; /* ID name (with length and EOS). */
{
PID p;
int hnum;
struct fwdref *rp;
if ((p = (PID)hfind((THASH)itab, iname, (hnum = hash(iname, IDHASH))))==0)
p = (PID)hin((THASH)itab, iname, hnum, IDSZ);
if (p->iddefed)
return 0;
rp = (struct fwdref *)rmalloc(FWDREFSZ);
rp->next = p->idrl;
p->idrl = rp;
rp->msg = 0;
TRACEID("IDREF", p);
return rp;
}
/* IDRCK: Check idrefs.
*/
VOID idrck()
{
int i;
PID p;
struct fwdref *r;
for (i = 0; i < IDHASH; i++)
for (p = itab[i]; p; p = p->idnext)
if (!p->iddefed)
for (r = p->idrl; r; r = r->next)
svderr(r->msg);
}
/* NTOA: Converts a positive integer to an ASCII string (abuf)
No leading zeros are generated.
*/
UNCH *ntoa(i)
int i;
{
static UNCH buf[1 + 3*sizeof(int) + 1];
sprintf((char *)buf, "%d", i);
return buf;
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
comment-column: 30
End:
*/

477
usr.bin/sgmls/sgmls/sgml1.c Normal file
View File

@ -0,0 +1,477 @@
#include "sgmlincl.h" /* #INCLUDE statements for SGML parser. */
#define ETDCON (tags[ts].tetd->etdmod->ttype) /* ETD content flags. */
/* SGML: Main SGML driver routine.
*/
enum sgmlevent sgmlnext(rcbdafp, rcbtagp)
struct rcbdata *rcbdafp;
struct rcbtag *rcbtagp;
{
while (prologsw && !conactsw) {
int oconact;
conact = parsepro();
conactsw = 0; /* Assume sgmlact() will not be skipped. */
switch(conact) {
case PIS_:
case EOD_:
case APP_: /* APPINFO */
conactsw = 1; /* We can skip sgmlact() in opening state. */
break;
case DAF_:
newetd = stagreal = ETDCDATA;
conact = stag(datarc = DAF_);
conactsw = 1; /* We can skip sgmlact() in opening state. */
prologsw = 0; /* End the prolog. */
break;
case DCE_:
case MSS_:
/* prcon[2].tu.thetd holds the etd for the document element. */
newetd = stagreal = prcon[2].tu.thetd;
stagmin = MINSTAG; /* This tag was minimized. */
/* It's an error if the start tag of the document element
is not minimizable. */
if (BITOFF(newetd->etdmin, SMO))
sgmlerr(226, conpcb, (UNCH *)0, (UNCH *)0);
oconact = conact; /* Save conact. */
conact = stag(0); /* Start the document element. */
conactsw = 1; /* conact needs processing. */
prologsw = 0; /* The prolog is finished. */
if (oconact == MSS_) {
if (msplevel==0) conpcb = getpcb((int)ETDCON);
conpcb = mdms(tbuf, conpcb); /* Parse the marked section
start. */
}
break;
default: /* STE_: not defined in SGMLACT.H. */
if (msplevel==0) conpcb = getpcb((int)ETDCON);
prologsw = 0; /* End the prolog. */
break;
}
}
for (;;) {
unsigned swact; /* Switch action: saved conact, new, or sgmlact.*/
if (conactsw) {
conactsw = 0;
swact = conact;
contersw = contersv;
}
else {
conact = parsecon(tbuf, conpcb);
swact = sgmlact((UNCH)(conact != EOD_ ? conact : LOP_));
}
switch (swact) {
case MD_: /* Process markup declaration. */
parsenm(tbuf, NAMECASE); /* Get declaration name. */
if (!ustrcmp(tbuf+1, key[KUSEMAP])) mdsrmuse(tbuf);
else sgmlerr(E_MDNAME, conpcb, tbuf+1, (UNCH *)0);
continue;
case MDC_: /* Process markup declaration comment. */
if (*FPOS!=lex.d.mdc)
parsemd(tbuf, NAMECASE, (struct parse *)0, NAMELEN);
continue;
case MSS_: /* Process marked section start. */
conpcb = mdms(tbuf, conpcb);
continue;
case MSE_: /* Process marked section end (drop to LOP_). */
if (mdmse()) conpcb = getpcb((int)ETDCON);
continue;
case PIS_: /* Return processing instruction (string). */
if (entpisw) rcbdafp->data = data;
else {
parselit(tbuf, &pcblitc, PILEN, lex.d.pic);
rcbdafp->data = tbuf;
}
rcbdafp->datalen = datalen;
rcbdafp->contersw = entpisw;
entpisw = 0; /* Reset for next time.*/
scbset(); /* Update location in current scb. */
return SGMLPIS;
case APP_:
rcbdafp->data = tbuf;
rcbdafp->datalen = ustrlen(tbuf);
rcbdafp->contersw = 0;
scbset();
return SGMLAPP;
case ETG_: /* Return end-tag. */
charmode = 0; /* Not in char mode unless CDATA or RCDATA.*/
if (msplevel==0) conpcb = getpcb((int)ETDCON);
rcbtagp->contersw = tags[ts+1].tflags;
rcbtagp->tagmin = etagimsw ? MINETAG : etagmin;
rcbtagp->curgi = tags[ts+1].tetd->etdgi;
rcbtagp->ru.oldgi = tags[ts].tetd->etdgi;
if (etagmin==MINSTAG) rcbtagp->tagreal =
BADPTR(stagreal) ? stagreal : (PETD)stagreal->etdgi;
else rcbtagp->tagreal =
BADPTR(etagreal) ? etagreal : (PETD)etagreal->etdgi;
rcbtagp->etictr = etictr;
rcbtagp->srmnm = tags[ts].tsrm!=SRMNULL ? tags[ts].tsrm[0]->ename
: 0;
scbset(); /* Update location in current scb. */
return SGMLETG;
case STG_: /* Return start-tag. */
charmode = 0; /* Not in char mode unless CDATA or RCDATA.*/
if (!conrefsw && msplevel==0) conpcb = getpcb((int)ETDCON);
rcbtagp->contersw = tags[ts].tflags;
rcbtagp->tagmin = dostag ? MINSTAG : stagmin;
rcbtagp->curgi = tags[ts].tetd->etdgi;
/* Get attribute list if one was defined for this element. */
rcbtagp->ru.al = !tags[ts].tetd->adl ? 0 :
rcbtagp->tagmin==MINNONE ? al : tags[ts].tetd->adl;
rcbtagp->tagreal = BADPTR(stagreal)?stagreal:(PETD)stagreal->etdgi;
rcbtagp->etictr = etictr;
rcbtagp->srmnm = tags[ts].tsrm!=SRMNULL ? tags[ts].tsrm[0]->ename
: 0;
scbset(); /* Update location in current scb. */
return SGMLSTG;
case DAF_: /* Return data in source entity buffer. */
charmode = 1;
rcbdafp->datalen = datalen;
rcbdafp->data = data;
rcbdafp->contersw = contersw | entdatsw;
contersw = entdatsw = 0;/* Reset for next time.*/
scbset(); /* Update location in current scb. */
return SGMLDAF;
case CON_: /* Process conact after returning REF_. */
conactsw = 1;
contersv = contersw;
case REF_: /* Return RE found. */
if (badresw) {
badresw = 0;
sgmlerr(E_CHARS, &pcbconm, tags[ts].tetd->etdgi+1, (UNCH *)0);
continue;
}
charmode = 1;
rcbdafp->contersw = contersw;
contersw = 0; /* Reset for next time.*/
scbset(); /* Update location in current scb. */
return SGMLREF;
case EOD_: /* End of source document entity. */
if (mslevel != 0) sgmlerr(139, conpcb, (UNCH *)0, (UNCH *)0);
idrck(); /* Check idrefs. */
scbset(); /* Update location in current scb. */
return SGMLEOD;
default: /* LOP_: Loop again with no action. */
continue;
}
}
}
/* PCBSGML: State and action table for action codes returned to text processor
by SGML.C.
Columns are based on SGMLACT.H values minus DAF_, except that end
of document has input code LOP_, regardless of its action code.
*/
/* Symbols for state names (end with a number). */
#define ST1 0 /* Just had a start tag. */
#define NR1 2 /* Just had an RS or RE. */
#define DA1 4 /* Just had some data. */
#define NR2 6 /* Just had an RE; RE pending. */
#define ST2 8 /* Had only markup since last RE/RS; RE pending. */
static UNCH sgmltab[][11] = {
/*daf_ etg_ md_ mdc_ mss_ mse_ pis_ ref_ stg_ rsr_ eod */
{DA1 ,DA1 ,ST1 ,ST1 ,ST1 ,ST1 ,ST1 ,NR1 ,ST1 ,NR1 ,ST1 },/*st1*/
{DAF_,ETG_,MD_ ,MDC_,MSS_,MSE_,PIS_,LOP_,STG_,LOP_,EOD_},
{DA1 ,DA1 ,ST1 ,ST1 ,ST1 ,ST1 ,ST1 ,NR2 ,ST1 ,NR1 ,ST1 },/*nr1*/
{DAF_,ETG_,MD_ ,MDC_,MSS_,MSE_,PIS_,LOP_,STG_,LOP_,EOD_},
{DA1 ,DA1 ,DA1 ,DA1 ,DA1 ,DA1 ,DA1 ,NR2 ,ST1 ,NR1 ,ST1 },/*da1*/
{DAF_,ETG_,MD_ ,MDC_,MSS_,MSE_,PIS_,LOP_,STG_,LOP_,EOD_},
{DA1 ,DA1 ,ST2 ,ST2 ,ST2 ,ST2 ,ST2 ,NR2 ,ST1 ,NR2 ,ST1 },/*nr2*/
{CON_,ETG_,MD_ ,MDC_,MSS_,MSE_,PIS_,REF_,CON_,LOP_,EOD_},
{DA1 ,DA1 ,ST2 ,ST2 ,ST2 ,ST2 ,ST2 ,NR1 ,ST1 ,NR2 ,ST1 },/*st2*/
{CON_,ETG_,MD_ ,MDC_,MSS_,MSE_,PIS_,REF_,CON_,LOP_,EOD_},
};
int scbsgmst = ST1; /* SCBSGML: trailing stag or markup; ignore RE. */
int scbsgmnr = NR1; /* SCBSGML: new record; do not ignore RE. */
/* SGMLACT: Determine action to be taken by SGML.C based on current state and
specified input.
For start or end of a plus exception element, push or pop the
pcbsgml stack.
Return to caller with action code.
*/
#ifdef USE_PROTOTYPES
int sgmlact(UNCH conret)
#else
int sgmlact(conret)
UNCH conret; /* Action returned to SGML.C by content parse. */
#endif
{
int action;
if (conret==STG_ && GET(tags[ts].tflags, TAGPEX))
{++pss; scbsgml[pss].snext = ST1;}
scbsgml[pss].sstate = scbsgml[pss].snext;
scbsgml[pss].snext = sgmltab[scbsgml[pss].sstate]
[scbsgml[pss].sinput = conret-DAF_];
scbsgml[pss].saction = sgmltab[scbsgml[pss].sstate+1][scbsgml[pss].sinput];
TRACEGML(scbsgml, pss, conactsw, conact);
action = scbsgml[pss].saction;
if (conret==ETG_ && GET(tags[ts+1].tflags, TAGPEX)) {
pss--;
/* An included subelement affects the enclosing state like a
processing instruction (or MDC_ or MD_),
that is to say NR1 is changed to ST1 and NR2 to ST2. */
scbsgml[pss].sstate = scbsgml[pss].snext;
scbsgml[pss].snext = sgmltab[scbsgml[pss].sstate][PIS_ - DAF_];
}
return action;
}
/* GETPCB: Choose pcb for new or resumed element.
*/
struct parse *getpcb(etdcon)
int etdcon; /* Content type of new or resumed element. */
{
if (BITON(etdcon, MGI)) {
return(BITON(etdcon, MCHARS) ? &pcbconm : &pcbcone);
}
if (BITON(etdcon, MCDATA) || BITON(etdcon, MRCDATA)) {
charmode = 1;
return(BITON(etdcon, MCDATA) ? &pcbconc : (rcessv = es, &pcbconr));
}
return(&pcbconm);
}
struct markup *sgmlset(swp)
struct switches *swp;
{
/* Initialize variables based on switches structure members. */
sw = *swp;
rbufs = (UNCH *)rmalloc((UNS)3+sw.swbufsz) + 3; /* DOS file read area. */
TRACEPRO(); /* Set trace switches for prolog. */
msginit(swp);
ioinit(swp);
sdinit();
return &lex.m;
}
/* Points for each capacity, indexed by *CAP in sgmldecl.h. We'll replace
2 with the real NAMELEN at run time. */
static UNCH cappoints[] = {
1,
2,
1,
2,
2,
2,
2,
2,
1,
2,
2,
1,
2,
2,
2,
2,
2
};
static long capnumber[NCAPACITY];
static long maxsubcap[NCAPACITY];
VOID sgmlend(p)
struct sgmlcap *p;
{
int i;
for (; es >= 0; --es)
if (FILESW)
fileclos();
capnumber[NOTCAP] = ds.dcncnt;
capnumber[EXGRPCAP] = ds.pmexgcnt;
capnumber[ELEMCAP] = ds.etdcnt+ds.etdercnt;
capnumber[EXNMCAP] = ds.pmexcnt;
capnumber[GRPCAP] = ds.modcnt;
capnumber[ATTCAP] = ds.attcnt;
capnumber[ATTCHCAP] = ds.attdef;
capnumber[AVGRPCAP] = ds.attgcnt;
capnumber[IDCAP] = ds.idcnt;
capnumber[IDREFCAP] = ds.idrcnt;
capnumber[ENTCAP] = ds.ecbcnt;
capnumber[ENTCHCAP] = ds.ecbtext;
capnumber[MAPCAP] = ds.srcnt + ds.srcnt*lex.s.dtb[0].mapdata;
capnumber[NOTCHCAP] = ds.dcntext;
capnumber[TOTALCAP] = 0;
for (i = 1; i < NCAPACITY; i++) {
if (cappoints[i] > 1)
cappoints[i] = NAMELEN;
capnumber[i] += maxsubcap[i]/cappoints[i];
capnumber[TOTALCAP] += (long)capnumber[i] * cappoints[i];
}
p->number = capnumber;
p->points = cappoints;
p->limit = sd.capacity;
p->name = captab;
for (i = 0; i < NCAPACITY; i++) {
long excess = capnumber[i]*cappoints[i] - sd.capacity[i];
if (excess > 0) {
char buf[sizeof(long)*3 + 1];
sprintf(buf, "%ld", excess);
sgmlerr(162, (struct parse *)0,
(UNCH *)captab[i], (UNCH *)buf);
}
}
}
VOID sgmlsubcap(v)
long *v;
{
int i;
for (i = 0; i < NCAPACITY; i++)
if (v[i] > maxsubcap[i])
maxsubcap[i] = v[i];
}
int sgmlsdoc(ptr)
UNIV ptr;
{
struct entity *e;
union etext etx;
etx.x = ptr;
e = entdef(indocent, ESF, &etx);
if (!e)
return -1;
return entopen(e);
}
/* SGMLGENT: Get a data entity.
Returns:
-1 if the entity does not exist
-2 if it is not a data entity
1 if it is an external entity
2 if it is an internal cdata entity
3 if it is an internal sdata entity
*/
int sgmlgent(iname, np, tp)
UNCH *iname;
PNE *np;
UNCH **tp;
{
PECB ep; /* Pointer to an entity control block. */
ep = entfind(iname);
if (!ep)
return -1;
switch (ep->estore) {
case ESN:
if (np)
*np = ep->etx.n;
return 1;
case ESC:
if (tp)
*tp = ep->etx.c;
return 2;
case ESX:
if (tp)
*tp = ep->etx.c;
return 3;
}
return -2;
}
/* Mark an entity. */
int sgmlment(iname)
UNCH *iname;
{
PECB ep;
int rc;
ep = entfind(iname);
if (!ep)
return -1;
rc = ep->mark;
ep->mark = 1;
return rc;
}
int sgmlgcnterr()
{
return msgcnterr();
}
/* This is for error handling functions that want to print a gi backtrace. */
UNCH *getgi(i)
int i;
{
return i >= 0 && i <= ts ? tags[i].tetd->etdgi + 1 : NULL;
}
/* Returns the value of prologsw for the use by error handling functions. */
int inprolog()
{
return prologsw;
}
/* Used by the error handling functions to access scbs. */
int getlocation(level, locp)
int level;
struct location *locp;
{
if (level < 0 || level > es)
return 0;
if (locp) {
int es = level;
/* source macros access a variable called `es' */
locp->filesw = FILESW;
locp->rcnt = RCNT;
locp->ccnt = CCNT;
locp->ename = ENTITY + 1;
locp->fcb = SCBFCB;
locp->curchar = CC;
locp->nextchar = NEXTC;
}
return 1;
}
int sgmlloc(linenop, filenamep)
unsigned long *linenop;
char **filenamep;
{
int level = es;
int es;
for (es = level; es >= 0 && !FILESW; es--)
;
if (es < 0)
return 0;
*linenop = RCNT;
*filenamep = ioflid(SCBFCB);
return 1;
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

499
usr.bin/sgmls/sgmls/sgml2.c Normal file
View File

@ -0,0 +1,499 @@
/* Added exiterr() for terminal errors to prevent SGML.MSG errors. */
#include "sgmlincl.h" /* #INCLUDE statements for SGML parser. */
static int iorc; /* Return code from io* functions */
/* ENTDEF: Process an entity definition and return the pointer to it.
The entity text must be in permanent storage.
There is no checking to see if the entity already exists;
the caller must have done that.
*/
#ifdef USE_PROTOTYPES
PECB entdef(UNCH *ename, UNCH estore, union etext *petx)
#else
PECB entdef(ename, estore, petx)
UNCH *ename; /* Entity name (with length and EOS). */
UNCH estore; /* Entity storage class. */
union etext *petx; /* Ptr to entity text union. */
#endif
{
PECB p;
p = (PECB)hin((THASH)etab, ename, hash(ename, ENTHASH), ENTSZ);
memcpy((UNIV)&p->etx, (UNIV)petx, ETEXTSZ);
p->estore = estore;
TRACEECB("ENTDEF", p);
return(p);
}
/* ENTFIND: If an entity exists, return ptr to its ecb.
Return NULL if it is not defined.
*/
PECB entfind(ename)
UNCH *ename; /* Entity name (with length and EOS). */
{
PECB p;
p = (PECB)hfind((THASH)etab, ename, hash(ename, ENTHASH));
TRACEECB("ENTFIND", p);
return p;
}
/* ENTREF: Process a general or parameter entity reference.
If the entity is defined it returns the return code from ENTOPEN.
It returns ENTUNDEF for undefined parameter entity references
and for general entity references when defaulting is not allowed.
Otherwise, it uses the default entity text.
*/
int entref(ename)
UNCH *ename; /* Entity name (with length and EOS). */
{
PECB ecb; /* Entity control block. */
/* Get the entity control block, if the entity has been defined. */
if ((ecb = (PECB)hfind((THASH)etab, ename, hash(ename, ENTHASH)))==0
|| ecb->estore == 0) {
if ( ename[1]==lex.d.pero
|| ecbdeflt==0
|| (ecb = usedef(ename))==0 ) {
sgmlerr(ename[1] == lex.d.pero || ecbdeflt == 0 ? 35 : 150,
(struct parse *)0, ename+1, (UNCH *)0);
return(ENTUNDEF);
}
}
return(entopen(ecb));
}
/* ENTOPEN: Open a newly referenced entity.
Increment the stack pointer (es) and initialize the new entry.
ENTDATA if entity is CDATA or SDATA, ENTPI if it is PI,
0 if normal and all o.k.; <0 if not.
*/
int entopen(ecb)
struct entity *ecb; /* Entity control block. */
{
int i; /* Loop counter. */
/* See if we have exceeded the entity nesting level. */
if (es>=ENTLVL) {
sgmlerr(34, (struct parse *)0, ecb->ename+1, ntoa(ENTLVL));
return(ENTMAX);
}
/* If entity is an etd, pi, or data, return it without creating an scb. */
switch (ecb->estore) {
case ESN:
if (NEXTYPE(ecb->etx.n)!=ESNSUB) {
if (!NEDCNDEFINED(ecb->etx.n))
sgmlerr(78, (struct parse *)0, NEDCN(ecb->etx.n)+1,
ecb->ename+1);
}
else {
#if 0
if (!NEID(ecb->etx.n)) {
sgmlerr(149, (struct parse *)0, ecb->ename + 1, (UNCH *)0);
return ENTFILE;
}
#endif
if (sw.nopen >= sd.subdoc)
sgmlerr(188, (struct parse *)0,
(UNCH *)NULL, (UNCH *)NULL);
}
data = (UNCH *)ecb->etx.n;
entdatsw = NDECONT;
return(ENTDATA);
case ESC:
case ESX:
datalen = ustrlen(ecb->etx.c);
data = ecb->etx.c;
entdatsw = (ecb->estore==ESC) ? CDECONT : SDECONT;
return(ENTDATA);
case ESI:
datalen = ustrlen(ecb->etx.c);
data = ecb->etx.c;
entpisw = 4;
return(ENTPI);
}
/* If the same entity is already open, send msg and ignore it.
Level 0 needn't be tested, as its entity name is always *DOC.
*/
for (i = 0; ++i<=es;) if (scbs[i].ecb.enext==ecb) {
sgmlerr(36, (struct parse *)0, ecb->ename+1, (UNCH *)0);
return(ENTLOOP);
}
/* Update SCB if entity trace is wanted in messages or entity is a file.
(Avoid this at start when es==-1 or memory will be corrupted.)
*/
if (es >= 0 && (sw.swenttr || FILESW)) scbset();
/* Stack the new source control block (we know there is room). */
++es; /* Increment scbs index. */
RCNT = CCO = RSCC = 0; /* No records or chars yet. */
COPIEDSW = 0;
memcpy((UNIV)&ECB, (UNIV)ecb, (UNS)ENTSZ); /* Copy the ecb into the scb. */
ECBPTR = ecb; /* Save the ecb pointer in scb.ecb.enext. */
TRACEECB("ENTOPEN", ECBPTR);
/* For memory entities, the read buffer is the entity text.
The text starts at FBUF, so FPOS should be FBUF-1
because it is bumped before each character is read.
*/
if (ECB.estore<ESFM) {FPOS = (FBUF = ECB.etx.c)-1; return 0;}
/* For file entities, suspend any open file and do first read. */
if (ECB.etx.x == 0) {
--es;
switch (ecb->estore) {
case ESF:
sgmlerr(149, (struct parse *)0, ecb->ename + 1, (UNCH *)0);
break;
case ESP:
sgmlerr(229, (struct parse *)0, ecb->ename + 2, (UNCH *)0);
break;
default:
abort();
}
return ENTFILE;
}
fileopen(); /* Open new external file. */
if (iorc<0) { /* If open not successful: */
FPOS = FBUF-1; /* Clean CCNT for OPEN error msg.*/
filerr(32, ecb->ename+1);
--es; /* Pop the stack. */
return(ENTFILE);
}
filepend(es); /* Suspend any open file. */
fileread(); /* First read of file must be ok.*/
return 0;
}
/* ENTGET: Get next record of entity (if there is one).
Otherwise, close the file (if entity is a file) and
pop the entity stack. If nothing else is on the stack,
return -1 to advise the caller.
*/
int entget()
{
RSCC += (CCO = FPOS-FBUF);
/* Characters-in-record (ignore EOB/EOF). */
tagctr += CCO; /* Update tag length counter. */
switch (*FPOS) {
case EOBCHAR: /* End of file buffer: refill it. */
rbufs[-2] = FPOS[-2];
rbufs[-1] = FPOS[-1];
fileread(); /* Read the file. */
if (iorc > 0) break;
readerr:
filerr(31, ENTITY+1); /* Treat error as EOF. */
case EOFCHAR: /* End of file: close it. */
fileclos(); /* Call SGMLIO to close file. */
conterr:
if (es==0) { /* Report if it is primary file. */
FPOS = FBUF-1; /* Preserve CCNT for omitted end-tags. */
return -1;
}
case EOS: /* End of memory entity: pop the stack. */
TRACEECB("ENTPOP", ECBPTR);
if (COPIEDSW) {
frem((UNIV)(FBUF + 1));
COPIEDSW = 0;
}
--es; /* Pop the SCB stack. */
if (FBUF) break; /* Not a PEND file. */
filecont(); /* Resume previous file. */
if (iorc<0) { /* If CONT not successful: */
filerr(94, ENTITY+1);
goto conterr;
}
fileread(); /* Read the file. */
if (iorc<=0) goto readerr; /* If READ not successful: */
rbufs[-1] = SCB.pushback;
FPOS += CCO;
CCO = 0;
if (delmscsw && es==0) { /* End of DTD. */
delmscsw = 0;
*rbufs = lex.d.msc;
}
break;
}
return 0;
}
/* USEDEF: Use the default value for an entity reference.
Returns the ECB for the defaulted entity.
*/
PECB usedef(ename)
UNCH *ename; /* Entity name (with length and EOS). */
{
union etext etx; /* Save return from entgen. */
PECB ecb; /* Entity control block. */
PNE pne = 0; /* Ptr to NDATA entity control block. */
UNCH estore; /* Default entity storage type. */
if ((estore = ecbdeflt->estore)<ESFM) /* Default is an internal string. */
etx.c = ecbdeflt->etx.c;
else {
/* Move entity name into fpi. */
fpidf.fpinm = ename + 1;
if ((etx.x = entgen(&fpidf))==0) return (PECB)0;
if (estore==ESN) {
memcpy((UNIV)(pne=(PNE)rmalloc((UNS)NESZ)),(UNIV)ecbdeflt->etx.n,(UNS)NESZ);
NEID(pne) = etx.x;
etx.n = pne;
}
}
if (sw.swrefmsg) sgmlerr(45, (struct parse *)0, ename+1, (UNCH *)0);
++ds.ecbcnt;
ecb = entdef(ename, estore, &etx);
ecb->dflt = 1;
if (pne) NEENAME(pne) = ecb->ename;
return(ecb);
}
/* SCBSET: Set source control block to current location in the current entity.
This routine is called by SGML when it returns to the text
processor and by ERROR when it reports an error.
*/
VOID scbset()
{
if (es >= 0 && FBUF) {
CC = *FPOS;
if (*FPOS == DELNONCH)
NEXTC = FPOS[1];
else
NEXTC = 0;
CCO = FPOS + 1 - FBUF;
}
}
/* FILEOPEN: Call IOOPEN to open an external entity (file).
*/
VOID fileopen() /* Open an external entity's file. */
{
iorc = ioopen(ECB.etx.x, &SCBFCB);
}
/* FILEREAD: Call IOREAD to read an open external entity (file).
*/
VOID fileread() /* Read the current external entity's file. */
{
int newfile;
iorc = ioread(SCBFCB, rbufs, &newfile);
FPOS = (FBUF = rbufs) - 1; /* Actual read buffer. */
if (newfile) RCNT = 0;
}
/* FILEPEND: Call IOPEND to close an open external entity (file) temporarily.
*/
VOID filepend(es) /* Close the current external entity's file. */
int es; /* Local index to scbs. */
{
while (--es>=0) { /* Find last external file on stack. */
int off;
if (!FILESW) continue; /* Not an external file. */
if (!FBUF) continue; /* Already suspended. */
off = CCO;
assert(off >= -1);
if (off < 0) off = 0;
else CCO = 0;
FPOS -= CCO;
SCB.pushback = FPOS[-1];
FBUF = 0; /* Indicate pending file. */
RSCC += off; /* Update characters-in-record counter. */
tagctr += off; /* Update tag length counter. */
iopend(SCBFCB, off, rbufs);
return;
}
}
/* FILECONT: Call IOCONT to reopen an external entity (file).
*/
VOID filecont() /* Open an external entity's file. */
{
iorc = iocont(SCBFCB);
}
/* FILECLOS: Call IOCLOSE to close an open external entity (file).
*/
VOID fileclos() /* Close the current external entity's file. */
{
if (!SCBFCB)
return;
ioclose(SCBFCB);
/* The fcb will have been freed by sgmlio.
Make sure we don't access it again. */
SCBFCB = NULL;
}
/* ERROR: Interface to text processor SGML I/O services for error handling.
*/
VOID error(e)
struct error *e;
{
scbset(); /* Update location in source control block. */
msgprint(e);
}
/* PTRSRCH: Find a pointer in a list and return its index.
Search key must be on list as there is no limit test.
This routine is internal only -- not for user data.
*/
UNIV mdnmtab[] = {
(UNIV)key[KATTLIST],
(UNIV)key[KDOCTYPE],
(UNIV)key[KELEMENT],
(UNIV)key[KENTITY],
(UNIV)key[KLINKTYPE],
(UNIV)key[KLINK],
(UNIV)key[KNOTATION],
(UNIV)sgmlkey,
(UNIV)key[KSHORTREF],
(UNIV)key[KUSELINK],
(UNIV)key[KUSEMAP]
};
UNIV pcbtab[] = {
(UNIV)&pcbconc,
(UNIV)&pcbcone,
(UNIV)&pcbconm,
(UNIV)&pcbconr,
(UNIV)&pcbetag,
(UNIV)&pcbgrcm,
(UNIV)&pcbgrcs,
(UNIV)&pcbgrnm,
(UNIV)&pcbgrnt,
(UNIV)&pcblitc,
(UNIV)&pcblitp,
(UNIV)&pcblitr,
(UNIV)&pcblitt,
(UNIV)&pcblitv,
(UNIV)&pcbmd,
(UNIV)&pcbmdc,
(UNIV)&pcbmdi,
(UNIV)&pcbmds,
(UNIV)&pcbmsc,
(UNIV)&pcbmsi,
(UNIV)&pcbmsrc,
(UNIV)&pcbpro,
(UNIV)&pcbref,
(UNIV)&pcbstag,
(UNIV)&pcbval,
(UNIV)&pcbeal,
(UNIV)&pcbsd,
};
UNS ptrsrch(ptrtab, ptr)
UNIV ptrtab[];
UNIV ptr;
{
UNS i;
for (i = 0; ; ++i)
if (ptrtab[i] == ptr)
break;
return i;
}
/* MDERR: Process errors for markup declarations.
Prepare the special parameters that only exist for
markup declaration errors.
*/
VOID mderr(number, parm1, parm2)
UNS number; /* Error number. */
UNCH *parm1; /* Additional parameters (or NULL). */
UNCH *parm2; /* Additional parameters (or NULL). */
{
struct error err;
errorinit(&err, subdcl ? MDERR : MDERR2, number);
err.parmno = parmno;
err.subdcl = subdcl;
err.eparm[0] = (UNIV)parm1;
err.eparm[1] = (UNIV)parm2;
err.errsp = (sizeof(pcbtab)/sizeof(pcbtab[0])) + ptrsrch(mdnmtab,
(UNIV)mdname);
error(&err);
}
/* SGMLERR: Process errors for SGML parser.
*/
VOID sgmlerr(number, pcb, parm1, parm2)
UNS number; /* Error number. */
struct parse *pcb; /* Current parse control block. */
UNCH *parm1; /* Error message parameters. */
UNCH *parm2; /* Error message parameters. */
{
struct error err;
errorinit(&err, DOCERR, number);
if (!pcb) pcb = prologsw ? propcb : conpcb;
err.errsp = ptrsrch(pcbtab, (UNIV)pcb);
err.eparm[0] = (UNIV)parm1;
err.eparm[1] = (UNIV)parm2;
error(&err);
}
/* SAVERR: Save an error for possible later use.
*/
UNIV saverr(number, pcb, parm1, parm2)
UNS number; /* Error number. */
struct parse *pcb; /* Current parse control block. */
UNCH *parm1; /* Error message parameters. */
UNCH *parm2; /* Error message parameters. */
{
struct error err;
errorinit(&err, DOCERR, number);
if (!pcb) pcb = prologsw ? propcb : conpcb;
err.errsp = ptrsrch(pcbtab, (UNIV)pcb);
err.eparm[0] = (UNIV)parm1;
err.eparm[1] = (UNIV)parm2;
scbset();
return msgsave(&err);
}
/* SVDERR: Print a saved error.
*/
VOID svderr(p)
UNIV p;
{
msgsprint(p);
}
/* EXITERR: Process terminal errors for SGML parser.
*/
VOID exiterr(number, pcb)
UNS number; /* Error number. */
struct parse *pcb; /* Current parse control block. */
{
struct error err;
errorinit(&err, EXITERR, number);
if (!pcb) pcb = prologsw ? propcb : conpcb;
err.errsp = ptrsrch(pcbtab, (UNIV)pcb);
error(&err);
/* The error handler should have exited. */
abort();
}
/* SYNERR: Process syntax errors for SGML parser.
*/
VOID synerr(number, pcb)
UNS number; /* Error number. */
struct parse *pcb; /* Current parse control block. */
{
struct error err;
errorinit(&err, DOCERR, number);
err.errsp = ptrsrch(pcbtab, (UNIV)pcb);
error(&err);
}
/* FILERR: Process a file access error.
*/
VOID filerr(number, parm)
UNS number;
UNCH *parm;
{
struct error err;
errorinit(&err, FILERR, number);
err.eparm[0] = (UNIV)parm;
err.sverrno = errno;
error(&err);
}
/* ERRORINIT: Constructor for struct error.
*/
VOID errorinit(e, type, number)
struct error *e;
UNS type;
UNS number;
{
int i;
e->errtype = type;
e->errnum = number;
e->errsp = 0;
for (i = 0; i < MAXARGS; i++)
e->eparm[i] = 0;
e->parmno = 0;
e->subdcl = 0;
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
comment-column: 30
End:
*/

View File

@ -0,0 +1,70 @@
/* This file controls the interface between the parser core and the auxiliary
functions in entgen.c, sgmlio.c, and sgmlmsg.c */
#include "std.h"
#include "entity.h"
#include "sgmldecl.h"
/* Error types (ERRTYPE) for calls to error-handling services
performed for SGML by the text processor (SGMLIO).
NOTE: Strings in these blocks have no lengths, but cannot exceed
NAMELEN (plus 1 more byte for the zero terminator).
*/
#define FILERR 0 /* Error: file access. */
#define DOCERR 1 /* Error: in document markup. */
#define MDERR 2 /* Error: in markup declaration with subdcl. */
#define MDERR2 3 /* Error: in markup declaration with no subdcl. */
#define EXITERR 4 /* Error: terminal error in document markup. */
/* Quantities affecting error messages and their arguments.
*/
#define MAXARGS 2 /* Maximum number of arguments in a msg. */
/* NOTE: Error handler must return, or next call to SGML must be RSET or END,
except for EXITERR errors which must not return.
*/
struct error { /* IPB for error messages. */
UNS errtype; /* Type of error: DOC, MD, MD2, FIL. */
UNS errnum; /* Error number. */
UNS errsp; /* Special parameter index in message file. */
int sverrno; /* Saved value of errno. */
int parmno; /* MDERROR: declaration parameter number. */
UNCH *subdcl; /* MDERROR: subject of declaration. */
UNIV eparm[MAXARGS]; /* Ptrs to arguments (no length, but EOS). */
};
struct location {
int filesw;
unsigned long rcnt;
int ccnt;
UNCH curchar;
UNCH nextchar;
UNCH *ename;
UNIV fcb;
};
int ioopen P((UNIV, UNIV*));
VOID ioclose P((UNIV));
int ioread P((UNIV, UNCH *, int *));
VOID iopend P((UNIV, int, UNCH *));
int iocont P((UNIV));
VOID ioinit P((struct switches *));
char *ioflid P((UNIV));
UNIV entgen P((struct fpi *));
VOID msgprint P((struct error *));
VOID msginit P((struct switches *));
UNIV msgsave P((struct error *));
VOID msgsprint P((UNIV));
VOID msgsfree P((UNIV));
int msgcnterr P((void));
int inprolog P((void));
UNCH *getgi P((int));
int getlocation P((int, struct location *));
UNIV rmalloc P((unsigned int));
UNIV rrealloc P((UNIV, UNS));
VOID frem P((UNIV));
VOID exiterr P((unsigned int,struct parse *));

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,84 @@
/* sgmldecl.h: SGML declaration parsing. */
#define QATTCNT 0
#define QATTSPLEN 1
#define QBSEQLEN 2
#define QDTAGLEN 3
#define QDTEMPLEN 4
#define QENTLVL 5
#define QGRPCNT 6
#define QGRPGTCNT 7
#define QGRPLVL 8
#define QLITLEN 9
#define QNAMELEN 10
#define QNORMSEP 11
#define QPILEN 12
#define QTAGLEN 13
#define QTAGLVL 14
#define NQUANTITY (QTAGLVL+1)
#define TOTALCAP 0
#define ENTCAP 1
#define ENTCHCAP 2
#define ELEMCAP 3
#define GRPCAP 4
#define EXGRPCAP 5
#define EXNMCAP 6
#define ATTCAP 7
#define ATTCHCAP 8
#define AVGRPCAP 9
#define NOTCAP 10
#define NOTCHCAP 11
#define IDCAP 12
#define IDREFCAP 13
#define MAPCAP 14
#define LKSETCAP 15
#define LKNMCAP 16
extern char *captab[];
struct sgmldecl {
long capacity[NCAPACITY];
long subdoc;
UNCH formal;
UNCH omittag;
UNCH shorttag;
UNCH shortref;
UNCH namecase[2]; /* case translation of general/entity names */
int quantity[NQUANTITY];
};
extern struct sgmldecl sd;
#define OMITTAG (sd.omittag)
#define SUBDOC (sd.subdoc)
#define SHORTTAG (sd.shorttag)
#define FORMAL (sd.formal)
#define ATTCNT (sd.quantity[QATTCNT])
#define ATTSPLEN (sd.quantity[QATTSPLEN])
#define BSEQLEN (sd.quantity[QBSEQLEN])
#define ENTLVL (sd.quantity[QENTLVL])
#define GRPGTCNT (sd.quantity[QGRPGTCNT])
#define GRPCNT (sd.quantity[QGRPCNT])
#define GRPLVL (sd.quantity[QGRPLVL])
#define LITLEN (sd.quantity[QLITLEN])
#define NAMELEN (sd.quantity[QNAMELEN])
#define NORMSEP (sd.quantity[QNORMSEP])
#define PILEN (sd.quantity[QPILEN])
#define TAGLEN (sd.quantity[QTAGLEN])
#define TAGLVL (sd.quantity[QTAGLVL])
#define NAMECASE (sd.namecase[0])
#define ENTCASE (sd.namecase[1])
#define YES 1
#define NO 0
#define UNUSED -1
#define UNKNOWN -2
#define UNDESC -3
#define UNKNOWN_SET -4
extern int asciicharset[];

View File

@ -0,0 +1,129 @@
/* SGMLFNSM.H: SGML function declarations (ANSI prototypes). */
VOID adlfree P((struct ad *, int));
VOID adlval P((int,struct etd *));
VOID aenttst P((int, UNCH *));
int allhit P((struct thdr *,unsigned long *,int,int));
VOID ambig P((void));
VOID ambigfree P((void));
int amemget P((struct ad *,int,UNCH *));
int anmget P((int,UNCH *));
int anmtgrp P((struct parse *,struct ad *,int,UNS *,int));
int antvget P((int,UNCH *,UNCH **));
int anyhit P((unsigned long *));
int attval P((int,UNCH *,int,struct ad *));
VOID charrefa P((UNCH *));
int charrefn P((UNCH *, struct parse *));
int context P((struct etd *,struct thdr *,struct mpos *,UNCH *,int));
struct etd **copygrp P((struct etd **,unsigned int));
int datachar P((int, struct parse *));
struct dcncb *dcnfind P((UNCH *));
VOID destack P((void));
int econtext P((struct thdr *,struct mpos *,UNCH *));
VOID endprolog P((void));
struct entity *entfind P((UNCH *));
int entopen P((struct entity *));
/* VOID eposset P((void)); NOT YET IN USE. */
VOID error P((struct error *));
VOID errorinit P((struct error *, unsigned, unsigned));
int etag P((void));
int etagetd P((struct parse *));
VOID etdadl P((struct etd *));
VOID etdcan P((UNCH *));
struct etd *etddef P((UNCH *));
struct etd *etdref P((UNCH *));
VOID exclude P((void));
VOID fileclos P((void));
VOID filecont P((void));
VOID fileopen P((void));
VOID filepend P((int));
VOID fileread P((void));
VOID filerr P((unsigned, UNCH *));
VOID fixdatt P((struct dcncb *));
struct parse *getpcb P((int));
int groupopt P((struct thdr *,struct mpos *));
int groupreq P((struct etd *,struct thdr *,struct mpos *));
int grpsz P((struct thdr *,int));
int hash P((UNCH *,int));
struct hash *hfind P((struct hash **,UNCH *,int));
struct hash *hin P((struct hash **,UNCH *,int,unsigned int));
int iddef P((UNCH *));
VOID idrck P((void));
struct fwdref *idref P((UNCH *));
VOID idreftst P((int,UNCH *));
int ingrp P((struct etd **,struct etd *));
VOID initatt P((struct ad *));
int mapsrch P((struct map *,UNCH *));
VOID mdadl P((UNCH *));
int mdattdef P((int, int));
VOID mddtde P((UNCH *));
VOID mddtds P((UNCH *));
VOID mdelem P((UNCH *));
VOID mdentity P((UNCH *));
VOID mderr P((unsigned int,UNCH *,UNCH *));
struct parse *mdms P((UNCH *,struct parse *));
int mdmse P((void));
VOID mdnadl P((UNCH *));
VOID mdnot P((UNCH *));
VOID mdsrmdef P((UNCH *));
VOID mdsrmuse P((UNCH *));
int netetd P((struct parse *));
VOID newtoken P((struct thdr *,struct mpos *,UNCH *));
int nstetd P((void));
UNCH *ntoa P((int));
int offbit P((unsigned long *,int,int));
int parsecon P((UNCH *,struct parse *));
int parsefpi P((struct fpi *));
struct thdr *parsegcm P((struct parse *,struct thdr *,struct thdr *));
VOID parselit P((UNCH *,struct parse *,unsigned int,UNCH));
struct thdr *parsemod P((int));
int parsepro P((void));
VOID parseseq P((UNCH *,int));
VOID parsetag P((struct parse *));
int parseval P((UNCH *,unsigned int,UNCH *));
int pexmex P((struct etd *));
unsigned int ptrsrch P((UNIV *,UNIV));
UNCH *pubfield P((UNCH *,UNCH *,UNCH,UNS *));
UNCH *replace P((UNCH *,UNCH *));
UNCH *sandwich P((UNCH *,UNCH *,UNCH *));
UNIV saverr P((unsigned int,struct parse *,UNCH *,UNCH *));
VOID scbset P((void));
VOID sdinit P((void));
VOID setcurchar P((int));
VOID setdtype P((void));
int sgmlact P((UNCH));
int sgmldecl P((void));
VOID sgmlerr P((unsigned int,struct parse *,UNCH *,UNCH *));
int shortref P((int,struct parse *));
struct srh *srhfind P((UNCH *));
VOID stack P((struct etd *));
int stag P((int));
int stagetd P((struct parse *));
VOID startdtd P((void));
UNCH *savenm P((UNCH *));
UNCH *savestr P((UNCH *));
VOID storedatt P((PNE));
VOID svderr P((UNIV));
VOID synerr P((unsigned int,struct parse *));
int testend P((struct thdr *,struct mpos *,int,int));
int tokenopt P((struct thdr *,struct mpos *));
int tokenreq P((struct etd *,struct thdr *,struct mpos *));
UNS vallen P((int,int,UNCH *));
struct dcncb *dcndef P((UNCH *));
struct entity *entdef P((UNCH *,UNCH,union etext *));
int entget P((void));
int entref P((UNCH *));
struct etd *etdset P((struct etd *,UNCH,struct thdr *,struct etd **,
struct etd **, struct entity **));
struct hash *hout P((struct hash **,UNCH *,int));
struct fpi *mdextid P((UNCH *,struct fpi *,UNCH *,UNCH *,struct ne *));
int parse P((struct parse *));
struct ad *parseatt P((struct ad *,UNCH *));
unsigned int parsegrp P((struct etd **,struct parse *, UNCH *));
unsigned int parsngrp P((struct dcncb **,struct parse *, UNCH *));
int parsemd P((UNCH *,int,struct parse *,unsigned int));
UNCH *parsenm P((UNCH *,int));
UNCH *parsetkn P((UNCH *,UNCH,int));
UNCH *s2valnm P((UNCH *,UNCH *,UNCH,int));
struct srh *srhdef P((UNCH *));
int tokdata P((UNCH *, int));
struct entity *usedef P((UNCH *));

View File

@ -0,0 +1,20 @@
/* SGMLINCL.H: Include file for parser core. */
#ifndef SGMLINCL /* Don't include this file more than once. */
#define SGMLINCL 1
#include "config.h"
#include "std.h"
#include "entity.h" /* Templates for entity control blocks. */
#include "action.h" /* Action names for all parsing. */
#include "adl.h" /* Definitions for attribute list processing. */
#include "error.h" /* Symbols for error codes. */
#include "etype.h" /* Definitions for element type processing. */
#include "keyword.h" /* Definitions for keyword processing. */
#include "lextoke.h" /* Symbols for tokenization lexical classes. */
#include "source.h" /* Templates for source entity control blocks. */
#include "synxtrn.h" /* Declarations for concrete syntax constants. */
#include "sgmlxtrn.h" /* External variable declarations. */
#include "trace.h" /* Declarations for internal trace functions. */
#include "sgmlmain.h"
#include "sgmlaux.h"
#include "sgmlfnsm.h" /* ANSI C: Declarations for SGML functions. */
#endif /* ndef SGMLINCL */

View File

@ -0,0 +1,384 @@
/* sgmlio.c -
IO functions for core parser.
Written by James Clark (jjc@jclark.com).
*/
/* SGML must see a file in which records start with RS and end with
RE, and EOFCHAR (Ctl-Z) is present at the end. This module must
supply these characters if they are not naturally present in the
file. SGML will open two files at a time: when an entity is
nested, the new file is opened before closing the old in order to
make sure the open is successful. If it is, the original open file
is closed temporarily (IOPEND); when the stack is popped, the new
file is closed and the original file is re-opened (IOCONT). SGML
will check error returns for the initial open of a file and all
reads, and for re-openings when the stack is popped, but not for
closes. Returning <0 indicates an error; 0 or more is a successful
operation, except for IOREAD where the return value is the number
of characters read, and must exceed 0 to be successful. The first
READ must always be successful, and normally consists of just
priming the buffer with EOBCHAR (or RS EOBCHAR). SGMLIO must
assure that there is an EOBCHAR at the end of each block read,
except for the last block of the entity, which must have an
EOFCHAR.
SGML views an entity as a contiguous whole, without regard to its
actual form of storage. SGMLIO supports entities that are
equivalent to a single file of one or more records, or to a
concatenation of files.
*/
/* Uses only stream I/O. This module should be portable to most ANSI
systems. */
/* We try to ensure that if an IO operation fails, then errno will contain
a meaningful value (although it may be zero.) */
#include "config.h"
#ifdef HAVE_O_NOINHERIT
#include <fcntl.h>
#include <io.h>
#endif /* HAVE_O_NOINHERIT */
#include "sgmlaux.h" /* Include files for auxiliary functions.. */
#ifdef HAVE_O_NOINHERIT
#define FOPENR(file) nifopen(file)
FILE *nifopen P((char *));
#else /* not HAVE_O_NOINHERIT */
#define FOPENR(file) fopen((file), "r")
#endif /* not HAVE_O_NOINHERIT */
struct iofcb { /* I/O file control block. */
FILE *fp; /* File handle. */
fpos_t off; /* Offset in file of current read block. */
char *next; /* Next file (NULL if no more). */
char *file; /* Current file (no length byte). */
int pendoff; /* Offset into line when file suspended. */
char bol; /* Non-zero if currently at beginning of line. */
char first; /* Non-zero if the first read. */
char wasbol; /* Non-zero if current block was at beginning of line. */
char canseek;
UNCH *pendbuf; /* Saved partial buffer for suspended file
that can't be closed and reopened. */
};
static char *lastfile; /* The name of the last file closed. */
static int bufsize; /* Size of buffer passed to ioread(). */
static char ismagic[256]; /* Table of magic chars that need to be prefixed
by DELNONCH. */
static int stdinused = 0;
static char *nextstr P((char *)); /* Iterate over list of strings. */
static FILE *openfile P((char *, char *));
static int closefile P((FILE *));
static int isreg P((FILE *));
VOID ioinit(swp)
struct switches *swp;
{
ismagic[EOBCHAR] = 1;
ismagic[EOFCHAR] = 1;
ismagic[EOS] = 1;
ismagic[(UNCH)DELNONCH] = 1;
ismagic[(UNCH)GENRECHAR] = 1;
bufsize = swp->swbufsz;
}
int ioopen(id, pp)
UNIV id;
UNIV *pp;
{
struct iofcb *f;
char *s;
errno = 0;
if (!id)
return -1;
s = id;
if (!*s)
return -1;
f = (struct iofcb *)rmalloc((UNS)sizeof(struct iofcb));
f->file = s;
f->next = nextstr(s);
errno = 0;
f->fp = openfile(f->file, &f->canseek);
f->bol = 1;
f->first = 1;
f->pendbuf = 0;
*pp = (UNIV)f;
return f->fp ? 1 : -1;
}
VOID ioclose(p)
UNIV p;
{
struct iofcb *f = (struct iofcb *)p;
if (f->fp)
closefile(f->fp);
lastfile = f->file;
frem((UNIV)f);
}
VOID iopend(p, off, buf)
UNIV p;
int off;
UNCH *buf;
{
struct iofcb *f = (struct iofcb *)p;
if (!f->canseek) {
UNCH *s;
for (s = buf + off; *s != EOFCHAR && *s != EOBCHAR; s++)
;
s++;
f->pendbuf = (UNCH *)rmalloc((UNS)(s - buf - off));
memcpy((UNIV)f->pendbuf, (UNIV)(buf + off), (UNS)(s - buf - off));
return;
}
f->bol = 0;
if (f->wasbol) {
if (off == 0)
f->bol = 1;
else
off--;
}
f->pendoff = off;
if (f->fp) {
fclose(f->fp);
f->fp = 0;
}
}
int iocont(p)
UNIV p;
{
struct iofcb *f = (struct iofcb *)p;
int c = EOF;
int off = f->pendoff;
if (!f->canseek)
return 0;
errno = 0;
f->fp = FOPENR(f->file);
if (!f->fp)
return -1;
if (fsetpos(f->fp, &f->off))
return -1;
while (--off >= 0) {
c = getc(f->fp);
if (c != EOF && ismagic[c])
off--;
}
if (c == '\n')
f->bol = 1;
if (ferror(f->fp))
return -1;
return 0;
}
/* Return -1 on error, otherwise the number of bytes read. The
strategy is to concatenate the files, insert a RS at the beginning of
each line, and change each '\n' into a RE. The returned data
shouldn't cross a file boundary, otherwise error messages might be
inaccurate. The first read must always succeed. */
int ioread(p, buf, newfilep)
UNIV p;
UNCH *buf;
int *newfilep;
{
int i = 0;
struct iofcb *f = (struct iofcb *)p;
FILE *fp;
int c;
*newfilep = 0;
if (f->first) {
buf[i] = EOBCHAR;
f->first = 0;
return 1;
}
if (f->pendbuf) {
for (i = 0;
(buf[i] = f->pendbuf[i]) != EOBCHAR && buf[i] != EOFCHAR;
i++)
;
frem((UNIV)f->pendbuf);
f->pendbuf = 0;
return i + 1;
}
fp = f->fp;
for (;;) {
errno = 0;
if (f->canseek && fgetpos(fp, &f->off))
f->canseek = 0;
errno = 0;
c = getc(fp);
if (c != EOF)
break;
if (ferror(fp))
return -1;
if (closefile(fp) == EOF)
return -1;
if (!f->next){
f->fp = 0;
buf[0] = EOFCHAR;
return 1;
}
f->file = f->next;
f->next = nextstr(f->next);
*newfilep = 1;
errno = 0;
fp = f->fp = openfile(f->file, &f->canseek);
if (!fp)
return -1;
f->bol = 1;
}
if (f->bol) {
f->bol = 0;
buf[i++] = RSCHAR;
f->wasbol = 1;
}
else
f->wasbol = 0;
errno = 0;
for (;;) {
if (c == '\n') {
f->bol = 1;
buf[i++] = RECHAR;
break;
}
if (ismagic[c]) {
buf[i++] = DELNONCH;
buf[i++] = SHIFTNON(c);
}
else
buf[i++] = c;
if (i >= bufsize - 2)
break;
c = getc(fp);
if (c == EOF) {
if (ferror(fp))
return -1;
/* This is in the middle of a line. */
break;
}
}
buf[i++] = EOBCHAR;
return i;
}
static char *nextstr(p)
char *p;
{
p = strchr(p, '\0');
return *++p ? p : 0;
}
/* Return the filename associated with p. If p is NULL, return the filename
of the last file closed. */
char *ioflid(p)
UNIV p;
{
if (!p)
return lastfile;
return ((struct iofcb *)p)->file;
}
static
FILE *openfile(name, seekp)
char *name;
char *seekp;
{
FILE *fp;
if (strcmp(name, STDINNAME) == 0) {
if (stdinused)
return 0;
stdinused = 1;
*seekp = 0;
return stdin;
}
fp = FOPENR(name);
if (fp)
*seekp = isreg(fp);
return fp;
}
/* Return -1 on error, 0 otherwise. */
static
int closefile(fp)
FILE *fp;
{
if (fp == stdin) {
stdinused = 0;
clearerr(fp);
return 0;
}
else
return fclose(fp);
}
#ifdef HAVE_O_NOINHERIT
/* This is the same as fopen(name, "r") except that it tells DOS that
the file descriptor should not be inherited by child processes. */
FILE *nifopen(name)
char *name;
{
int fd = open(name, O_RDONLY|O_NOINHERIT|O_TEXT);
if (fd < 0)
return 0;
return fdopen(fd, "r");
}
#endif /* HAVE_O_NOINHERIT */
#ifdef HAVE_SYS_STAT_H
#include <sys/types.h>
#include <sys/stat.h>
#ifndef S_ISREG
#ifdef S_IFMT
#ifdef S_IFREG
#define S_ISREG(m) (((m) & S_IFMT) == S_IFREG)
#endif /* S_IFREG */
#endif /* S_IFMT */
#endif /* not S_ISREG */
#endif /* HAVE_SYS_STAT_H */
/* Return 1 if fp might be associated with a regular file. 0
otherwise. We check this because on many Unix systems lseek() will
succeed on a (pseudo-)terminal although terminals aren't seekable in
the way we need. */
static
int isreg(fp)
FILE *fp;
{
#ifdef S_ISREG
struct stat sb;
/* This assumes that a system that has S_ISREG will also have
fstat() and fileno(). */
if (fstat(fileno(fp), &sb) == 0)
return S_ISREG(sb.st_mode);
#endif /* S_ISREG */
return 1;
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
comment-column: 30
End:
*/

View File

@ -0,0 +1,101 @@
/* SGMLMAIN: Main interface to SGML services.
Preprocessor variable names are the only supported interface
to data maintained by SGML. They are defined in this file or in adl.h.
*/
/* Return control block types (RCBTYPE) from calls to parser (SGML):
Names and strings follow the convention for the IPBs.
*/
enum sgmlevent {
SGMLEOD, /* End of document. */
SGMLDAF, /* Data found. */
SGMLSTG, /* Start-tag found. */
SGMLETG, /* End-tag found. */
SGMLREF, /* Record end found. */
SGMLPIS, /* Processing instruction (string). */
SGMLAPP /* APPINFO (string) */
};
struct rcbdata { /* Return control block: DAF EOD REF PIS APP. */
UNS contersw; /* 1=context error; 2,4,8=data type; 0=not. */
UNS datalen; /* Length of data or PI (0=single nonchar). */
UNCH *data; /* Data, PI, single nonSGML, or NDATA ecb ptr. */
};
struct rcbtag { /* Return control block for STG and ETG. */
UNS contersw; /* 1=context error; 2=NET enabled; 0/0=not. */
UNS tagmin; /* Minim: NONE NULL NET DATA; implied by S/ETAG */
UNCH *curgi; /* Start-tag (or end-tag) GI. */
union {
struct ad *al; /* Start-tag: attribute list. */
UNCH *oldgi; /* End-tag: resumed GI. */
} ru;
struct ad *lal; /* Start-tag: link attribute list (UNUSED). */
UNS format; /* Format class for default processing. */
struct etd *tagreal; /* Dummy etd or ptr to GI that implied this tag.*/
int etictr; /* Number of elements on stack with NET enabled.*/
UNCH *srmnm; /* Current SHORTREF map name (NULL=#EMPTY). */
};
/* Accessors for rcbdata and rcbtag. */
/* Datatype abbreviations: C=unsigned char S=string U=unsigned int L=4 bytes
A=array P=ptr to structure N=name (see sgmlcb.h)
*/
/* Data control block fields: processing instructions (SGMLPIS).
*/
#define PDATA(d) ((d).data) /*S PI string. */
#define PDATALEN(d) ((d).datalen) /*U Length of PI string. */
#define PIESW(d) (((d).contersw & 4)) /*U 1=PIDATA entity returned. */
/* Data control block fields: other data types.
*/
#define CDATA(d) ((d).data) /*S CDATA content string. */
#define CDATALEN(d) ((d).datalen) /*U Length of CDATA content string. */
#define CONTERSW(d) (((d).contersw &1))/*U 1=CDATA or TAG out of context. */
#define CDESW(d) (((d).contersw & 2)) /*U 1=CDATA entity returned. */
#define SDESW(d) (((d).contersw & 4)) /*U 1=SDATA entity returned. */
#define NDESW(d) (((d).contersw & 8)) /*U 1=NDATA entity returned. */
#define NEPTR(d) ((PNE)(d).data) /*P Ptr to NDATA control block. */
#define MARKUP(d) ((d).data) /*A Markup delimiter strings. */
#define DTYPELEN(d) ((d).datalen) /*U Length of doc type name +len+EOS. */
#define DOCTYPE(d) ((d).data) /*S Document type name (with len+EOS). */
#define ADATA(d) ((d).data) /*S APPINFO */
#define ADATALEN(d) ((d).datalen) /*U Length of APPINFO string. */
/* Tag control block fields.
*/
#define ALPTR(t) ((t).ru.al) /*P Ptr to SGML attribute list. */
#define CURGI(t) ((t).curgi+1) /*N GI of started or ended element. */
#define OLDGI(t) ((t).ru.oldgi) /*S GI of resumed element. */
#define TAGMIN(t) (t).tagmin /*U Minimization for current tag. */
#define TAGREAL(t) ((t).tagreal) /*P Dummy etd that implied this tag. */
#define TAGRLNM(t) ((UNCH *)(t).tagreal) /*P GI of tag that implied this tag.*/
#define ETISW(t) (((t).contersw & 2)) /*U 1=NET delimiter enabled by ETI. */
#define PEXSW(t) (((t).contersw & 4)) /*U 1=Element was plus exception. */
#define MTYSW(t) (((t).contersw & 8)) /*U 1=Element is empty. */
#define ETICTR(t) ((t).etictr) /*U Number of active NET delimiters. */
#define SRMNM(t) ((t).srmnm) /*S Name of current SHORTREF map. */
#define SRMCNT(t) ((t).contersw) /*U Number of SHORTREF maps defined. */
#define FORMAT(t) ((t).format) /*U Format class.*/
/* These function names are chosen so as to be distinct in the first 6
letters. */
/* Initialize. */
struct markup *sgmlset P((struct switches *));
/* Cleanup and return capacity usage statistics. */
VOID sgmlend P((struct sgmlcap *));
/* Set document entity. */
int sgmlsdoc P((UNIV));
/* Get entity. */
int sgmlgent P((UNCH *, PNE *, UNCH **));
/* Mark an entity. Return is non-zero if already marked.*/
int sgmlment P((UNCH *));
/* Get the next sgml event. */
enum sgmlevent sgmlnext P((struct rcbdata *, struct rcbtag *));
/* Get the error count. */
int sgmlgcnterr P((void));
/* Get the current location. */
int sgmlloc P((unsigned long *, char **));
/* Write out the SGML declaration. */
VOID sgmlwrsd P((FILE *));
/* Note subdocument capacity usage. */
VOID sgmlsubcap P((long *));

View File

@ -0,0 +1,514 @@
/* sgmlmsg.c -
message handling for core parser
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#include "sgmlaux.h"
#include "msg.h"
static nl_catd catd;
#define TEXT_SET 1 /* message set number for text of messages */
#define HEADER_SET 2 /* message set number for header strings */
#define PARM_SET 3 /* message set number for special parameters */
#ifdef HAVE_EXTENDED_PRINTF
#define xfprintf fprintf
#else
extern int xfprintf VP((FILE *, char *,...));
#endif
#define SIZEOF(v) (sizeof(v)/sizeof(v[0]))
static char *gettext P((int));
static char *getheader P((int));
static char *getparm P((int));
static VOID elttrace P((FILE *, int));
static int printit P((FILE *, struct error *));
static char *transparm P((UNCH *, char *));
static VOID spaces P((FILE *, int));
#define PARMBUFSIZ 50
static char parmbuf[PARMBUFSIZ*2];
static char *parmbuf1 = parmbuf;
static char *parmbuf2 = parmbuf + PARMBUFSIZ;
static char *prog; /* program name */
static int sweltr; /* non-zero means print an element trace */
static int swenttr; /* non-zero means print an entity trace */
static int cnterr = 0;
static VOID (*die) P((void));
static char *headers[] = {
"In file included",
"SGML error", /* parameters: type, severity, number */
"Unsupported feature", /* type U errors */
"Error", /* for type R errors */
"Warning", /* severity type I */
" at %s, %.0sline %lu", /* ignore entity name and ccnt */
" at entity %s, line %lu",
"%.0s%.0s in declaration parameter %d", /* ignore first two parameters */
"%.0s in declaration parameter %d", /* ignore first parameter */
"%.0s", /* parse mode */
" at end of file",
" at end of entity",
" at record start",
" at record end",
" at \"%c\"",
" at \"\\%03o\"",
" accessing \"%s\"",
"Element structure:"
};
/* Indexes into headers[] */
#define HDRPFX 0
#define HDRALL 1
#define HDRUNSUP 2
#define HDRSYS 3
#define HDRWARN 4
#define HDRLOC 5
#define HDRELOC 6
#define HDRMD 7
#define HDRMD2 8
#define HDRMODE 9
#define HDREOF 10
#define HDREE 11
#define HDRRS 12
#define HDRRE 13
#define HDRPRT 14
#define HDRCTL 15
#define HDRFIL 16
#define HDRELT 17
/* Special parameters (error::errsp) */
static char *parms[] = {
"character data",
"element content",
"mixed content",
"replaceable character data",
"tag close",
"content model group",
"content model occurrence indicator",
"name group",
"name token group",
"system data",
"parameter literal",
"attribute value literal",
"tokenized attribute value literal",
"minimum literal",
"markup declaration",
"markup declaration comment",
"ignored markup declaration",
"declaration subset",
"CDATA marked section",
"IGNORE marked section",
"RCDATA marked section",
"prolog",
"reference",
"attribute specification list",
"tokenized attribute value",
"attribute specification list close",
"SGML declaration",
"attribute definition list",
"document type",
"element",
"entity",
"link type",
"link set",
"notation",
"SGML",
"short reference mapping",
"link set use",
"short reference use",
};
static FILE *tfp; /* temporary file for saved messages */
struct saved {
long start;
long end;
char exiterr;
char countit;
};
VOID msgprint(e)
struct error *e;
{
if (printit(stderr, e))
++cnterr;
fflush(stderr);
if (e->errtype == EXITERR) {
if (die) {
(*die)();
abort();
}
else
exit(EXIT_FAILURE);
}
}
/* Save an error message. */
UNIV msgsave(e)
struct error *e;
{
struct saved *sv;
sv = (struct saved *)rmalloc(sizeof(struct saved));
if (!tfp) {
tfp = tmpfile();
if (!tfp)
exiterr(160, (struct parse *)0);
}
sv->start = ftell(tfp);
sv->countit = (char)printit(tfp, e);
sv->end = ftell(tfp);
sv->exiterr = (char)(e->errtype == EXITERR);
return (UNIV)sv;
}
/* Print a saved error message. */
VOID msgsprint(p)
UNIV p;
{
struct saved *sv = (struct saved *)p;
long cnt;
assert(p != 0);
assert(tfp != 0);
if (fseek(tfp, sv->start, SEEK_SET) < 0)
return;
/* Temporary files are opened in binary mode, so this is portable. */
cnt = sv->end - sv->start;
while (--cnt >= 0) {
int c = getc(tfp);
if (c == EOF)
break;
putc(c, stderr);
}
fflush(stderr);
if (sv->countit)
++cnterr;
if (sv->exiterr)
exit(EXIT_FAILURE);
}
/* Free a sved error message. */
VOID msgsfree(p)
UNIV p;
{
frem(p);
}
/* Return 1 if it should be counted as an error. */
static int printit(efp, e)
FILE *efp;
struct error *e;
{
int indent;
int countit;
int hdrcode;
int filelevel = -1, prevfilelevel = -1, toplevel;
struct location loc;
char type[2], severity[2];
assert(e->errnum < SIZEOF(messages));
assert(messages[e->errnum].text != NULL);
if (prog) {
fprintf(efp, "%s: ", prog);
indent = strlen(prog) + 2; /* don't rely on return value of fprintf */
/* Don't want to waste too much space on indenting. */
if (indent > 10)
indent = 4;
}
else
indent = 4;
for (toplevel = 0; getlocation(toplevel, &loc); toplevel++)
if (loc.filesw) {
prevfilelevel = filelevel;
filelevel = toplevel;
}
toplevel--;
if (e->errtype == FILERR) {
toplevel--;
filelevel = prevfilelevel;
}
if (swenttr && filelevel > 0) {
int level = 0;
int middle = 0; /* in the middle of a line */
do {
(void)getlocation(level, &loc);
if (loc.filesw) {
if (middle) {
fputs(":\n", efp);
spaces(efp, indent);
}
else
middle = 1;
xfprintf(efp, getheader(HDRPFX));
xfprintf(efp, getheader(HDRLOC), ioflid(loc.fcb),
loc.ename, loc.rcnt, loc.ccnt);
}
else if (middle)
xfprintf(efp, getheader(HDRELOC),
loc.ename, loc.rcnt + 1, loc.ccnt);
}
while (++level != filelevel);
if (middle) {
fputs(":\n", efp);
spaces(efp, indent);
}
}
/* We use strings for the type and severity,
so that the format can use %.0s to ignore them. */
type[0] = messages[e->errnum].type;
type[1] = '\0';
severity[0] = messages[e->errnum].severity;
severity[1] = '\0';
countit = (severity[0] != 'I');
if (!countit)
hdrcode = HDRWARN;
else if (type[0] == 'R')
hdrcode = HDRSYS;
else if (type[0] == 'U')
hdrcode = HDRUNSUP;
else
hdrcode = HDRALL;
xfprintf(efp, getheader(hdrcode), type, severity, e->errnum);
if (filelevel >= 0) {
(void)getlocation(filelevel, &loc);
xfprintf(efp, getheader(HDRLOC),
ioflid(loc.fcb), loc.ename, loc.rcnt, loc.ccnt);
while (filelevel < toplevel) {
++filelevel;
if (swenttr) {
(void)getlocation(filelevel, &loc);
xfprintf(efp, getheader(HDRELOC),
loc.ename, loc.rcnt + 1, loc.ccnt);
}
}
}
/* It is necessary to copy the result of getparm() because
the specification of catgets() says in can return a
pointer to a static buffer which may get overwritten
by the next call to catgets(). */
switch (e->errtype) {
case MDERR:
strncpy(parmbuf, getparm(e->errsp), PARMBUFSIZ*2 - 1);
xfprintf(efp, getheader(HDRMD), parmbuf,
(e->subdcl ? e->subdcl : (UNCH *)""), e->parmno);
break;
case MDERR2:
/* no subdcl parameter */
strncpy(parmbuf, getparm(e->errsp), PARMBUFSIZ*2 - 1);
xfprintf(efp, getheader(HDRMD2), parmbuf, e->parmno);
break;
case DOCERR:
case EXITERR:
if (toplevel < 0)
break;
strncpy(parmbuf, getparm(e->errsp), PARMBUFSIZ*2 - 1);
xfprintf(efp, getheader(HDRMODE), parmbuf);
switch (loc.curchar) {
case EOFCHAR:
xfprintf(efp, getheader(HDREOF));
break;
case RSCHAR:
xfprintf(efp, getheader(HDRRS));
break;
case RECHAR:
xfprintf(efp, getheader(HDRRE));
break;
case DELNONCH:
xfprintf(efp, getheader(HDRCTL), UNSHIFTNON(loc.nextchar));
break;
case EOS:
xfprintf(efp, getheader(HDREE));
break;
case EOBCHAR:
break;
default:
if (ISASCII(loc.curchar) && isprint(loc.curchar))
xfprintf(efp, getheader(HDRPRT), loc.curchar);
else
xfprintf(efp, getheader(HDRCTL), loc.curchar);
break;
}
break;
case FILERR:
if (getlocation(toplevel + 1, &loc))
xfprintf(efp, getheader(HDRFIL), ioflid(loc.fcb));
break;
}
fputs(":\n", efp);
if (e->errtype == FILERR && e->sverrno != 0) {
char *errstr = strerror(e->sverrno);
UNS len = strlen(errstr);
/* Strip a trailing newline if there is one. */
if (len > 0 && errstr[len - 1] == '\n')
len--;
spaces(efp, indent);
for (; len > 0; len--, errstr++)
putc(*errstr, efp);
fputs(":\n", efp);
}
spaces(efp, indent);
xfprintf(efp, gettext(e->errnum),
transparm((UNCH *)e->eparm[0], parmbuf1),
transparm((UNCH *)e->eparm[1], parmbuf2));
putc('\n', efp);
if (sweltr)
elttrace(efp, indent);
return countit;
}
/* Print an element trace. */
static VOID elttrace(efp, indent)
FILE *efp;
int indent;
{
int i = 1;
UNCH *gi;
gi = getgi(i);
if (!gi)
return;
spaces(efp, indent);
xfprintf(efp, getheader(HDRELT));
do {
fprintf(efp, " %s", gi);
gi = getgi(++i);
} while (gi);
putc('\n', efp);
}
static VOID spaces(efp, indent)
FILE *efp;
int indent;
{
while (--indent >= 0)
putc(' ', efp);
}
VOID msginit(swp)
struct switches *swp;
{
catd = swp->catd;
prog = swp->prog;
sweltr = swp->sweltr;
swenttr = swp->swenttr;
die = swp->die;
}
/* Return the error count. */
int msgcnterr()
{
return cnterr;
}
/* Transform a parameter into a form suitable for printing. */
static char *transparm(s, buf)
UNCH *s;
char *buf;
{
char *ptr;
int cnt;
if (!s)
return 0;
ptr = buf;
cnt = PARMBUFSIZ - 4; /* space for `...\0' */
while (*s) {
UNCH ch = *s++;
if (ch == DELNONCH) {
if (*s == '\0')
break;
ch = UNSHIFTNON(*s);
s++;
}
if (ch == DELCDATA || ch == DELSDATA)
;
else if (ch == '\\') {
if (cnt < 2)
break;
*ptr++ = '\\';
*ptr++ = '\\';
cnt -= 2;
}
else if (ISASCII(ch) && isprint(ch)) {
if (cnt < 1)
break;
*ptr++ = ch;
cnt--;
}
else {
if (cnt < 4)
break;
sprintf(ptr, "\\%03o", ch);
ptr += 4;
cnt -= 4;
}
}
if (!*s)
*ptr = '\0';
else
strcpy(ptr, "...");
return buf;
}
/* The message and set numbers in the catgets function must be > 0. */
static char *gettext(n)
int n;
{
assert(n > 0 && n < SIZEOF(messages));
assert(messages[n].text != 0);
return catgets(catd, TEXT_SET, n, messages[n].text);
}
static char *getheader(n)
int n;
{
assert(n >= 0 && n < SIZEOF(headers));
return catgets(catd, HEADER_SET, n + 1, headers[n]);
}
static char *getparm(n)
int n;
{
assert(n >= 0 && n < SIZEOF(parms));
return catgets(catd, PARM_SET, n + 1, parms[n]);
}
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

871
usr.bin/sgmls/sgmls/sgmls.1 Normal file
View File

@ -0,0 +1,871 @@
'\" t
.\" Uncomment the next line to get a man page accurate for MS-DOS
.\"nr Os 1
.\" Uncomment the next line if tracing is enabled.
.\"nr Tr 1
.if \n(.g .if !r Os .nr Os 0
.tr \(ts"
.ds S \s-1SGML\s0
.de TS
.br
.sp .5
..
.de TE
.br
.sp .5
..
.de TQ
.br
.ns
.TP \\$1
..
.TH SGMLS 1
.SH NAME
sgmls \- a validating SGML parser
.sp
An \*S System Conforming to
.if n .br
International Standard ISO 8879 \(em
.br
Standard Generalized Markup Language
.SH SYNOPSIS
.B sgmls
[
.B \-deglprsuv
]
[
.BI \-c file
]
.if \n(Os=1 \{\
[
.BI \-f file
]
.\}
[
.BI \-i name
]
.if \n(Tr \{\
[
.BI \-x flags
]
[
.BI \-y flags
]
.\}
[
.I filename\|.\|.\|.
]
.SH DESCRIPTION
.I Sgmls
parses and validates
the \*S document entity in
.I filename\|.\|.\|.
and prints on the standard output a simple \s-1ASCII\s0 representation of its
Element Structure Information Set.
(This is the information set which a structure-controlled
conforming \*S application should act upon.)
Note that the document entity may be spread amongst several files;
for example, the SGML declaration, document type declaration and document
instance set could each be in a separate file.
If no filenames are specified, then
.I sgmls
will read the document entity from the standard input.
A filename of
.B \-
can also be used to refer to the standard input.
.LP
The following options are available:
.TP
.BI \-c file
Write a report of capacity usage to
.IR file .
The report is in the format of a RACT result.
RACT is the Reference Application for Capacity Testing defined in the
Proposed American National Standard
.I
Conformance Testing for Standard Generalized Markup Language (SGL) Systems
(X3.190-199X),
Draft July 1991.
.TP
.B \-d
Warn about duplicate entity declarations.
.TP
.B \-e
Describe open entities in error messages.
Error messages always include the position of the most recently
opened external entity.
.if \n(Os=1 \{\
.TP
.BI \-f file
Redirect errors to
.IR file .
.\}
.TP
.B \-g
Show the \s-1GI\s0s of open elements in error messages.
.TP
.BI \-i name
Pretend that
.RS
.IP
.BI <!ENTITY\ %\ name\ \(tsINCLUDE\(ts>
.LP
occurs at the start of the document type declaration subset
in the \*S document entity.
Since repeated definitions of an entity are ignored,
this definition will take precedence over any other definitions
of this entity in the document type declaration.
Multiple
.B \-i
options are allowed.
If the \*S declaration replaces the reserved name
.B INCLUDE
then the new reserved name will be the replacement text of the entity.
Typically the document type declaration will contain
.IP
.BI <!ENTITY\ %\ name\ \(tsIGNORE\(ts>
.LP
and will use
.BI % name ;
in the status keyword specification of a marked section declaration.
In this case the effect of the option will be to cause the marked
section not to be ignored.
.RE
.TP
.B \-l
Output
.B L
commands giving the current line number and filename.
.TP
.B \-p
Parse only the prolog.
.I Sgmls
will exit after parsing the document type declaration.
Implies
.BR \-s .
.TP
.B \-r
Warn about defaulted references.
.TP
.B \-s
Suppress output.
Error messages will still be printed.
.TP
.B \-u
Warn about undefined elements: elements used in the DTD but not defined.
Also warn about undefined short reference maps.
.TP
.B \-v
Print the version number.
.if \n(Tr \{\
.TP
.BI \-x flags
.br
.ns
.TP
.BI \-y flags
Enable debugging output;
.B \-x
applies to the document body,
.B \-y
to the prolog.
Each character in the
.I flags
argument enables tracing of a particular activity.
.RS
.TP
.B t
Trace state transitions.
.TP
.B a
Trace attribute activity.
.TP
.B c
Trace context checking.
.TP
.B d
Trace declaration parsing.
.TP
.B e
Trace entities.
.TP
.B g
Trace groups.
.TP
.B i
Trace \s-1ID\s0s.
.TP
.B m
Trace marked sections.
.TP
.B n
Trace notations.
.RE
.\}
.SS "Entity Manager"
An external entity resides in one or more files.
The entity manager component of
.I sgmls
maps a sequence of files into an entity in three sequential stages:
.IP 1.
each carriage return character is turned into a non-SGML character;
.IP 2.
each newline character is turned into a record end character,
and at the same time
a record start character is inserted at the beginning of each line;
.IP 3.
the files are concatenated.
.LP
A system identifier is
interpreted as a list of filenames separated by
.if \n(Os=0 colons.
.if \n(Os=1 semi-colons.
A filename of
.B \-
can be used to refer to the standard input.
If no system identifier is supplied, then the entity manager will
attempt to generate a filename using the public identifier
(if there is one) and other information available to it.
Notation identifiers are not subject to this treatment.
This process is controlled by the environment variable
.BR \s-1SGML_PATH\s0 ;
this contains a
.if \n(Os=0 colon-separated
.if \n(Os=1 semicolon-separated
list of filename templates.
A filename template is a filename that may contain
substitution fields; a substitution field is a
.B %
character followed by a single letter that indicates the value
of the substitution.
If
.B \s-1SGML_PATH\s0
uses the
.B %S
field (the value of which is the system identifier),
then the entity manager will also use
.B \s-1SGML_PATH\s0
to generate a filename
when a system identifier that does not contain any
.if \n(Os=0 colons
.if \n(Os=1 semi-colons
is supplied.
The value of a substitution can either be a string
or it can be
.IR null .
The entity manager transforms the list of
filename templates into a list of filenames by substituting for each
substitution field and discarding any template
that contained a substitution field whose value was null.
It then uses the first resulting filename that exists and is readable.
Substitution values are transformed before being used for substitution:
firstly, any names that were subject to upper case substitution
are folded to lower case;
secondly,
.if \n(Os=0 \{\
.\" Unix
space characters are mapped to underscores
and slashes are mapped to percents.
.\}
.if \n(Os=1 \{\
.\" MS-DOS
the characters
.B +,./:=?
and space characters are deleted.
.\}
The value of the
.B %S
field is not transformed.
The values of substitution fields are as follows:
.TP
.B %%
A single
.BR % .
.TP
.B %D
The entity's data content notation.
This substitution will succeed only for external data entities.
.TP
.B %N
The entity, notation or document type name.
.TP
.B %P
The public identifier if there was a public identifier,
otherwise null.
.TP
.B %S
The system identifier if there was a system identifier
otherwise null.
.TP
.B %X
(This is provided mainly for compatibility with \s-1ARCSGML\s0.)
A three-letter string chosen as follows:
.LP
.RS
.ne 11
.TS
tab(&);
c|c|c s
c|c|c s
c|c|c|c
c|c|c|c
l|lB|lB|lB.
&&With public identifier
&&_
&No public&Device&Device
&identifier&independent&dependent
_
Data or subdocument entity&nsd&pns&vns
General SGML text entity&gml&pge&vge
Parameter entity&spe&ppe&vpe
Document type definition&dtd&pdt&vdt
Link process definition&lpd&plp&vlp
.TE
.LP
The device dependent version is selected if the public text class
allows a public text display version but no public text display
version was specified.
.RE
.TP
.B %Y
The type of thing for which the filename is being generated:
.TS
tab(&);
l lB.
SGML subdocument entity&sgml
Data entity&data
General text entity&text
Parameter entity&parm
Document type definition&dtd
Link process definition&lpd
.TE
.LP
The value of the following substitution fields will be null
unless a valid formal public identifier was supplied.
.TP
.B %A
Null if the text identifier in the
formal public identifier contains an unavailable text indicator,
otherwise the empty string.
.TP
.B %C
The public text class, mapped to lower case.
.TP
.B %E
The public text designating sequence (escape sequence)
if the public text class is
.BR \s-1CHARSET\s0 ,
otherwise null.
.TP
.B %I
The empty string if the owner identifier in the formal public identifier
is an \s-1ISO\s0 owner identifier,
otherwise null.
.TP
.B %L
The public text language, mapped to lower case,
unless the public text class is
.BR \s-1CHARSET\s0 ,
in which case null.
.TP
.B %O
The owner identifier (with the
.B +//
or
.B \-//
prefix stripped.)
.TP
.B %R
The empty string if the owner identifier in the formal public identifier
is a registered owner identifier,
otherwise null.
.TP
.B %T
The public text description.
.TP
.B %U
The empty string if the owner identifier in the formal public identifier
is an unregistered owner identifier,
otherwise null.
.TP
.B %V
The public text display version.
This substitution will be null if the public text class
does not allow a display version or if no version was specified.
If an empty version was specified, a value of
.B default
will be used.
.br
.ne 18
.SS "System declaration"
The system declaration for
.I sgmls
is as follows:
.LP
.TS
tab(&);
c1 s1 s1 s1 s1 s1 s1 s1 s
c s s s s s s s s
l l s s s s s s s
l l s s s s s s s
l l s s s s s s s
l l l s s s s s s
c s s s s s s s s
l l l l l l l l l
l l l l l l l l l
l l l l l l l l l
l l s s s s s s s
l l l s s s s s s
l l l s s s s s s
c s s s s s s s s
l l l l l l l l l.
SYSTEM "ISO 8879:1986"
CHARSET
BASESET&"ISO 646-1983//CHARSET
&\h'\w'"'u'International Reference Version (IRV)//ESC 2/5 4/0"
DESCSET&0\0128\00
CAPACITY&PUBLIC&"ISO 8879:1986//CAPACITY Reference//EN"
FEATURES
MINIMIZE&DATATAG&NO&OMITTAG&YES&RANK&NO&SHORTTAG&YES
LINK&SIMPLE&NO&IMPLICIT&NO&EXPLICIT&NO
OTHER&CONCUR&NO&SUBDOC&YES 1&FORMAL&YES
SCOPE&DOCUMENT
SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX Reference//EN"
SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX Core//EN"
VALIDATE
&GENERAL&YES&MODEL&YES&EXCLUDE&YES&CAPACITY&YES
&NONSGML&YES&SGML&YES&FORMAL&YES
.T&
c s s s s s s s s
l l l l l l l l l.
SDIF
&PACK&NO&UNPACK&NO
.TE
.LP
The memory usage of
.I sgmls
is not a function of the capacity points used by a document;
however,
.I sgmls
can handle capacities significantly greater than the reference capacity set.
.LP
In some environments,
higher values may be supported for the \s-1SUBDOC\s0 parameter.
.LP
Documents that do not use optional features are also supported.
For example, if
.B FORMAL\ NO
is specified in the \*S declaration,
public identifiers will not be required to be valid formal public identifiers.
.LP
Certain parts of the concrete syntax may be changed:
.RS
.LP
The shunned character numbers can be changed.
.LP
Eight bit characters can be assigned to
\s-1LCNMSTRT\s0, \s-1UCNMSTRT\s0, \s-1LCNMCHAR\s0 and \s-1UCNMCHAR\s0.
Declaring this requires that the syntax reference character set be declared
like this:
.RS
.ne 3
.TS
tab(&);
l l.
BASESET&"ISO Registration Number 100//CHARSET
&\h'\w'"'u'ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1"
DESCSET&0\0256\00
.TE
.RE
.LP
Uppercase substitution can be performed or not performed
both for entity names and for other names.
.LP
Either short reference delimiters assigned by the reference delimiter set
or no short reference delimiters are supported.
.LP
The reserved names can be changed.
.LP
The quantity set can be increased within certain limits
subject to there being sufficient memory available.
The upper limit on \s-1\%NAMELEN\s0 is 239.
The upper limits on
\s-1\%ATTCNT\s0, \s-1\%ATTSPLEN\s0, \s-1\%BSEQLEN\s0, \s-1\%ENTLVL\s0,
\s-1\%LITLEN\s0, \s-1\%PILEN\s0, \s-1\%TAGLEN\s0, and \s-1\%TAGLVL\s0
are more than thirty times greater than the reference limits.
The upper limit on
\s-1\%GRPCNT\s0, \s-1\%GRPGTCNT\s0, and \s-1\%GRPLVL\s0 is 253.
\s-1\%NORMSEP\s0
cannot be changed.
\s-1\%DTAGLEN\s0
are
\s-1\%DTEMPLEN\s0
irrelevant since
.I sgmls
does not support the
\s-1\%DATATAG\s0
feature.
.RE
.SS "\*S declaration"
The \*S declaration may be omitted,
the following declaration will be implied:
.TS
tab(&);
c1 s1 s1 s1 s1 s1 s1 s1 s
c s s s s s s s s
l l s s s s s s s.
<!SGML "ISO 8879:1986"
CHARSET
BASESET&"ISO 646-1983//CHARSET
&\h'\w'"'u'International Reference Version (IRV)//ESC 2/5 4/0"
DESCSET&\0\00\0\09\0UNUSED
&\0\09\0\02\0\09
&\011\0\02\0UNUSED
&\013\0\01\013
&\014\018\0UNUSED
&\032\095\032
&127\0\01\0UNUSED
.T&
l l l s s s s s s
l l s s s s s s s
l l l s s s s s s
c s s s s s s s s
l l l l l l l l l.
CAPACITY&PUBLIC&"ISO 8879:1986//CAPACITY Reference//EN"
SCOPE&DOCUMENT
SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX Reference//EN"
FEATURES
MINIMIZE&DATATAG&NO&OMITTAG&YES&RANK&NO&SHORTTAG&YES
LINK&SIMPLE&NO&IMPLICIT&NO&EXPLICIT&NO
OTHER&CONCUR&NO&SUBDOC&YES 99999999&FORMAL&YES
.T&
c s s s s s s s s.
APPINFO NONE>
.TE
with the exception that characters 128 through 254 will be assigned to
\s-1DATACHAR\s0.
When exporting documents that use characters in this range,
an accurate description of the upper half of the document character set
should be added to this declaration.
For ISO Latin-1, an appropriate description would be:
.br
.ne 5
.TS
tab(&);
l l.
BASESET&"ISO Registration Number 100//CHARSET
&\h'\w'"'u'ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1"
DESCSET&128\032\0UNUSED
&160\095\032
&255\0\01\0UNUSED
.TE
.SS "Output format"
The output is a series of lines.
Lines can be arbitrarily long.
Each line consists of an initial command character
and one or more arguments.
Arguments are separated by a single space,
but when a command takes a fixed number of arguments
the last argument can contain spaces.
There is no space between the command character and the first argument.
Arguments can contain the following escape sequences.
.TP
.B \e\e
A
.BR \e.
.TP
.B \en
A record end character.
.TP
.B \e|
Internal \s-1SDATA\s0 entities are bracketed by these.
.TP
.BI \e nnn
The character whose code is
.I nnn
octal.
.LP
A record start character will be represented by
.BR \e012 .
Most applications will need to ignore
.B \e012
and translate
.B \en
into newline.
.LP
The possible command characters and arguments are as follows:
.TP
.BI ( gi
The start of an element whose generic identifier is
.IR gi .
Any attributes for this element
will have been specified with
.B A
commands.
.TP
.BI ) gi
The end an element whose generic identifier is
.IR gi .
.TP
.BI \- data
Data.
.TP
.BI & name
A reference to an external data entity
.IR name ;
.I name
will have been defined using an
.B E
command.
.TP
.BI ? pi
A processing instruction with data
.IR pi .
.TP
.BI A name\ val
The next element to start has an attribute
.I name
with value
.I val
which takes one of the following forms:
.RS
.TP
.B IMPLIED
The value of the attribute is implied.
.TP
.BI CDATA\ data
The attribute is character data.
This is used for attributes whose declared value is
.BR \s-1CDATA\s0 .
.TP
.BI NOTATION\ nname
The attribute is a notation name;
.I nname
will have been defined using a
.B N
command.
This is used for attributes whose declared value is
.BR \s-1NOTATION\s0 .
.TP
.BI ENTITY\ name\|.\|.\|.
The attribute is a list of general entity names.
Each entity name will have been defined using an
.BR I ,
.B E
or
.B S
command.
This is used for attributes whose declared value is
.B \s-1ENTITY\s0
or
.BR \s-1ENTITIES\s0 .
.TP
.BI TOKEN\ token\|.\|.\|.
The attribute is a list of tokens.
This is used for attributes whose declared value is anything else.
.RE
.TP
.BI D ename\ name\ val
This is the same as the
.B A
command, except that it specifies a data attribute for an
external entity named
.IR ename .
Any
.B D
commands will come after the
.B E
command that defines the entity to which they apply, but
before any
.B &
or
.B A
commands that reference the entity.
.TP
.BI N nname
.IR nname.
Define a notation
This command will be preceded by a
.B p
command if the notation was declared with a public identifier,
and by a
.B s
command if the notation was declared with a system identifier.
A notation will only be defined if it is to be referenced
in an
.B E
command or in an
.B A
command for an attribute with a declared value of
.BR \s-1NOTATION\s0 .
.TP
.BI E ename\ typ\ nname
Define an external data entity named
.I ename
with type
.I typ
.RB ( \s-1CDATA\s0 ,
.B \s-1NDATA\s0
or
.BR \s-1SDATA\s0 )
and notation
.IR not.
This command will be preceded by one or more
.B f
commands giving the filenames generated by the entity manager from the system
and public identifiers,
by a
.B p
command if a public identifier was declared for the entity,
and by a
.B s
command if a system identifier was declared for the entity.
.I not
will have been defined using a
.B N
command.
Data attributes may be specified for the entity using
.B D
commands.
An external data entity will only be defined if it is to be referenced in a
.B &
command or in an
.B A
command for an attribute whose declared value is
.B \s-1ENTITY\s0
or
.BR \s-1ENTITIES\s0 .
.TP
.BI I ename\ typ\ text
Define an internal data entity named
.I ename
with type
.I typ
.RB ( \s-1CDATA\s0
or
.BR \s-1SDATA\s0 )
and entity text
.IR text .
An internal data entity will only be defined if it is referenced in an
.B A
command for an attribute whose declared value is
.B \s-1ENTITY\s0
or
.BR \s-1ENTITIES\s0 .
.TP
.BI S ename
Define a subdocument entity named
.IR ename .
This command will be preceded by one or more
.B f
commands giving the filenames generated by the entity manager from the system
and public identifiers,
by a
.B p
command if a public identifier was declared for the entity,
and by a
.B s
command if a system identifier was declared for the entity.
A subdocument entity will only be defined if it is referenced
in a
.B {
command
or in an
.B A
command for an attribute whose declared value is
.B \s-1ENTITY\s0
or
.BR \s-1ENTITIES\s0 .
.TP
.BI s sysid
This command applies to the next
.BR E ,
.B S
or
.B N
command and specifies the associated system identifier.
.TP
.BI p pubid
This command applies to the next
.BR E ,
.B S
or
.B N
command and specifies the associated public identifier.
.TP
.BI f filename
This command applies to the next
.B E
or
.B S
command and specifies an associated filename.
There will be more than one
.B f
command for a single
.B E
or
.B S
command if the system identifier used a
.if \n(Os=0 colon.
.if \n(Os=1 semi-colon.
.TP
.BI { ename
The start of the \*S subdocument entity
.IR ename ;
.I ename
will have been defined using a
.B S
command.
.TP
.BI } ename
The end of the \*S subdocument entity
.IR ename .
.TP
.BI L lineno\ file
.TQ
.BI L lineno
Set the current line number and filename.
The
.I filename
argument will be omitted if only the line number has changed.
This will be output only if the
.B \-l
option has been given.
.TP
.BI # text
An \s-1APPINFO\s0 parameter of
.I text
was specified in the \*S declaration.
This is not strictly part of the ESIS, but a structure-controlled
application is permitted to act on it.
No
.B #
command will be output if
.B \s-1APPINFO\s0\ \s-1NONE\s0
was specified.
A
.B #
command will occur at most once,
and may be preceded only by a single
.B L
command.
.TP
.B C
This command indicates that the document was a conforming \*S document.
If this command is output, it will be the last command.
An \*S document is not conforming if it references a subdocument entity
that is not conforming.
.SH BUGS
Some non-SGML characters in literals are counted as two characters for the
purposes of quantity and capacity calculations.
.SH "SEE ALSO"
The \*S Handbook, Charles F. Goldfarb
.br
\s-1ISO\s0 8879 (Standard Generalized Markup Language),
International Organization for Standardization
.SH ORIGIN
\s-1ARCSGML\s0 was written by Charles F. Goldfarb.
.LP
.I Sgmls
was derived from \s-1ARCSGML\s0 by James Clark (jjc@jclark.com),
to whom bugs should be reported.

View File

@ -0,0 +1,223 @@
/* Standard Generalized Markup Language Users' Group (SGMLUG)
SGML Parser Materials (ARCSGML 1.0)
(C) 1983-1988 Charles F. Goldfarb (assigned to IBM Corporation)
(C) 1988-1991 IBM Corporation
Licensed to the SGML Users' Group for distribution under the terms of
the following license: */
char license[] =
"SGMLUG hereby grants to any user: (1) an irrevocable royalty-free,\n\
worldwide, non-exclusive license to use, execute, reproduce, display,\n\
perform and distribute copies of, and to prepare derivative works\n\
based upon these materials; and (2) the right to authorize others to\n\
do any of the foregoing.\n";
#include "sgmlincl.h"
/* SGMLXTRN: Storage allocation and initialization for all public variables.
Exceptions: Constants lex????? and del????? are defined in
LEX?????.C modules; constants pcb????? are defined in PCB?????.c.
*/
int badresw = 0; /* 1=REF_ out of context; 0=valid. */
int charmode = 0; /* >0=in #CHARS; 0=not. */
int conactsw = 0; /* 1=return saved content action 0=get new one.*/
int conrefsw = 0; /* 1=content reference att specified; 0=no. */
int contersv = 0; /* Save contersw while processing pending REF. */
int contersw = 0; /* 1=element or #CHARS out of context; 0=valid. */
int datarc = 0; /* Return code for data: DAF_ or REF_. */
int delmscsw = 0; /* 1=DELMSC must be read on return to es==0. */
int didreq = 0; /* 1=required implied tag processed; 0=no. */
int dostag = 0; /* 1=retry newetd instead of parsing; 0=parse. */
int dtdsw = 0; /* DOCTYPE declaration found: 1=yes; 0=no. */
int entdatsw = 0; /* 2=CDATA entity; 4=SDATA; 8=NDATA; 0=none. */
int entpisw = 0; /* 4=PI entity occurred; 0=not. */
int eodsw = 0; /* 1=eod found in error; 0=not yet. */
int eofsw = 0; /* 1=eof found in body of document; 0=not yet. */
int es = -1; /* Index of current source in stack. */
int etagimct = 0; /* Implicitly ended elements left on stack. */
int etagimsw = 0; /* 1=end-tag implied by other end-tag; 0=not. */
int etagmin = MINNONE; /* Minim: NONE NULL NET DATA; implied by S/ETAG*/
int etictr = 0; /* Number of "NET enabled" tags on stack. */
int etisw = 0; /* 1=tag ended with eti; 0=did not. */
int indtdsw = 0; /* Are we in the DTD? 1=yes; 0=no. */
int mslevel = 0; /* Nesting level of marked sections. */
int msplevel = 0; /* Nested MS levels subject to special parse. */
int prologsw = 1; /* 1=in prolog; 0=not. */
int pss = 0; /* SGMLACT: scbsgml stack level. */
int sgmlsw = 0; /* SGML declaration found: 1=yes; 0=no. */
int stagmin = MINNONE; /* Minimization: NONE, NULL tag, implied by STAG*/
int tagctr = 0; /* Tag source chars read. */
int ts = -1; /* Index of current tag in stack. */
struct parse *propcb = &pcbpro; /* Current PCB for prolog parse. */
int aentctr = 0; /* Number of ENTITY tokens in this att list. */
int conact = 0; /* Return code from content parse. */
int conrefsv = 0; /* Save conrefsw when doing implied start-tag.*/
int dtdrefsw = 0; /* External DTD? 1=yes; 0=no. */
int etiswsv = 0; /* Save etisw when processing implied start-tag.*/
int grplvl = 0; /* Current level of nested grps in model. */
int idrctr = 0; /* Number of IDREF tokens in this att list. */
int mdessv = 0; /* ES level at start of markup declaration. */
int notadn = 0; /* Position of NOTATION attribute in list. */
int parmno = 0; /* Current markup declaration parameter number. */
int pexsw = 0; /* 1=tag valid solely because of plus exception.*/
int rcessv = 0; /* ES level at start of RCDATA content. */
int tagdelsw = 0; /* 1=tag ended with delimiter; 0=no delimiter. */
int tokencnt = 0; /* Number of tokens found in attribute value. */
struct entity *ecbdeflt = 0; /* #DEFAULT ecb (NULL if no default entity). */
struct etd *docetd = 0; /* The etd for the document as a whole. */
struct etd *etagreal = 0; /* Actual or dummy etd that implied this tag. */
struct etd *newetd = 0; /* The etd for a start- or end-tag recognized. */
struct etd *nextetd = 0; /* ETD that must come next (only one choice). */
struct etd *stagreal = 0; /* Actual or dummy etd that implied this tag. */
struct parse *conpcb = 0; /* Current PCB for content parse. */
UNCH *data = 0; /* Pointer to returned data in buffer. */
UNCH *mdname = 0; /* Name of current markup declaration. */
UNCH *ptcon = 0; /* Current pointer into tbuf. */
UNCH *ptpro = 0; /* Current pointer into tbuf. */
UNCH *rbufs = 0; /* DOS file read area: start position for read. */
UNCH *subdcl = 0; /* Subject of markup declaration (e.g., GI). */
int Tstart = 0; /* Save starting token for AND group testing. */
UNS conradn = 0; /* 1=CONREF attribute in list (0=no). */
UNS datalen = 0; /* Length of returned data in buffer. */
UNS entlen = 0; /* Length of TAG or EXTERNAL entity text. */
UNS idadn = 0; /* Number of ID attribute (0 if none). */
UNS noteadn = 0; /* Number of NOTATION attribute (0 if none). */
UNS reqadn = 0; /* Num of atts with REQUIRED default (0=none). */
int grplongs; /* Number of longs for GRPCNT bitvector. */
/* Variable arrays and structures.
*/
struct ad *al = 0; /* Current attribute list work area. */
struct dcncb *dcntab[1]; /* List of data content notation names. */
struct entity *etab[ENTHASH]; /* Entity hash table. */
struct etd *etdtab[ETDHASH]; /* Element type definition hash table. */
struct fpi fpidf; /* Fpi for #DEFAULT entity. */
struct id *itab[IDHASH]; /* Unique identifier hash table. */
struct etd **nmgrp = 0; /* Element name group */
PDCB *nnmgrp = 0; /* Notation name group */
struct restate *scbsgml = 0; /* SGMLACT: return action state stack. */
struct source *scbs = 0; /* Stack of open sources ("SCB stack"). */
struct srh *srhtab[1]; /* List of SHORTREF table headers. */
struct sgmlstat ds; /* Document statistics. */
struct switches sw; /* Parser control switches set by text proc. */
struct tag *tags = 0; /* Stack of open elements ("tag stack"). */
struct thdr *gbuf = 0; /* Buffer for creating group. */
struct thdr prcon[3]; /* 0-2: Model for *DOC content. */
struct thdr undechdr; /* 0:Default model hdr for undeclared content.*/
UNCH *dtype = 0; /* Document type name. */
UNCH *entbuf = 0; /* Buffer for entity reference name. */
UNCH fce[2]; /* String form of FCE char.
(fce[1] must be EOS).*/
/* Buffer for non-SGML character reference.*/
UNCH nonchbuf[2] = { DELNONCH };
UNCH *tbuf; /* Work area for tokenization. */
UNCH *lbuf = 0; /* In tbuf: Literal parse area.*/
UNCH *sysibuf = 0; /* Buffer for system identifiers. */
UNCH *pubibuf = 0; /* Buffer for public identifiers. */
UNCH *nmbuf = 0; /* Name buffer used by mdentity. */
struct mpos *savedpos;
/* Constants.
*/
struct map dctab[] = { /* Keywords for declared content parameter.*/
{ key[KRCDATA], MRCDATA+MPHRASE },
{ key[KCDATA], MCDATA+MPHRASE },
{ key[KANY], MANY+MCHARS+MGI },
{ key[KEMPTY], MNONE+MPHRASE },
{ NULL, 0 }
};
struct map deftab[] = { /* Default value keywords. */
{ key[KIMPLIED], DNULL },
{ key[KREQUIRED], DREQ },
{ key[KCURRENT], DCURR },
{ key[KCONREF], DCONR },
{ key[KFIXED], DFIXED},
{ NULL, 0}
};
struct map dvtab[] = { /* Declared value: keywords and type codes.*/
/* TYPE NUMBER */
/* grp ANMTGRP Case 1 0 Grp size */
/* grp member ANMTGRP Case 0 Position */
/* grp ANOTEGRP Case 1 1 Grp size */
{ key[KNOTATION], ANOTEGRP}, /* Case 1 Position */
{ key[KCDATA], ACHARS }, /* Case 2 Always 0 */
{ key[KENTITY], AENTITY }, /* Case 3 Normal 1 */
{ key[KID], AID }, /* Case 4 Normal 1 */
{ key[KIDREF], AIDREF }, /* Case 5 Normal 1 */
{ key[KNAME], ANAME }, /* Case 6 Normal 1 */
{ key[KNMTOKEN], ANMTOKE }, /* Case 7 Normal 1 */
{ key[KNUMBER], ANUMBER }, /* Case 8 Normal 1 */
{ key[KNUTOKEN], ANUTOKE }, /* Case 9 Normal 1 */
{ key[KENTITIES], AENTITYS}, /* Case A Normal 1 */
{ key[KIDREFS], AIDREFS }, /* Case B # tokens */
{ key[KNAMES], ANAMES }, /* Case C # tokens */
{ key[KNMTOKENS], ANMTOKES}, /* Case D # tokens */
{ key[KNUMBERS], ANUMBERS}, /* Case E # tokens */
{ key[KNUTOKENS], ANUTOKES}, /* Case F # tokens */
{ NULL, 0 } /* Case 0 ERROR */
};
struct map enttab[] = { /* Entity declaration second parameter. */
{ key[KCDATA], ESC },
{ key[KSDATA], ESX },
{ key[KMS], ESMS},
{ key[KPI], ESI },
{ key[KSTARTTAG], ESS },
{ key[KENDTAG], ESE },
{ key[KMD], ESMD},
{ NULL, 0 }
};
struct map exttab[] = { /* Keywords for external identifier. */
{ key[KSYSTEM], EDSYSTEM },
{ key[KPUBLIC], EDPUBLIC },
{ NULL, 0 }
};
struct map extettab[] = { /* Keywords for external entity type. */
{ key[KCDATA], ESNCDATA },
{ key[KNDATA], ESNNDATA },
{ key[KSDATA], ESNSDATA },
{ key[KSUBDOC], ESNSUB },
{ NULL, 0 }
};
struct map funtab[] = { /* Function character reference names. */
{ key[KRE], RECHAR },
{ key[KRS], RSCHAR },
{ key[KSPACE], SPCCHAR },
/* We should use an extra table for added functions. */
{ (UNCH *)"TAB", TABCHAR },
{ NULL, 0 }
};
struct map mstab[] = { /* Marked section keywords. */
{ key[KTEMP], MSTEMP },
{ key[KINCLUDE], MSTEMP }, /* Treat INCLUDE like TEMP; both are NOPs.*/
{ key[KRCDATA], MSRCDATA},
{ key[KCDATA], MSCDATA },
{ key[KIGNORE], MSIGNORE},
{ NULL, 0 }
};
struct map pubcltab[] = { /* Names for public text class. */
{ (UNCH *)"CAPACITY", FPICAP },
{ (UNCH *)"CHARSET", FPICHARS},
{ (UNCH *)"DOCUMENT", FPIDOC },
{ (UNCH *)"DTD", FPIDTD },
{ (UNCH *)"ELEMENTS", FPIELEM },
{ (UNCH *)"ENTITIES", FPIENT },
{ (UNCH *)"LPD", FPILPD },
{ (UNCH *)"NONSGML", FPINON },
{ (UNCH *)"NOTATION", FPINOT },
{ (UNCH *)"SHORTREF", FPISHORT},
{ (UNCH *)"SUBDOC", FPISUB },
{ (UNCH *)"SYNTAX", FPISYN },
{ (UNCH *)"TEXT", FPITEXT },
{ NULL, 0 }
};
UNCH indefent[] = "\12#DEFAULT"; /* Internal name: default entity name. */
UNCH indefetd[] = "\12*DOCTYPE"; /* Internal name: default document type. */
UNCH indocent[] = "\12*SGMLDOC"; /* Internal name: SGML document entity. */
UNCH indocetd[] = "\6*DOC"; /* Internal name: document level etd. */
UNCH indtdent[] = "\11*DTDENT"; /* Internal name: external DTD entity. */
struct etd dumetd[3];
struct entity *dumpecb;
UNCH sgmlkey[] = "SGML";

View File

@ -0,0 +1,121 @@
/* SGMLXTRN.H: External declarations for SGML public variables.
Exceptions: Constants lex????? and del????? are defined in
LEX?????.C modules; constants pcb????? are defined in PCB?????.c.
*/
#ifndef SGMLXTRN /* Don't include this file more than once. */
#define SGMLXTRN
extern int badresw; /* 1=REF_ out of context; 0=valid. */
extern int charmode; /* >0=in #CHARS; 0=not. */
extern int conactsw; /* 1=return saved content action 0=get new one.*/
extern int conrefsw; /* 1=content reference att specified; 0=no. */
extern int contersv; /* Save contersw while processing pending REF. */
extern int contersw; /* 1=element or #CHARS out of context; 0=valid. */
extern int datarc; /* Return code for data: DAF_ or REF_. */
extern int delmscsw; /* 1=DELMSC must be read on return to es==0. */
extern int didreq; /* 1=required implied tag processed; 0=no. */
extern int dostag; /* 1=retry newetd instead of parsing; 0=parse. */
extern int dtdsw; /* DOCTYPE declaration found: 1=yes; 0=no. */
extern int entdatsw; /* 2=CDATA entity; 4=SDATA; 8=NDATA; 0=none. */
extern int entpisw; /* 4=PI entity occurred; 0=not. */
extern int eodsw; /* 1=eod found in error; 0=not yet. */
extern int eofsw; /* 1=eof found in body of document; 0=not yet. */
extern int etagimct; /* Implicitly ended elements left on stack. */
extern int etagimsw; /* 1=end-tag implied by other end-tag; 0=not. */
extern int etagmin; /* Minim: NONE NULL NET DATA; implied by S/ETAG*/
extern int etictr; /* Number of "NET enabled" tags on stack. */
extern int etisw; /* 1=tag ended with eti; 0=did not. */
extern int indtdsw; /* Are we in the DTD? 1=yes; 0=no. */
extern int mslevel; /* Nesting level of marked sections. */
extern int msplevel; /* Nested MS levels subject to special parse. */
extern int prologsw; /* 1=in prolog; 0=not. */
extern int pss; /* SGMLACT: scbsgml stack level. */
extern int sgmlsw; /* SGML declaration found: 1=yes; 0=no. */
extern int stagmin; /* Minimization: NONE, NULL tag, implied by STAG*/
extern int tagctr; /* Tag source chars read. */
extern int ts; /* Index of current tag in stack. */
extern struct parse *propcb; /* Current PCB for prolog parse. */
extern int aentctr; /* Number of ENTITY tokens in this att list. */
extern int conact; /* Return code from content parse. */
extern int conrefsv; /* Save conrefsw when doing implied start-tag.*/
extern int dtdrefsw; /* External DTD? 1=yes; 0=no. */
extern int etiswsv; /* Save etisw when processing implied start-tag.*/
extern int grplvl; /* Current level of nested grps in model. */
extern int idrctr; /* Number of IDREF tokens in this att list. */
extern int mdessv; /* ES level at start of markup declaration. */
extern int notadn; /* Position of NOTATION attribute in list. */
extern int parmno; /* Current markup declaration parameter number. */
extern int pexsw; /* 1=tag valid solely because of plus exception.*/
extern int rcessv; /* ES level at start of RCDATA content. */
extern int tagdelsw; /* 1=tag ended with delimiter; 0=no delimiter. */
extern int tokencnt; /* Number of tokens found in attribute value. */
extern struct entity *ecbdeflt; /* #DEFAULT ecb (NULL if no default entity). */
extern struct etd *docetd; /* The etd for the document as a whole. */
extern struct etd *etagreal; /* Actual or dummy etd that implied this tag. */
extern struct etd *newetd; /* The etd for a start- or end-tag recognized. */
extern struct etd *nextetd; /* ETD that must come next (only one choice). */
extern struct etd *stagreal; /* Actual or dummy etd that implied this tag. */
extern struct parse *conpcb; /* Current PCB for content parse. */
extern UNCH *data; /* Pointer to returned data in buffer. */
extern UNCH *mdname; /* Name of current markup declaration. */
extern UNCH *ptcon; /* Current pointer into tbuf. */
extern UNCH *ptpro; /* Current pointer into tbuf. */
extern UNCH *rbufs; /* DOS file read area: start position for read. */
extern UNCH *subdcl; /* Subject of markup declaration (e.g., GI). */
extern int Tstart; /* Save starting token for AND group testing. */
extern UNS conradn; /* 1=CONREF attribute in list (0=no). */
extern UNS datalen; /* Length of returned data in buffer. */
extern UNS entlen; /* Length of TAG or EXTERNAL entity text. */
extern UNS idadn; /* Number of ID attribute (0 if none). */
extern UNS noteadn; /* Number of NOTATION attribute (0 if none). */
extern UNS reqadn; /* Num of atts with REQUIRED default (0=none). */
extern int grplongs; /* Number of longs for GRPCNT bitvector. */
/* Variable arrays and structures.
*/
extern struct ad *al; /* Current attribute list work area. */
extern struct dcncb *dcntab[];/* List of data content notation names. */
extern struct entity *etab[]; /* Entity hash table. */
extern struct etd *etdtab[]; /* Element type definition hash table. */
extern struct fpi fpidf; /* Fpi for #DEFAULT entity. */
extern struct id *itab[]; /* Unique identifier hash table. */
extern struct etd **nmgrp; /* Element name group */
extern PDCB *nnmgrp; /* Notation name group */
extern struct restate *scbsgml; /* SGMLACT: return action state stack. */
extern struct srh *srhtab[]; /* List of SHORTREF table headers. */
extern struct sgmlstat ds; /* Document statistics. */
extern struct switches sw; /* Parser control switches set by text proc. */
extern struct tag *tags; /* Stack of open elements ("tag stack"). */
extern struct thdr *gbuf; /* Buffer for creating group. */
extern struct thdr prcon[]; /* 0-2: Model for *DOC content. */
extern struct thdr undechdr; /* 0: Default model hdr for undeclared content. */
extern UNCH *dtype; /* Document type name. */
extern UNCH *entbuf; /* Buffer for entity reference name. */
extern UNCH fce[]; /* String form of FCE char (fce[1] must be EOS).*/
extern UNCH nonchbuf[]; /* Buffer for valid nonchar character reference.*/
extern UNCH *tbuf; /* Work area for tokenization. */
extern UNCH *lbuf; /* In tbuf: Literal parse area; TAGLEN limit.*/
extern struct entity *dumpecb; /* SRMNULL points to this. */
extern UNCH *sysibuf;
extern UNCH *pubibuf;
extern UNCH *nmbuf; /* Name buffer used by mdentity. */
extern struct mpos *savedpos;
/* Constants.
*/
extern int scbsgmnr; /* SCBSGML: new record; do not ignore RE. */
extern int scbsgmst; /* SCBSGML: trailing stag or markup; ignore RE. */
extern struct map dctab[]; /* Keywords for declared content parameter. */
extern struct map deftab[]; /* Default value keywords. */
extern struct map dvtab[]; /* Declared value: keywords and type codes.*/
extern struct map enttab[]; /* Entity declaration second parameter. */
extern struct map exttab[]; /* Keywords for external identifier. */
extern struct map extettab[]; /* Keywords for external entity type. */
extern struct map funtab[]; /* Function character reference names. */
extern struct map mstab[]; /* Marked section keywords. */
extern struct map pubcltab[]; /* Keywords for public text class. */
extern UNCH indefent[]; /* Internal name: default entity name. */
extern UNCH indefetd[]; /* Internal name: default document type. */
extern UNCH indocent[]; /* Internal name: SGML document entity. */
extern UNCH indocetd[]; /* Internal name: etd for document as a whole. */
extern UNCH indtdent[]; /* Internal name: external DTD entity. */
extern char license[]; /* SGML Users' Group free license. */
#endif /* ndef SGMLXTRN */

View File

@ -0,0 +1,114 @@
/* SOURCE.H: Entity and source control block structures and definitions.
*/
#define ENTHASH 503 /* Size of entity hash table. Must be prime. */
/* Entity storage class values for estore member of entity structure. */
#define EST 1 /* String: Tag (usually a fixed STARTGI). */
/* <MDENTITY sets these:> */
#define ESMD 2 /* String: Markup declaration. */
#define ESMS 3 /* String: Marked section. */
#define ESM 4 /* String: ordinary text. */
/* <ENTOPEN treats these specially:> */
#define ESS 5 /* ETD: Start-tag. */
#define ESE 6 /* ETD: End-tag. */
#define ESI 7 /* String: PI. */
#define ESX 8 /* String: SDATA general entity. */
#define ESC 9 /* String: CDATA general entity. */
/* </MDENTITY> <MDEXTID sets these:> */
#define ESFM 10 /* LPU: minimum external (file) storage class. */
#define ESN 10 /* XCB: N/C/SDATA or SUBDOC control block. */
/* </ENTOPEN> */
#define ESF 11 /* LPU: General entity. */
#define ESP 12 /* LPU: Parameter entity. */
#define ESD 13 /* LPU: Document type definition. */
#define ESL 14 /* LPU: Link process definition. */
#define ESK 15 /* LPU: Data content notation. */
/* </MDEXTID> */
union etext { /* Entity text. */
UNIV x; /* External ID generated by system. */
UNCH *c; /* Character string. */
struct ne *n; /* N/C/SDATA or SUBDOC entity control block. */
};
#define ETEXTSZ sizeof(union etext)
struct entity { /* Entity control block. */
struct entity *enext; /* Next entity in chain. */
UNCH *ename; /* Entity name with length and EOS. */
UNCH estore; /* Storage class (see values above). */
UNCH dflt; /* Declared as default entity. */
UNCH mark; /* For use by for sgmlment. */
union etext etx; /* Entity text. */
};
#define ENTSZ sizeof(struct entity)
typedef struct entity *PECB; /* Ptr to entity control block. */
typedef struct entity **TECB; /* Table of entity control blocks. */
struct source { /* Source control block. */
struct entity ecb; /* Entity control block. */
unsigned long rcnt; /* Source record number. */
int ccnt; /* Source record chars since last RS. */
int curoff; /* Offset of curchar (chars read in this block).*/
UNCH curchar; /* Current character. */
UNCH nextchar; /* If curchar was DELNONCH, next character. */
UNIV fcb; /* SGMLIO fcb ptr returned by OPEN. */
UNCH *fbuf; /* 1st char in buffer (0=PEND) or entity text. */
UNCH *fpos; /* Current char in buffer or entity text. */
UNCH pushback; /* Character before pend position */
char copied; /* Is this a copy of the internal entity? */
};
#define SCBSZ sizeof(struct source)
typedef struct source *PSCB; /* Ptr to source control block. */
extern int es; /* Index of current source in stack. */
extern struct source *scbs; /* Stack of open sources ("SCB stack"). */
/* Member definitions for source and entity control blocks.
*/
#define SCB (scbs[es]) /* Ptr to current source control block. */
#define ECB SCB.ecb /* Pointer to current entity control block. */
#define FBUF SCB.fbuf /* Pointer to start of entity buffer. */
#define FPOS SCB.fpos /* Pointer to current char of current source. */
#define RSCC SCB.ccnt /* CCNT at start of block (across EOB/EOS/EOF). */
#define CCO SCB.curoff /* Offset in read buffer of current char. */
#define CC SCB.curchar /* Current character of current source entity. */
#define NEXTC SCB.nextchar /* Next character in current source entity. */
#define CCNT (SCB.ccnt+CCO) /* Position of CC in current record (RS=0). */
#define RCNT SCB.rcnt /* Position of record in entity (origin=1). */
#define SCBFCB SCB.fcb /* Current file control block (if FILESW). */
#define ECBPTR ((ECB.enext)) /* Pointer to this entity's ECB. */
#define ENTITY ((ECB.ename)) /* Current entity name. */
#define FILESW (ECB.estore>=ESFM) /* 1=Entity is external file; 0=internal. */
#define NEWCC (++FPOS) /* Get next current character. */
#define REPEATCC (--FPOS) /* Repeat previous current character. */
#define COPIEDSW SCB.copied /* Non-zero means entity was copied. */
struct srh { /* Short reference map header. */
struct srh *enext; /* Next short reference map in chain. */
UNCH *ename; /* Short reference map name. */
TECB srhsrm; /* Ptr to short reference map. */
};
#define SRHSZ (sizeof(struct srh))
typedef struct srh *PSRH; /* Ptr to short reference map header. */
#define SRMNULL (&dumpecb) /* Dummy ptr to empty short reference map. */
/* Definitions for ENTOPEN/ENTREF return codes.
*/
#define ENTUNDEF -1 /* Callers of ENTOPEN: entity undefined. */
#define ENTLOOP -2 /* ENTOPEN: endless loop entity. */
#define ENTMAX -3 /* ENTOPEN: too many open entities. */
#define ENTFILE -4 /* ENTOPEN: file I/O error. */
#define ENTDATA -5 /* ENTOPEN: CDATA or SDATA entity. */
#define ENTPI -6 /* ENTOPEN: PI entity. */
/* Definitions for ENTDATA switches set in contersw.
*/
#define CDECONT 2 /* 0010 CDATA entity referenced. */
#define SDECONT 4 /* 0100 SDATA entity referenced. */
#define NDECONT 8 /* 1000 NDATA entity referenced. */
/* Definitions for manipulating signed source character counters.
*/
#define CTRSET(CTR) (CTR = (int) -(FPOS+1-FBUF)) /* Init source char ctr. */
#define CTRGET(CTR) (CTR + (int) (FPOS+1-FBUF)) /* Read source char ctr. */

116
usr.bin/sgmls/sgmls/std.h Normal file
View File

@ -0,0 +1,116 @@
/* std.h -
Include standard header files.
*/
#ifndef STD_H
#define STD_H 1
#include <stdio.h>
#include <ctype.h>
#include <errno.h>
#include <assert.h>
#ifdef SUPPORT_SUBDOC
#include <signal.h>
#endif /* SUPPORT_SUBDOC */
#ifndef STDDEF_H_MISSING
#include <stddef.h>
#endif /* not STDDEF_H_MISSING */
#ifndef LIMITS_H_MISSING
#include <limits.h>
#endif /* not LIMITS_H_MISSING */
#ifndef UINT_MAX
#define UINT_MAX (sizeof(unsigned int) == 2 ? 0x7fff : \
(sizeof(unsigned int) == 4 ? 0x7fffffff : cant_guess_UINT_MAX))
#endif
#ifdef VARARGS
#include <varargs.h>
#else
#include <stdarg.h>
#endif
#ifdef BSD_STRINGS
#include <strings.h>
#define memcpy(to, from, n) bcopy(from, to, n)
#define memcmp(p, q, n) bcmp(p, q, n)
#define strchr(s, c) index(s, c)
#define strrchr(s, c) rindex(s, c)
#else /* not BSD_STRINGS */
#include <string.h>
#endif /* not BSD_STRINGS */
#ifdef STRERROR_MISSING
#ifdef USE_PROTOTYPES
extern char *strerror(int);
#else
extern char *strerror();
#endif
#endif /* STRERROR_MISSING */
#ifdef STDLIB_H_MISSING
UNIV malloc();
UNIV calloc();
UNIV realloc();
char *getenv();
long atol();
#else /* not STDLIB_H_MISSING */
#include <stdlib.h>
#endif /* not STDLIB_H_MISSING */
#ifdef REMOVE_MISSING
#ifdef HAVE_UNISTD_H
#include <unistd.h>
#endif /* HAVE_UNISTD_H */
#define remove unlink
#endif /* REMOVE_MISSING */
#ifdef RAISE_MISSING
#ifdef HAVE_UNISTD_H
#include <unistd.h>
#endif /* HAVE_UNISTD_H */
#define raise(sig) kill(getpid(), sig)
#endif /* RAISE_MISSING */
#ifndef offsetof
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#endif
#ifndef EXIT_FAILURE
#define EXIT_FAILURE 1
#endif
#ifndef EXIT_SUCCESS
#define EXIT_SUCCESS 0
#endif
#ifndef SEEK_SET
#define SEEK_SET 0
#define SEEK_CUR 1
#define SEEK_END 2
#endif
#ifdef FPOS_MISSING
typedef long fpos_t;
#define fsetpos(stream, pos) fseek(stream, *(pos), SEEK_SET)
#define fgetpos(stream, pos) ((*(pos) = ftell(stream)) == -1L)
#endif /* FPOS_MISSING */
/* Old BSD systems lack L_tmpnam and tmpnam(). This is a partial
emulation using mktemp(). It requires that the argument to tmpnam()
be non-NULL. */
#ifndef L_tmpnam
#define tmpnam_template "/tmp/sgmlsXXXXXX"
#define L_tmpnam (sizeof(tmpnam_template))
#undef tmpnam
#define tmpnam(buf) \
(mktemp(strcpy(buf, tmpnam_template)) == 0 || (buf)[0] == '\0' ? 0 : (buf))
#endif /* not L_tmpnam */
#ifndef errno
extern int errno;
#endif
#endif /* not STD_H */

View File

@ -0,0 +1,2 @@
/* This tells Borland C++ to allocate a 14k stack. */
unsigned _stklen = 14*1024;

View File

@ -0,0 +1,36 @@
/* strerror.c -
ANSI C strerror() function.
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#ifdef STRERROR_MISSING
#include <stdio.h>
char *strerror(n)
int n;
{
extern int sys_nerr;
extern char *sys_errlist[];
static char buf[sizeof("Error ") + 1 + 3*sizeof(int)];
if (n >= 0 && n < sys_nerr && sys_errlist[n] != 0)
return sys_errlist[n];
else {
sprintf(buf, "Error %d", n);
return buf;
}
}
#endif /* STRERROR_MISSING */
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,72 @@
/* SYNRF: Reserved names and other constants for reference concrete syntax.
*/
#include "config.h"
#include "entity.h" /* Templates for entity control blocks. */
#include "synxtrn.h" /* Declarations for concrete syntax constants. */
#include "adl.h" /* Definitions for attribute list processing. */
UNCH key[NKEYS][REFNAMELEN+1] = {
"ANY",
"ATTLIST",
"CDATA",
"CONREF",
"CURRENT",
"DEFAULT",
"DOCTYPE",
"ELEMENT",
"EMPTY",
"ENDTAG",
"ENTITIES",
"ENTITY",
"FIXED",
"ID",
"IDLINK",
"IDREF",
"IDREFS",
"IGNORE",
"IMPLIED",
"INCLUDE",
"INITIAL",
"LINK",
"LINKTYPE",
"MD",
"MS",
"NAME",
"NAMES",
"NDATA",
"NMTOKEN",
"NMTOKENS",
"NOTATION",
"NUMBER",
"NUMBERS",
"NUTOKEN",
"NUTOKENS",
"O",
"PCDATA",
"PI",
"POSTLINK",
"PUBLIC",
"RCDATA",
"RE",
"REQUIRED",
"RESTORE",
"RS",
"SDATA",
"SHORTREF",
"SIMPLE",
"SPACE",
"STARTTAG",
"SUBDOC",
"SYSTEM",
"TEMP",
"USELINK",
"USEMAP"
};
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,152 @@
/* SYNXTRN.H: External declarations for concrete syntax constants.
*/
/* Short References
*/
#define SRCT 32 /* Number of short reference delimiters. */
#define SRMAXLEN 3 /* Maximum length of a SHORTREF delimiter. */
#define SRNPRT 8 /* Number of non-printable SHORTREF delimiters. */
struct srdel {
struct map dtb[SRCT+2]; /* LEXCNM: Short reference delimiters. */
char *pdtb[SRNPRT+1]; /* LEXCNM: Printable form of unprintable SRs. */
int fce; /* LEXCNM: Index of first FCE in srdeltab. */
int hyp2; /* LEXCNM: Index of "two hyphens" in srdeltab. */
int data; /* LEXCNM: Index of first SR with data char. */
int hyp; /* LEXCNM: Index of hyphen in srdeltab. */
int prtmin; /* LEXCNM: Index of 1st printable SR. */
int spc; /* LEXCNM: Index of space in srdeltab. */
int lbr; /* LEXCNM: Index of left bracket in srdeltab. */
int rbr; /* LEXCNM: Index of right bracket in srdeltab. */
};
struct delim {
UNCH genre; /* LEXCON: Generated RE; cannot be markup. */
UNCH lit; /* LEXMARK: Char used as LIT delimiter.*/
UNCH lita; /* LEXMARK: Char used as LITA delimiter.*/
UNCH mdc; /* LEXLMS: Char used as MDC delimiter.*/
UNCH msc; /* LEXCON: Char used as MSC delimiter. */
UNCH net; /* LEXCON: Char used as NET when enabled.*/
UNCH pero; /* LEXMARK: Char used as PERO delimiter. */
UNCH pic; /* LEXCON: Char used as PIC delimiter.*/
UNCH tago; /* LEXCON: Char used as TAGO when enabled.*/
};
struct lexcode {
UNCH fce; /* LEXCNM: FRE character as entity reference. */
UNCH fre; /* LEXCON: Free character not an entity ref. */
UNCH litc; /* LEXLMS: Literal close delimiter enabled. */
UNCH msc; /* LEXLMS: Marked section close delim enabled. */
UNCH net; /* LEXCON: Null end-tag delimiter enabled. */
UNCH nonet; /* LEXCON: NET disabled; still used as ETI. */
UNCH spcr; /* LEXCNM: Space in use as SHORTREF delimiter. */
UNCH tago; /* LEXCON: Tag open delimiter enabled. */
UNCH cde; /* LEXLMS: CDATA/SDATA delimiters. */
};
struct lexical {
struct markup m; /* Markup strings for text processor. */
struct srdel s; /* Short reference delimiters. */
struct delim d; /* General delimiter characters. */
struct lexcode l; /* Lexical table code assignments. */
};
extern struct lexical lex; /* Delimiter set constants. */
extern UNCH lexcnm[]; /* Lexical table: mixed content. */
extern UNCH lexcon[]; /* Lexical table for content (except mixed). */
extern UNCH lexgrp[]; /* Lexical table for groups. */
extern UNCH lexlms[]; /* Lexical table: literals and marked sections. */
extern UNCH lexmark[]; /* Lexical table for markup. */
extern UNCH lexsd[]; /* Lexical table for SGML declaration. */
extern UNCH lextran[]; /* Case translation table for SGML names. */
extern UNCH lextoke[]; /* Lexical table for tokenization. */
extern UNCH *lextabs[]; /* List of all lexical tables. */
extern struct parse pcbconc; /* PCB: character data. */
extern struct parse pcbcone; /* PCB: element content (no data allowed). */
extern struct parse pcbconm; /* PCB: mixed content (data allowed). */
extern struct parse pcbconr; /* PCB: replaceable character data. */
extern struct parse pcbetag; /* PCB: end-tags. */
extern struct parse pcbgrcm; /* PCB: content model group. */
extern struct parse pcbgrcs; /* PCB: content model suffix. */
extern struct parse pcbgrnm; /* PCB: name group. */
extern struct parse pcbgrnt; /* PCB: name token group. */
extern struct parse pcblitc; /* PCB: literal with CDATA. */
extern struct parse pcblitp; /* PCB: literal with CDATA, parm & char refs. */
extern struct parse pcblitr; /* PCB: attribute value with general refs. */
extern struct parse pcblitt; /* PCB: tokenized attribute value. */
extern struct parse pcblitv; /* PCB: literal with CDATA, function char trans.*/
extern struct parse pcbmd; /* PCB: markup declaration. */
extern struct parse pcbmdc; /* PCB: comment declaration. */
extern struct parse pcbmdi; /* PCB: markup declaration (ignored). */
extern struct parse pcbmds; /* PCB: markup declaration subset. */
extern struct parse pcbmsc; /* PCB: marked section in CDATA mode. */
extern struct parse pcbmsi; /* PCB: marked section in IGNORE mode. */
extern struct parse pcbmsrc; /* PCB: marked section in RCDATA mode. */
extern struct parse pcbpro; /* PCB: prolog. */
extern struct parse pcbref; /* PCB: reference. */
extern struct parse pcbstag; /* PCB: start-tag. */
extern struct parse pcbval; /* PCB: attribute value. */
extern struct parse pcbeal; /* PCB: end of attribute list. */
extern struct parse pcbsd; /* PCB: SGML declaration. */
extern int pcbcnda; /* PCBCONM: data in buffer. */
extern int pcbcnet; /* PCBCONM: markup found or data buffer flushed.*/
extern int pcbmdtk; /* PCBMD: token expected. */
extern int pcbstan; /* PCBSTAG: attribute name expected. */
extern int pcblittda; /* PCBLITT: data character found */
#define KANY 0
#define KATTLIST 1
#define KCDATA 2
#define KCONREF 3
#define KCURRENT 4
#define KDEFAULT 5
#define KDOCTYPE 6
#define KELEMENT 7
#define KEMPTY 8
#define KENDTAG 9
#define KENTITIES 10
#define KENTITY 11
#define KFIXED 12
#define KID 13
#define KIDLINK 14
#define KIDREF 15
#define KIDREFS 16
#define KIGNORE 17
#define KIMPLIED 18
#define KINCLUDE 19
#define KINITIAL 20
#define KLINK 21
#define KLINKTYPE 22
#define KMD 23
#define KMS 24
#define KNAME 25
#define KNAMES 26
#define KNDATA 27
#define KNMTOKEN 28
#define KNMTOKENS 29
#define KNOTATION 30
#define KNUMBER 31
#define KNUMBERS 32
#define KNUTOKEN 33
#define KNUTOKENS 34
#define KO 35
#define KPCDATA 36
#define KPI 37
#define KPOSTLINK 38
#define KPUBLIC 39
#define KRCDATA 40
#define KRE 41
#define KREQUIRED 42
#define KRESTORE 43
#define KRS 44
#define KSDATA 45
#define KSHORTREF 46
#define KSIMPLE 47
#define KSPACE 48
#define KSTARTTAG 49
#define KSUBDOC 50
#define KSYSTEM 51
#define KTEMP 52
#define KUSELINK 53
#define KUSEMAP 54
#define NKEYS (KUSEMAP+1)
extern UNCH key[NKEYS][REFNAMELEN+1];
/* Holds the SGML keyword (not alterable by concrete syntax). */
extern UNCH sgmlkey[];

View File

@ -0,0 +1,76 @@
/* TOOLS.H: Definitions for type declarations, printing, bit handling, etc.
*/
#if CHAR_SIGNED
typedef unsigned char UNCH;
#else
typedef char UNCH;
#endif
#if CHAR_SIGNED
#define ustrcmp(s1, s2) strcmp((char *)(s1), (char *)(s2))
#define ustrcpy(s1, s2) strcpy((char *)(s1), (char *)(s2))
#define ustrchr(s, c) (UNCH *)strchr((char *)(s), c)
#define ustrncmp(s1, s2, n) strncmp((char *)(s1), (char *)(s2), n)
#define ustrncpy(s1, s2, n) strncpy((char *)(s1), (char *)(s2), n)
#define ustrlen(s1) strlen((char *)(s1))
#else
#define ustrcmp strcmp
#define ustrcpy strcpy
#define ustrchr strchr
#define ustrncmp strncmp
#define ustrncpy strncpy
#define ustrlen strlen
#endif
#if 0
int ustrcmp(UNCH *, UNCH *);
UNCH *ustrchr(UNCH *, int);
int ustrncmp(UNCH *, UNCH *, UNS);
int ustrncpy(UNCH *, UNCH *, UNS);
int ustrlen(UNCH *);
#endif
typedef unsigned UNS;
#ifdef USE_ISASCII
#define ISASCII(c) isascii(c)
#else
#define ISASCII(c) (1)
#endif
#ifdef BSD_STRINGS
#define MEMZERO(s, n) bzero(s, n)
#else /* not BSD_STRINGS */
#define MEMZERO(s, n) memset(s, '\0', n)
#endif /* not BSD_STRINGS */
/* Macros for bit manipulation.
*/
#define SET(word, bits) ((word) |= (bits)) /* Turn bits on */
#define RESET(word, bits) ((word) &= ~(bits)) /* Turn bits off */
#define GET(word, bits) ((word) & (bits)) /* 1=any bit on */
#define BITOFF(word, bits) (GET(word, bits)==0) /* 1=no bits on */
#define BITON(word, bits) ((word) & (bits)) /* 1=any bit on */
#define ETDCDATA (dumetd) /* Dummy etd pointer for #PCDATA. */
#define ETDNULL (dumetd + 1) /* Dummy etd pointer for null tag. */
#define ETDNET (dumetd + 2) /* Dummy etd pointer for NET delimiter. */
#define BADPTR(p) \
((p) == NULL || (p) == ETDCDATA || (p) == ETDNULL || (p) == ETDNET)
#define PTRNUM(p) ((p) == NULL ? 0 : ((p) - dumetd) + 1)
#ifdef USE_PROTOTYPES
#define P(parms) parms
#else
#define P(parms) ()
#endif
/* VP is used for prototypes of varargs functions. You can't have a
prototype if the function is defined using varargs.h rather than
stdarg.h. */
#ifdef VARARGS
#define VP(parms) ()
#else
#define VP(parms) P(parms)
#endif

113
usr.bin/sgmls/sgmls/trace.h Normal file
View File

@ -0,0 +1,113 @@
/* TRACE.H: Declarations for internal trace functions. */
#ifdef TRACE
/* Trace variables.
*/
extern int trace; /* Switch: 1=trace state transitions; 0=don't. */
extern int atrace; /* Switch: 1=trace attribute activity; 0=don't. */
extern int ctrace; /* Switch: 1=trace context checking; 0=don't. */
extern int dtrace; /* Switch: 1=trace declaration parsing; 0=don't.*/
extern int etrace; /* Switch: 1=trace entity activity; 0=don't.*/
extern int gtrace; /* Switch: 1=trace group creations; 0=don't. */
extern int itrace; /* Switch: 1=trace ID activity; 0=don't. */
extern int mtrace; /* Switch: 1=trace MS activity; 0=don't. */
extern int ntrace; /* Switch: 1=trace data notation activity. */
extern char emd[]; /* For "EMD" parameter type in dtrace calls. */
VOID traceadl P((struct ad *));
VOID tracecon P((int,int,int,struct parse *,int,int));
VOID tracedcn P((struct dcncb *));
VOID tracedsk P((struct tag *,struct tag *,int,int));
VOID traceecb P((char *,struct entity *));
VOID traceend P((char *,struct thdr *,struct mpos *,int,int,int));
VOID traceesn P((struct ne *));
VOID traceetd P((struct etd *));
VOID traceetg P((struct tag *,struct etd *,int,int));
VOID tracegi P((char *,struct etd *,struct thdr *,struct mpos *,int));
VOID tracegml P((struct restate *,int,int,int));
VOID tracegrp P((struct etd **));
VOID traceid P((char *,struct id *));
VOID tracemd P((char *));
VOID tracemod P((struct thdr *));
VOID tracems P((int,int,int,int));
VOID tracengr P((struct dcncb **));
VOID tracepcb P((struct parse *));
VOID tracepro P((void));
VOID traceset P((void));
VOID tracesrm P((char *,struct entity **,UNCH *));
VOID tracestg P((struct etd *,int,int,struct etd *,int));
VOID tracestk P((struct tag *,int,int));
VOID tracetkn P((int,UNCH *));
VOID traceval P((struct parse *,unsigned int,UNCH *,int));
#define TRACEADL(al) ((void)(atrace && (traceadl(al), 1)))
#define TRACECON(etagimct, dostag, datarc, pcb, conrefsw, didreq) \
((void)(gtrace \
&& (tracecon(etagimct, dostag, datarc, pcb, conrefsw, didreq), 1)))
#define TRACEDCN(dcn) ((void)(ntrace && (tracedcn(dcn), 1)))
#define TRACEDSK(pts, ptso, ts3, etictr) \
((void)(gtrace && (tracedsk(pts, ptso, ts3, etictr), 1)))
#define TRACEECB(action, p) \
((void)(etrace && (traceecb(action, p), 1)))
#define TRACEEND(stagenm, mod, pos, rc, opt, Tstart) \
((void)(ctrace && (traceend(stagenm, mod, pos, rc, opt, Tstart), 1)))
#define TRACEESN(p) \
((void)((etrace || atrace || ntrace) && (traceesn(p), 1)))
#define TRACEETD(p) ((void)(gtrace && (traceetd(p), 1)))
#define TRACEETG(pts, curetd, tsl, etagimct) \
((void)(gtrace && (traceetg(pts, curetd, tsl, etagimct), 1)))
#define TRACEGI(stagenm, gi, mod, pos, Tstart) \
((void)(ctrace && (tracegi(stagenm, gi, mod, pos, Tstart), 1)))
#define TRACEGML(scb, pss, conactsw, conact) \
((void)(trace && (tracegml(scb, pss, conactsw, conact), 1)))
#define TRACEGRP(p) ((void)(gtrace && (tracegrp(p), 1)))
#define TRACEID(action, p) ((void)(itrace && (traceid(action, p), 1)))
#define TRACEMD(p) ((void)(dtrace && (tracemd(p), 1)))
#define TRACEMOD(p) ((void)(gtrace && (tracemod(p), 1)))
#define TRACEMS(action, code, mslevel, msplevel) \
((void)(mtrace && (tracems(action, code, mslevel, msplevel), 1)))
#define TRACENGR(p) ((void)(gtrace && (tracengr(p), 1)))
#define TRACEPCB(p) ((void)(trace && (tracepcb(p), 1)))
#define TRACEPRO() (tracepro())
#define TRACESET() (traceset())
#define TRACESRM(action, pg, gi) \
((void)(etrace && (tracesrm(action, pg, gi), 1)))
#define TRACESTG(curetd, dataret, rc, nextetd, mexts) \
((void)(gtrace && (tracestg(curetd, dataret, rc, nextetd, mexts), 1)))
#define TRACESTK(pts, ts2, etictr) \
((void)(gtrace && (tracestk(pts, ts2, etictr), 1)))
#define TRACETKN(scope, lextoke) \
((void)(trace && (tracetkn(scope, lextoke), 1)))
#define TRACEVAL(pcb, atype, aval, tokencnt) \
((void)(atrace && (traceval(pcb, atype, aval, tokencnt), 1)))
#else /* not TRACE */
#define TRACEADL(al) /* empty */
#define TRACECON(etagimct, dostag, datarc, pcb, conrefsw, didreq) /* empty */
#define TRACEDCN(dcn) /* empty */
#define TRACEDSK(pts, ptso, ts3, etictr) /* empty */
#define TRACEECB(action, p) /* empty */
#define TRACEEND(stagenm, mod, pos, rc, opt, Tstart) /* empty */
#define TRACEESN(p) /* empty */
#define TRACEETG(pts, curetd, tsl, etagimct) /* empty */
#define TRACEETD(p) /* empty */
#define TRACEGI(stagenm, gi, mod, pos, Tstart) /* empty */
#define TRACEGML(scb, pss, conactsw, conact) /* empty */
#define TRACEGRP(p) /* empty */
#define TRACEID(action, p) /* empty */
#define TRACEMD(p) /* empty */
#define TRACEMOD(p) /* empty */
#define TRACEMS(action, code, mslevel, msplevel) /* empty */
#define TRACENGR(p) /* empty */
#define TRACEPCB(p) /* empty */
#define TRACEPRO() /* empty */
#define TRACESET() /* empty */
#define TRACESRM(action, pg, gi) /* empty */
#define TRACESTG(curetd, dataret, rc, nextetd, mexts) /* empty */
#define TRACESTK(pts, ts2, etictr) /* empty */
#define TRACETKN(scope, lextoke) /* empty */
#define TRACEVAL(pcb, atype, aval, tokencnt) /* empty */
#endif /* not TRACE */

View File

@ -0,0 +1,465 @@
#include "sgmlincl.h" /* #INCLUDE statements for SGML parser. */
#ifdef TRACE
#include "context.h"
/* Token status: RCHIT RCMISS RCEND RCREQ RCNREQ */
#define STATUX tags[ts].status
/* Trace variables.
*/
int trace = 0; /* Switch: 1=trace state transitions; 0=don't. */
int atrace = 0; /* Switch: 1=trace attribute activity; 0=don't. */
int ctrace = 0; /* Switch: 1=trace context checking; 0=don't. */
int dtrace = 0; /* Switch: 1=trace declaration parsing; 0=don't.*/
int etrace = 0; /* Switch: 1=trace entity activity; 0=don't.*/
int gtrace = 0; /* Switch: 1=trace group creations; 0=don't. */
int itrace = 0; /* Switch: 1=trace ID activity; 0=don't. */
int mtrace = 0; /* Switch: 1=trace MS activity; 0=don't. */
int ntrace = 0; /* Switch: 1=trace notation activity; 0=don't. */
char emd[] = "EMD"; /* For "EMD" parameter type in dtrace calls. */
/* Return a printable representation of c.
*/
static
char *printable(c)
int c;
{
static char buf[5];
if (c >= 040 && c < 0177) {
buf[0] = c;
buf[1] = '\0';
}
else
sprintf(buf, "\\%03o", (UNCH)c);
return buf;
}
static
VOID dotrace(s)
char *s;
{
trace = (s && strchr(s, 't') != 0);
atrace = (s && strchr(s, 'a') != 0);
ctrace = (s && strchr(s, 'c') != 0);
dtrace = (s && strchr(s, 'd') != 0);
etrace = (s && strchr(s, 'e') != 0);
gtrace = (s && strchr(s, 'g') != 0);
itrace = (s && strchr(s, 'i') != 0);
mtrace = (s && strchr(s, 'm') != 0);
ntrace = (s && strchr(s, 'n') != 0);
}
/* TRACESET: Set switches for tracing body of document.
*/
VOID traceset()
{
dotrace(sw.trace);
if (trace||atrace||ctrace||dtrace||etrace||gtrace||itrace||mtrace||ntrace)
fprintf(stderr,
"TRACESET: state=%d;att=%d;con=%d;dcl=%d;ent=%d;grp=%d;id=%d;ms=%d;dcn=%d.\n",
trace, atrace, ctrace, dtrace, etrace, gtrace, itrace,
mtrace, ntrace);
}
/* TRACEPRO: Set switches for tracing prolog.
*/
VOID tracepro()
{
dotrace(sw.ptrace);
if (trace||atrace||dtrace||etrace||gtrace||mtrace||ntrace)
fprintf(stderr,
"TRACEPRO: state=%d; att=%d; dcl=%d; ent=%d; grp=%d; ms=%d; dcn=%d.\n",
trace, atrace, dtrace, etrace, gtrace, mtrace, ntrace);
}
/* TRACEPCB: Trace character just parsed and other pcb data.
*/
VOID tracepcb(pcb)
struct parse *pcb;
{
fprintf(stderr, "%-8s %2u-%2u-%2u-%2u from %s [%3d] in %s, %d:%d.\n",
pcb->pname, pcb->state, pcb->input, pcb->action,
pcb->newstate, printable(*FPOS), *FPOS, ENTITY+1, RCNT,
RSCC+FPOS+1-FBUF);
}
/* TRACETKN: Trace character just read during token parse.
*/
VOID tracetkn(scope, lextoke)
int scope;
UNCH lextoke[]; /* Lexical table for token and name parses. */
{
fprintf(stderr, "TOKEN %2d-%2d from %s [%3d] in %s, %d:%d.\n",
scope, lextoke[*FPOS],
printable(*FPOS), *FPOS, ENTITY+1, RCNT,
RSCC+FPOS+1-FBUF);
}
/* TRACEGML: Trace state of main SGML driver routine.
*/
VOID tracegml(scb, pss, conactsw, conact)
struct restate *scb;
int pss, conactsw, conact;
{
fprintf(stderr,
"SGML%02d %2d-%2d-%2d-%2d in main driver; conactsw=%d; conact=%d.\n",
pss, scb[pss].sstate, scb[pss].sinput, scb[pss].saction,
scb[pss].snext, conactsw, conact);
}
/* TRACEVAL: Trace parse of an attribute value that is a token list.
*/
VOID traceval(pcb, atype, aval, tokencnt)
struct parse *pcb;
UNS atype; /* Type of token list expected. */
UNCH *aval; /* Value string to be parsed as token list. */
int tokencnt; /* Number of tokens found in attribute value. */
{
fprintf(stderr,
"%-8s %2d-%2d-%2d-%2d at %p, atype=%02x, tokencnt=%d: ",
pcb->pname, pcb->state, pcb->input, pcb->action,
pcb->newstate, (UNIV)aval, atype, tokencnt);
fprintf(stderr, "%s\n", aval);
}
/* TRACESTK: Trace entry just placed on tag stack.
*/
VOID tracestk(pts, ts2, etictr)
struct tag *pts; /* Stack entry for this tag. */
int ts2; /* Stack depth. */
int etictr; /* Number of "netok" tags on stack. */
{
fprintf(stderr,
"STACK %s begun; stack depth %d; tflag=%02x; etictr=%d",
pts->tetd->etdgi+1, ts2, pts->tflags, etictr);
fprintf(stderr, " srm=%s.\n",
pts->tsrm!=SRMNULL ? (char *)(pts->tsrm[0]->ename+1) : "#EMPTY");
}
/* TRACEDSK: Trace entry just removed from tag stack.
*/
VOID tracedsk(pts, ptso, ts3, etictr)
struct tag *pts; /* Stack entry for new open tag. */
struct tag *ptso; /* Stack entry for tag just ended. */
int ts3; /* Stack depth. */
int etictr; /* Number of "netok" tags on stack. */
{
fprintf(stderr,
"DESTACK %s ended; otflag=%02x; %s resumed; depth=%d; tflag=%02x; etictr=%d",
ptso->tetd->etdgi+1, ptso->tflags,
pts->tetd->etdgi+1, ts3, pts->tflags, etictr);
fprintf(stderr, " srm=%s.\n",
pts->tsrm!=SRMNULL ? (char *)(pts->tsrm[0]->ename+1) : "#EMPTY");
}
/* TRACECON: Trace interactions between content parse and stag/context
processing.
*/
VOID tracecon(etagimct, dostag, datarc, pcb, conrefsw, didreq)
int etagimct; /* Implicitly ended elements left on stack. */
int dostag; /* 1=retry newetd instead of parsing; 0=parse. */
int datarc; /* Return code for data: DAF_ or REF_ or zero. */
struct parse *pcb; /* Parse control block for this parse. */
int conrefsw; /* 1=content reference att specified; 0=no. */
int didreq; /* 1=required implied empty tag processed; 0=no.*/
{
fprintf(stderr,
"CONTENT etagimct=%d dostag=%d datarc=%d pname=%s action=%d \
conrefsw=%d didreq=%d\n",
etagimct, dostag, datarc, pcb->pname, pcb->action,
conrefsw, didreq);
}
/* TRACESTG: Trace start-tag context validation input and results.
*/
VOID tracestg(curetd, dataret, rc, nextetd, mexts)
struct etd *curetd; /* The etd for this tag. */
int dataret; /* Data pending: DAF_ REF_ 0=not #PCDATA. */
int rc; /* Return code from context or other test. */
struct etd *nextetd; /* The etd for a forced start-tag (if rc==2). */
int mexts; /* >0=stack level of minus grp; -1=plus; 0=none.*/
{
fprintf(stderr,
"STARTTAG newetd=%p; dataret=%d; rc=%d; nextetd=%p; mexts=%d.\n",
(UNIV)curetd, dataret, rc, (UNIV)nextetd, mexts);
}
/* TRACEETG: Trace end-tag matching test on stack.
*/
VOID traceetg(pts, curetd, tsl, etagimct)
struct tag *pts; /* Stack entry for this tag. */
struct etd *curetd; /* The etd for this tag. */
int tsl; /* Temporary stack level for looping. */
int etagimct; /* Num of implicitly ended tags left on stack. */
{
fprintf(stderr,
"ENDTAG tsl=%d; newetd=%p; stacketd=%p; tflags=%02x; etagimct=%d.\n",
tsl, (UNIV)curetd, (UNIV)pts->tetd, pts->tflags, etagimct);
}
/* TRACEECB: Trace entity control block activity.
*/
VOID traceecb(action, p)
char *action;
struct entity *p;
{
static char estype1[] = " TMMMSEIXCNFPDLK";
static char estype2[] = " DS ";
if (!p)
return;
fprintf(stderr,
"%-8s (es=%d) type %c%c entity %s at %p containing ",
action, es, estype1[p->estore], estype2[p->estore], p->ename+1,
(UNIV)p);
if (p->estore==ESN && strcmp(action, "ENTDEF"))
traceesn(p->etx.n);
else if (p->etx.x==0)
fprintf(stderr, "[NOTHING]");
else
fprintf(stderr, "%s",
p->etx.c[0] ? (char *)p->etx.c : "[EMPTY]");
putc('\n', stderr);
}
/* TRACEDCN: Trace data content notation activity.
*/
VOID tracedcn(p)
struct dcncb *p;
{
fprintf(stderr,
"DCN dcn=%p; adl=%p; notation is %s\n",
(UNIV)p, (UNIV)p->adl, p->ename+1);
if (p->adl)
traceadl(p->adl);
}
/* TRACEESN: Print a data entity control block.
*/
VOID traceesn(p)
PNE p;
{
fprintf(stderr, "ESN Entity name is %s; entity type is %s.\n",
(NEENAME(p)!=0) ? ((char *)NEENAME(p))+1 : "[UNDEFINED]",
/* NEXTYPE(p)); */
(NEXTYPE(p)==1 ? "CDATA" : (NEXTYPE(p)==2 ? "NDATA" : "SDATA")));
fprintf(stderr, " System ID is %s\n",
(NEID(p)!=0) ? (char *)NEID(p) : "[UNDEFINED]");
if (p->nedcn!=0)
tracedcn(p->nedcn);
}
/* TRACESRM: Print the members of a short reference map.
*/
VOID tracesrm(action, pg, gi)
char *action;
TECB pg;
UNCH *gi;
{
int i = 0; /* Loop counter. */
if (pg==SRMNULL)
fprintf(stderr, "%-8s SHORTREF table empty for %s.\n", action, gi);
else {
fprintf(stderr, "%-8s %s at %p mapped for %s.\n",
action, pg[0]->ename+1, (UNIV)pg,
gi ? (char *)gi : "definition");
while (++i<=lex.s.dtb[0].mapdata)
if (pg[i])
fprintf(stderr, "%14s%02u %p %s\n",
"SR", i, (UNIV)pg[i], pg[i]->ename+1);
}
}
/* TRACEADL: Print an attribute definition list.
*/
VOID traceadl(al)
struct ad al[];
{
int i=0;
fprintf(stderr, "ADLIST %p %d membe%s; %d attribut%s\n",
(UNIV)al, ADN(al), ADN(al)==1 ? "r" : "rs", AN(al),
AN(al)==1 ? "e" : "es");
while (++i<=ADN(al)) {
fprintf(stderr,
(BITOFF(ADFLAGS(al,i), AGROUP) && ADTYPE(al,i)<=ANOTEGRP)
? " %p %-8s %02x %02x %2d %2d %p %p\n"
: " %p %-8s %02x %02x %2d %2d %p %p\n",
&al[i], ADNAME(al,i), ADFLAGS(al,i), ADTYPE(al,i), ADNUM(al,i),
ADLEN(al,i), ADVAL(al,i), ADDATA(al,i).x);
if (ADVAL(al,i)) {
fprintf(stderr, "%s", ADVAL(al,i));
if (ADTYPE(al,i)==AENTITY && ADDATA(al,i).n!=0) {
fprintf(stderr, "=>");
traceesn(ADDATA(al,i).n);
}
else if (ADTYPE(al,i)==ANOTEGRP)
fprintf(stderr, "=>%s",
(ADDATA(al,i).x->dcnid!=0)
? (char *)ADDATA(al,i).x->dcnid
: "[UNDEFINED]");
}
else
fprintf(stderr, "[%s]",
GET(ADFLAGS(al,i), AREQ)
? "REQUIRED"
: (GET(ADFLAGS(al,i), ACURRENT) ? "CURRENT" : "NULL"));
}
fprintf(stderr, "\n");
}
/* TRACEMOD: Print the members of a model.
*/
VOID tracemod(pg)
struct thdr pg[];
{
fprintf(stderr, "MODEL %p %02x %d\n",
(UNIV)&pg[0], pg[0].ttype, pg[0].tu.tnum);
if ((pg[0].ttype & MKEYWORD) == 0) {
int i;
for (i = 1; i < pg[0].tu.tnum + 2; i++) {
if (GET(pg[i].ttype, TTMASK) == TTETD)
fprintf(stderr, " %p %02x %s\n",
(UNIV)&pg[i], pg[i].ttype, pg[i].tu.thetd->etdgi+1);
else if (GET(pg[i].ttype, TTMASK) == TTCHARS)
fprintf(stderr, " %p %02x %s\n",
(UNIV)&pg[i], pg[i].ttype, "#PCDATA");
else
fprintf(stderr, " %p %02x %d\n",
(UNIV)&pg[i], pg[i].ttype, pg[i].tu.tnum);
}
}
fprintf(stderr, "\n");
}
/* TRACEGRP: Print the members of a name (i.e., etd) group.
*/
VOID tracegrp(pg)
struct etd *pg[];
{
int i = -1; /* Loop counter. */
fprintf(stderr, "ETDGRP %p\n", (UNIV)pg);
while (pg[++i]!=0)
fprintf(stderr, " %p %s\n", (UNIV)pg[i], pg[i]->etdgi+1);
}
/* TRACENGR: Print the members of a notation (i.e., dcncb) group.
*/
VOID tracengr(pg)
struct dcncb *pg[];
{
int i = -1; /* Loop counter. */
fprintf(stderr, "DCNGRP %p\n", (UNIV)pg);
while (pg[++i]!=0)
fprintf(stderr, " %p %s\n", (UNIV)pg[i], pg[i]->ename+1);
}
/* TRACEETD: Print an element type definition.
*/
VOID traceetd(p)
struct etd *p; /* Pointer to an etd. */
{
fprintf(stderr,
"ETD etd=%p %s min=%02x cmod=%p ttype=%02x mex=%p, pex=%p, ",
(UNIV)p, p->etdgi+1, p->etdmin, (UNIV)p->etdmod,
p->etdmod->ttype, (UNIV)p->etdmex, (UNIV)p->etdpex);
fprintf(stderr, "adl=%p, srm=%s.\n",
(UNIV)p->adl,
(p->etdsrm==SRMNULL)
? "#EMPTY"
: (p->etdsrm) ? (char *)(p->etdsrm[0]->ename+1) : "#CURRENT");
}
/* TRACEID: Print an ID control block.
*/
VOID traceid(action, p)
char *action;
struct id *p; /* Pointer to an ID. */
{
fprintf(stderr, "%-8s %s at %p is %s; ", action, p->idname+1, (UNIV)p,
p->iddefed ? "defined" : "undefined");
fprintf(stderr, "last ref=%p\n", (UNIV)p->idrl);
}
/* TRACEMD: Trace a markup declaration parameter.
*/
VOID tracemd(parmid)
char *parmid; /* Parameter identifier. */
{
fprintf(stderr, "MDPARM %-8s for %-8s, token %02d, type %02u, %s.\n",
mdname, subdcl ? (char *)subdcl : "[NONE]", parmno, pcbmd.action, parmid);
}
/* TRACEMS: Trace marked section activity.
*/
VOID tracems(action, code, mslevel, msplevel)
int action; /* 1=began new level; 0=resumed previous. */
int code;
int mslevel; /* Nesting level of marked sections. */
int msplevel; /* Nested MS levels subject to special parse. */
{
fprintf(stderr,
"MS%c %2d %s nesting level %d (msp %d).\n",
(action ? ' ' : 'E'), code, (action ? "began" : "resumed"),
mslevel, msplevel);
}
static
VOID tracehits(h)
unsigned long *h;
{
int i;
fprintf(stderr, " H=");
for (i = grplongs - 1; i >= 0; --i)
fprintf(stderr, "%0*lx", LONGBITS/4, h[i]);
}
/* TRACEGI: Trace GI testing stages in CONTEXT.C processing.
*/
VOID tracegi(stagenm, gi, mod, pos, Tstart)
char *stagenm;
struct etd *gi; /* ETD of new GI. */
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
int Tstart; /* Initial T for this group. */
{
int i = 0; /* Loop counter. */
fprintf(stderr, "%-10s %d:", stagenm, P);
while (++i<=P)
fprintf(stderr, " %d-%d", pos[i].g, pos[i].t);
fprintf(stderr, " (%u) gocc=%02x gtype=%02x gnum=%d",
M, GOCC, GTYPE, GNUM);
tracehits(H);
fprintf(stderr, " status=%d Tstart=%d\n", STATUX, Tstart);
fprintf(stderr,
"=>%-8s tocc=%02x ttype=%02x thetd=%p (%s) gietd=%p (%s)\n",
tags[ts].tetd->etdgi+1, TOCC, TTYPE, (UNIV)TOKEN.tu.thetd,
(TTYPE
? (TTYPE==TTETD ? (char *)(TOKEN.tu.thetd->etdgi+1) : "#GROUP")
: "#PCDATA"),
(UNIV)gi,
(gi==ETDCDATA ? "#PCDATA" : (char *)(gi->etdgi+1)));
}
/* TRACEEND: Trace testing for end of group in CONTEXT.C processing.
*/
VOID traceend(stagenm, mod, pos, rc, opt, Tstart)
char *stagenm;
struct thdr mod[]; /* Model of current open element. */
struct mpos pos[]; /* Position in open element's model. */
int rc; /* Return code: RCNREQ RCHIT RCMISS RCEND */
int opt; /* ALLHIT parm: 1=test optionals; 0=ignore. */
int Tstart; /* Initial T for this group. */
{
int i = 0; /* Loop counter. */
fprintf(stderr, "%-10s %d:", stagenm, P);
while (++i<=P)
fprintf(stderr, " %d-%d", pos[i].g, pos[i].t);
fprintf(stderr, " (%u) gocc=%02x gtype=%02x gnum=%d",
M, GOCC, GTYPE, GNUM);
tracehits(H);
fprintf(stderr, " status=%d Tstart=%d\n", STATUX, Tstart);
fprintf(stderr, "=>%-8s tocc=%02x ttype=%02x thetd=%p (%s)",
tags[ts].tetd->etdgi+1, TOCC, TTYPE, (UNIV)TOKEN.tu.thetd,
(TTYPE
? (TTYPE==TTETD ? (char *)(TOKEN.tu.thetd->etdgi+1) : "#GROUP")
: "#PCDATA"));
fprintf(stderr, " rc=%d offbitT=%d allhit=%d\n",
rc, offbit(H, (int)T, GNUM), allhit(&GHDR, H, 0, opt));
}
#endif /* TRACE */
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1,147 @@
/* unix.cfg: Configuration file for sgmls on Unix. */
/* A list of filename templates to use for searching for external entities.
The filenames are separated by the character specified in PATH_FILE_SEP.
See sgmls.man for details. */
#define DEFAULT_PATH "/usr/local/lib/sgml/%O/%C/%T:%N.%X:%N.%D"
/* The character that separates the filenames templates. */
#define PATH_FILE_SEP ':'
/* The character that separates filenames in a system identifier.
Usually the same as PATH_FILE_SEP. */
#define SYSID_FILE_SEP ':'
/* The environment variable that contains the list of filename templates. */
#define PATH_ENV_VAR "SGML_PATH"
/* MIN_DAT_SUBS_FROM and MIN_DATS_SUBS_TO tell sgmls how to transform a name
or system identifier into a legal filename. A character in
MIN_DAT_SUBS_FROM will be transformed into the character in the
corresponding position in MIN_DAT_SUBS_TO. If there is no such
position, then the character is removed. */
/* This says that spaces should be transformed to underscores, and
slashes to percents. */
#define MIN_DAT_SUBS_FROM " /"
#define MIN_DAT_SUBS_TO "_%"
/* Define this to allow tracing. */
/* #define TRACE 1 */
/* Define this you want support for subdocuments. This is implemented
using features that are not part of Standard C, so you might not want
to define it if you are porting to a new system. Otherwise I suggest
you leave it defined. */
#define SUPPORT_SUBDOC 1
/* Define HAVE_EXTENDED_PRINTF if your *printf functions supports
X/Open extensions; if they do, then, for example,
printf("%2$s%1$s", "bar", "foo")
should print `foobar'. */
/* #define HAVE_EXTENDED_PRINTF 1 */
/* Define HAVE_CAT if your system provides the X/Open message
catalogue functions catopen() and catgets(), and you want to use them.
An implementations of these functions is included and will be used if
you don't define this. On SunOS 4.1.1, if you do define this you
should set CC=/usr/xpg2bin/cc in the makefile. */
/* #define HAVE_CAT 1 */
#ifdef __STDC__
/* Define this if your compiler supports prototypes. */
#define USE_PROTOTYPES 1
#endif
/* Can't use <stdarg.h> without prototypes. */
#ifndef USE_PROTOTYPES
#define VARARGS 1
#endif
/* If your compiler defines __STDC__ but doesn't provide <stdarg.h>,
you must define VARARGS yourself here. */
/* #define VARARGS 1 */
/* Define this if you do not have strerror(). */
#define STRERROR_MISSING 1
/* Define this unless the character testing functions in ctype.h
are defined for all values representable as an unsigned char. You do
not need to define this if your system is ANSI C conformant. You
should define for old Unix systems. */
/* #define USE_ISASCII 1 */
/* Define this if your system provides the BSD style string operations
rather than ANSI C ones (eg bcopy() rather than memcpy(), and index()
rather than strchr()). */
/* #define BSD_STRINGS 1 */
/* Define this if you have getopt(). */
#define HAVE_GETOPT 1
/* Define this if you have access(). */
#define HAVE_ACCESS 1
/* Define this if you have <unistd.h>. */
#define HAVE_UNISTD_H 1
/* Define this if you have <sys/stat.h>. */
#define HAVE_SYS_STAT_H 1
/* Define this if you have waitpid(). */
#define HAVE_WAITPID 1
/* Define this if your system is POSIX.1 (ISO 9945-1:1990) compliant. */
#define POSIX 1
/* Define this if you have the vfork() system call. */
#define HAVE_VFORK 1
/* Define this if you have <vfork.h>. */
#define HAVE_VFORK_H 1
/* Define this if you don't have <stdlib.h> */
/* #define STDLIB_H_MISSING 1 */
/* Define this if you don't have <stddef.h> */
/* #define STDDEF_H_MISSING 1 */
/* Define this if you don't have <limits.h> */
/* #define LIMITS_H_MISSING 1 */
/* Define this if you don't have remove(); unlink() will be used instead. */
#define REMOVE_MISSING 1
/* Define this if you don't have raise(); kill() will be used instead. */
#define RAISE_MISSING 1
/* Define this if you don't have fsetpos() and fgetpos(). */
#define FPOS_MISSING 1
/* Universal pointer type. */
/* If your compiler doesn't fully support void *, change `void' to `char'. */
typedef void *UNIV;
/* If your compiler doesn't support void as a function return type,
change `void' to `int'. */
typedef void VOID;
/* If you don't have an ANSI C conformant <limits.h>, define
CHAR_SIGNED as 1 or 0 according to whether the `char' type is signed.
The <limits.h> on some versions of System Release V 3.2 is not ANSI C
conformant: the value of CHAR_MIN is 0 even though the `char' type is
signed. */
/* #define CHAR_SIGNED 1 */
/* #define CHAR_SIGNED 0 */
#ifndef CHAR_SIGNED
#include <limits.h>
#if CHAR_MIN < 0
#define CHAR_SIGNED 1
#else
#define CHAR_SIGNED 0
#endif
#endif /* not CHAR_SIGNED */
/* Assume the system character set is ISO Latin-1. */
#include "latin1.h"

View File

@ -0,0 +1,98 @@
/* unixproc.c -
Unix implementation of run_process().
Written by James Clark (jjc@jclark.com).
*/
#include "config.h"
#ifdef SUPPORT_SUBDOC
#ifdef POSIX
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#endif /* POSIX */
#include "std.h"
#include "entity.h"
#include "appl.h"
#ifndef POSIX
#define WIFSTOPPED(s) (((s) & 0377) == 0177)
#define WIFSIGNALED(s) (((s) & 0377) != 0 && ((s) & 0377 != 0177))
#define WIFEXITED(s) (((s) & 0377) == 0)
#define WEXITSTATUS(s) (((s) >> 8) & 0377)
#define WTERMSIG(s) ((s) & 0177)
#define WSTOPSIG(s) (((s) >> 8) & 0377)
#define _SC_OPEN_MAX 0
#define sysconf(name) (20)
typedef int pid_t;
#endif /* not POSIX */
#ifndef HAVE_VFORK
#define vfork() fork()
#endif /* not HAVE_VFORK */
#ifdef HAVE_VFORK_H
#include <vfork.h>
#endif /* HAVE_VFORK_H */
int run_process(argv)
char **argv;
{
pid_t pid;
int status;
int ret;
/* Can't trust Unix implementations to support fflush(NULL). */
fflush(stderr);
fflush(stdout);
pid = vfork();
if (pid == 0) {
/* child */
int i;
int open_max = (int)sysconf(_SC_OPEN_MAX);
for (i = 3; i < open_max; i++)
(void)close(i);
execvp(argv[0], argv);
appl_error(E_EXEC, argv[0], strerror(errno));
fflush(stderr);
_exit(127);
}
if (pid < 0) {
appl_error(E_FORK, strerror(errno));
return -1;
}
/* parent */
while ((ret = wait(&status)) != pid)
if (ret < 0) {
appl_error(E_WAIT, strerror(errno));
return -1;
}
if (WIFSIGNALED(status)) {
appl_error(E_SIGNAL, argv[0], WTERMSIG(status));
return -1;
}
/* Must have exited normally. */
return WEXITSTATUS(status);
}
#endif /* SUPPORT_SUBDOC */
/*
Local Variables:
c-indent-level: 5
c-continued-statement-offset: 5
c-brace-offset: -5
c-argdecl-indent: 0
c-label-offset: -5
End:
*/

View File

@ -0,0 +1 @@
char *version_string = "1.1";

View File

@ -0,0 +1,568 @@
/* xfprintf.c -
X/Open extended v?fprintf implemented in terms of v?fprintf.
Written by James Clark (jjc@jclark.com).
*/
/* Compile with:
-DVARARGS to use varargs.h instead of stdarg.h
-DLONG_DOUBLE_MISSING if your compiler doesn't like `long double'
-DFP_SUPPORT to include floating point stuff
*/
#include "config.h"
#ifndef HAVE_EXTENDED_PRINTF
#include "std.h"
#ifdef lint
/* avoid stupid lint warnings */
#undef va_arg
#define va_arg(ap, type) (ap, (type)0)
#endif
#ifdef FP_SUPPORT
#ifdef LONG_DOUBLE_MISSING
typedef double long_double;
#else
typedef long double long_double;
#endif
#endif /* FP_SUPPORT */
#ifndef __STDC__
#define const /* as nothing */
#endif
#ifdef USE_PROTOTYPES
#define P(parms) parms
#else
#define P(parms) ()
#endif
#ifdef VARARGS
typedef int (*printer)();
#else
typedef int (*printer)(UNIV, const char *, ...);
#endif
enum arg_type {
NONE,
INT,
UNSIGNED,
LONG,
UNSIGNED_LONG,
#ifdef FP_SUPPORT
DOUBLE,
LONG_DOUBLE,
#endif /* FP_SUPPORT */
PCHAR,
PINT,
PLONG,
PSHORT
};
union arg {
int i;
unsigned u;
long l;
unsigned long ul;
#ifdef FP_SUPPORT
double d;
long_double ld;
#endif /* FP_SUPPORT */
char *pc;
UNIV pv;
int *pi;
short *ps;
long *pl;
};
#define NEXT 0
#define MISSING 10
struct spec {
enum arg_type type;
char pos;
char field_width;
char precision;
};
#define FLAG_CHARS "-+ #0"
static int parse_spec P((const char **, struct spec *));
static int find_arg_types P((const char *, enum arg_type *));
static void get_arg P((enum arg_type, va_list *, union arg *));
static int do_arg P((UNIV, printer, const char *, enum arg_type, union arg *));
static int xdoprt P((UNIV, printer, const char *, va_list));
static int printit P((UNIV, printer, const char *, va_list, int, union arg *));
static int maybe_positional P((const char *));
/* Return 1 if sucessful, 0 otherwise. **pp points to character after % */
static int parse_spec(pp, sp)
const char **pp;
struct spec *sp;
{
char modifier = 0;
sp->pos = NEXT;
if (isdigit((unsigned char)(**pp)) && (*pp)[1] == '$') {
if (**pp == '0')
return 0;
sp->pos = **pp - '0';
*pp += 2;
}
while (**pp != '\0' && strchr(FLAG_CHARS, **pp))
*pp += 1;
/* handle the field width */
sp->field_width = MISSING;
if (**pp == '*') {
*pp += 1;
if (isdigit((unsigned char)**pp) && (*pp)[1] == '$') {
if (**pp == '0')
return 0;
sp->field_width = **pp - '0';
*pp += 2;
}
else
sp->field_width = NEXT;
}
else {
while (isdigit((unsigned char)**pp))
*pp += 1;
}
/* handle the precision */
sp->precision = MISSING;
if (**pp == '.') {
*pp += 1;
if (**pp == '*') {
*pp += 1;
if (isdigit((unsigned char)**pp) && (*pp)[1] == '$') {
if (**pp == '0')
return 0;
sp->precision = **pp - '0';
*pp += 2;
}
else
sp->precision = NEXT;
}
else {
while (isdigit((unsigned char)**pp))
*pp += 1;
}
}
/* handle h l or L */
if (**pp == 'h' || **pp == 'l' || **pp == 'L') {
modifier = **pp;
*pp += 1;
}
switch (**pp) {
case 'd':
case 'i':
sp->type = modifier == 'l' ? LONG : INT;
break;
case 'o':
case 'u':
case 'x':
case 'X':
sp->type = modifier == 'l' ? UNSIGNED_LONG : UNSIGNED;
break;
#ifdef FP_SUPPORT
case 'e':
case 'E':
case 'f':
case 'g':
case 'G':
sp->type = modifier == 'L' ? LONG_DOUBLE : DOUBLE;
break;
#endif /* FP_SUPPORT */
case 'c':
sp->type = INT;
break;
case 's':
sp->type = PCHAR;
break;
case 'p':
/* a pointer to void has the same representation as a pointer to char */
sp->type = PCHAR;
break;
case 'n':
if (modifier == 'h')
sp->type = PSHORT;
else if (modifier == 'l')
sp->type = PLONG;
else
sp->type = PINT;
break;
case '%':
sp->type = NONE;
break;
default:
return 0;
}
*pp += 1;
return 1;
}
static int find_arg_types(format, arg_type)
const char *format;
enum arg_type *arg_type;
{
int i, pos;
const char *p;
struct spec spec;
for (i = 0; i < 9; i++)
arg_type[i] = NONE;
pos = 0;
p = format;
while (*p)
if (*p == '%') {
p++;
if (!parse_spec(&p, &spec))
return 0;
if (spec.type != NONE) {
int n;
if (spec.pos == NEXT)
n = pos++;
else
n = spec.pos - 1;
if (n < 9) {
enum arg_type t = arg_type[n];
if (t != NONE && t != spec.type)
return 0;
arg_type[n] = spec.type;
}
}
if (spec.field_width != MISSING) {
int n;
if (spec.field_width == NEXT)
n = pos++;
else
n = spec.field_width - 1;
if (n < 9) {
enum arg_type t = arg_type[n];
if (t != NONE && t != INT)
return 0;
arg_type[n] = INT;
}
}
if (spec.precision != MISSING) {
int n;
if (spec.precision == NEXT)
n = pos++;
else
n = spec.precision - 1;
if (n < 9) {
enum arg_type t = arg_type[n];
if (t != NONE && t != INT)
return 0;
arg_type[n] = INT;
}
}
}
else
p++;
return 1;
}
static void get_arg(arg_type, app, argp)
enum arg_type arg_type;
va_list *app;
union arg *argp;
{
switch (arg_type) {
case NONE:
break;
case INT:
argp->i = va_arg(*app, int);
break;
case UNSIGNED:
argp->u = va_arg(*app, unsigned);
break;
case LONG:
argp->l = va_arg(*app, long);
break;
case UNSIGNED_LONG:
argp->ul = va_arg(*app, unsigned long);
break;
#ifdef FP_SUPPORT
case DOUBLE:
argp->d = va_arg(*app, double);
break;
case LONG_DOUBLE:
argp->ld = va_arg(*app, long_double);
break;
#endif /* FP_SUPPORT */
case PCHAR:
argp->pc = va_arg(*app, char *);
break;
case PINT:
argp->pi = va_arg(*app, int *);
break;
case PSHORT:
argp->ps = va_arg(*app, short *);
break;
case PLONG:
argp->pl = va_arg(*app, long *);
break;
default:
abort();
}
}
static int do_arg(handle, func, buf, arg_type, argp)
UNIV handle;
printer func;
const char *buf;
enum arg_type arg_type;
union arg *argp;
{
switch (arg_type) {
case NONE:
return (*func)(handle, buf);
case INT:
return (*func)(handle, buf, argp->i);
case UNSIGNED:
return (*func)(handle, buf, argp->u);
case LONG:
return (*func)(handle, buf, argp->l);
case UNSIGNED_LONG:
return (*func)(handle, buf, argp->ul);
#ifdef FP_SUPPORT
case DOUBLE:
return (*func)(handle, buf, argp->d);
case LONG_DOUBLE:
return (*func)(handle, buf, argp->ld);
#endif /* FP_SUPPORT */
case PCHAR:
return (*func)(handle, buf, argp->pc);
case PINT:
return (*func)(handle, buf, argp->pi);
case PSHORT:
return (*func)(handle, buf, argp->ps);
case PLONG:
return (*func)(handle, buf, argp->pl);
default:
abort();
}
/* NOTREACHED */
}
static int printit(handle, func, p, ap, nargs, arg)
UNIV handle;
printer func;
const char *p;
va_list ap;
int nargs;
union arg *arg;
{
char buf[512]; /* enough for a spec */
int count = 0;
int pos = 0;
while (*p)
if (*p == '%') {
char *q;
struct spec spec;
const char *start;
int had_field_width;
union arg *argp;
union arg a;
int res;
start = ++p;
if (!parse_spec(&p, &spec))
abort(); /* should have caught it in find_arg_types */
buf[0] = '%';
q = buf + 1;
if (spec.pos != NEXT)
start += 2;
/* substitute in precision and field width if necessary */
had_field_width = 0;
while (start < p) {
if (*start == '*') {
char c;
int n, val;
start++;
if (!had_field_width && spec.field_width != MISSING) {
c = spec.field_width;
had_field_width = 1;
}
else
c = spec.precision;
if (c == NEXT)
n = pos++;
else {
start += 2;
n = c - 1;
}
if (n >= nargs)
val = va_arg(ap, int);
else
val = arg[n].i;
/* ignore negative precision */
if (val >= 0 || q[-1] != '.') {
(void)sprintf(q, "%d", val);
q = strchr(q, '\0');
}
}
else
*q++ = *start++;
}
*q++ = '\0';
argp = 0;
if (spec.type != NONE) {
int n = spec.pos == NEXT ? pos++ : spec.pos - 1;
if (n >= nargs) {
get_arg(spec.type, &ap, &a);
argp = &a;
}
else
argp = arg + n;
}
res = do_arg(handle, func, buf, spec.type, argp);
if (res < 0)
return -1;
count += res;
}
else {
if ((*func)(handle, "%c", *p++) < 0)
return -1;
count++;
}
return count;
}
/* Do a quick check to see if it may contains any positional thingies. */
static int maybe_positional(format)
const char *format;
{
const char *p;
p = format;
for (;;) {
p = strchr(p, '$');
if (!p)
return 0;
if (p - format >= 2
&& isdigit((unsigned char)p[-1])
&& (p[-2] == '%' || p[-2] == '*'))
break; /* might be a positional thingy */
}
return 1;
}
static int xdoprt(handle, func, format, ap)
UNIV handle;
printer func;
const char *format;
va_list ap;
{
enum arg_type arg_type[9];
union arg arg[9];
int nargs, i;
if (!find_arg_types(format, arg_type))
return -1;
for (nargs = 0; nargs < 9; nargs++)
if (arg_type[nargs] == NONE)
break;
for (i = nargs; i < 9; i++)
if (arg_type[i] != NONE)
return -1;
for (i = 0; i < nargs; i++)
get_arg(arg_type[i], &ap, arg + i);
return printit(handle, func, format, ap, nargs, arg);
}
#ifdef VARARGS
static int do_fprintf(va_alist) va_dcl
#else
static int do_fprintf(UNIV p, const char *format,...)
#endif
{
#ifdef VARARGS
UNIV p;
const char *format;
#endif
va_list ap;
int res;
#ifdef VARARGS
va_start(ap);
p = va_arg(ap, UNIV);
format = va_arg(ap, char *);
#else
va_start(ap, format);
#endif
res = vfprintf((FILE *)p, format, ap);
va_end(ap);
return res;
}
#ifdef VARARGS
int xfprintf(va_alist) va_dcl
#else
int xfprintf(FILE *fp, const char *format, ...)
#endif
{
#ifdef VARARGS
FILE *fp;
char *format;
#endif
va_list ap;
int res;
#ifdef VARARGS
va_start(ap);
fp = va_arg(ap, FILE *);
format = va_arg(ap, char *);
#else
va_start(ap, format);
#endif
if (maybe_positional(format))
res = xdoprt((UNIV)fp, do_fprintf, format, ap);
else
res = vfprintf(fp, format, ap);
va_end(ap);
return res;
}
int xvfprintf(fp, format, ap)
FILE *fp;
const char *format;
va_list ap;
{
int res;
if (maybe_positional(format))
res = xdoprt((UNIV)fp, do_fprintf, format, ap);
else
res = vfprintf(fp, format, ap);
return res;
}
#endif /* not HAVE_EXTENDED_PRINTF */

View File

@ -0,0 +1,18 @@
#
# Bmakefile for sgmlsasp
#
# $id$
#
PROG= sgmlsasp
SRCS+= sgmlsasp.c replace.c
CFLAGS+= -I${.CURDIR}/../libsgmls -I${.CURDIR}/../sgmls
LDADD= ${LIBSGMLS}
DPADD= ${LIBSGMLS}
.include "../Makefile.inc"
.include <bsd.prog.mk>

View File

@ -0,0 +1,467 @@
/* replace.c
Parse ASP style replacement file.
Written by James Clark (jjc@jclark.com). */
#include "sgmlsasp.h"
#include "replace.h"
#define TABLE_SIZE 251
struct table_entry {
enum event_type type;
char *gi;
struct replacement replacement;
struct table_entry *next;
};
struct replacement_table {
struct table_entry *table[TABLE_SIZE];
};
struct buffer {
char *s;
unsigned len;
unsigned size;
};
/* Tokens returned by get_token(). */
#define STRING 1
#define STAGO 2
#define ETAGO 3
#define PLUS 4
static int get P((void));
static int peek P((void));
static int get_token P((void));
static void scan_name P((struct buffer *, int));
static struct replacement *define_replacement
P((struct replacement_table *, enum event_type, char *));
static struct replacement_item **parse_string
P((struct replacement_item **, int));
static UNIV xmalloc P((unsigned));
static UNIV xrealloc P((UNIV, unsigned));
static struct replacement_item **add_replacement_data
P((struct replacement_item **, char *, unsigned));
static struct replacement_item **add_replacement_attr
P((struct replacement_item **, char *));
static int hash P((enum event_type, char *));
static NO_RETURN void parse_error VP((char *,...));
static VOID buffer_init P((struct buffer *));
static VOID buffer_append P((struct buffer *, int));
static char *buffer_extract P((struct buffer *));
#if 0
static VOID buffer_free P((struct buffer *));
#endif
#define buffer_length(buf) ((buf)->len)
#define NEW(type) ((type *)xmalloc(sizeof(type)))
static int current_lineno;
static char *current_file;
static FILE *fp;
struct replacement_table *make_replacement_table()
{
int i;
struct replacement_table *tablep;
tablep = NEW(struct replacement_table);
for (i = 0; i < TABLE_SIZE; i++)
tablep->table[i] = 0;
return tablep;
}
void load_replacement_file(tablep, file)
struct replacement_table *tablep;
char *file;
{
int tok;
struct buffer name;
buffer_init(&name);
errno = 0;
fp = fopen(file, "r");
if (!fp) {
if (errno)
error("can't open `%s': %s", file, strerror(errno));
else
error("can't open `%s'", file);
}
current_lineno = 1;
current_file = file;
tok = get_token();
while (tok != EOF) {
struct replacement *p;
struct replacement_item **tail;
enum event_type type;
if (tok != STAGO && tok != ETAGO)
parse_error("syntax error");
type = tok == STAGO ? START_ELEMENT : END_ELEMENT;
scan_name(&name, '>');
p = define_replacement(tablep, type, buffer_extract(&name));
tok = get_token();
if (tok == PLUS) {
if (p)
p->flags |= NEWLINE_BEGIN;
tok = get_token();
}
tail = p ? &p->items : 0;
while (tok == STRING) {
tail = parse_string(tail, type == START_ELEMENT);
tok = get_token();
}
if (tok == PLUS) {
if (p)
p->flags |= NEWLINE_END;
tok = get_token();
}
}
fclose(fp);
}
static
struct replacement_item **parse_string(tail, recog_attr)
struct replacement_item **tail;
int recog_attr;
{
struct buffer buf;
unsigned len;
buffer_init(&buf);
for (;;) {
int c = get();
if (c == '\"')
break;
if (recog_attr && c == '[') {
if (buffer_length(&buf)) {
len = buffer_length(&buf);
tail = add_replacement_data(tail, buffer_extract(&buf), len);
}
scan_name(&buf, ']');
tail = add_replacement_attr(tail, buffer_extract(&buf));
}
else {
if (c == '\\') {
c = get();
switch (c) {
case EOF:
parse_error("unfinished string at end of file");
case 's':
buffer_append(&buf, ' ');
break;
case 'n':
buffer_append(&buf, '\n');
break;
case 't':
buffer_append(&buf, '\t');
break;
case 'r':
buffer_append(&buf, '\r');
break;
case 'f':
buffer_append(&buf, '\f');
break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
{
int val = c - '0';
c = peek();
if ('0' <= c && c <= '7') {
(void)get();
val = val*8 + (c - '0');
c = peek();
if ('0' <= c && c <= '7') {
(void)get();
val = val*8 + (c - '0');
}
}
buffer_append(&buf, val);
break;
}
default:
buffer_append(&buf, c);
break;
}
}
else
buffer_append(&buf, c);
}
}
len = buffer_length(&buf);
if (len > 0)
tail = add_replacement_data(tail, buffer_extract(&buf), len);
return tail;
}
static
struct replacement_item **add_replacement_data(tail, buf, n)
struct replacement_item **tail;
char *buf;
unsigned n;
{
if (!tail)
free(buf);
else {
*tail = NEW(struct replacement_item);
(*tail)->type = DATA_REPL;
(*tail)->u.data.n = n;
(*tail)->next = 0;
(*tail)->u.data.s = buf;
tail = &(*tail)->next;
}
return tail;
}
static
struct replacement_item **add_replacement_attr(tail, name)
struct replacement_item **tail;
char *name;
{
if (!tail)
free(name);
else {
*tail = NEW(struct replacement_item);
(*tail)->type = ATTR_REPL;
(*tail)->next = 0;
(*tail)->u.attr = name;
tail = &(*tail)->next;
}
return tail;
}
static
int get_token()
{
int c;
for (;;) {
c = get();
while (isspace(c))
c = get();
if (c != '%')
break;
do {
c = get();
if (c == EOF)
return EOF;
} while (c != '\n');
}
switch (c) {
case '+':
return PLUS;
case '<':
c = peek();
if (c == '/') {
(void)get();
return ETAGO;
}
return STAGO;
case '"':
return STRING;
case EOF:
return EOF;
default:
parse_error("bad input character `%c'", c);
}
}
static
void scan_name(buf, term)
struct buffer *buf;
int term;
{
int c;
for (;;) {
c = get();
if (c == term)
break;
if (c == '\n' || c == EOF)
parse_error("missing `%c'", term);
if (fold_general_names) {
if (islower((unsigned char)c))
c = toupper((unsigned char)c);
}
buffer_append(buf, c);
}
if (buffer_length(buf) == 0)
parse_error("empty name");
buffer_append(buf, '\0');
}
static
int get()
{
int c = getc(fp);
if (c == '\n')
current_lineno++;
return c;
}
static
int peek()
{
int c = getc(fp);
if (c != EOF)
ungetc(c, fp);
return c;
}
struct replacement *lookup_replacement(tablep, type, name)
struct replacement_table *tablep;
enum event_type type;
char *name;
{
int h = hash(type, name);
struct table_entry *p;
for (p = tablep->table[h]; p; p = p->next)
if (strcmp(name, p->gi) == 0 && type == p->type)
return &p->replacement;
return 0;
}
/* Return 0 if already defined. */
static
struct replacement *define_replacement(tablep, type, name)
struct replacement_table *tablep;
enum event_type type;
char *name;
{
int h = hash(type, name);
struct table_entry *p;
for (p = tablep->table[h]; p; p = p->next)
if (strcmp(name, p->gi) == 0 && type == p->type)
return 0;
p = NEW(struct table_entry);
p->next = tablep->table[h];
tablep->table[h] = p;
p->type = type;
p->gi = name;
p->replacement.flags = 0;
p->replacement.items = 0;
return &p->replacement;
}
static
VOID buffer_init(buf)
struct buffer *buf;
{
buf->size = buf->len = 0;
buf->s = 0;
}
static
char *buffer_extract(buf)
struct buffer *buf;
{
char *s = buf->s;
buf->s = 0;
buf->len = 0;
buf->size = 0;
return s;
}
#if 0
static
VOID buffer_free(buf)
struct buffer *buf;
{
if (buf->s) {
free((UNIV)buf->s);
buf->s = 0;
buf->size = buf->size = 0;
}
}
#endif
static
VOID buffer_append(buf, c)
struct buffer *buf;
int c;
{
if (buf->len >= buf->size) {
if (!buf->size)
buf->s = (char *)xmalloc(buf->size = 10);
else
buf->s = (char *)xrealloc((UNIV)buf->s, buf->size *= 2);
}
buf->s[buf->len] = c;
buf->len += 1;
}
static
int hash(type, s)
enum event_type type;
char *s;
{
unsigned long h = 0, g;
while (*s != 0) {
h <<= 4;
h += *s++;
if ((g = h & 0xf0000000) != 0) {
h ^= g >> 24;
h ^= g;
}
}
h ^= (int)type;
return (int)(h % TABLE_SIZE);
}
static
UNIV xmalloc(n)
unsigned n;
{
UNIV p = (UNIV)malloc(n);
if (!p)
parse_error("out of memory");
return p;
}
static
UNIV xrealloc(p, size)
UNIV p;
unsigned size;
{
p = (UNIV)realloc(p, size);
if (!p)
parse_error("out of memory");
return p;
}
static NO_RETURN
#ifdef VARARGS
void parse_error(va_alist) va_dcl
#else
void parse_error(char *message,...)
#endif
{
char buf[512];
#ifdef VARARGS
char *message;
#endif
va_list ap;
#ifdef VARARGS
va_start(ap);
message = va_arg(ap, char *);
#else
va_start(ap, message);
#endif
vsprintf(buf, message, ap);
va_end(ap);
error("%s:%d: %s", current_file, current_lineno, buf);
}

View File

@ -0,0 +1,35 @@
/* replace.h
Interface to replacement file parser. */
enum replacement_type {
DATA_REPL,
ATTR_REPL
};
struct replacement_item {
union {
char *attr;
struct {
char *s;
unsigned n;
} data;
} u;
enum replacement_type type;
struct replacement_item *next;
};
#define NEWLINE_BEGIN 01
#define NEWLINE_END 02
struct replacement {
unsigned flags;
struct replacement_item *items;
};
enum event_type { START_ELEMENT, END_ELEMENT };
struct replacement_table *make_replacement_table P((void));
void load_replacement_file P((struct replacement_table *, char *));
struct replacement *
lookup_replacement P((struct replacement_table *, enum event_type, char *));

View File

@ -0,0 +1,30 @@
.\" -*- nroff -*-
.TH SGMLSASP 1
.SH NAME
sgmlsasp \- translate output of sgmls using ASP replacement files
.SH SYNOPSIS
.B sgmls
.RB [ \-n ]
.I replacement_file\|.\|.\|.
.SH DESCRIPTION
.I sgmlsasp
translates the standard input using the specification in
.I replacement_file\|.\|.\|.
and writes the result to the standard output.
The standard input must be in the format output by
.IR sgmls .
Each replacement file must be in the format of an
Amsterdam SGML parser (ASP) replacement file;
this format is described in the ASP documentation.
Duplicate replacements are silently ignored.
The
.B \-n
option disables upper-case substitution (folding) for names in
replacement files; this option should be used with concrete syntaxes
that do not specify upper-case substitution for general names (that
is, names that are not entity names).
.SH BUGS
References to external data entities are ignored.
(Support for external data entities is not implemented in ASP.)
.SH "SEE ALSO"
.IR sgmls (1)

View File

@ -0,0 +1,278 @@
/* sgmlsasp.c
Translate sgmls output using ASP replacement file.
Written by James Clark (jjc@jclark.com). */
#include "sgmlsasp.h"
#include "sgmls.h"
#include "replace.h"
#include "getopt.h"
/* Non-zero if general (non-entity) names should be folded to upper case. */
int fold_general_names = 1;
static char *program_name;
static char last_char = '\n';
static void output_begin_line P((void));
static void output_data P((struct sgmls_data *, int));
static void output_pi P((char *, unsigned));
static void output_token P((char *));
static void output_attribute P((struct sgmls_attribute *));
static void output_data_char P((int));
static void output_replacement
P((struct replacement *, struct sgmls_attribute *));
static void do_file P((FILE *, struct replacement_table *));
static void usage P((void));
static void input_error P((int, char *, unsigned long));
#define output_char(c) (last_char = (c), putchar(c))
int main(argc, argv)
int argc;
char **argv;
{
struct replacement_table *tablep;
int i;
int opt;
program_name = argv[0];
while ((opt = getopt(argc, argv, "n")) != EOF)
switch (opt) {
case 'n':
fold_general_names = 0;
break;
case '?':
usage();
default:
assert(0);
}
if (argc - optind <= 0)
usage();
tablep = make_replacement_table();
for (i = optind; i < argc; i++)
load_replacement_file(tablep, argv[i]);
(void)sgmls_set_errhandler(input_error);
do_file(stdin, tablep);
exit(0);
}
static
void usage()
{
fprintf(stderr, "usage: %s [-n] replacement_file...\n", program_name);
exit(1);
}
static
void input_error(num, str, lineno)
int num;
char *str;
unsigned long lineno;
{
error("Error at input line %lu: %s", lineno, str);
}
static
void do_file(fp, tablep)
FILE *fp;
struct replacement_table *tablep;
{
struct sgmls *sp;
struct sgmls_event e;
sp = sgmls_create(fp);
while (sgmls_next(sp, &e))
switch (e.type) {
case SGMLS_EVENT_DATA:
output_data(e.u.data.v, e.u.data.n);
break;
case SGMLS_EVENT_ENTITY:
/* XXX what should we do here? */
break;
case SGMLS_EVENT_PI:
output_pi(e.u.pi.s, e.u.pi.len);
break;
case SGMLS_EVENT_START:
output_replacement(lookup_replacement(tablep,
START_ELEMENT, e.u.start.gi),
e.u.start.attributes);
sgmls_free_attributes(e.u.start.attributes);
break;
case SGMLS_EVENT_END:
output_replacement(lookup_replacement(tablep, END_ELEMENT, e.u.end.gi),
0);
break;
case SGMLS_EVENT_SUBSTART:
break;
case SGMLS_EVENT_SUBEND:
break;
case SGMLS_EVENT_APPINFO:
break;
case SGMLS_EVENT_CONFORMING:
break;
default:
abort();
}
sgmls_free(sp);
}
static
void output_data(v, n)
struct sgmls_data *v;
int n;
{
int i;
for (i = 0; i < n; i++) {
char *s = v[i].s;
int len = v[i].len;
for (; len > 0; len--, s++)
output_data_char(*s);
}
}
static
void output_pi(s, len)
char *s;
unsigned len;
{
for (; len > 0; len--, s++)
output_data_char(*s);
}
static
void output_replacement(repl, attributes)
struct replacement *repl;
struct sgmls_attribute *attributes;
{
struct replacement_item *p;
struct sgmls_attribute *a;
int i;
if (!repl)
return;
if (repl->flags & NEWLINE_BEGIN)
output_begin_line();
for (p = repl->items; p; p = p->next)
switch (p->type) {
case DATA_REPL:
for (i = 0; i < p->u.data.n; i++)
output_char(p->u.data.s[i]);
break;
case ATTR_REPL:
for (a = attributes; a; a = a->next)
if (strcmp(a->name, p->u.attr) == 0) {
output_attribute(a);
break;
}
break;
default:
abort();
}
if (repl->flags & NEWLINE_END)
output_begin_line();
}
static
void output_attribute(p)
struct sgmls_attribute *p;
{
switch (p->type) {
case SGMLS_ATTR_IMPLIED:
break;
case SGMLS_ATTR_CDATA:
output_data(p->value.data.v, p->value.data.n);
break;
case SGMLS_ATTR_TOKEN:
{
char **token = p->value.token.v;
int n = p->value.token.n;
if (n > 0) {
int i;
output_token(token[0]);
for (i = 1; i < n; i++) {
output_char(' ');
output_token(token[i]);
}
}
}
break;
case SGMLS_ATTR_ENTITY:
{
struct sgmls_entity **v = p->value.entity.v;
int n = p->value.entity.n;
int i;
for (i = 0; i < n; i++) {
if (i > 0)
output_char(' ');
output_token(v[i]->is_internal
? v[i]->u.internal.name
: v[i]->u.external.name);
}
}
break;
case SGMLS_ATTR_NOTATION:
if (p->value.notation)
output_token(p->value.notation->name);
break;
default:
abort();
}
}
static
void output_token(s)
char *s;
{
for (; *s; s++)
output_char(*s);
}
static
void output_data_char(c)
int c;
{
if (c != RSCHAR) {
if (c == RECHAR)
c = '\n';
output_char(c);
}
}
static
void output_begin_line()
{
if (last_char != '\n')
output_char('\n');
}
NO_RETURN
#ifdef VARARGS
void error(va_alist) va_dcl
#else
void error(char *message,...)
#endif
{
#ifdef VARARGS
char *message;
#endif
va_list ap;
fprintf(stderr, "%s: ", program_name);
#ifdef VARARGS
va_start(ap);
message = va_arg(ap, char *);
#else
va_start(ap, message);
#endif
vfprintf(stderr, message, ap);
va_end(ap);
fputc('\n', stderr);
fflush(stderr);
exit(EXIT_FAILURE);
}

View File

@ -0,0 +1,26 @@
/* sgmlsasp.h */
#include "config.h"
#include "std.h"
#ifdef USE_PROTOTYPES
#define P(parms) parms
#else
#define P(parms) ()
#endif
#ifdef __GNUC__
#define NO_RETURN volatile
#else
#define NO_RETURN /* as nothing */
#endif
#ifdef VARARGS
#define VP(parms) ()
#else
#define VP(parms) P(parms)
#endif
NO_RETURN void error VP((char *,...));
extern int fold_general_names;

147
usr.bin/sgmls/unix.cfg Normal file
View File

@ -0,0 +1,147 @@
/* unix.cfg: Configuration file for sgmls on Unix. */
/* A list of filename templates to use for searching for external entities.
The filenames are separated by the character specified in PATH_FILE_SEP.
See sgmls.man for details. */
#define DEFAULT_PATH "/usr/local/lib/sgml/%O/%C/%T:%N.%X:%N.%D"
/* The character that separates the filenames templates. */
#define PATH_FILE_SEP ':'
/* The character that separates filenames in a system identifier.
Usually the same as PATH_FILE_SEP. */
#define SYSID_FILE_SEP ':'
/* The environment variable that contains the list of filename templates. */
#define PATH_ENV_VAR "SGML_PATH"
/* MIN_DAT_SUBS_FROM and MIN_DATS_SUBS_TO tell sgmls how to transform a name
or system identifier into a legal filename. A character in
MIN_DAT_SUBS_FROM will be transformed into the character in the
corresponding position in MIN_DAT_SUBS_TO. If there is no such
position, then the character is removed. */
/* This says that spaces should be transformed to underscores, and
slashes to percents. */
#define MIN_DAT_SUBS_FROM " /"
#define MIN_DAT_SUBS_TO "_%"
/* Define this to allow tracing. */
/* #define TRACE 1 */
/* Define this you want support for subdocuments. This is implemented
using features that are not part of Standard C, so you might not want
to define it if you are porting to a new system. Otherwise I suggest
you leave it defined. */
#define SUPPORT_SUBDOC 1
/* Define HAVE_EXTENDED_PRINTF if your *printf functions supports
X/Open extensions; if they do, then, for example,
printf("%2$s%1$s", "bar", "foo")
should print `foobar'. */
/* #define HAVE_EXTENDED_PRINTF 1 */
/* Define HAVE_CAT if your system provides the X/Open message
catalogue functions catopen() and catgets(), and you want to use them.
An implementations of these functions is included and will be used if
you don't define this. On SunOS 4.1.1, if you do define this you
should set CC=/usr/xpg2bin/cc in the makefile. */
/* #define HAVE_CAT 1 */
#ifdef __STDC__
/* Define this if your compiler supports prototypes. */
#define USE_PROTOTYPES 1
#endif
/* Can't use <stdarg.h> without prototypes. */
#ifndef USE_PROTOTYPES
#define VARARGS 1
#endif
/* If your compiler defines __STDC__ but doesn't provide <stdarg.h>,
you must define VARARGS yourself here. */
/* #define VARARGS 1 */
/* Define this if you do not have strerror(). */
#define STRERROR_MISSING 1
/* Define this unless the character testing functions in ctype.h
are defined for all values representable as an unsigned char. You do
not need to define this if your system is ANSI C conformant. You
should define for old Unix systems. */
/* #define USE_ISASCII 1 */
/* Define this if your system provides the BSD style string operations
rather than ANSI C ones (eg bcopy() rather than memcpy(), and index()
rather than strchr()). */
/* #define BSD_STRINGS 1 */
/* Define this if you have getopt(). */
#define HAVE_GETOPT 1
/* Define this if you have access(). */
#define HAVE_ACCESS 1
/* Define this if you have <unistd.h>. */
#define HAVE_UNISTD_H 1
/* Define this if you have <sys/stat.h>. */
#define HAVE_SYS_STAT_H 1
/* Define this if you have waitpid(). */
#define HAVE_WAITPID 1
/* Define this if your system is POSIX.1 (ISO 9945-1:1990) compliant. */
#define POSIX 1
/* Define this if you have the vfork() system call. */
#define HAVE_VFORK 1
/* Define this if you have <vfork.h>. */
#define HAVE_VFORK_H 1
/* Define this if you don't have <stdlib.h> */
/* #define STDLIB_H_MISSING 1 */
/* Define this if you don't have <stddef.h> */
/* #define STDDEF_H_MISSING 1 */
/* Define this if you don't have <limits.h> */
/* #define LIMITS_H_MISSING 1 */
/* Define this if you don't have remove(); unlink() will be used instead. */
#define REMOVE_MISSING 1
/* Define this if you don't have raise(); kill() will be used instead. */
#define RAISE_MISSING 1
/* Define this if you don't have fsetpos() and fgetpos(). */
#define FPOS_MISSING 1
/* Universal pointer type. */
/* If your compiler doesn't fully support void *, change `void' to `char'. */
typedef void *UNIV;
/* If your compiler doesn't support void as a function return type,
change `void' to `int'. */
typedef void VOID;
/* If you don't have an ANSI C conformant <limits.h>, define
CHAR_SIGNED as 1 or 0 according to whether the `char' type is signed.
The <limits.h> on some versions of System Release V 3.2 is not ANSI C
conformant: the value of CHAR_MIN is 0 even though the `char' type is
signed. */
/* #define CHAR_SIGNED 1 */
/* #define CHAR_SIGNED 0 */
#ifndef CHAR_SIGNED
#include <limits.h>
#if CHAR_MIN < 0
#define CHAR_SIGNED 1
#else
#define CHAR_SIGNED 0
#endif
#endif /* not CHAR_SIGNED */
/* Assume the system character set is ISO Latin-1. */
#include "latin1.h"