151 lines
6.4 KiB
Plaintext
151 lines
6.4 KiB
Plaintext
#
|
|
# Copyright (c) 1994
|
|
# Open Software Foundation, Inc.
|
|
#
|
|
# Permission is hereby granted to use, copy, modify and freely distribute
|
|
# the software in this file and its documentation for any purpose without
|
|
# fee, provided that the above copyright notice appears in all copies and
|
|
# that both the copyright notice and this permission notice appear in
|
|
# supporting documentation. Further, provided that the name of Open
|
|
# Software Foundation, Inc. ("OSF") not be used in advertising or
|
|
# publicity pertaining to distribution of the software without prior
|
|
# written permission from OSF. OSF makes no representations about the
|
|
# suitability of this software for any purpose. It is provided "as is"
|
|
# without express or implied warranty.
|
|
#
|
|
|
|
instant - a formatting application for OSF SGML instances
|
|
____________________________________________________________________________
|
|
|
|
Requirements
|
|
|
|
ANSI C compiler (gcc is one)
|
|
|
|
sgmls 1.1 -- sgml parser from James Clark. Based on Goldfarb's ARC parser.
|
|
|
|
Vanilla unix make
|
|
|
|
POSIX C libraries
|
|
|
|
|
|
Files for instant program
|
|
|
|
Module Function
|
|
------ --------
|
|
browse.c interactive browser
|
|
general.h general definitions
|
|
info.c print information about the instances
|
|
main.c main entry, arg parsing, instance reading
|
|
tables.c table-specific formatting routines (TeX and tbl)
|
|
traninit.c translator initialization (read spec, etc.)
|
|
translate.c main translator
|
|
translate.h structure definitions for translation code
|
|
tranvar.c routines for handling "special variables"
|
|
util.c general utilities
|
|
|
|
|
|
Also required
|
|
|
|
1. Translation spec (transpec) files. (../transpecs/*.ts)
|
|
2. SDATA mapping files for mapping sdata entities. (../transpecs/*.sdata)
|
|
3. Character mapping files for mapping characters. (../transpecs/*.cmap)
|
|
|
|
|
|
Platforms tried on
|
|
|
|
OSF1 1.3 (i486)
|
|
Ultrix 4.2 (mips)
|
|
HP-UX 9.01 (hp 9000/700)
|
|
AIX 3.2 (rs6000)
|
|
SunOS [missing strerror()]
|
|
|
|
____________________________________________________________________________
|
|
|
|
General outline of program
|
|
------- ------- -- -------
|
|
|
|
To summarize in a sentence, instant reads the output of sgmls, builds a tree
|
|
of the instnace in memory, then traverses the tree in in-order, processing
|
|
the nodes according to a translation spec.
|
|
|
|
Element tree storage
|
|
------- ---- -------
|
|
|
|
The first thing instant must do is read the ESIS (output of sgmls) from the
|
|
specified file or stdin, forming a tree in memory. (Nothing makes sense
|
|
without an instance...) Each element of the instance is a node in the tree,
|
|
stored as a structure called Element_t. Elements contain content (what
|
|
else?), which is a mixture of data (#PCDATA, #CDATA, #RCDATA - all the same
|
|
in the ESIS), child elements, and PIs. Each 'chunk' of content is referred
|
|
to by a Content_t structure. A Content_t contains an enum that can point to
|
|
a string (data or PI), another Element_t. For example, if a <p> element
|
|
contains some characters, an <emphasis> element, some more characters,
|
|
a <function> element, then some more characters, it has 5 Content_t children
|
|
as an array.
|
|
|
|
Element_t's have pointers to their parents, and a next element in a linked
|
|
list (they're stored as a linked list, for cases when you'd want to quickly
|
|
travers all the nodes, in no particular order).
|
|
For convenience, Element_t's have an array of pointers to it's child
|
|
Element_t's. These are just pointers to the same thing contained in the
|
|
Content_t array, without the intervening data or PIs. This makes it easier
|
|
for the program to traverse the elements of the tree (it does not have to
|
|
be concerned with skipping data, etc.). There is an analagous array of
|
|
pointers for the data content, an array of (char *)'s. This makes it easier
|
|
to consider the immediate character content of an element.
|
|
|
|
Attributes are kept as an array of name-value mappings (using the typedef
|
|
Mapping_t). ID attributes are also stored in a separate list of ID value -
|
|
element pointer pairs so that it is quick to find an element by ID.
|
|
|
|
Other information kept about each element (in the Element_t struct) includes
|
|
the line number in the EISI where the element occurs, the input filename.
|
|
(These depend on sgmls being run with the "-l" option.) Also stored is
|
|
an element's order in its parent's collection of children and an element's
|
|
depth in the tree.
|
|
|
|
Translation specs
|
|
----------- -----
|
|
|
|
A translation spec is read into a linked list in memory. As the instance
|
|
tree is traversed, the transpecs are searched in order for a match. As a
|
|
rule, one should position the less specific transpecs later in the file.
|
|
Also, specs for seldom-used element are best placed later in the file, since
|
|
it takes cpu cycles to skip over them for each of the more-used elements.
|
|
|
|
During translation of a particular element, the list of Content_t structures
|
|
are processed in order. If a content 'chunk' is data, it is printed to
|
|
the output stream. If it is an element, the translation routine is called
|
|
for that elemen, recursively. Hence, in-order traversal.
|
|
|
|
Miscellaneous information displays
|
|
------------- ----------- --------
|
|
|
|
There are several informational display options available. They include:
|
|
- a list of element usage (-u) -- lists each element in the instance,
|
|
it's attributes, number of children, parent, data content 'nodes'.
|
|
- statistics about elements (-S) -- lists elements, number of times
|
|
each is used, percent of elements that this is, total char content
|
|
in that element, average number of characters in they element.
|
|
- show context of each element (-x) -- lists each element and its
|
|
context, looking up the tree (this is the same context that
|
|
would match the Context field of a transpec).
|
|
- show the hierarchy of the instance (-h) -- show an ascii 'tree' of
|
|
the instance, where elements deeper in the tree are indented more.
|
|
Numbers after the element name in parentheses are the number of
|
|
element content nodes and the number of data content nodes.
|
|
|
|
Interactive browser
|
|
----------- -------
|
|
|
|
Originally, the interactive browser was intended as a debugging aid for
|
|
the code developer. It was a way to examine a particular node of the
|
|
instance tree or process a subtree without being distracted by the rest
|
|
of the instance. Many of the commands test functionality of the query
|
|
and search code (such as testing whether a certain element has a given
|
|
relationship to the current position in the tree).
|
|
|
|
|
|
____________________________________________________________________________
|
|
|