freebsd-nq/contrib/byacc/NOTES-btyacc-Changes
Baptiste Daroussin 0c8de5b03c Update to byacc 20140409
Among all the modifications, this new byacc also solves a 14 year old bug [1]

PR:		bin/23254 [1]
Submitted by:	marka@nominum.com [1]
MFC after:	3 weeks
2014-04-23 05:57:45 +00:00

386 lines
18 KiB
Plaintext

Tom Shields, March 17, 2014
PARKING LOT ISSUES:
-------------------
- verify debian packaging still works?
- there are no #line directives in y.tab.i, other than those that come
from the input file and the skeleton file; to fix this, would need to
count output lines in externs_file and add 'write_externs_lineno()'
similar to 'write_code_lineno()'
- if there are no defined symbols, the .tab.h file isn't empty (weird case,
may not be worth fixing)
- consider: treat []-actions identical to {}-actions if not processing a
backtracking parser (avoids test case error)?
BTYACC CHANGES CURRENTLY DEFERRED, BY FILE:
-------------------------------------------
push.skel
- skeleton for a 'push' parser
- needs to be upgraded match the structure of yaccpar.skel
defs.h
- adopt '%include' changes
- adopt '%define'/'%ifdef'/'%endif'
- adopt -E flag to print preprocessed grammar to stdout
error.c
- adopt '%include' changes
- NOTE: there is a btyacc change that might be worth adopting in byacc
[FileError() refactoring to eliminate duplicated code in most of the
error message functions]
main.c
- adopt '%define' changes
- adopt '-DNAME' command line option to define preprocessor variable NAME
- adopt -E flag to print preprocessed grammar to stdout
- adopt '-S skeleton_file' command line option to select an alternate parser
skeleton file
- the skeleton file named by the -S flag is used as provided to open the
file; consider a change to this behavior to check whether the named file
has a path prefix, and if not, look in 'installation' directory if the
file is not found in the working directory
output.c
- adopt '%include' changes
reader.c
- adopt '%include' changes
- adopt '%define'/'%ifdef'/'%endif' changes
- adopt -E flag to print preprocessed grammar to stdout
- NOTE: there is a btyacc change that might be worth adopting in byacc
[copy_string() & copy_comment() refactoring to eliminate duplicated
code in copy_text() and copy_union()]
warshall.c
- NOTE: there is a btyacc change that might be worth adopting in byacc
[shifting 'mask' incrementally rather than literal '1' by a variable
amount each time thru the loop]
================================================================================
new files:
----------
skel2c
- modified from btyacc distribution: don't generate #include defs.h
- extended syntax recognized to include '%% insert VERSION here', generating
the defines for YYMAJOR, YYMINOR and YYPATCH at that point
- made generated tables type 'const char *const' to match skelton.c from
byacc-20130925 baseline
- added code to append text for write_section() to end of generated skeleton.c
- remove conversion of tab to \t in generated skeleton.c
- extended syntax recognized to include '%%ifdef', '%%ifndef', '%%else' and
'%%endif'; used in yaccpar.skel to bracket code that is specific to
backtracking
yaccpar.skel.old
- created from skeleton.c in byacc-20140101 baseline; use of this skeleton
will create a version of skeleton.c that is close to that in the
byacc-20140101 baseline
- eliminated 'body_3' and 'trailer_2' skeleton segments - no need to generate
yyerror() invocation dynamically; YYERROR_CALL() is already generated
earlier, and so can be used in the skeleton to simplify
- added 'const' to types in '%% tables' section to match what skel2c,
start_int_table() and state_str_table() generate
- added a few cosmetic changes (e.g., added some additional comments,
reworded debugging output to match yaccpar.skel, changed yygrowstack()
to return YYENOMEM for 'out of memory' error, rather than -1, to match
yaccpar.skel; changed yyparse() return value from 1 to 2 for the
'out of memory' error to match yaccpar.skel)
- added '#ifndef'/'#endif' around '#define YYINITSTACKSIZE 200' to allow
the value to be changed at compile time
- changed 'printf(' to 'fprintf(stderr, '; added stack depth (yydepth) to
debugging output from yaccpar.skel
- use 'YYINT' rather than 'short' for integer table types
yaccpar.skel
- renamed from btyaccpa.ske, merged with btyacc-c.ske
- modified from btyacc distribution to match the latest byacc-20140101
skeleton structure & data structures
- make local functions static
- change "virtual memory exceeded" to "memory exhausted" for bison
compatibility
- change debug output generation from printf/puts/putc onto stdout to use
fprintf/fputs/fputc onto stderr; include
stack depth and whether or not in trial parsing
- changed types of generated string tables to be 'const pointer to const char'
- check all malloc()/realloc() return values, ensure return value of
yyparse() = 2 if parsing failed due to memory exhaustion
- change YYDBPR() macro to YYSTYPE_TOSTRING(); define semantics as delivering
a char* value representing a semantic value (e.g., yylval or yyval, or the
contents of an entry on the semantic stack); additional parameter passed:
grammar symbol # (to assist interpretation of semantic value)
- change YYPOSN to YYLTYPE and yyposn to yylloc (position corresponding to
yylval) for bison compatibility; add yyloc (corresponding to yyval)
- move default definition of YYLTYPE into output.c, generating a typedef
- add '#if defined(YYLTYPE) || defined(YYLTYPE_IS_DECLARED)'/'#endif' around
all lines specific to position processing
- add '#if defined(YYDESTRUCT_CALL)'/'#endif' around all lines specific to
semantic & position stack processing to reclaim memory associated with
discarded symbols
- add '%%ifdef YYBTYACC'/'%%endif' around all lines specific to backtrack
parsing; converted by skel2c into '#if defined(YYBTYACC)'/'#endif'
- distinguish between "yacc stack overflow" and "memory exhausted" situations
- consolidated termination cleanup code; introduced yyreturn, set to 2 after
labels yyoverflow/yyenomem, set to 1 after label yyabort, set to 0 after
label yyaccept; all termination cases jump to label yyreturn, which does
any cleanup then returns yyreturn value
- replaced YYDELETEVAL & YYDELETEPOSN user-supplied macro capability by
implementation of byacc-generated yydestruct() as defined by bison
compatible %destructor mechanism
- moved invocation of 'YYREDUCEPOSNFUNC' macro to immediately prior to, rather
than after, execution of final rule action (so that, at some future
date, implementation extensions can be added to enable custom calculation
of locations associated with non-terminals within rule actions); deleted
unnecessary flag 'reduce_posn'; deleted 'YYCALLREDUCEPOSN' macro; deleted
C++ variant of 'YYREDUCEPOSNFUNC' invocation
- adopt approach similar to bison for default computation of yyloc; change
macro 'YYREDUCEPOSNFUNC' name to 'YYLLOC_DEFAULT' for bison compatibility;
added 'yyerror_loc_range[2]' to hold start & end locations for error
situations that pop the stack
- use 'YYINT' rather than 'short' for integer table types, and for indexing
parser tables
readskel.c
http://www.verisign.com/index.html- replaced error() with fprintf()
mstring.h
- moved contents of mstring.h to defs.h - mstring.h is obsolete
mstring.c
- replaced include of mstring.h with defs.h
- changed 'START' to 'HEAD' to remove conflict with 'START' used for
the start symbol defined in defs.h
modified byacc files:
---------------------
skeleton.c
- skeleton.c is now generated from the appropriate skeleton file by 'skel2c'
configure.in
- added configuration for --enable-btyacc option; if 'yes' add '-DYYBTYACC'
to DEFINES in makefile.in; --enable-btyacc defaults to 'no'
- added configuration for --with-max-table-size option; if present,
overrides the value of MAXTABLE defined in defs.h
- regenerate configure using autoconf
makefile.in
- added mstring.c to C_FILES
- added mstring$o to OBJS
- added @DEFINES@ as value of DEFINES make variable
- added new make variable SKELETON with value 'yaccpar.skel'
- added rule to generate skeleton.c from $(SKELETON), depending on skel2c
and makefile
- added rm -f skeleton.c distclean rule
- moved dependency on makefile from only main$o & skeleton$o to $(OBJS),
since if ./configure is run changing, for example, from --enable-btyacc
to --disable-btyacc, all files must be recompiled to ensure a clean
executable
- add @MAXTABLE@ for optional '-DMAXTABLE=nnn' if configured using
--with-max-table-size=nnn
- changed 'cd test && rn 0f test-*'to 'rm -f $(testdir)/test-*'
test/run_test.sh
- ???
test/run_make.sh
- ???
defs.h
- moved contents of mstring.h to defs.h - mstring.h is obsolete
- added <limits.h> to get the various system defined machine limits;
changed definitions of MAXCHAR, MAXSHORT, MINSHORT & BITS_PER_WORD to use
defines from <limits.h>; changed definitions of BIT and SETBIT to use
value of BITS_PER_WORD
- added typedef for __compar_fn_t, conditioned on _COMPAR_FN_T being
undefined (at least for Mac OSX environment)
- adopt new symbol class values ACTION and ARGUMENT
- adopt changes/additions used by inherited attribute processing
- clean up locations of extern function definitions to match where they
actually live in source files
- adopt error functions from inherited attribute processing; added new error
functions
- added keyword code LOCATIONS for %locations
- added keyword code DESTRUCTOR for %destructor
- added extern decl for 'int locations'; true if %locations present
- added extern decl for 'int backtrack'; initialized to 0 (= false), set to
1 (= true) if -B flag is present
- added extern decl for 'int destructor'; true if at least one %destructor
present in grammar spec file
- define 'YYINT' as the smallest C type that can be used to address a
table of size 'MAXTABLE'; define 'YYINT' based on the value of
'MAXTABLE' using the standard system type size definitions from <limits.h>;
define 'MAXYYINT' and 'MINYYINT' accordingly
- change 'Value_t' and 'Index_t' to 'YYINT' from 'short'
- allow 'MAXTABLE' to be defined by '-DMAXTABLE=nnn' at compile-time
closure.c
- changed print_closure(), print_EFF() and print_first_derives() to 'static';
added fwd declarations
- changed 'short' to 'Value_t' (in some instances, 'Value_t' was already
used for variables/parameters that were related to variables/parameters
declared as 'short'
error.c
- adopt error functions from inherited attribute processing; added a few
additional inherited attribute error functions
graph.c
- changed 'short' to 'Value_t' (in some instances, 'Value_t' was already
used for variables/parameters that were related to variables/parameters
declared as 'short'
lalr.c
- changed MAXSHORT to MAXYYINT
lr0.c
- changed MAXSHORT to MAXYYINT
- changed 'short' to 'Value_t' (in some instances, 'Value_t' was already
used for variables/parameters that were related to variables/parameters
declared as 'short'
main.c
- changed 'short' to 'Value_t' (in some instances, 'Value_t' was already
used for variables/parameters that were related to variables/parameters
declared as 'short'
mkpar.c
- backtracking attempts to resolve shift/reduce and reduce/reduce conflicts
output.c
- generate prefix & YYPREFIX defines into externs file (-i, .tab.i) and
code file (-r, .code.c); generate into output file (.tab.c) only if not
using -r option; eliminates doubled output of prefix aliases if -r with
no -i in y.tab.c and y.code.c or if -r & -i in y.tab.i and y.code.c
- changed types of generated string tables to be 'const pointer to const char'
- adopt backtracking as an alternative in cases where otherwise we have a
conflict in the parsing actions (3, rather than 2, choices)
- wrap defines file with (where "yy" is value of 'symbol_prefix')
#ifndef __yy_defines_h_
#define _yy_defines_h_
<defines>
#endif
- avoid writing %%xdecls skeleton section twice if -r used
- eliminated 'body_3' and 'trailer_2' skeleton segments - no need to generate
yyerror() invocation dynamically; YYERROR_CALL() is already generated
earlier, and can be used in the pareser skeleton
- if -P flag (pure_parser), add yylloc as 2nd parameter to yylex()
(declaration & call)
- change YYPOSN to YYLTYPE and yyposn to yylloc (position corresponding to
yylval) for bison compatibility; add yyloc (corresponding to yyval)
- generate yylloc parameters for yylex & yyerror if %locations present
- add location as 1st parameter to declaraion & invocation of yyerror() if
%locations present
- output backtrack parsing tables if -B flag is present
- added generation of yystos[] with output_accessing_symbols() to allow
translation from a parser internal state number to the corresponding
grammar symbol number [0 .. nsyms) of the accessing symbol of that parser
state; used in the generated code for YYDESTRUCT_CALL() &
YYSTYPE_TOSTRING() to enable the correct semantic value union tag to be
determined when executing the implementation of YYDESTRUCT_CALL() or
YYSTYPE_TOSTRING() (similar to yystos[] in bison)
- added to output_prefix(): yystos; yycindex & yyctable if compiling
backtracking; yyloc & yylloc if %locations used
- extended yyname[] to include all grammar symbols, not just the terminal
symbols: '$end', 'error', '$accept', all non-terminals, including internally
generated non-terminals for embedded actions in rules, and 'illegal-symbol'
(which bison spells '$undefined'); '$end' already defined as a symbol 0,
rathern than adding 'end-of-file' as the name of symbol 0; added
'illegal-symbol' from byacc-20140101 (NOTE: the comment in the code that
says byacc does not predefine '$end' and '$error' is incorrect; however,
both bison and byacc spell '$error' as 'error')
- added generation of #define YYTRANSLATE() from byacc-20140101, but changed
the definition for the undefined symbol case because it is no longer in
yyname[YYMAXTOKEN+1] but rather occurs after the last non-terminal symbol;
added #define YYUNDFTOKEN to contain the index in yyname of 'illegal-symbol'
- generate YYLTYPE in output_ltype() as a struct like for bison rather than
using #define in yaccpar.skel
- added 'write_code_lineno' invocation at start of 'output_prefix'
- added 'write_code_lineno' invocation at start of 'output_pure_parser'
- added 'write_code_lineno' invocation prior to generation of #include
for externs file
- added 'write_code_lineno' invocation after 1st 'write_section(fp, xdecls)'
- added '++outline;' prior to output of '#define YYTRANSLATE' - this was
actually causing almost all of the invocations of 'write_code_lineno' to
put out the correct #line directive
- corrected 'write_code_lineno' - the line number in a #line directive is
the number of the next line, not the number of the #line line
- changed MAXSHORT to MAXYYINT; changed 'high' local static from 'int' to
'long' so that it can get higher than 'MAXYYINT' without machine-dependent
behavior; changed related formats from '%d' to '%ld'
- generate 'YYINT' rather than 'short' for integer table types
- generate YYDESTRUCT_DECL & YYDESTRUCT_CALL macros, similar to YYERROR_DECL
and YYERROR_CALL macros, that can be redefined by user, if desired, to add
additional parameters to yydestruct() (and even change the 'yydestruct'
function name)
- if at least one %destructor present, generate yydestruct(); 1st parameter
is a string indicating the context in which yydestruct() is invoked
(e.g., discarding input token, discarding state on stack, cleanup when
aborting); 2nd parameter is the internal grammar symbol number [0..nsyms)
of the accessing symbol of the parser state on the top of the stack; 3rd
parameter is a pointer to the semantic value to be reclaimed associated
with the grammar symbol in the 2nd parameter; if %locations is defined,
the 4th parameter is a pointer to the position value to be reclaimed
associated with the grammar symbol in the 2nd parameter
reader.c
- adopt []-actions, similar to {}-actions; {}-actions are only executed when
not in trial mode, but []-actions are executed regardless of mode
- adopt new symbol class values ACTION and ARGUMENT
- adopt inherited attributes (syntax resembles arguments to non-terminal
symbols)
- adopt keyword table lookup from btyacc, modified to handle equivalence
of '-' and '_' in spelling of keywords
- adopt refactoring of tag table creation into cache_tag() for use in
multiple locations
- added new error functions in place of btyacc's generic error() function
- changed '0' to 'NULL' for pointer initialization
- reworked for-loop at end of get_line (part of DEFERRED '%ifdef/%endif' change)
- added %locations directive for bison compatibility to enable position
processing
- added decl for 'int locations'; true if %locations present
- added decl 'int backtrack'; initialized to 0 (= false), set to
1 (= true) if -B flag is present
- process %locations if present, set location = 1
- only process []-actions and only generate 'if (!yytrial)' prefix for
{}-actions if backtracking is enabled
- add decl for 'int destructor'; true if at least one %destructor is present
- add %destructor directive to enable semantic & position stack processing to
reclaim memory associated with discarded symbols
- process bison compatible %destructor (set destructor = 1); support @$ in
%destructor code to reference the position value if %locations is defined
- changed 'short' to 'Value_t' (in some instances, 'Value_t' was already
used for variables/parameters that were related to variables/parameters
declared as 'short'
- if %locations present, support @N and @$ syntax as for bison to reference
the locations associated with the N-th rhs symbol and the lhs symbol,
respectively
symtab.c
- initialize fields added to 'struct bucket' for non-terminal symbol
inherited attributes
verbose.c
- for parse states with conflicts, the contents of the y.output file include
the trial shift and/or trial reduce actions
- added output to the end of the verbose report showing the correspondance
between grammar symbol #, internal parser symbol #, and grammar symbol name
- changed 'short' to 'Value_t' (in some instances, 'Value_t' was already
used for variables/parameters that were related to variables/parameters
declared as 'short'
yacc.1
- added options 'P', 'V', 'y' and '-o output_file' to the yacc command
synopsis (already covered in the description section)
- added options 'B', 'D' and 'L' to the yacc command synopsis; added text in
the description section
- added %locations description to the extensions section