326 lines
14 KiB
Plaintext
326 lines
14 KiB
Plaintext
Perl Compiler Kit, Version alpha4
|
|
|
|
Copyright (c) 1996, 1997, Malcolm Beattie
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of either:
|
|
|
|
a) the GNU General Public License as published by the Free
|
|
Software Foundation; either version 1, or (at your option) any
|
|
later version, or
|
|
|
|
b) the "Artistic License" which comes with this kit.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See either
|
|
the GNU General Public License or the Artistic License for more details.
|
|
|
|
You should have received a copy of the Artistic License with this kit,
|
|
in the file named "Artistic". If not, you can get one from the Perl
|
|
distribution. You should also have received a copy of the GNU General
|
|
Public License, in the file named "Copying". If not, you can get one
|
|
from the Perl distribution or else write to the Free Software Foundation,
|
|
Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
|
|
|
|
CHANGES
|
|
|
|
New since alpha3
|
|
Anonymous subs work properly with C and CC.
|
|
Heuristics for forcing compilation of apparently unused subs/methods.
|
|
Subs which use the AutoLoader module are forcibly loaded at compile-time.
|
|
Slightly faster compilation.
|
|
Handles slightly more complex code within a BEGIN { }.
|
|
Minor bug fixes.
|
|
|
|
New since alpha2
|
|
CC backend now supports ".." and s//e.
|
|
Xref backend generates cross-reference reports
|
|
Cleanups to fix benign but irritating "-w" warnings
|
|
Minor cxstack fix
|
|
New since alpha1
|
|
Working CC backend
|
|
Shared globs and pre-initialised hash support
|
|
Some XSUB support
|
|
Assorted bug fixes
|
|
|
|
INSTALLATION
|
|
|
|
(1) You need perl5.002 or later.
|
|
|
|
(2) If you want to compile and run programs with the C or CC backends
|
|
which undefine (or redefine) subroutines, then you need to apply a
|
|
one-line patch to perl itself. One or two of the programs in perl's
|
|
own test suite do this. The patch is in file op.patch. It prevents
|
|
perl from calling free() on OPs with the magic sequence number (U16)-1.
|
|
The compiler declares all OPs as static structures and uses that magic
|
|
sequence number.
|
|
|
|
(3) Type
|
|
perl Makefile.PL
|
|
to write a personalised Makefile for your system. If you want the
|
|
bytecode modules to support reading bytecode from strings (instead of
|
|
just from files) then add the option
|
|
-DINDIRECT_BGET_MACROS
|
|
into the middle of the definition of the CCCMD macro in the Makefile.
|
|
Your C compiler may need to be able to cope with Standard C for this.
|
|
I haven't tested this option yet with an old pre-Standard compiler.
|
|
|
|
(4) If your platform supports dynamic loading then just type
|
|
make
|
|
and you can then use
|
|
perl -Iblib/arch -MO=foo bar
|
|
to use the compiler modules (see later for details).
|
|
If you need/want instead to make a statically linked perl which
|
|
contains the appropriate modules, then type
|
|
make perl
|
|
make byteperl
|
|
and you can then use
|
|
./perl -MO=foo bar
|
|
to use the compiler modules.
|
|
In both cases, the byteperl executable is required for running standalone
|
|
bytecode programs. It is *not* a standard perl+XSUB perl executable.
|
|
|
|
USAGE
|
|
|
|
As of the alpha3 release, the Bytecode, C and CC backends are now all
|
|
functional enough to compile almost the whole of the main perl test
|
|
suite. In the case of the CC backend, any failures are all due to
|
|
differences and/or known bugs documented below. See the file TESTS.
|
|
In the following examples, you'll need to replace "perl" by
|
|
perl -Iblib/arch
|
|
if you have built the extensions for a dynamic loading platform but
|
|
haven't installed the extensions completely. You'll need to replace
|
|
"perl" by
|
|
./perl
|
|
if you have built the extensions into a statically linked perl binary.
|
|
|
|
(1) To compile perl program foo.pl with the C backend, do
|
|
perl -MO=C,-ofoo.c foo.pl
|
|
Then use the cc_harness perl program to compile the resulting C source:
|
|
perl cc_harness -O2 -o foo foo.c
|
|
|
|
If you are using a non-ANSI pre-Standard C compiler that can't handle
|
|
pre-declaring static arrays, then add -DBROKEN_STATIC_REDECL to the
|
|
options you use:
|
|
perl cc_harness -O2 -o foo -DBROKEN_STATIC_REDECL foo.c
|
|
If you are using a non-ANSI pre-Standard C compiler that can't handle
|
|
static initialisation of structures with union members then add
|
|
-DBROKEN_UNION_INIT to the options you use. If you want command line
|
|
arguments passed to your executable to be interpreted by perl (e.g. -Dx)
|
|
then compile foo.c with -DALLOW_PERL_OPTIONS. Otherwise, all command line
|
|
arguments passed to foo will appear directly in @ARGV. The resulting
|
|
executable foo is the compiled version of foo.pl. See the file NOTES for
|
|
extra options you can pass to -MO=C.
|
|
|
|
There are some constraints on the contents on foo.pl if you want to be
|
|
able to compile it successfully. Some problems can be fixed fairly easily
|
|
by altering foo.pl; some problems with the compiler are known to be
|
|
straightforward to solve and I'll do so soon. The file Todo lists a
|
|
number of known problems. See the XSUB section lower down for information
|
|
about compiling programs which use XSUBs.
|
|
|
|
(2) To compile foo.pl with the CC backend (which generates actual
|
|
optimised C code for the execution path of your perl program), use
|
|
perl -MO=CC,-ofoo.c foo.pl
|
|
|
|
and proceed just as with the C backend. You should almost certainly
|
|
use an option such as -O2 with the subsequent cc_harness invocation
|
|
so that your C compiler uses optimisation. The C code generated by
|
|
the Perl compiler's CC backend looks ugly to humans but is easily
|
|
optimised by C compilers.
|
|
|
|
To make the most of this compiler backend, you need to tell the
|
|
compiler when you're using int or double variables so that it can
|
|
optimise appropriately (although this part of the compiler is the most
|
|
buggy). You currently do that by naming lexical variables ending in
|
|
"_i" for ints, "_d" for doubles, "_ir" for int "register" variables or
|
|
"_dr" for double "register" variables. Here "register" is a promise
|
|
that you won't pass a reference to the variable into a sub which then
|
|
modifies the variable. The compiler ought to catch attempts to use
|
|
"\$i" just as C compilers catch attempts to do "&i" for a register int
|
|
i but it doesn't at the moment. Bugs in the CC backend may make your
|
|
program fail in mysterious ways and give wrong answers rather than just
|
|
crash in boring ways. But, hey, this is an alpha release so you knew
|
|
that anyway. See the XSUB section lower down for information about
|
|
compiling programs which use XSUBs.
|
|
|
|
If your program uses classes which define methods (or other subs which
|
|
are not exported and not apparently used until runtime) then you'll
|
|
need to use -u compile-time options (see the NOTES file) to force the
|
|
subs to be compiled. Future releases will probably default the other
|
|
way, do more auto-detection and provide more fine-grained control.
|
|
|
|
Since compiled executables need linking with libperl, you may want
|
|
to turn libperl.a into a shared library if your platform supports
|
|
it. For example, with Digital UNIX, do something like
|
|
ld -shared -o libperl.so -all libperl.a -none -lc
|
|
and with Linux/ELF, rebuild the perl .c files with -fPIC (and I
|
|
also suggest -fomit-frame-pointer for Linux on Intel architetcures),
|
|
do "make libperl.a" and then do
|
|
gcc -shared -Wl,-soname,libperl.so.5 -o libperl.so.5.3 `ar t libperl.a`
|
|
and then
|
|
# cp libperl.so.5.3 /usr/lib
|
|
# cd /usr/lib
|
|
# ln -s libperl.so.5.3 libperl.so.5
|
|
# ln -s libperl.so.5 libperl.so
|
|
# ldconfig
|
|
When you compile perl executables with cc_harness, append -L/usr/lib
|
|
otherwise the -L for the perl source directory will override it. For
|
|
example,
|
|
perl -Iblib/arch -MO=CC,-O2,-ofoo3.c foo3.bench
|
|
perl cc_harness -o foo3 -O2 foo3.c -L/usr/lib
|
|
ls -l foo3
|
|
-rwxr-xr-x 1 mbeattie xzdg 11218 Jul 1 15:28 foo3
|
|
You'll probably also want to link your main perl executable against
|
|
libperl.so; it's nice having an 11K perl executable.
|
|
|
|
(3) To compile foo.pl into bytecode do
|
|
perl -MO=Bytecode,-ofoo foo.pl
|
|
To run the resulting bytecode file foo as a standalone program, you
|
|
use the program byteperl which should have been built along with the
|
|
extensions.
|
|
./byteperl foo
|
|
Any extra arguments are passed in as @ARGV; they are not interpreted
|
|
as perl options. If you want to load chunks of bytecode into an already
|
|
running perl program then use the -m option and investigate the
|
|
byteload_fh and byteload_string functions exported by the B module.
|
|
See the NOTES file for details of these and other options (including
|
|
optimisation options and ways of getting at the intermediate "assembler"
|
|
code that the Bytecode backend uses).
|
|
|
|
(3) There are little Bourne shell scripts and perl programs to aid with
|
|
some common operations: assemble, disassemble, run_bytecode_test,
|
|
run_test, cc_harness, test_harness, test_harness_bytecode.
|
|
|
|
(4) Walk the op tree in execution order printing terse info about each op
|
|
perl -MO=Terse,exec foo.pl
|
|
|
|
(5) Walk the op tree in syntax order printing lengthier debug info about
|
|
each op. You can also append ",exec" to walk in execution order, but the
|
|
formatting is designed to look nice with Terse rather than Debug.
|
|
perl -MO=Debug foo.pl
|
|
|
|
(6) Produce a cross-reference report of the line numbers at which all
|
|
variables, subs and formats are defined and used.
|
|
perl -MO=Xref foo.pl
|
|
|
|
XSUBS
|
|
|
|
The C and CC backends can successfully compile some perl programs which
|
|
make use of XSUB extensions. [I'll add more detail to this section in a
|
|
later release.] As a prerequisite, such extensions must not need to do
|
|
anything in their BOOT: section which needs to be done at runtime rather
|
|
than compile time. Normally, the only code in the boot_Foo() function is
|
|
a list of newXS() calls which xsubpp puts there and the compiler handles
|
|
saving those XS subs itself. For each XSUB used, the C and CC compiler
|
|
will generate an initialiser in their C output which refers to the name
|
|
of the relevant C function (XS_Foo_somesub). What is not yet automated
|
|
is the necessary commands and cc command-line options (e.g. via
|
|
"perl cc_harness") which link against the extension libraries. For now,
|
|
you need the XSUB extension to have installed files in the right format
|
|
for using as C libraries (e.g. Foo.a or Foo.so). As the Foo.so files (or
|
|
your platform's version) aren't suitable for linking against, you will
|
|
have to reget the extension source and rebuild it as a static extension
|
|
to force the generation of a suitable Foo.a file. Then you need to make
|
|
a symlink (or copy or rename) of that file into a libFoo.a suitable for
|
|
cc linking. Then add the appropriate -L and -l options to your
|
|
"perl cc_harness" command line to find and link against those libraries.
|
|
You may also need to fix up some platform-dependent environment variable
|
|
to ensure that linked-against .so files are found at runtime too.
|
|
|
|
DIFFERENCES
|
|
|
|
The result of running a compiled Perl program can sometimes be different
|
|
from running the same program with standard perl. Think of the compiler
|
|
as having a slightly different implementation of the language Perl.
|
|
Unfortunately, since Perl has had a single implementation until now,
|
|
there are no formal standards or documents defining what behaviour is
|
|
guaranteed of Perl the language and what just "happens to work".
|
|
Some of the differences below are almost impossible to change because of
|
|
the way the compiler works. Others can be changed to produce "standard"
|
|
perl behaviour if it's deemed proper and the resulting performance hit
|
|
is accepted. I'll use "standard perl" to mean the result of running a
|
|
Perl program using the perl executable from the perl distribution.
|
|
I'll use "compiled Perl program" to mean running an executable produced
|
|
by this compiler kit ("the compiler") with the CC backend.
|
|
|
|
Loops
|
|
Standard perl calculates the target of "next", "last", and "redo"
|
|
at run-time. The compiler calculates the targets at compile-time.
|
|
For example, the program
|
|
|
|
sub skip_on_odd { next NUMBER if $_[0] % 2 }
|
|
NUMBER: for ($i = 0; $i < 5; $i++) {
|
|
skip_on_odd($i);
|
|
print $i;
|
|
}
|
|
|
|
produces the output
|
|
024
|
|
with standard perl but gives a compile-time error with the compiler.
|
|
|
|
Context of ".."
|
|
The context (scalar or array) of the ".." operator determines whether
|
|
it behaves as a range or a flip/flop. Standard perl delays until
|
|
runtime the decision of which context it is in but the compiler needs
|
|
to know the context at compile-time. For example,
|
|
@a = (4,6,1,0,0,1);
|
|
sub range { (shift @a)..(shift @a) }
|
|
print range();
|
|
while (@a) { print scalar(range()) }
|
|
generates the output
|
|
456123E0
|
|
with standard Perl but gives a compile-time error with compiled Perl.
|
|
|
|
Arithmetic
|
|
Compiled Perl programs use native C arithemtic much more frequently
|
|
than standard perl. Operations on large numbers or on boundary
|
|
cases may produce different behaviour.
|
|
|
|
Deprecated features
|
|
Features of standard perl such as $[ which have been deprecated
|
|
in standard perl since version 5 was released have not been
|
|
implemented in the compiler.
|
|
|
|
Others
|
|
I'll add to this list as I remember what they are.
|
|
|
|
BUGS
|
|
|
|
Here are some things which may cause the compiler problems.
|
|
|
|
The following render the compiler useless (without serious hacking):
|
|
* Use of the DATA filehandle (via __END__ or __DATA__ tokens)
|
|
* Operator overloading with %OVERLOAD
|
|
* The (deprecated) magic array-offset variable $[ does not work
|
|
* The following operators are not yet implemented for CC
|
|
goto
|
|
sort with a non-default comparison (i.e. a named sub or inline block)
|
|
* You can't use "last" to exit from a non-loop block.
|
|
|
|
The following may give significant problems:
|
|
* BEGIN blocks containing complex initialisation code
|
|
* Code which is only ever referred to at runtime (e.g. via eval "..." or
|
|
via method calls): see the -u option for the C and CC backends.
|
|
* Run-time lookups of lexical variables in "outside" closures
|
|
|
|
The following may cause problems (not thoroughly tested):
|
|
* Dependencies on whether values of some "magic" Perl variables are
|
|
determined at compile-time or runtime.
|
|
* For the C and CC backends: compile-time strings which are longer than
|
|
your C compiler can cope with in a single line or definition.
|
|
* Reliance on intimate details of global destruction
|
|
* For the Bytecode backend: high -On optimisation numbers with code
|
|
that has complex flow of control.
|
|
* Any "-w" option in the first line of your perl program is seen and
|
|
acted on by perl itself before the compiler starts. The compiler
|
|
itself then runs with warnings turned on. This may cause perl to
|
|
print out warnings about the compiler itself since I haven't tested
|
|
it thoroughly with warnings turned on.
|
|
|
|
There is a terser but more complete list in the Todo file.
|
|
|
|
Malcolm Beattie
|
|
2 September 1996
|