400 lines
16 KiB
HTML
400 lines
16 KiB
HTML
|
<HTML>
|
||
|
<HEAD><TITLE>APR Design Document</TITLE></HEAD>
|
||
|
<BODY>
|
||
|
<h1>Design of APR</h1>
|
||
|
|
||
|
<p>The Apache Portable Run-time libraries have been designed to provide a common
|
||
|
interface to low level routines across any platform. The original goal of APR
|
||
|
was to combine all code in Apache to one common code base. This is not the
|
||
|
correct approach however, so the goal of APR has changed. There are places
|
||
|
where common code is not a good thing. For example, how to map requests
|
||
|
to either threads or processes should be platform specific. APR's place
|
||
|
is now to combine any code that can be safely combined without sacrificing
|
||
|
performance.</p>
|
||
|
|
||
|
<p>To this end we have created a set of operations that are required for cross
|
||
|
platform development. There may be other types that are desired and those
|
||
|
will be implemented in the future.</p>
|
||
|
|
||
|
<p>This document will discuss the structure of APR, and how best to contribute
|
||
|
code to the effort.</p>
|
||
|
|
||
|
<h2>APR On Windows and Netware</h2>
|
||
|
|
||
|
<p>APR on Windows and Netware is different from APR on all other systems,
|
||
|
because those platforms don't use autoconf. On Unix, apr_private.h (private to
|
||
|
APR) and apr.h (public, used by applications that use APR) are generated by
|
||
|
autoconf from acconfig.h and apr.h.in respectively. On Windows (and Netware),
|
||
|
apr_private.h and apr.h are created from apr_private.hw (apr_private.hwn)
|
||
|
and apr.hw (apr.hwn) respectively.</p>
|
||
|
|
||
|
<p> <strong>
|
||
|
If you add code to acconfig.h or tests to configure.in or aclocal.m4,
|
||
|
please give some thought to whether or not Windows and Netware need
|
||
|
these additions as well. A general rule of thumb, is that if it is
|
||
|
a feature macro, such as APR_HAS_THREADS, Windows and Netware need it.
|
||
|
In other words, if the definition is going to be used in a public APR
|
||
|
header file, such as apr_general.h, Windows needs it.
|
||
|
|
||
|
The only time it is safe to add a macro or test without also adding
|
||
|
the macro to apr*.h[n]w, is if the macro tells APR how to build. For
|
||
|
example, a test for a header file does not need to be added to Windows.
|
||
|
</strong></p>
|
||
|
|
||
|
<h2>APR Features</h2>
|
||
|
|
||
|
<p>One of the goals of APR is to provide a common set of features across all
|
||
|
platforms. This is an admirable goal, it is also not realistic. We cannot
|
||
|
expect to be able to implement ALL features on ALL platforms. So we are
|
||
|
going to do the next best thing. Provide a common interface to ALL APR
|
||
|
features on MOST platforms.</p>
|
||
|
|
||
|
<p>APR developers should create FEATURE MACROS for any feature that is not
|
||
|
available on ALL platforms. This should be a simple definition which has
|
||
|
the form:</p>
|
||
|
|
||
|
<code>APR_HAS_FEATURE</code>
|
||
|
|
||
|
<p>This macro should evaluate to true if APR has this feature on this platform.
|
||
|
For example, Linux and Windows have mmap'ed files, and APR is providing an
|
||
|
interface for mmapp'ing a file. On both Linux and Windows, APR_HAS_MMAP
|
||
|
should evaluate to one, and the ap_mmap_* functions should map files into
|
||
|
memory and return the appropriate status codes.</p>
|
||
|
|
||
|
<p>If your OS of choice does not have mmap'ed files, APR_HAS_MMAP should
|
||
|
evaluate to zero, and all ap_mmap_* functions should not be defined. The
|
||
|
second step is a precaution that will allow us to break at compile time if a
|
||
|
programmer tries to use unsupported functions.</p>
|
||
|
|
||
|
<h2>APR types</h2>
|
||
|
|
||
|
<p>The base types in APR</p>
|
||
|
|
||
|
<ul>
|
||
|
<li>dso<br>
|
||
|
Shared library routines
|
||
|
<li>mmap<br>
|
||
|
Memory-mapped files
|
||
|
<li>poll<br>
|
||
|
Polling I/O
|
||
|
<li>time<br>
|
||
|
Time
|
||
|
<li>user<br>
|
||
|
Users and groups
|
||
|
<li>locks<br>
|
||
|
Process and thread locks (critical sections)
|
||
|
<li>shmem<br>
|
||
|
Shared memory
|
||
|
<li>file_io<br>
|
||
|
File I/O, including pipes
|
||
|
<li>atomic<br>
|
||
|
Atomic integer operations
|
||
|
<li>strings<br>
|
||
|
String handling routines
|
||
|
<li>memory<br>
|
||
|
Pool-based memory allocation
|
||
|
<li>passwd<br>
|
||
|
Reading passwords from the terminal
|
||
|
<li>tables<br>
|
||
|
Tables and hashes
|
||
|
<li>network_io<br>
|
||
|
Network I/O
|
||
|
<li>threadproc<br>
|
||
|
Threads and processes
|
||
|
<li>misc<br>
|
||
|
Any APR type which doesn't have any other place to belong. This
|
||
|
should be used sparingly.
|
||
|
<li>support<br>
|
||
|
Functions meant to be used across multiple APR types. This area
|
||
|
is for internal functions only. If a function is exposed, it should
|
||
|
not be put here.
|
||
|
</ul>
|
||
|
|
||
|
<h2>Directory Structure</h2>
|
||
|
|
||
|
<p>Each type has a base directory. Inside this base directory, are
|
||
|
subdirectories, which contain the actual code. These subdirectories are named
|
||
|
after the platforms the are compiled on. Unix is also used as a common
|
||
|
directory. If the code you are writing is POSIX based, you should look at the
|
||
|
code in the unix directory. A good rule of thumb, is that if more than half
|
||
|
your code needs to be ifdef'ed out, and the structures required for your code
|
||
|
are substantively different from the POSIX code, you should create a new
|
||
|
directory.</p>
|
||
|
|
||
|
<p>Currently, the APR code is written for Unix, BeOS, Windows, and OS/2. An
|
||
|
example of the directory structure is the file I/O directory:</p>
|
||
|
|
||
|
<pre>
|
||
|
apr
|
||
|
|
|
||
|
-> file_io
|
||
|
|
|
||
|
-> unix The Unix and common base code
|
||
|
|
|
||
|
-> win32 The Windows code
|
||
|
|
|
||
|
-> os2 The OS/2 code
|
||
|
</pre>
|
||
|
|
||
|
<p>Obviously, BeOS does not have a directory. This is because BeOS is currently
|
||
|
using the Unix directory for it's file_io.</p>
|
||
|
|
||
|
<p>There are a few special top level directories. These are test and include.
|
||
|
Test is a directory which stores all test programs. It is expected
|
||
|
that if a new type is developed, there will also be a new test program, to
|
||
|
help people port this new type to different platforms. A small document
|
||
|
describing how to create new tests that integrate with the test suite can be
|
||
|
found in the test/ directory. Include is a directory which stores all
|
||
|
required APR header files for external use.</p>
|
||
|
|
||
|
<h2>Creating an APR Type</h2>
|
||
|
|
||
|
<p>The current design of APR requires that most APR types be incomplete.
|
||
|
It is not possible to write flexible portable code if programs can access
|
||
|
the internals of APR types. This is because different platforms are
|
||
|
likely to define different native types. There are only two execptions to
|
||
|
this rule:</p>
|
||
|
|
||
|
<ul>
|
||
|
<li>The first exception to this rule is if the type can only reasonably be
|
||
|
implemented one way. For example, time is a complete type because there
|
||
|
is only one reasonable time implementation.
|
||
|
|
||
|
<li>The second exception to the incomplete type rule can be found in
|
||
|
apr_portable.h. This file defines the native types for each platform.
|
||
|
Using these types, it is possible to extract native types for any APR type.</p>
|
||
|
</ul>
|
||
|
|
||
|
<p>For this reason, each platform defines a structure in their own directories.
|
||
|
Those structures are then typedef'ed in an external header file. For example
|
||
|
in file_io/unix/fileio.h:</p>
|
||
|
|
||
|
<pre>
|
||
|
struct ap_file_t {
|
||
|
apr_pool_t *cntxt;
|
||
|
int filedes;
|
||
|
FILE *filehand;
|
||
|
...
|
||
|
}
|
||
|
</pre>
|
||
|
|
||
|
<p>In include/apr_file_io.h:</p>
|
||
|
</pre>
|
||
|
typedef struct ap_file_t ap_file_t;
|
||
|
</pre>
|
||
|
|
||
|
<p> This will cause a compiler error if somebody tries to access the filedes
|
||
|
field in this structure. Windows does not have a filedes field, so obviously,
|
||
|
it is important that programs not be able to access these.</p>
|
||
|
|
||
|
<p>You may notice the apr_pool_t field. Most APR types have this field. This
|
||
|
type is used to allocate memory within APR. Because every APR type has a pool,
|
||
|
any APR function can allocate memory if it needs to. This is very important
|
||
|
and it is one of the reasons that APR works. If you create a new type, you
|
||
|
must add a pool to it. If you do not, then all functions that operate on that
|
||
|
type will need a pool argument.</p>
|
||
|
|
||
|
<h2>New Function</h2>
|
||
|
|
||
|
<p>When creating a new function, please try to adhere to these rules.</p>
|
||
|
|
||
|
<ul>
|
||
|
<li> Result arguments should be the first arguments.
|
||
|
<li> If a function needs a pool, it should be the last argument.
|
||
|
<li> These rules are flexible, especially if it makes the code easier
|
||
|
to understand because it mimics a standard function.
|
||
|
</ul>
|
||
|
|
||
|
<h2>Documentation</h2>
|
||
|
|
||
|
<p>Whenever a new function is added to APR, it MUST be documented. New
|
||
|
functions will not be committed unless there are docs to go along with them.
|
||
|
The documentation should be a comment block above the function in the header
|
||
|
file.</p>
|
||
|
|
||
|
<p>The format for the comment block is:</p>
|
||
|
|
||
|
<pre>
|
||
|
/**
|
||
|
* Brief description of the function
|
||
|
* @param parma_1_name explanation
|
||
|
* @param parma_2_name explanation
|
||
|
* @param parma_n_name explanation
|
||
|
* @tip Any extra information people should know.
|
||
|
* @deffunc function prototype if required
|
||
|
*/
|
||
|
</pre>
|
||
|
|
||
|
<p>For an actual example, look at any file in the include directory. The
|
||
|
reason the docs are in the header files is to ensure that the docs always
|
||
|
reflect the current code. If you change paramters or return values for a
|
||
|
function, please be sure to update the documentation.</p>
|
||
|
|
||
|
<h2>APR Error reporting</h2>
|
||
|
|
||
|
<p>Most APR functions should return an ap_status_t type. The only time an
|
||
|
APR function does not return an ap_status_t is if it absolutely CAN NOT
|
||
|
fail. Examples of this would be filling out an array when you know you are
|
||
|
not beyond the array's range. If it cannot fail on your platform, but it
|
||
|
could conceivably fail on another platform, it should return an ap_status_t.
|
||
|
Unless you are sure, return an ap_status_t.</p>
|
||
|
|
||
|
<strong>
|
||
|
This includes functions that return TRUE/FALSE values. How that
|
||
|
is handled is discussed below
|
||
|
</strong>
|
||
|
|
||
|
<p>All platforms return errno values unchanged. Each platform can also have
|
||
|
one system error type, which can be returned after an offset is added.
|
||
|
There are five types of error values in APR, each with it's own offset.</p>
|
||
|
|
||
|
<!-- This should be turned into a table, but I am lazy today -->
|
||
|
<pre>
|
||
|
Name Purpose
|
||
|
0) This is 0 for all platforms and isn't really defined
|
||
|
anywhere, but it is the offset for errno values.
|
||
|
(This has no name because it isn't actually defined,
|
||
|
but for completeness we are discussing it here).
|
||
|
|
||
|
1) APR_OS_START_ERROR This is platform dependent, and is the offset at which
|
||
|
APR errors start to be defined. Error values are
|
||
|
defined as anything which caused the APR function to
|
||
|
fail. APR errors in this range should be named
|
||
|
APR_E* (i.e. APR_ENOSOCKET)
|
||
|
|
||
|
2) APR_OS_START_STATUS This is platform dependent, and is the offset at which
|
||
|
APR status values start. Status values do not indicate
|
||
|
success or failure, and should be returned if
|
||
|
APR_SUCCESS does not make sense. APR status codes in
|
||
|
this range should be name APR_* (i.e. APR_DETACH)
|
||
|
|
||
|
4) APR_OS_START_USEERR This is platform dependent, and is the offset at which
|
||
|
APR apps can begin to add their own error codes.
|
||
|
|
||
|
3) APR_OS_START_SYSERR This is platform dependent, and is the offset at which
|
||
|
system error values begin.
|
||
|
</pre>
|
||
|
|
||
|
<strong>The difference in naming between APR_OS_START_ERROR and
|
||
|
APR_OS_START_STATUS mentioned above allows programmers to easily determine if
|
||
|
the error code indicates an error condition or a status codition.</strong>
|
||
|
|
||
|
<p>If your function has multiple return codes that all indicate success, but
|
||
|
with different results, or if your function can only return PASS/FAIL, you
|
||
|
should still return an apr_status_t. In the first case, define one
|
||
|
APR status code for each return value, an example of this is
|
||
|
<code>apr_proc_wait</code>, which can only return APR_CHILDDONE,
|
||
|
APR_CHILDNOTDONE, or an error code. In the second case, please return
|
||
|
APR_SUCCESS for PASS, and define a new APR status code for failure, an
|
||
|
example of this is <code>apr_compare_users</code>, which can only return
|
||
|
APR_SUCCESS, APR_EMISMATCH, or an error code.</p>
|
||
|
|
||
|
<p>All of these definitions can be found in apr_errno.h for all platforms. When
|
||
|
an error occurs in an APR function, the function must return an error code.
|
||
|
If the error occurred in a system call and that system call uses errno to
|
||
|
report an error, then the code is returned unchanged. For example: </p>
|
||
|
|
||
|
<pre>
|
||
|
if (open(fname, oflags, 0777) < 0)
|
||
|
return errno;
|
||
|
</pre>
|
||
|
|
||
|
<p>The next place an error can occur is a system call that uses some error value
|
||
|
other than the primary error value on a platform. This can also be handled
|
||
|
by APR applications. For example:</p>
|
||
|
|
||
|
<pre>
|
||
|
if (CreateFile(fname, oflags, sharemod, NULL,
|
||
|
createflags, attributes, 0) == INVALID_HANDLE_VALUE
|
||
|
return (GetLAstError() + APR_OS_START_SYSERR);
|
||
|
</pre>
|
||
|
|
||
|
<p>These two examples implement the same function for two different platforms.
|
||
|
Obviously even if the underlying problem is the same on both platforms, this
|
||
|
will result in two different error codes being returned. This is OKAY, and
|
||
|
is correct for APR. APR relies on the fact that most of the time an error
|
||
|
occurs, the program logs the error and continues, it does not try to
|
||
|
programatically solve the problem. This does not mean we have not provided
|
||
|
support for programmatically solving the problem, it just isn't the default
|
||
|
case. We'll get to how this problem is solved in a little while.</p>
|
||
|
|
||
|
<p>If the error occurs in an APR function but it is not due to a system call,
|
||
|
but it is actually an APR error or just a status code from APR, then the
|
||
|
appropriate code should be returned. These codes are defined in apr_errno.h
|
||
|
and should be self explanatory.</p>
|
||
|
|
||
|
<p>No APR code should ever return a code between APR_OS_START_USEERR and
|
||
|
APR_OS_START_SYSERR, those codes are reserved for APR applications.</p>
|
||
|
|
||
|
<p>To programmatically correct an error in a running application, the error
|
||
|
codes need to be consistent across platforms. This should make sense. APR
|
||
|
has provided macros to test for status code equivalency. For example, to
|
||
|
determine if the code that you received from the APR function means EOF, you
|
||
|
would use the macro APR_STATUS_IS_EOF().</p>
|
||
|
|
||
|
<p>Why did APR take this approach? There are two ways to deal with error
|
||
|
codes portably.</p>
|
||
|
|
||
|
<ol type=1>
|
||
|
<li> Return the same error code across all platforms.
|
||
|
<li> Return platform specific error codes and convert them when necessary.
|
||
|
</ol>
|
||
|
|
||
|
<p>The problem with option number one is that it takes time to convert error
|
||
|
codes to a common code, and most of the time programs want to just output
|
||
|
an error string. If we convert all errors to a common subset, we have four
|
||
|
steps to output an error string:</p>
|
||
|
|
||
|
<p>The seocnd problem with option 1, is that it is a lossy conversion. For
|
||
|
example, Windows and OS/2 have a couple hundred error codes, but POSIX errno
|
||
|
only defines about 50 errno values. This means that if we convert to a
|
||
|
canonical error value immediately, there is no way for the programmer to
|
||
|
get the actual system error.</p>
|
||
|
|
||
|
<pre>
|
||
|
make syscall that fails
|
||
|
convert to common error code step 1
|
||
|
return common error code
|
||
|
check for success
|
||
|
call error output function step 2
|
||
|
convert back to system error step 3
|
||
|
output error string step 4
|
||
|
</pre>
|
||
|
|
||
|
<p>By keeping the errors platform specific, we can output error strings in two
|
||
|
steps.</p>
|
||
|
|
||
|
<pre>
|
||
|
make syscall that fails
|
||
|
return error code
|
||
|
check for success
|
||
|
call error output function step 1
|
||
|
output error string step 2
|
||
|
</pre>
|
||
|
|
||
|
<p>Less often, programs change their execution based on what error was returned.
|
||
|
This is no more expensive using option 2 than it is using option 1, but we
|
||
|
put the onus of converting the error code on the programmer themselves.
|
||
|
For example, using option 1:</p>
|
||
|
|
||
|
<pre>
|
||
|
make syscall that fails
|
||
|
convert to common error code
|
||
|
return common error code
|
||
|
decide execution based on common error code
|
||
|
</pre>
|
||
|
|
||
|
<p>Using option 2:</p>
|
||
|
|
||
|
<pre>
|
||
|
make syscall that fails
|
||
|
return error code
|
||
|
convert to common error code (using ap_canonical_error)
|
||
|
decide execution based on common error code
|
||
|
</pre>
|
||
|
|
||
|
<p>Finally, there is one more operation on error codes. You can get a string
|
||
|
that explains in human readable form what has happened. To do this using
|
||
|
APR, call ap_strerror().</p>
|
||
|
|