was not fun and I am not entirely certain of the correctness, but it seems
to work. (in fact, side by side testing of this code vs the x86 version
turned up hidden bugs in the x86 code).
testing and real-life applications:
1) If you returned from the thread function, you got a segv instead of
calling _exit() with your return code.
2) clean up some bogus stack management. There was also an underflow
on function return.
3) when making syscalls, the kernel is expecting to have to leave space
for the function's return address. We need to duplicate this. It was
an accident that the rfork syscall actually worked here. :-/
the number of times I have given this to people and got asked: why isn't
it in libc? It is impossible to do this without assembler glue to reset
the stack for the new child process.
int rfork_thread(flags, stack_addr, start_fnc, start_arg)
int flags; Flags to rfork system call. See rfork(2).
void *stack_addr; Top of stack for thread.
int (*start_fnc)(void *); Address of thread function to call in child.
void *start_arg; Argument to pass to the thread function in child.
This is deliberately not documented or prototyped in includes until the
corresponding alpha version is written.
a bug in some ftp servers (most notably ftp.vmunix.com) which report the
size of a file correctly in ascii mode, but report it as 0 in binary mode.
Reported by: asmodai
Also remove an unneeded initialization.
Sort out the size / length confusion. Always try to report the *real* file
size in the url_stat structure, no matter how much of it is actually being
sent, and try to detect inconsistencies between sizes.
Rearrange the request loop to avoid having to add meaningless code just to
silence compiler warnings.
Switch to a more sensible and consistent interface for the _http_parse*()
functions.
32-bit type (rather than define his own type based on the type of box
being compiled on).
Submitted by: Mark Abene <phiber@radicalmedia.com>
(however I applied a slightly different fix)
strdup()) rather than pointing it at something that's free()d
(via freeaddrinfo(res)) before the function returns.
I appreciate that this is an API change, but it's the only way
(AFAIK) of doing this without breaking existing code that uses
rcmd{,_af}().
Pointed out by: phkmalloc
than requested. Instead, inform the caller of the real offset by modifying
the offset field in the original struct url, and let him decide how to handle
the situation.
pthread_cond_signal(), pthread_cond_broadcast(), and pthread_cond_timedwait().
Do not dump core in pthread_cond_timedwait() (due to a NULL pointer
dereference) if attempting to wait on an uninitialized condition variable.
PR: bin/18099
fetchStat*(). In most cases, either fetchGet*() or fetchXGet*() is a wrapper
around the other; in all cases, calling fetchGet*() is identical to calling
fetchXGet*() with the second argument set to NULL.
outside the loop inspects it to determine whether or not we succeeded in
retrieving the requested document. This fixes a bug where fetchGetHTTP()
would return a FILE with an invalid file descriptor if it hit the redirect
limit without locating the requested document.
or not interrupted system calls will be restarted. This fixes a bug where
fetch(1) would hang (potentially forever) if a server stopped responding,
because the signal handler would absorb the user's efforts to interrupt the
transfer.
via IPv6, the hostname is trimed due to the length of IPv6 address.
This change saves it as possible.
I have a grudge against the shortage of UT_HOSTSIZE.
to be applied to the value given. This does not break installed
/etc/login.conf files, since un-suffixed numbers are interpreted as
they were before.
PR: 19750
Submitted by: Paul Herman <pherman@frenchfries.net>
moved around, but the acutal functional changes are small.
Add support for site-internal redirects (where the Location: header gives a
path instead of an absolute URI)
Pointed out by: kuriyama
with fdisk, ensure that they are a multiple of the sector size in length.
- Axe all the 1024 cylinder checks as they are no longer relevant with the
fixed bootstrap.
more robust, and somewhat more efficient. It also handles authorization and
redirects properly, and supports timeouts like the FTP code.
Many thanks to Umemoto-san for his assistance with IPv6 support, both here
and in other parts of libfetch.
management involving rcmd_af(), getaddrinfo(), freeaddrinfo(), etc.
We set *ahost to point to ai->canonname; and later free the ai-> stuff
and still leave the old pointers in *ahost to the freed data.
Perhaps the best way to deal with this is a static buffer or a static
strdup() that is freed on the next iteration or something. This gives
me headaches just thinking about this.
The new 'AJ' default for malloc() tripped this up.
of the processing of the recursion, "scan" would be pointing to O_CH
(or O_QUEST), which would then be interpreted as being the end character
for altoffset().
We avoid this by properly increasing scan before leaving the switch.
Without this, something like (a?b?)?cc would result in a g->moffset of
1 instead of 2.
I added a case to the soon-to-be-imported regex(3) test code to catch
this error.
string may be found (from the beginning of the pattern), the point
at which must is found minus that offset may actually point to some
place before the start of the text.
In that case, make start = start.
Alternatively, this could be tested for in the preceding if, but it
did not occur to me. :-)
Caught by: regex(3) test code
use a CHAR_MIN-based array, like elsewhere in the code.
Remove a number of unused variables (some due to the above change, one
that was left after a number of optimizing steps through the source).
Brucified by: bde
remove (comment out) functions defined or depricated elsewhere:
bsearch, lfind, lsearch, insque, remque
change hcreate to take a size_t rather than uint (essentially the same)
since hcreate/hdestroy are now in <search.h>, remove private search.h
in lib/libc/db/hash/
add $FreeBSD tags to hsearch.c
- permit numeric scopeid, be more careful about buffer size
TODO: 2nd arg type should be socklen_t for RFC2553 conformance,
but due to include file dependency it is not a easy thing to do
(netdb.h does not have socklen_t)
soon to be committed syscall stubs. These calls will be used to get
and set capability state associated with executables.
Obtained from: TrustedBSD Project
interface addresses in a portable manner, without headache of SIOCGIFCONF
or sysctl. it is in bsdi/openbsd/netbsd already.
from kame tree (actually, mandatory for latest kame tree).
when parsing certain DNS records during a reverse address resolution. Thus
when code tries to examine the returned host name, it dereferences a null
pointer :-(
Problem noticed by: ps
VIS_HTTPSTYLE is a new encoding style for use in vis(), strvis() and
strvisx() that escapes characters according to RFC 1808 (URI encoding).
Since decoding of these require different detection of start-points of
escaped characters, VIS_HTTPSTYLE can be given as flag to unvis().
unvis() will then properly decode URIs.
A new function appeared, strunvisx(): strunvisx() behaves similar as
strunvis(), with one exception: It has an additional flag parameter,
which is passed to unvis() to archive the effect I described above.
previous commits.
At the time we search the pattern for the "must" string, we now compute
the longest offset from the beginning of the pattern at which the must
string might be found. If that offset is found to be infinite (through
use of "+" or "*"), we set it to -1 to disable the heuristics applied
later.
After we are done with pre-matching, we use that offset and the point in
the text at which the must string was found to compute the earliest
point at which the pattern might be found.
Special care should be taken here. The variable "start" is passed to the
automata-processing functions fast() and slow() to indicate the point in
the text at which they should start working from. The real beginning of
the text is passed in a struct match variable m, which is used to check
for anchors. That variable, though, is initialized with "start", so we
must not adjust "start" before "m" is properly initialized.
Simple tests showed a speed increase from 100% to 400%, but they were
biased in that regexec() was called for the whole file instead of line
by line, and parenthized subexpressions were not searched for.
This change adds a single integer to the size of the "guts" structure,
and does not change the ABI.
Further improvements possible:
Since the speed increase observed here is so huge, one intuitive
optimization would be to introduce a bias in the function that computes
the "must" string so as to prefer a smaller string with a finite offset
over a larger one with an infinite offset. Tests have shown this to be a
bad idea, though, as the cost of false pre-matches far outweights the
benefits of a must offset, even in biased situations.
A number of other improvements suggest themselves, though:
* identify the cases where the pattern is identical to the must
string, and avoid entering fast() and slow() in these cases.
* compute the maximum offset from the must string to the end of
the pattern, and use that to set the point at which fast() and
slow() should give up trying to find a match, and return then
return to pre-matching.
* return all the way to pre-matching if a "match" was found and
later invalidated by back reference processing. Since back
references are evil and should be avoided anyway, this is of
little use.
The BM algorithm works by scanning the pattern from right to left,
and jumping as many characters as viable based on the text's mismatched
character and the pattern's already matched suffix.
This typically enable us to test only a fraction of the text's characters,
but has a worse performance than the straight-forward method for small
patterns. Because of this, the BM algorithm will only be used if the
pattern size is at least 4 characters.
Notice that this pre-matching is done on the largest substring of the
regular expression that _must_ be present on the text for a succesful
match to be possible at all.
For instance, "(xyzzy|grues)" will yield a null "must" substring, and,
therefore, not benefit from the BM algorithm at all. Because of the
lack of intelligence of the algorithm that finds the "must" string,
things like "charjump|matchjump" will also yield a null string. To
optimize that, "(char|match)jump" should be used.
The setup time (at regcomp()) for the BM algorithm will most likely
outweight any benefits for one-time matches. Given the slow regex(3)
we have, this is unlikely to be even perceptible, though.
The size of a regex_t structure is increased by 2*sizeof(char*) +
256*sizeof(int) + strlen(must)*sizeof(int). This is all inside the
regex_t's "guts", which is allocated dynamically by regcomp(). If
allocation of either of the two tables fail, the other one is freed.
In this case, the straight-forward algorithm is used for pre-matching.
Tests exercising the code path affected have shown a speed increase of
50% for "must" strings of length four or five.
API and ABI remain unchanged by this commit.
The patch submitted on the PR was not used, as it was non-functional.
PR: 14342