freebsd-skq

Author	SHA1	Message	Date
jasone	c614695539	Clean up manipulation of chunk page map elements to remove some tenuous assumptions about whether bits are set at various times. This makes adding other flags safe. Reorganize functions in order to inline i{m,c,p,s,re}alloc(). This allows the entire fast-path call chains for malloc() and free() to be inlined. [1] Suggested by: [1] Stuart Parmenter <stuart@mozilla.com>	2008-02-08 00:35:56 +00:00
bde	efcf10f47b	Use a better method of scaling by 2k. Instead of adding to the exponent bits of the reduced result, construct 2k (hopefully in parallel with the construction of the reduced result) and multiply by it. This tends to be much faster if the construction of 2*k is actually in parallel, and might be faster even with no parallelism since adjustment of the exponent requires a read-modify-wrtite at an unfortunate time for pipelines. In some cases involving exp2 on amd64 (A64), this change saves about 40 cycles or 30%. I think it is inherently only about 12 cycles faster in these cases and the rest of the speedup is from partly-accidentally avoiding compiler pessimizations (the construction of 2*k is now manually scheduled for good results, and -O2 doesn't always mess this up). In most cases on amd64 (A64) and i386 (A64) the speedup is about 20 cycles. The worst case that I found is expf on ia64 where this change is a pessimization of about 10 cycles or 5%. The manual scheduling for plain exp[f] is harder and not as tuned. Details specific to expm1: - the saving is closer to 12 cycles than to 40 for expm1* on i386 (A64). For some reason it is much larger for negative args. - also convert to __FBSDID().	2008-02-07 09:42:19 +00:00
bde	22e608f1ce	Use a better method of scaling by 2k. Instead of adding to the exponent bits of the reduced result, construct 2k (hopefully in parallel with the construction of the reduced result) and multiply by it. This tends to be much faster if the construction of 2*k is actually in parallel, and might be faster even with no parallelism since adjustment of the exponent requires a read-modify-wrtite at an unfortunate time for pipelines. In some cases involving exp2 on amd64 (A64), this change saves about 40 cycles or 30%. I think it is inherently only about 12 cycles faster in these cases and the rest of the speedup is from partly-accidentally avoiding compiler pessimizations (the construction of 2**k is now manually scheduled for good results, and -O2 doesn't always mess this up). In most cases on amd64 (A64) and i386 (A64) the speedup is about 20 cycles. The worst case that I found is expf on ia64 where this change is a pessimization of about 10 cycles or 5%. The manual scheduling for plain exp[f] is harder and not as tuned. This change ld128/s_exp2l.c has not been tested.	2008-02-07 03:17:05 +00:00
des	67c8e0948c	Add missing #include Spotted by: tinderbox Submitted by: Pietro Cerutti <gahr@gahr.ch> Pointy hat to: des	2008-02-06 23:25:29 +00:00
des	ddda03a2e0	Yet another pointy hat: when I zapped FBSDprivate_1.1, I forgot to move its contents to FBSDprivate_1.0.	2008-02-06 20:45:46 +00:00
des	b4e1ea3e1c	Add pthread_mutex_isowned_np() here as well; libthr and libkse are supposed to have identical functionality. MFC after: 2 weeks	2008-02-06 20:44:29 +00:00
des	f006d1f25a	Remove unnecessary prototype.	2008-02-06 20:43:19 +00:00
des	0cd1685caf	Add pthread_mutex_isowned_np() so there is no need for an additional prototype next to the implementation. MFC after: 2 weeks	2008-02-06 20:42:35 +00:00
des	ee907d59af	Previous commit had a typo that resulted in symbol versioning being (silently) disabled for libkse... Pointy hat to: des	2008-02-06 20:33:59 +00:00
des	03dc3be400	Give libkse the same treatment as libthr re. symbol versioning. MFC after: 2 weeks	2008-02-06 20:30:48 +00:00
des	3086491df6	Convert pthread.map to the format expected by version_gen.awk, and modify the Makefile accordingly; libthr now explicitly uses libc's Versions.def. MFC after: 2 weeks	2008-02-06 20:25:00 +00:00
des	85a226c62d	Remove incorrectly added FBSDprivate_1.1 namespace, and move symbols which are new in FreeBSD 8 to the appropriate namespace.	2008-02-06 20:20:29 +00:00
des	053e111aba	Per discussion on -threads, rename _islocked_np() to _isowned_np().	2008-02-06 19:34:31 +00:00
des	d129ae8c34	Add necessary cast for tolower() argument. Submitted by: Joerg Sonnenberger <joerg@britannica.bec.de> MFC after: 1 week	2008-02-06 11:39:55 +00:00
bde	bd06cb56ab	As for the float trig functions and logf, use a minimax polynomial that is specialized for float precision. The new polynomial has degree 5 instead of 11, and a maximum error of 2-27.74 ulps instead of 2-30.64. This doesn't affect the final error significantly; the maximum error was and is about 0.9101 ulps on amd64 -01 and the number of cases with an error of > 0.5 ulps is actually reduced by epsilon despite the larger error in the polynomial. This is about 15% faster on amd64 (A64), i386 (A64) and ia64. The asm version is still used instead of this on i386 since it is faster and more accurate.	2008-02-06 06:35:21 +00:00
jasone	44c343f8fa	Track dirty unused pages so that they can be purged if they exceed a threshold, according to the 'F' MALLOC_OPTIONS flag. This obsoletes the 'H' flag. Try to realloc() large objects in place. This substantially speeds up incremental large reallocations in the common case. Fix a bug in arena_ralloc() that caused relocation of sub-page objects even if the old and new sizes were in the same size class. Maintain trees of runs and simplify the per-chunk page map. This allows logarithmic-time searching for sufficiently large runs in arena_run_alloc(), whereas the previous algorithm required linear time in the worst case. Break various large functions into smaller sub-functions, and inline only the functions that are in the fast path for small object allocation/deallocation. Remove an unnecessary check in base_pages_alloc_mmap(). Avoid integer division in choose_arena() for the NO_TLS case on single-CPU systems.	2008-02-06 02:59:54 +00:00
matteo	da1547d460	set WARNS to 1: with WARNS=2 an aliasing error in a file generated by rpcgen from include/rpcsvc/rex.x is exposed and I really don't know how to fix it. MFC after: 1 week	2008-02-05 20:03:45 +00:00
jkoshy	5c5b24c7a8	Document the return type for gelf_fsize(3). Submitted by: kaiw	2008-02-04 18:50:28 +00:00
des	c089534891	After careful consideration (and a brief discussion with attilio@), change the semantics of pthread_mutex_islocked_np() to return true if and only if the mutex is held by the current thread. Obviously, change the regression test to match. MFC after: 2 weeks	2008-02-04 12:35:23 +00:00
matteo	5f3ef8ab8c	Fix incorrect handling of malloc failures PR: bin/83369 MFC after: 1 week	2008-02-04 07:56:36 +00:00
des	72d185548f	Add pthread_mutex_islocked_np(), a cheap way to verify that a mutex is locked. This is intended primarily to support the userland equivalent of the various *_ASSERT_LOCKED() macros we have in the kernel. MFC after: 2 weeks	2008-02-03 22:38:10 +00:00
ume	97fd4b42a1	Remove incomplete support of AI_ALL and AI_V4MAPPED. Reported by: "Heiko Wundram (Beenic)" <wundram__at__beenic.net>	2008-02-03 19:07:55 +00:00
phk	13132840a1	Give sendfile(2) a SF_SYNC flag which makes it wait until all mbufs referencing the files VM pages are returned from the network stack, making changes to the file safe. This flag does not guarantee that the data has been transmitted to the other end.	2008-02-03 15:54:41 +00:00
jkoshy	234cf2fb65	Correct a typo.	2008-02-03 06:04:38 +00:00
deischen	9dcd92a867	When reinitializing a lockuser, don't assume that the lock is in use. If it is in use, use the watched request, otherwise use the lockuser's own request. Only allocate a lockuser request if both requests are null. PR: 119920 Tested by (6.x): Landon Fuller <landonf -at- bikemonkey -dot- org>	2008-01-31 19:38:26 +00:00
jhb	d8dba28f83	The devstat(3) manpage claims that only <devstat.h> is needed as a prerequisite for using this interface. However, the 'statinfo' struct actually references CPUSTATES from <sys/resource.h>, so in fact it requires <sys/resource.h> to compile. Use a nested include of <sys/resource.h> to make the code match the docs. Reported by: Pietro Cerutti gahr \| gahr.ch	2008-01-31 16:55:12 +00:00
kaiw	e57ce2b6e6	Add hook routine archive_write_ar_finish() which writes the 'ar' global header if nothing else has been written before the closing of the archive. This will change the behaviour when creating archives without members, i.e., instead of generating a 0-size archive file, an archive with just the global header (8 bytes in total) will be created and it is indeed a valid archive by the definition of libarchive, thus subsequent operation on this archive will be accepted. This especially solves the failure caused by following sequence: (several ports do) % ar cru libfoo.a # without specifying obj files % ranlib libfoo.a Reviewed by: kientzle, jkoshy Approved by: kientzle Approved by: jkoshy (mentor) Reported by: erwin MFC after: 1 month	2008-01-31 08:11:01 +00:00
kientzle	213915cb49	Add a test to verify compatibility with archives with odd hardlinks. I need to extend this to test pax extended archives with bodies attached to hardlinks and other less-common cases.	2008-01-31 07:47:38 +00:00
kientzle	512df3b205	Tighten up the heuristic that decides whether or not we should obey or ignore the size field on a hardlink entry. In particular, if we're reading a non-POSIX archive, we should always ignore the size field. This should fix both the audio/xmcd port and the math/unixstat port. Thanks to: Pav Lucistnik for pointing these two ports out to me. MFC after: 7 days	2008-01-31 07:41:45 +00:00
trhodes	46c986723b	Update this manual page to describe the extattr_list_file() and the extattr_list_fd() functions. PR: 108142 Submitted by: Richard Dawe <rich@phekda.gotadsl.co.uk> Reviewed by: kientzle	2008-01-29 18:15:38 +00:00
das	b2c068251b	Adjust the exponent before converting the result from double to float precision. This fixes some double rounding problems for subnormals and simplifies things a bit.	2008-01-28 01:19:07 +00:00
yar	ac1e4103b9	Our fts(3) API, as inherited from 4.4BSD, suffers from integer fields in FTS and FTSENT structs being too narrow. In addition, the narrow types creep from there into fts.c. As a result, fts(3) consumers, e.g., find(1) or rm(1), can't handle file trees an ordinary user can create, which can have security implications. To fix the historic implementation of fts(3), OpenBSD and NetBSD have already changed <fts.h> in somewhat incompatible ways, so we are free to do so, too. This change is a superset of changes from the other BSDs with a few more improvements. It doesn't touch fts(3) functionality; it just extends integer types used by it to match modern reality and the C standard. Here are its points: o For C object sizes, use size_t unless it's 100% certain that the object will be really small. (Note that fts(3) can construct pathnames _much_ longer than PATH_MAX for its consumers.) o Avoid the short types because on modern platforms using them results in larger and slower code. Change shorts to ints as follows: - For variables than count simple, limited things like states, use plain vanilla `int' as it's the type of choice in C. - For a limited number of bit flags use `unsigned' because signed bit-wise operations are implementation-defined, i.e., unportable, in C. o For things that should be at least 64 bits wide, use long long and not int64_t, as the latter is an optional type. See FTSENT.fts_number aka FTS.fts_bignum. Extending fts_number `to satisfy future needs' is pointless because there is fts_pointer, which can be used to link to arbitrary data from an FTSENT. However, there already are fts(3) consumers that require fts_number, or fts_bignum, have at least 64 bits in it, so we must allow for them. o For the tree depth, use `long'. This is a trade-off between making this field too wide and allowing for 64-bit inode numbers and/or chain-mounted filesystems. On the one hand, `long' is almost enough for 32-bit filesystems on a 32-bit platform (our ino_t is uint32_t now). On the other hand, platforms with a 64-bit (or wider) `long' will be ready for 64-bit inode numbers, as well as for several 32-bit filesystems mounted one under another. Note that fts_level has to be signed because -1 is a magic value for it, FTS_ROOTPARENTLEVEL. o For the `nlinks' local var in fts_build(), use `long'. The logic in fts_build() requires that `nlinks' be signed, but our nlink_t currently is uint16_t. Therefore let's make the signed var wide enough to be able to represent 2^16-1 in pure C99, and even 2^32-1 on a 64-bit platform. Perhaps the logic should be changed just to use nlink_t, but it can be done later w/o breaking fts(3) ABI any more because `nlinks' is just a local var. This commit also inludes supporting stuff for the fts change: o Preserve the old versions of fts(3) functions through libc symbol versioning because the old versions appeared in all our former releases. o Bump __FreeBSD_version just in case. There is a small chance that some ill-written 3-rd party apps may fail to build or work correctly if compiled after this change. o Update the fts(3) manpage accordingly. In particular, remove references to fts_bignum, which was a FreeBSD-specific hack to work around the too narrow types of FTSENT members. Now fts_number is at least 64 bits wide (long long) and fts_bignum is an undocumented alias for fts_number kept around for compatibility reasons. According to Google Code Search, the only big consumers of fts_bignum are in our own source tree, so they can be fixed easily to use fts_number. o Mention the change in src/UPDATING. PR: bin/104458 Approved by: re (quite a while ago) Discussed with: deischen (the symbol versioning part) Reviewed by: -arch (mostly silence); das (generally OK, but we didn't agree on some types used; assuming that no objections on -arch let me to stick to my opinion)	2008-01-26 17:09:40 +00:00
bde	997d2d26fb	Fix a harmless type error in 1.9.	2008-01-25 21:09:21 +00:00
des	a0d60019af	Fix a regression introduced in rev 1.99: replace fclose(f) with a comment explaining why f cannot possibly be a valid FILE * at this point. MFC after: 1 day	2008-01-23 20:57:59 +00:00
kientzle	417364f2db	Track version # from the portable release.	2008-01-23 05:48:07 +00:00
kientzle	d164e15296	Explain a subtle API change that was made recently. Even though I believe this is a good change, it does have the potential to break certain clients, so it's good to document the reasoning behind the change.	2008-01-23 05:47:08 +00:00
kientzle	5c89a8c35a	Properly pad symlinks when writing cpio "newc" format. Thanks to: Jesse Barker for reporting this. MFC after: 7 days	2008-01-23 05:43:26 +00:00
ache	061b803830	Fix longstanding mb/wc functions segfault if error occurse inside _<encoding>_init(). Currently _EUC_init() only was affected.	2008-01-23 03:05:35 +00:00
ache	28095b28d0	Better fix for longstanding segfault. Don't touch current locale at all on unknown encoding. Previous fix resets it to POSIX.	2008-01-23 02:17:27 +00:00
ache	76c6a978cc	1) Add (void) cast to _none_init() (while I am here) 2) Fix longstanding segfault in mb/wc code when unknown encoding is specified in the locale file (mb/wc functions becomes NULL in that case).	2008-01-23 01:57:26 +00:00
trhodes	3c543fe5ae	Xref flopen.3 which references this manual page. PR: 112650	2008-01-22 15:56:48 +00:00
ache	c52b8566b4	Introduce new encoding: "ASCII" It differs from default C/POSIX "NONE" mainly by stricter 8bit check for mbtowc/wctomb family, returning EILSEQ	2008-01-21 23:48:12 +00:00
bde	babf3acb0e	Fix cutoffs. This is just a cleanup and an optimization for unusual cases which are used mainly by regression tests. As usual, the cutoff for tiny args was not correctly translated to float precision. It was 2-54 but 2-24 works. It must be about 2-precision, since the error from approximating log(1+x) by x is about the same as \|x\|. Exhaustive testing shows that 2-24 gives perfect rounding in round-to-nearest mode. Similarly for the cutoff for being small, except this is not used by so many other functions. It was 2-29 but 2-15 works. It must be a bit smaller than sqrt(2*-precision), since the error from approximating log(1+x) by x-xx/2 is about the same as xx. Exhaustive testing shows that 2-15 gives a maximum error of 0.5052 ulps in round-to-nearest-mode. The algorithm for the general case is only good for 0.8388 ulps, so this is sufficient (but it loses slightly on i386 -- then extra precision gives 0.5032 ulps for the general case). While investigating this, I noticed that optimizing the usual case by falling into a middle case involving a simple polynomial evaluation (return x-xx/2 instead of x here) is not such a good idea since it gives an enormous pessimization of tinier args on machines for which denormals are slow. Float xx/2 is denormal when \|x\| ~< 2-64 and xx/2 is evaluated in float precision, so it can easily be denormal for normal x. This is even more interesting for general polynomial evaluations. Multiplying out large powers of x is normally a good optimization since it reduces dependencies, but it creates denormals starting with quite large x.	2008-01-21 13:46:21 +00:00
bde	ef5ed15ee4	Oops, when merging from the float version to the double versions, don't forget to translate "float" to "double". ucbtest didn't detect the bug, but exhaustive testing of the float case relative to the double case eventually did. The bug only affects args x with \|x\| ~> 2*19(pi/2) on non-i386 (i386 is broken in a different way for large args).	2008-01-20 04:09:44 +00:00
bde	b3048e4365	Remove the float version of the kernel of arg reduction for pi/2, since it should never have existed and it has not been used for many years (floats are reduced faster using doubles). All relevant changes (just the workaround for broken assignment) have been merged to the double version.	2008-01-19 22:50:50 +00:00
bde	2005bbb395	Do an ordinary assignment in STRICT_ASSIGN() except for floats until there is a problem with non-floats (when i386 defaults to extra precision). This essentially restores yesterday's behaviour for doubles on i386 (since generic rint() isn't used and everywhere else assumed working assignment), but for arches that use the generic rint() it finishes restoring some of 1995's behaviour (don't waste time doing unnecessary store/load).	2008-01-19 22:05:14 +00:00
bde	8499893373	Use STRICT_ASSIGN() for exp2f() and exp2() instead of a volatile variable hack for exp2f() only. The volatile variable had a surprisingly large cost for exp2f() -- 19 cycles or 15% on i386 in the worst case observed. This is only partly explained by there being several references to the variable, only one of which benefited from it being volatile. Arches that have working assignment are likely to benefit even more from not having any volatile variable. exp2() now has a chance of working with extra precision on i386. exp2() has even more references to the variable, so it would have been pessimized more by simply declaring the variable as volatile. Even the temporary volatile variable for STRICT_ASSIGN costs 5-10% on i386, (A64) so I will change STRICT_ASSIGN() to do an ordinary assignment until i386 defaults to extra precision.	2008-01-19 21:37:14 +00:00
bde	3b6da800e8	Use STRICT_ASSIGN() for _kernel_rem_pio2f() and _kernel_rem_pio2f() instead of a volatile cast hack for the float version only. The cast hack broke with gcc-4, but this was harmless since the float version hasn't been used for a few years. Merge from the float version so that the double version has a chance of working on i386 with extra precision. See k_rem_pio2f.c rev.1.8 for the original hack. Convert to _FBSDID().	2008-01-19 20:02:55 +00:00
bde	6d3cabaca6	Use STRICT_ASSIGN() for log1pf() and log1p() instead of a volatile cast hack for log1pf() only. The cast hack broke with gcc-4, resulting in ~1 million errors of more than 1 ulp, with a maximum error of ~1.5 ulps. Now the maximum error for log1pf() on i386 is 0.5034 ulps again (this depends on extra precision), and log1p() has a chance of working with extra precision. See s_log1pf.c 1.8 for the original hack. (It claims only 62343 large errors). Convert to _FBSDID(). Another thing broken with gcc-4 is the static const hack used for rcsids.	2008-01-19 18:13:21 +00:00
bde	aaee2ef564	Use STRICT_ASSIGN() instead of assorted direct volatile hacks to work around assignments not working for gcc on i386. Now volatile hacks for rint() and rintf() don't needlessly pessimize so many arches and the remaining pessimizations (for arm and powerpc) can be avoided centrally. This cleans up after s_rint.c 1.3 and 1.13 and s_rintf.c 1.3 and 1.9: - s_rint.c 1.13 broke 1.3 by only using a volatile cast hack in 1 place when it was needed in 2 places, and the volatile cast hack stopped working with gcc-4. These bugs only affected correctness tests on i386 since i386 normally uses asm rint() and doesn't support the extra precision mode that would break assignments of doubles. - s_rintf.c 1.9 improved(?) on 1.3 by using a volatile variable hack instead of an extra-precision variable hack, but it declared 2 variables as volatile when only 1 variable needed to be volatile. This only affected speed tests on i386 since i386 uses asm rintf().	2008-01-19 16:37:57 +00:00

... 2 3 4 5 6 ...

11634 Commits