freebsd-dev

Author	SHA1	Message	Date
Eitan Adler	5a51239a71	libc/locale: fix an off-by-one in newlocale Reported by: zrj@DragonFlyBSD.org	2017-12-29 14:56:46 +00:00
Pedro F. Giffuni	91fb056ed6	SPDX: Fix some License ID tags for libc.	2017-12-27 21:21:03 +00:00
Pedro F. Giffuni	d915a14ef0	libc: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using mis-identified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-25 17:12:48 +00:00
Pedro F. Giffuni	8a16b7a18f	General further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2017-11-20 19:49:47 +00:00
Pedro F. Giffuni	df57947f08	spdx: initial adoption of licensing ID tags. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Initially, only tag files that use BSD 4-Clause "Original" license. RelNotes: yes Differential Revision: https://reviews.freebsd.org/D13133	2017-11-18 14:26:50 +00:00
Bryan Drewery	dc8507e1f7	__setrunelocale: Fix asprintf(3) failure not returning an error. Also fix the style of the asprintf(3) call in __collate_load_tables_l(). Both of these lines were modified away from snprintf(3) during the import from DragonFly/Illumos. Reviewed by: jilles (briefly over shoulder) MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-09-29 16:30:50 +00:00
David Chisnall	c0cd38223c	Document some invariants for the XLC_ enum. These can't be reordered without breaking other code. Document that and add some static asserts to ensure that anyone who tries gets build failures.	2017-09-07 17:51:35 +00:00
Pedro F. Giffuni	be53a489c6	libc: minor indent(1) cleanups. Illumos and Schillix is adopting some of the locale code and our style(9) sometimes matches the Solaris cstyle, so the changes are also useful as a way to reduce diffs. No functional change. Discussed with: Joerg Schilling MFC after: 1 week	2017-08-26 16:11:21 +00:00
Enji Cooper	5e84ba7a43	localeconv(3): start sentences on new lines Reported by: make manlint MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-05-23 07:09:26 +00:00
Warner Losh	fbbd9655e5	Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96	2017-02-28 23:42:47 +00:00
Pedro F. Giffuni	ba7161ed72	Move __hidden attribute towards the end of the declaration. Apple had them at the start but moving them to the end is better for faster reading and fits better what is done in other FreeBSD headers. MFC after: 5 days	2016-12-31 15:30:00 +00:00
Eric van Gyzen	81f91de9f7	Fix error reporting from wcstof() When wcstof() skipped initial space and then parsing failed, it set endptr to the first non-space character. Fix it to correctly report failure by setting endptr to the beginning of the input string. The fix is from theraven@, who fixed this bug in wcstod() and wcstold() in r227753. While I'm here: Move assignments out of declarations in wcstod() and wcstold(). This is against my personal preference, but it is our agreed style(9). Set endptr correctly on malloc() failure in all three functions. Remove an incorrect comment: This is pointer arithmetic, so the code was not actually making that assumption. wcstold() advanced the wcp pointer beyond leading whitespace and then reset it back to the beginning of the string. Do not reset it. This seems to have no functional effect, since strtold_l() also skips leading whitespace. I'm making the change to keep this function consistent with wcstof() and wcstod(), and because the C11 spec prescribes the use of iswspace() to skip leading space. Reported by: libc++ unit test for std::stof(std::wstring) MFC after: 8 days Sponsored by: Dell EMC	2016-11-20 20:13:22 +00:00
Ruslan Bukin	77bc2a1cd6	Locale fix for endian big (EB) machines. We have locale files generated on EL machines (e.g. during cross-build on amd64 host), but then we are using them on EB machines (e.g. MIPS64EB), so proceed byte-swap if necessary. All the libc tests passed successfully, including Russian collation. Tested by: br@, Hongyan Xia <hx242@cam.ac.uk> Sponsored by: DARPA, AFRL Sponsored by: HEIF5 Differential Revision: https://reviews.freebsd.org/D8281	2016-11-01 13:54:44 +00:00
Ed Schouten	718fe473dd	Change the return type of freelocale(3) to void. Our version of this function currently returns an integer indicating failure or success, whereas POSIX specifies that this function has no return value. It returns void. Patch up the header, sources and man page to use the right type. While there, use the opportunity to simplify the body of this function. Theoretically speaking, this change breaks the ABI of this function. That said, I have yet to find any code that makes use of freelocale()'s return value. I couldn't find any of it in the base system, nor did an exp-run reveal any breakage caused by this change. PR: 211394 (exp-run)	2016-07-29 17:18:47 +00:00
Pedro F. Giffuni	5c4d28f5dc	libc: tag the Rune initialization function prototypes visibility as hidden. It is good practice to export as few symbols as possible from your shared libraries, so use the GCC visibility attribute in this case, matching what Apple's libc does. Reference: https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/CppRuntimeEnv/Articles/SymbolVisibility.html Hinted by: Apple's libc 1082.20.4 MFC after: 1 week	2016-07-19 20:22:13 +00:00
Baptiste Daroussin	dee0bbbdca	Revert 302324 and properly fix the crash with ISO-8859-5 locales PR: 211135 Reported by: jkim Tested by: jkim MFC after: 2 days	2016-07-15 23:03:20 +00:00
Andrey A. Chernov	12eae8c8f3	1) Eliminate possibility to call __*collate_range_cmp() with inclomplete locale (which cause core dump) by removing whole 'table' argument by which it passed. 2) Restore __collate_range_cmp() in __sccl(). 3) Collating [a-z] range in regcomp() only for single bytes locales (we can't do it now for other ones). In previous state only first 256 wchars are considered and all others are just silently dropped from the range.	2016-07-14 09:07:25 +00:00
Andrey A. Chernov	1daad8f5ad	Back out non-collating [a-z] ranges. Instead of changing whole course to another POSIX-permitted way for consistency and uniformity I decide to completely ignore missing regex fucntionality and concentrace on fixing bugs in what we have now, too many small obstacles instead, counting ports.	2016-07-14 08:18:12 +00:00
Andrey A. Chernov	5a5807dd4c	Remove broken support for collation in [a-z] type ranges. Only first 256 wide chars are considered currently, all other are just dropped from the range. Proper implementation require reverse tables database lookup, since objects are really big as max UTF-8 (1114112 code points), so just the same scanning as it was for 256 chars will slow things down. POSIX does not require collation for [a-z] type ranges and does not prohibit it for non-POSIX locales. POSIX require collation for ranges only for POSIX (or C) locale which is equal to ASCII and binary for other chars, so we already have it. No other *BSD implements collation for [a-z] type ranges. Restore ABI compatibility with unused now __collate_range_cmp() which is visible from outside (will be removed later).	2016-07-10 03:49:38 +00:00
Baptiste Daroussin	a8bacd6f93	Fix a bad test resulting in a segfault with ISO-8859-5 locales Reported by: Lauri Tirkkonen from Illumos Approved by: re@ (gjb)	2016-07-03 15:00:12 +00:00
Pedro F. Giffuni	3c2c0c0443	libc/locale: Fix type breakage in __collate_range_cmp(). When collation support was brought in, the second and third arguments in __collate_range_cmp() were changed from int to wchar_t, breaking the ABI. Change them to a "char" type which makes more sense and keeps the ABI compatible. Also introduce __wcollate_range_cmp() which does work with wide characters. This function is used only internally in libc so we don't export it. Use the new function in glob(3), fnmatch(3), and regexec(3). PR: 179721 Suggested by: ache. jilles MFC after: 3 weeks (perhaps partial only)	2016-06-05 19:12:52 +00:00
Andrey A. Chernov	2f423a266a	For EILSEQ case in mbsnrtowcs() and wcsnrtombs() update src to point to the character after the one this conversion stopped at. PR: 209907 Submitted by: Roel Standaert <roel@abittechnical.com> (partially) MFC after: 3 days	2016-05-31 18:44:33 +00:00
Pedro F. Giffuni	32223c1b7d	libc: spelling fixes. Mostly on comments.	2016-04-30 01:24:24 +00:00
Baptiste Daroussin	49c4407313	Restore the original ascii.c from prior to r290494 It was doing the right thing, there was no need to "fail" to reinvent it from none.c Pointy hat: bapt Submitted by: ache	2016-04-21 07:36:11 +00:00
Baptiste Daroussin	d3591d68a9	Check the returned value of memchr(3) before using it Reported by: Coverity CID: 1338530	2016-04-20 20:44:30 +00:00
Pedro F. Giffuni	513004a23d	libc: replace 0 with NULL for pointers. While here also cleanup some surrounding code; particularly drop some malloc() casts. Found with devel/coccinelle. Reviewed by: bde (previous version - all new bugs are mine)	2016-04-10 19:33:58 +00:00
Andrey A. Chernov	ae7abb26b1	SJIS encoding don't have single byte characters >= 224 MFC after: 1 week	2016-04-04 15:56:14 +00:00
Andrey A. Chernov	e08c3b7c11	EUC-type encodings don't have single byte characters >= 128 This change should not be MFCed until new collate will be MFCed first, because our old EUC tables have some hacks for missing codesets.	2016-04-04 02:43:35 +00:00
Pedro F. Giffuni	45256214eb	mbtowc(3): set errno to EILSEQ if an incomplete character is passed. According to POSIX, The mbtowc() function shall fail if: [EILSEQ] An invalid character sequence is detected. Reviewed by: bapt Differential Revision: https://reviews.freebsd.org/D5496 Obtained from: OpenBSD (Ingo Schwarze) MFC after: 1 month	2016-03-01 19:15:34 +00:00
Enji Cooper	0b5cc81d3b	Link localeconv(3) to localeconv_l(3) MFC after: 3 days	2015-11-25 09:12:30 +00:00
Baptiste Daroussin	87101cb572	return "US-ASCII" instead of "POSIX" for "C" and "POSIX" locales as it used to be in previous version of the locales. Returning "POSIX" has too many fallouts.	2015-11-10 08:11:27 +00:00
Baptiste Daroussin	403105944d	nl_langinfo: Simplify case ladder The NONE:US-ASCII case isn't necessary. The "NONE:" case will handle US-ASCII, so let's remove the redundant handling. Submitted by: marino Obtained from: DragonflyBSD	2015-11-09 22:29:47 +00:00
Baptiste Daroussin	22b87a3555	Readd ascii.c forgotten in r290618	2015-11-09 22:11:37 +00:00
Baptiste Daroussin	473aa0b7ee	locales: Enforce US-ASCII encoding (limited to 7-bit) The US-ASCII format was getting treated identically to POSIX. It is supposed to throw an ILSEQ errno if a value of 0x80 or greater is encountered, so let's bring back the "ASCII" handling. While here, change nl_codeset to return US-ASCII only when the encoding really is "US-ASCII". Before "C" and "POSIX" encoding returned this string, so now they return "POSIX". Discussed with: ache Submitted by: marino Obtained from: DragonflyBSD	2015-11-09 22:06:22 +00:00
Baptiste Daroussin	e58504783b	Fix mbtowc not setting EILSEQ on an Incomplete multibyte sequence for eucJP encoding	2015-11-02 22:56:24 +00:00
Baptiste Daroussin	d8ed03efe5	locales: Fix eucJP sorting (broken upstream?) Sorting eucJP text with "sort" resulted in an illegal sequence while "gsort" worked. This was traced back to mbrtowc handling which was broken for eucJP (probably eucCN, eucKR, and eucTW as well). This small fix took hours to figure out. The OR operation to build the wide character requires an unsigned character to work correctly. The euc wcrtowc conversion is probably broken upstream in Illumos as well. Triggered by: misc/freebsd-doc-ja in ports (encoded in eucJP) Submitted by: marino Obtained from: DragonflyBSD	2015-11-01 21:02:30 +00:00
Baptiste Daroussin	d79cdd21de	libc: Fix (and improve) nl_langinfo (CODESET) The output of "locale charmap" is identical to the result of nl_langinfo (CODESET) for any given locale. The logic for returning the codeset was very simplistic. It just returned portion of the locale name after the period (e.g. en_FR.ISO8859-1 returned "ISO8859-1"). When softlinks were added to locales, this broke. e.g.: en_US returned "" en_FR.UTF8 returned "UTF8" en_FR.UTF-8 returned "UTF-8" zh_Hant_HK.Big5HKSCS returned "Big5HKSCS" zh_Hant_TW.Big5 returned "Big5" es_ES@euro returned "" In order to fix this properly, the named locale cannot be used to determine the encoding. This information was almost available in the rune data. Unfortunately, all the single byte encodings were listed as "NONE" encoding. So I adjusted localedef tool to provide more information about the encoding. For example, instead of "NONE", the LC_CTYPE used by fr_FR.ISO8859-15 is now encoded as "NONE:ISO8859-15". The locale handlers now check if the first four characters of the encoding is "NONE" and if so, treats it as a single-byte encoding. The nl_langinfo handling of CODESET was adjusting accordingly. Now the following is returned: en_US returns "ISO8859-1" fr_FR.UTF8 returns "UTF-8" fr_FR.UTF-8 returns "UTF-8" zh_Hant_HK.Big5HKSCS returns "Big5" zh_Hant_TW.Big5 returns "Big5" es_ES@euro returns "ISO8859-15" as before, "C" and "POSIX" locales return "US-ASCII". This is a big improvement. The result of nl_langinfo can never be a zero-length string and it will always exclusively one of the values of the character maps of /usr/src/tools/tools/locale/etc/final-maps. Submitted by: marino Obtained from: DragonflyBSD	2015-11-01 12:00:55 +00:00
Baptiste Daroussin	76e6db686e	collate: Fix expansion substitions (broken upstream too) Through testing, the user noted that some Cyrillic characters were not sorting correctly, and this was confirmed. After extensive testing and review, the localedef tool was eliminated as the culprit. The sustitutions were encoded correctly in LC_COLLATE. The error was mainly in wcscoll where character expansions were mishandled. The main directive pass routines had to be written to go back for a new collation value when the "state" variable was set. Before pointers were being advanced, the second lookup was gettting applied to the wrong character, etc. The "eat expansion codes" section on collate.c also had a bug. Later own, the "state" variable logic was changed to only set if next code was greater than zero (rather than >= 0). Some additional cleanups got captured from previous work: 1) The previous commit moved the binary search comment from the correct location to a wrong location because it's wrong upstream in Illumos. The comment has little value so I just removed it. 2) Don't check if pointers are null before freeing, this is redundant as free() handles null pointers. 3) The two binary search trees were standardized wrt initialization 4) On the binary search trees, a negative "high" exits rather than checking the table count again. Submitted by: marino Obtained from: DragonflyBSD	2015-10-23 23:24:03 +00:00
Baptiste Daroussin	332fe83717	libc/collate: minor tweaks / fix The main "fix" here is properly setting a collate loading error for each early return. Tweaks include removing unnecessary null checks, adding assertions (from Illumos) and a couple of variables to reduces code differences and improve readability. For normal use, there are no functional changes here. Obtained from: DragonflyBSD, Illumos	2015-10-22 14:29:19 +00:00
Baptiste Daroussin	c25f5140e9	Include sys/*.h earlier Reported by: kib	2015-10-14 12:46:05 +00:00
Baptiste Daroussin	f5dde0166d	Commit log from Dragonfly: FreeBSD extended ctypes to include numbers (e.g. isnumber()) but never actually implemented it. The isnumber() function was equivalent to the isdigit() function in every case. Now that DragonFly's ctype source files have number definitions, the number ctype can finally be implemented. It's given a new flag _CTYPE_N. The isalnum() and iswalnum() functions have been changed to use this flag rather than the _CTYPE_D digit flag. While isalnum(), isnumber(), and their wide equivalents now return different values in locale cases, the ishexnumber() and iswhexnumber() functions are unchanged. They are still aliases for isxdigit() and iswxdigit(). Also change ctype.h for isdigit and isxdigit to use sbistype like the other functions. Obtained from: dragonfly	2015-10-13 20:43:49 +00:00
Baptiste Daroussin	becbad1f6e	Merge from head	2015-10-13 19:44:36 +00:00
Craig Rodrigues	c83f3fc4b4	Use ANSI C prototypes. Eliminates -Wold-style-definition warnings.	2015-09-20 20:50:18 +00:00
Baptiste Daroussin	23a32822d2	Merge from HEAD	2015-08-25 20:14:50 +00:00
Ed Schouten	57c69b1478	Make UTF-8 parsing and generation more strict. - in mbrtowc() we need to disallow codepoints above 0x10ffff. - In wcrtomb() we need to disallow codepoints between 0xd800 and 0xdfff. Reviewed by: bapt Differential Revision: https://reviews.freebsd.org/D3399	2015-08-25 09:16:09 +00:00
Baptiste Daroussin	eaa94ab419	Fix typo	2015-08-09 12:20:22 +00:00
Baptiste Daroussin	28a20bb3f5	Use more asprintf Plug memory leak introduced in previous asprintf addition	2015-08-09 12:13:30 +00:00
Baptiste Daroussin	b89704cee7	Use asprintf/free instead of snprintf	2015-08-09 11:50:50 +00:00
Baptiste Daroussin	5e4bbc69de	Remove useless variable	2015-08-09 11:47:01 +00:00
Baptiste Daroussin	81eb7d7e4b	Readd checking utf16 surrogates that are invalid in utf8	2015-08-09 10:36:25 +00:00

1 2 3 4 5 ...

619 Commits