Commit Graph

7 Commits

Author SHA1 Message Date
Kyle Evans
693f88c9da iconv_std: complete the //IGNORE support
Previously, it would only ignore failures due to csmapper conversion
failure.  It may be the case that the input string contains invalid
sequences that also need to be ignored.

A good example of //IGNORE application is sanitizing user- or remotely-
specified strings that are expected to be UTF-8; perhaps as part of a
pipeline that will feed the result into a system less tested against or
tolerant of illegal UTF-8 sequences.

Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D34345
2022-08-11 11:42:20 -05:00
Pedro F. Giffuni
5e53a4f90f lib: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-26 02:00:33 +00:00
Tijl Coosemans
1243a98e38 Remove the const qualifier from iconv(3) to comply with POSIX:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/iconv.html

Adjust all code that calls iconv.

PR:		199099
Exp-run by:	antoine
MFC after:	2 weeks
2015-04-15 09:09:20 +00:00
Tijl Coosemans
9ca40936af - In the libiconv module for ISO 2022 restore the original order of the
fields of a private struct such that variables of this type are
  initialised correctly.  Fixes conversion from ISO 2022.
  Also do this in the BIG5 module to prevent similar errors in the future.
- In the libiconv module for EUC-TW replace 2^cs with 1<<cs.  Fixes
  conversion from EUC-TW.
- Synchronise iconv code with NetBSD.  In most cases this only updates
  the RCS id because the changes are already there or are NetBSD specific.
  + libc/iconv/citrus_csmapper.c: Add a comment.
  + libc/iconv/citrus_db_factory.c: Remove put16().
  + libc/iconv/citrus_iconv.c: Return EINVAL on error.
  + libc/iconv/citrus_mapper.c: Return EINVAL on error.
  + libc/iconv/citrus_memstream.c: Fix type of a variable.
  + libc/iconv/citrus_prop.h: Sync definition of _CITRUS_PROP_HINT_END.
  + libc/iconv/citrus_stdenc.c: Return EINVAL on error.
  + libiconv_modules/mapper_std/citrus_mapper_std.c: Plug memory leak.

Obtained from:	NetBSD
MFC after:	2 weeks
2014-04-01 10:36:11 +00:00
Hiroki Sato
7c5b23111c Add ICONV_{GET,SET}_ILSEQ_INVALID iconvctl. GNU iconv returns EILSEQ
when there is an invalid character in the output codeset while it is
valid in the input.  However, POSIX requires iconv() to perform an
implementation-defined conversion on the character.  So, Citrus iconv converts
such a character to a special character which means it is invalid in the
output codeset.

This is not a problem in most cases but some software like libxml2 depends
on GNU's behavior to determine if a character is output as-is or another form
such as a character entity (&#NNN;).
2013-11-25 01:26:06 +00:00
Peter Wemm
7900abff04 As a followup to r252547, propate const down the call stack. 2013-07-03 18:27:45 +00:00
Gabor Kovesdan
ad30f8e79b Add the BSD-licensed Citrus iconv to the base system with default off
setting. It can be built by setting the WITH_ICONV knob. While this
knob is unset, the library part, the binaries, the header file and
the metadata files will not be built or installed so it makes no impact
on the system if left turned off.

This work is based on the iconv implementation in NetBSD but a great
number of improvements and feature additions have been included:

- Some utilities have been added. There is a conversion table generator,
  which can compare conversion tables to reference data generated by
  GNU libiconv. This helps ensuring conversion compatibility.
- UTF-16 surrogate support and some endianness issues have been fixed.
- The rather chaotic Makefiles to build metadata have been refactored
  and cleaned up, now it is easy to read and it is also easier to add
  support for new encodings.
- A bunch of new encodings and encoding aliases have been added.
- Support for 1->2, 1->3 and 1->4 mappings, which is needed for
  transliterating with flying accents as GNU does, like "u.
- Lots of warnings have been fixed, the major part of the code is
  now WARNS=6 clean.
- New section 1 and section 5 manual pages have been added.
- Some GNU-specific calls have been implemented:
  iconvlist(), iconvctl(), iconv_canonicalize(), iconv_open_into()
- Support for GNU's //IGNORE suffix has been added.
- The "-" argument for stdin is now recognized in iconv(1) as per POSIX.
- The Big5 conversion module has been fixed.
- The iconv.h header files is supposed to be compatible with the
  GNU version, i.e. sources should build with base iconv.h and
  GNU libiconv. It also includes a macro magic to deal with the
  char ** and const char ** incompatibility.
- GNU compatibility: "" or "char" means the current local
  encoding in use
- Various cleanups and style(9) fixes.

Approved by:	delphij (mentor)
Obtained from:	The NetBSD Project
Sponsored by:	Google Summer of Code 2009
2011-02-25 00:04:39 +00:00