Table of Contents
iconv_open - allocate descriptor for character set conversion
#include <iconv.h>
iconv_t iconv_open (const char* tocode, const char* fromcode);
The
iconv_open function allocates a conversion descriptor suitable for converting
byte sequences from character encoding fromcode to character encoding tocode.
The values permitted for fromcode and tocode and the supported combinations
are system dependent. For the libiconv library, the following encodings
are supported, in all combinations.
- European languages
ASCII, ISO-8859-{1,2,3,4,5,7,9,10,13,14,15,16}, KOI8-R, KOI8-U, KOI8-RU, CP{1250,1251,1252,1253,1254,1257},
CP{850,866}, Mac{Roman,CentralEurope,Iceland,Croatian,Romania}, Mac{Cyrillic,Ukraine,Greek,Turkish},
Macintosh
- Semitic languages
ISO-8859-{6,8}, CP{1255,1256}, CP862, Mac{Hebrew,Arabic}
- Japanese
EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP, ISO-2022-JP-2, ISO-2022-JP-1
- Chinese
EUC-CN, HZ, GBK, GB18030, EUC-TW, BIG5, CP950, BIG5-HKSCS, ISO-2022-CN, ISO-2022-CN-EXT
- Korean
EUC-KR, CP949, ISO-2022-KR, JOHAB
- Armenian
ARMSCII-8
- Georgian
Georgian-Academy, Georgian-PS
- Tajik
KOI8-T
- Thai
TIS-620, CP874, MacThai
- Laotian
MuleLao-1, CP1133
- Vietnamese
VISCII, TCVN, CP1258
- Platform specifics
HP-ROMAN8, NEXTSTEP
- Full Unicode
UTF-8
UCS-2, UCS-2BE, UCS-2LE
UCS-4, UCS-4BE, UCS-4LE
UTF-16, UTF-16BE, UTF-16LE
UTF-32, UTF-32BE, UTF-32LE
UTF-7
C99, JAVA
- Full Unicode, in terms of uint16_t or uint32_t
- (with machine
dependent endianness and alignment)
UCS-2-INTERNAL, UCS-4-INTERNAL
- Locale dependent, in terms of char or wchar_t
- (with machine dependent endianness and alignment, and with semantics depending
on the OS and the current LC_CTYPE locale facet)
char, wchar_t
When configured with the option --enable-extra-encodings, it
also provides support for a few extra encodings:
- European languages
CP{437,737,775,852,853,855,857,858,860,861,863,865,869,1125}
- Semitic languages
CP864
- Japanese
EUC-JISX0213, Shift_JISX0213, ISO-2022-JP-3
- Turkmen
TDS565
- Platform specifics
RISCOS-LATIN1
The empty encoding name "" is equivalent to "char": it denotes
the locale dependent character encoding.
When the string "//TRANSLIT" is
appended to tocode, transliteration is activated. This means that when a
character cannot be represented in the target character set, it can be
approximated through one or several similarly looking characters.
When the
string "//IGNORE" is appended to tocode, characters that cannot be represented
in the target character set will be silently discarded.
The resulting conversion
descriptor can be used with iconv any number of times. It remains valid
until deallocated using iconv_close.
A conversion descriptor contains a
conversion state. After creation using iconv_open, the state is in the initial
state. Using iconv modifies the descriptor’s conversion state. (This implies
that a conversion descriptor can not be used in multiple threads simultaneously.)
To bring the state back to the initial state, use iconv with NULL as inbuf
argument.
The iconv_open function returns a freshly allocated
conversion descriptor. In case of error, it sets errno and returns (iconv_t)(-1).
The following error can occur, among others:
- EINVAL
- The conversion
from fromcode to tocode is not supported by the implementation.
UNIX98
iconv(3)
, iconv_close(3)
Table of Contents