Table of Contents
mbrtowc - convert a multibyte sequence to a wide character
#include <wchar.h>
size_t mbrtowc (wchar_t* pwc, const char* s, size_t n, mbstate_t* ps);
The
main case for this function is when s is not NULL and pwc is not NULL. In
this case, the mbrtowc function inspects at most n bytes of the multibyte
string starting at s, extracts the next complete multibyte character, converts
it to a wide character and stores it at *pwc. It updates the shift state
*ps. If the converted wide character is not L'\0', it returns the number of
bytes that were consumed from s. If the converted wide character is L'\0',
it resets the shift state *ps to the initial state and returns 0.
If the
n bytes starting at s do not contain a complete multibyte character, mbrtowc
keeps track of the partial multibyte character by updating *ps and returns
(size_t)(-2). This can happen even if n >= MB_CUR_MAX, if the multibyte string
contains redundant shift sequences.
If the multibyte string starting at
s contains an invalid multibyte sequence before the next complete character,
mbrtowc returns (size_t)(-1) and sets errno to EILSEQ. In this case, the
effects on *ps are undefined.
A different case is when s is not NULL but
pwc is NULL. In this case the mbrtowc function behaves as above, excepts
that it does not store the converted wide character in memory.
A third case
is when s is NULL. In this case, pwc and n are ignored. If *ps contains no
partially accumulated multibyte character, the mbrtowc function puts *ps
in the initial state and returns 0; otherwise it returns (size_t)(-1) and
sets errno to EILSEQ.
In all of the above cases, if ps is a NULL pointer,
a static anonymous state only known to the mbrtowc function is used instead.
The mbrtowc function returns the number of bytes parsed from
the multibyte sequence starting at s, if a non-L'\0' wide character was recognized.
It returns 0, if a L'\0' wide character was recognized. It returns (size_t)(-1)
and sets errno to EILSEQ, if an invalid multibyte sequence was encountered.
It returns (size_t)(-2) if it couldn't parse a complete multibyte character,
meaning that the remaining bytes should be fed to mbrtowc in a new call.
ISO/ANSI C, UNIX98
mbsrtowcs(3)
The behaviour
of mbrtowc depends on the LC_CTYPE category of the current locale.
Table of Contents