C <wchar.h> - mbrtowc() Function
The C <wchar.h> mbrtowc() function converts a multibyte character whose first byte is pointed to by str to a wide character and stored at the location pointed by pwc. The function returns the length in bytes of the multibyte character.
The function uses (and updates) the shift state described by ps. If ps is a null pointer, the function uses its own internal shift state, which is altered as necessary only by calls to this function.
If str is a null character, the function resets the shift state and returns zero after storing the wide null character at pwc.
If str is a null pointer, the function resets the shift state, ignoring parameters pwc and max (no character is stored ad pwc).
Syntax
size_t mbrtowc(whcar_t *pwc, const char *str, size_t max, mbstate_t* ps);
Parameters
pwc |
Specify pointer to the wide character for output. |
str |
Specify pointer to the multibyte character. |
max |
Specify maximum number of bytes in str that can be examined. size_t is an unsigned integral type. |
ps |
Specify pointer to the mbstate_t object used when interpreting the multibyte string. |
Return Value
Returns the number of bytes from str used to produce the wide character.
If this was the null wide character, or if str is a null pointer, the function returns zero (in the first case, the null wide character is stored at pwc).
If the max first characters of str form an incomplete (but potentially valid) multibyte character, the function returns (size_t)-2 (no value is stored at pwc).
Otherwise, if the characters pointed by str do not form a valid multibyte character (or the beginning of one), the function returns (size_t)-1 and sets errno to EILSEQ (no value is stored at pwc).
Example:
In the example below shows the usage of mbrtowc() function.
#include <locale.h> #include <string.h> #include <wchar.h> void print_mb(const char* ch){ mbstate_t temp; int len; wchar_t wc; int cal = strlen(ch); const char* i = ch + cal; while ((len = mbrtowc(&wc, ch, i - ch, &temp)) > 0){ wprintf(L"Next %i bytes are the character %lc\n", len, wc); ch += len; } } int main(){ setlocale(LC_ALL, "en_US.utf8"); const char* str = u8"\xE2\x88\x83y\xE2\x88\x80x\xC2"; print_mb(str); return 0; }
The output of the above code will be:
Next 3 bytes are the character ∃ Next 1 bytes are the character y Next 3 bytes are the character ∀ Next 1 bytes are the character x
❮ C <wchar.h> Library