class template
<codecvt>

std::codecvt_utf8_utf16

template < class Elem, unsigned long MaxCode = 0x10ffffUL, codecvt_mode Mode = (codecvt_mode)0 >
  class codecvt_utf8_utf16 : public codecvt <Elem, char, mbstate_t>
Convert between UTF-8 and UTF-16

Converts between multibyte sequences encoded in UTF-8 and UTF-16.

The facet uses Elem as its internal character type (encoded as UTF-16), and char as its external character type (encoded as UTF-8). Therefore:

Template parameters

Elem
The internal character type, aliased as member intern_type. This shall be a wide character type: wchar_t, char16_t or char32_t.
For 32bit-wide characters, conversions in of characters result in one UTF-16 code unit stored per wide character (as a 32-bit value).
The external character type in this facet is always char.
MaxCode
The largest code point that will be translated without reporting a conversion error.
Mode
Bitmask value of type codecvt_mode:
labelvaluedescription
consume_header4An optional initial header sequence (BOM) is read to determine whether a multibyte sequence converted in is big-endian or little-endian.
generate_header2An initial header sequence (BOM) shall be generated to indicate whether a multibyte sequence converted out is big-endian or little-endian.
little_endian1The multibyte sequence generated on conversions out shall be little-endian (as opposed to the default big-endian).

Member types

The following aliases are member types of codecvt_utf8_utf16, inherited from codecvt:

member typedefinitionnotes
intern_typeThe first template parameter (Elem)The internal character type (encoded as UTF-16).
extern_typecharThe external character type (encoded as UTF-8).
state_typembstate_tConversion state type (see mbstate_t).
resultcodecvt_base::resultEnum type with the result of a conversion operation (see codecvt_base::result).

Public member functions inherited from codecvt


Conversion functions:

Character encoding properties:

Virtual protected member functions

The class defines its functionality through its virtual protected member functions:
member functionbehavior in codecvt_utf16
do_always_no_convReturns 0 (not all conversions will yield a noconv result).
do_encodingReturns 0 (the external encoding is not fixed-width).
do_inConverts from UTF-8 to UTF-16.
do_lengthReturns length (for codecvt::length).
do_max_lengthReturns the maximum length (in bytes) of a code point.
do_outConverts from UTF-16 to UTF-8.
do_unshiftBrings the mbstate_t object to an initial state.
(destructor)Releases resources.

Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// codecvt_utf8_utf16 example
#include <iostream>
#include <locale>
#include <string>
#include <codecvt>

int main ()
{
  std::wstring_convert<std::codecvt_utf8_utf16<char16_t>,char16_t> conversion;
  std::string mbs = conversion.to_bytes( u"\u4f60\u597d" );  // ni hao (你好)

  // print out hex value of each byte:
  std::cout << std::hex;
  for (int i=0; i<mbs.length(); ++i)
    std::cout << int(unsigned char(mbs[i])) << ' ';
  std::cout << '\n';

  return 0;
}


Output:

e4 bd a0 e5 a5 bd 

See also