Should Unicode literals be guaranteed to be well-formed?

TL;DR Betteridge’s law applies: No. Are you still here? Unicode Literals In C++ 20 there are 2 kinds and 6 forms of Unicode literals. Character literals and string literals, in UTF-8, UTF-16, and UTF-32 encodings. Each of them uses a distinct char type to signal in the type system what the encoding is for the …