File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -840,7 +840,7 @@ There's another encoding that is able to encoding the full range of Unicode
840840characters: UTF-8. UTF-8 is an 8-bit encoding, which means there are no issues
841841with byte order in UTF-8. Each byte in a UTF-8 byte sequence consists of two
842842parts: Marker bits (the most significant bits) and payload bits. The marker bits
843- are a sequence of zero to six 1 bits followed by a 0 bit. Unicode characters are
843+ are a sequence of zero to four `` 1 `` bits followed by a `` 0 `` bit. Unicode characters are
844844encoded like this (with x being payload bits, which when concatenated give the
845845Unicode character):
846846
@@ -853,12 +853,7 @@ Unicode character):
853853+-----------------------------------+----------------------------------------------+
854854| ``U-00000800 `` ... ``U-0000FFFF `` | 1110xxxx 10xxxxxx 10xxxxxx |
855855+-----------------------------------+----------------------------------------------+
856- | ``U-00010000 `` ... ``U-001FFFFF `` | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
857- +-----------------------------------+----------------------------------------------+
858- | ``U-00200000 `` ... ``U-03FFFFFF `` | 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
859- +-----------------------------------+----------------------------------------------+
860- | ``U-04000000 `` ... ``U-7FFFFFFF `` | 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
861- | | 10xxxxxx |
856+ | ``U-00010000 `` ... ``U-0010FFFF `` | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
862857+-----------------------------------+----------------------------------------------+
863858
864859The least significant bit of the Unicode character is the rightmost x bit.
You can’t perform that action at this time.
0 commit comments