File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -839,7 +839,7 @@ There's another encoding that is able to encoding the full range of Unicode
839839characters: UTF-8. UTF-8 is an 8-bit encoding, which means there are no issues
840840with byte order in UTF-8. Each byte in a UTF-8 byte sequence consists of two
841841parts: Marker bits (the most significant bits) and payload bits. The marker bits
842- are a sequence of zero to six 1 bits followed by a 0 bit. Unicode characters are
842+ are a sequence of zero to four `` 1 `` bits followed by a `` 0 `` bit. Unicode characters are
843843encoded like this (with x being payload bits, which when concatenated give the
844844Unicode character):
845845
@@ -852,12 +852,7 @@ Unicode character):
852852+-----------------------------------+----------------------------------------------+
853853| ``U-00000800 `` ... ``U-0000FFFF `` | 1110xxxx 10xxxxxx 10xxxxxx |
854854+-----------------------------------+----------------------------------------------+
855- | ``U-00010000 `` ... ``U-001FFFFF `` | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
856- +-----------------------------------+----------------------------------------------+
857- | ``U-00200000 `` ... ``U-03FFFFFF `` | 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
858- +-----------------------------------+----------------------------------------------+
859- | ``U-04000000 `` ... ``U-7FFFFFFF `` | 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
860- | | 10xxxxxx |
855+ | ``U-00010000 `` ... ``U-0010FFFF `` | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
861856+-----------------------------------+----------------------------------------------+
862857
863858The least significant bit of the Unicode character is the rightmost x bit.
You can’t perform that action at this time.
0 commit comments