Skip to content

Commit c26e095

Browse files
authored
Sync docs for practice exercise micro-blog (exercism#2454)
1 parent 9d057c5 commit c26e095

1 file changed

Lines changed: 26 additions & 30 deletions

File tree

Lines changed: 26 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,37 @@
11
# Instructions
22

3-
You have identified a gap in the social media market for very very short
4-
posts. Now that Twitter allows 280 character posts, people wanting quick
5-
social media updates aren't being served. You decide to create your own
6-
social media network.
3+
You have identified a gap in the social media market for very very short posts.
4+
Now that Twitter allows 280 character posts, people wanting quick social media updates aren't being served.
5+
You decide to create your own social media network.
76

8-
To make your product noteworthy, you make it extreme and only allow posts
9-
of 5 or less characters. Any posts of more than 5 characters should be
10-
truncated to 5.
7+
To make your product noteworthy, you make it extreme and only allow posts of 5 or less characters.
8+
Any posts of more than 5 characters should be truncated to 5.
119

12-
To allow your users to express themselves fully, you allow Emoji and
13-
other Unicode.
10+
To allow your users to express themselves fully, you allow Emoji and other Unicode.
1411

1512
The task is to truncate input strings to 5 characters.
1613

1714
## Text Encodings
1815

1916
Text stored digitally has to be converted to a series of bytes.
2017
There are 3 ways to map characters to bytes in common use.
21-
* **ASCII** can encode English language characters. All
22-
characters are precisely 1 byte long.
23-
* **UTF-8** is a Unicode text encoding. Characters take between 1
24-
and 4 bytes.
25-
* **UTF-16** is a Unicode text encoding. Characters are either 2 or
26-
4 bytes long.
27-
28-
UTF-8 and UTF-16 are both Unicode encodings which means they're capable of
29-
representing a massive range of characters including:
30-
* Text in most of the world's languages and scripts
31-
* Historic text
32-
* Emoji
33-
34-
UTF-8 and UTF-16 are both variable length encodings, which means that
35-
different characters take up different amounts of space.
36-
37-
Consider the letter 'a' and the emoji '😛'. In UTF-16 the letter takes
38-
2 bytes but the emoji takes 4 bytes.
39-
40-
The trick to this exercise is to use APIs designed around Unicode
41-
characters (codepoints) instead of Unicode codeunits.
18+
19+
- **ASCII** can encode English language characters.
20+
All characters are precisely 1 byte long.
21+
- **UTF-8** is a Unicode text encoding.
22+
Characters take between 1 and 4 bytes.
23+
- **UTF-16** is a Unicode text encoding.
24+
Characters are either 2 or 4 bytes long.
25+
26+
UTF-8 and UTF-16 are both Unicode encodings which means they're capable of representing a massive range of characters including:
27+
28+
- Text in most of the world's languages and scripts
29+
- Historic text
30+
- Emoji
31+
32+
UTF-8 and UTF-16 are both variable length encodings, which means that different characters take up different amounts of space.
33+
34+
Consider the letter 'a' and the emoji '😛'.
35+
In UTF-16 the letter takes 2 bytes but the emoji takes 4 bytes.
36+
37+
The trick to this exercise is to use APIs designed around Unicode characters (codepoints) instead of Unicode codeunits.

0 commit comments

Comments
 (0)