Detection and Tiny sequences

**Describe the bug**
The issue pertains to the charset_normalizer.detect() method, which fails to perform valid encoding detection for expression sequences. Specifically, the method incorrectly recognizes the sequence as Big5 and returns it as a Chinese character. This behavior can be observed with the provided code snippet:

**To Reproduce**
Execute the code snippet, where the charset_normalizer.detect() method misidentifies the encoding of the sequence.
```
import charset_normalizer
text = b"What Actions Will Keep Us at 1.5-2\xbaC?"
text.decode(charset_normalizer.detect(text)["encoding"])
```


**Expected behavior**
The expected behavior can be demonstrated using the chardet library, which accurately recognizes the encoding as ISO 8859-1. The correct degree character is then returned from the sequence:

```
import chardet
text = b"What Actions Will Keep Us at 1.5-2\xbaC?"
text.decode(chardet.detect(text)["encoding"])
```

**Logs**

**charset-normalizer:**
```What Actions Will Keep Us at 1.5-2慢?```

**chardet:**
```What Actions Will Keep Us at 1.5-2ºC?```

```
{'encoding': 'ISO-8859-1', 'confidence': 0.73, 'language': ''}  # chardet
{'encoding': 'Big5', 'language': 'Chinese', 'confidence': 1.0}  # charset-normalizer
```

**Desktop (please complete the following information):**
 - OS: Linux
 - Python version 3.10
 - Package version 3.3.2 / 2.1.1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Detection and Tiny sequences #391

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Detection and Tiny sequences #391

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions