Skip to content

Commit 66df98d

Browse files
committed
Added codecs: pkzip*
1 parent e443fd4 commit 66df98d

9 files changed

Lines changed: 113 additions & 27 deletions

File tree

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,9 @@ o
191191
`navajo` | text <-> Navajo | only handles letters (not full words from the Navajo dictionary)
192192
`octal` | text <-> octal digits | dummy octal conversion (converts to 3-digits groups)
193193
`ordinal` | text <-> ordinal digits | dummy character ordinals conversion (converts to 3-digits groups)
194+
`pkzip_deflate` | text <-> deflated text | standard Zip-deflate compression/decompression
195+
`pkzip_bzip2` | text <-> Bzipped text | standard BZip2 compression/decompression
196+
`pkzip_lzma` | text <-> LZMA-compressed text | standard LZMA compression/decompression
194197
`radio` | text <-> radio words | aka NATO or radio phonetic alphabet
195198
`resistor` | text <-> resistor colors | aka resistor color codes
196199
`rot` | text <-> rot(N) ciphertext | aka Caesar cipher (N belongs to [1,25])

codext/VERSION.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.8.2
1+
1.8.3

codext/compressions/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# -*- coding: UTF-8 -*-
2+
from .gzipp import *
3+
from .pkzip import *
4+
Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@
1616

1717

1818
__examples__ = {'enc-dec(gzip)': ["test", "This is a test"]}
19-
__guess__ = ["gzip"]
2019

2120

2221
def gzip_encode(text, errors="strict"):

codext/compressions/pkzip.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# -*- coding: UTF-8 -*-
2+
"""Pkzip Codec - pkzip content compression.
3+
4+
NB: Not an encoding properly speaking.
5+
6+
This codec:
7+
- en/decodes strings from str to str
8+
- en/decodes strings from bytes to bytes
9+
- decodes file content to str (read)
10+
- encodes file content from str to bytes (write)
11+
"""
12+
import zipfile
13+
14+
from ..__common__ import *
15+
16+
17+
_str = ["test", "This is a test", "@random{1024}"]
18+
__examples1__ = {'enc-dec(pkzip-deflate|deflate)': _str}
19+
__examples2__ = {'enc-dec(pkzip_bz2|bzip2)': _str}
20+
__examples3__ = {'enc-dec(pkzip-lzma|lzma)': _str}
21+
22+
23+
if PY3:
24+
def pkzip_encode(compression_type):
25+
def _encode(text, errors="strict"):
26+
c = zipfile._get_compressor(compression_type)
27+
return c.compress(b(text)) + c.flush(), len(text)
28+
return _encode
29+
30+
31+
def pkzip_decode(compression_type):
32+
def _decode(data, errors="strict"):
33+
d = zipfile._get_decompressor(compression_type)
34+
r = d.decompress(b(data))
35+
return r, len(r)
36+
return _decode
37+
38+
39+
add("pkzip_deflate", pkzip_encode(8), pkzip_decode(8), r"(?:(?:pk)?zip[-_])?deflate",
40+
entropy=7.9, examples=__examples1__, guess=["deflate"])
41+
42+
add("pkzip_bzip2", pkzip_encode(12), pkzip_decode(12), r"(?:(?:pk)?zip[-_])?bz(?:ip)?2",
43+
entropy=7.9, examples=__examples2__, guess=["bz2"])
44+
45+
add("pkzip_lzma", pkzip_encode(14), pkzip_decode(14), r"(?:(?:pk)?zip[-_])?lzma",
46+
entropy=7.9, examples=__examples3__, guess=["lzma"])
47+

codext/others/__init__.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
# -*- coding: UTF-8 -*-
22
from .dna import *
3-
from .gzipp import *
43
from .html import *
54
from .letters import *
65
from .markdown import *

docs/enc/compressions.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
`codext` provides a few common compression codecs.
2+
3+
-----
4+
5+
### GZip
6+
7+
**Codec** | **Conversions** | **Aliases** | **Comment**
8+
:---: | :---: | --- | ---
9+
`gzip` | data <-> GZipped data | | decoding tries with and without the file signature
10+
11+
```python
12+
>>> codext.encode('test', "gzip")
13+
'\x1f\x8b\x08\x00\x0esÛ_\x02ÿ+I-.\x01\x00\x0c~\x7fØ\x04\x00\x00\x00'
14+
>>> codext.decode('\x1f\x8b\x08\x00\x0esÛ_\x02ÿ+I-.\x01\x00\x0c~\x7fØ\x04\x00\x00\x00', "gzip")
15+
'test'
16+
```
17+
18+
-----
19+
20+
### PKZip
21+
22+
This implements multiple compression types available in the native [`zipfile`](https://docs.python.org/3/library/zipfile.html) library.
23+
24+
**Codec** | **Conversions** | **Aliases** | **Comment**
25+
:---: | :---: | --- | ---
26+
`pkzip_deflate` | data <-> Deflated data | `deflate`, `zip_deflate` | Python3 only
27+
`pkzip_bzip2` | data <-> Bzipped data | `bz2`, `bzip2`, `zip_bz2` | Python3 only
28+
`pkzip_lzma` | data <-> LZMA-compressed data | `lzma`, `zip_lzma` | Python3 only
29+
30+
```python
31+
>>> codecs.encode("a test string", "deflate")
32+
'KT(I-.Q(.)ÊÌK\x07\x00'
33+
>>> codecs.decode("KT(I-.Q(.)ÊÌK\x07\x00", "zip_deflate")
34+
'a test string'
35+
```
36+
37+
```python
38+
>>> codecs.encode("a test string", "bzip2")
39+
'BZh91AY&SY°\x92µÏ\x00\x00\x01\x11\x80@\x00\x1c\x00 \x00"\x1a\x07¤ É\x88u\x95Á`Òñw$S\x85\t\x0b\t+\\ð'
40+
>>> codecs.decode("BZh91AY&SY°\x92µÏ\x00\x00\x01\x11\x80@\x00\"¡\x1c\x00 \x00\"\x1a\x07¤ É\x88u\x95Á`Òñw$S\x85\t\x0b\t+\\ð", "bz2")
41+
'a test string'
42+
```
43+
44+
```python
45+
>>> codecs.encode("a test string", "lzma")
46+
'\t\x04\x05\x00]\x00\x00\x80\x00\x000\x88\n\x86\x94\\Uf\x14Þ\x82*\x11ëê\x93fÿý\x84 \x00'
47+
>>> codecs.decode("\t\x04\x05\x00]\x00\x00\x80\x00\x000\x88\n\x86\x94\\Uf\x14Þ\x82*\x11ëê\x93fÿý\x84 \x00", "zip_lzma")
48+
'a test string'
49+
```
50+

docs/enc/others.md

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -29,23 +29,6 @@ CACTCGGTCGGCCATATGTTCGGCCATATGTTCGTCTGTTCACTCGCCCATACACT
2929

3030
-----
3131

32-
### GZip
33-
34-
This is, of course, not an encoding properly speaking, but it is implemented for the sake of convenience.
35-
36-
**Codec** | **Conversions** | **Aliases** | **Comment**
37-
:---: | :---: | --- | ---
38-
`gzip` | data <-> GZipped data | | decoding tries with and without the file signature
39-
40-
```python
41-
>>> codext.encode('test', "gzip")
42-
'\x1f\x8b\x08\x00\x0esÛ_\x02ÿ+I-.\x01\x00\x0c~\x7fØ\x04\x00\x00\x00'
43-
>>> codext.decode('\x1f\x8b\x08\x00\x0esÛ_\x02ÿ+I-.\x01\x00\x0c~\x7fØ\x04\x00\x00\x00', "gzip")
44-
'test'
45-
```
46-
47-
-----
48-
4932
### HTML Entities
5033

5134
This implements the full list of characters available at [this reference](https://dev.w3.org/html5/html-author/charref).

mkdocs.yml

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,14 @@ pages:
77
- Features: features.md
88
- Encodings: encodings.md
99
- Encodings:
10-
- 'Base': enc/base.md
11-
- 'Binary': enc/binary.md
12-
- 'Common': enc/common.md
13-
- 'Cryptography': enc/crypto.md
14-
- 'Languages': enc/languages.md
15-
- 'Others': enc/others.md
16-
- 'Steganography': enc/stegano.md
10+
- Base: enc/base.md
11+
- Binary: enc/binary.md
12+
- Common: enc/common.md
13+
- Compressions: enc/compressions.md
14+
- Cryptography: enc/crypto.md
15+
- Languages: enc/languages.md
16+
- Others: enc/others.md
17+
- Steganography: enc/stegano.md
1718
- 'String manipulations': manipulations.md
1819
- 'CLI tool': cli.md
1920
- 'Create your codec': howto.md

0 commit comments

Comments
 (0)