javaobj-py3
###########
.. image:: https://img.shields.io/pypi/v/javaobj-py3.svg
:target: https://pypi.python.org/pypi/javaobj-py3/
:alt: Latest Version
.. image:: https://img.shields.io/pypi/l/javaobj-py3.svg
:target: https://pypi.python.org/pypi/javaobj-py3/
:alt: License
.. image:: https://travis-ci.org/tcalmant/python-javaobj.svg?branch=master
:target: https://travis-ci.org/tcalmant/python-javaobj
:alt: Travis-CI status
.. image:: https://coveralls.io/repos/tcalmant/python-javaobj/badge.svg?branch=master
:target: https://coveralls.io/r/tcalmant/python-javaobj?branch=master
:alt: Coveralls status
*python-javaobj* is a python library that provides functions for reading and
writing (writing is WIP currently) Java objects serialized or will be
deserialized by ``ObjectOutputStream``. This form of object representation is a
standard data interchange format in Java world.
The ``javaobj`` module exposes an API familiar to users of the standard library
``marshal``, ``pickle`` and ``json`` modules.
About this repository
=====================
This project is a fork of *python-javaobj* by Volodymyr Buell, originally from
`Google Code `_ and now hosted on
`GitHub `_.
This fork intends to work both on Python 2.7 and Python 3.4+.
Compatibility Warnings
======================
New implementation of the parser
--------------------------------
:Implementations: ``v1``, ``v2``
:Version: ``0.4.0``+
Since version 0.4.0, two implementations of the parser are available:
* ``v1``: the *classic* implementation of ``javaobj``, with a work in progress
implementation of a writer.
* ``v2``: the *new* implementation, which is a port of the Java project
`jdeserialize `_,
with support of the object transformer (with a new API) and of the ``numpy``
arrays loading.
You can use the ``v1`` parser to ensure that the behaviour of your scripts
doesn't change and to keep the ability to write down files.
You can use the ``v2`` parser for new developments
*which won't require marshalling* and as a *fallback* if the ``v1``
fails to parse a file.
Object transformers V1
----------------------
:Implementations: ``v1``
:Version: ``0.2.0``+
As of version 0.2.0, the notion of *object transformer* from the original
project as been replaced by an *object creator*.
The *object creator* is called before the deserialization.
This allows to store the reference of the converted object before deserializing
it, and avoids a mismatch between the referenced object and the transformed one.
Object transformers V2
----------------------
:Implementations: ``v2``
:Version: ``0.4.0``+
The ``v2`` implementation provides a new API for the object transformers.
Please look at the *Usage (V2)* section in this file.
Bytes arrays
------------
:Implementations: ``v1``
:Version: ``0.2.3``+
As of version 0.2.3, bytes arrays are loaded as a ``bytes`` object instead of
an array of integers.
Features
========
* Java object instance un-marshalling
* Java classes un-marshalling
* Primitive values un-marshalling
* Automatic conversion of Java Collections to python ones
(``HashMap`` => ``dict``, ``ArrayList`` => ``list``, etc.)
* Basic marshalling of simple Java objects (``v1`` implementation only)
Requirements
============
* Python >= 2.7 or Python >= 3.4
* ``enum34`` and ``typing`` when using Python <= 3.4 (installable with ``pip``)
* Maven 2+ (for building test data of serialized objects.
You can skip it if you do not plan to run ``tests.py``)
Usage (V1 implementation)
=========================
Un-marshalling of Java serialised object:
.. code-block:: python
import javaobj
with open("obj5.ser", "rb") as fd:
jobj = fd.read()
pobj = javaobj.loads(jobj)
print(pobj)
Or, you can use ``JavaObjectUnmarshaller`` object directly:
.. code-block:: python
import javaobj
with open("objCollections.ser", "rb") as fd:
marshaller = javaobj.JavaObjectUnmarshaller(fd)
pobj = marshaller.readObject()
print(pobj.value, "should be", 17)
print(pobj.next, "should be", True)
pobj = marshaller.readObject()
**Note:** The objects and methods provided by ``javaobj`` module are shortcuts
to the ``javaobj.v1`` package, for Compatibility purpose.
It is **recommended** to explicitly import methods and classes from the ``v1``
(or ``v2``) package when writing new code, in order to be sure that your code
won't need import updates in the future.
Usage (V2 implementation)
=========================
The following methods are provided by the ``javaobj.v2`` package:
* ``load(fd, *transformers, use_numpy_arrays=False)``:
Parses the content of the given file descriptor, opened in binary mode (`rb`).
The method accepts a list of custom object transformers. The default object
transformer is always added to the list.
The ``use_numpy_arrays`` flag indicates that the arrays of primitive type
elements must be loaded using ``numpy`` (if available) instead of using the
standard parsing technic.
* ``loads(bytes, *transformers, use_numpy_arrays=False)``:
This the a shortcut to the ``load()`` method, providing it the binary data
using a ``BytesIO`` object.
**Note:** The V2 parser doesn't have the marshalling capability.
Sample usage:
.. code-block:: python
import javaobj.v2 as javaobj
with open("obj5.ser", "rb") as fd:
pobj = javaobj.load(fd)
print(pobj.dump())
Object Transformer
-------------------
An object transformer can be called during the parsing of a Java object
instance or while loading an array.
The Java object instance parsing works in two main steps:
1. The transformer is called to create an instance of a bean that inherits
``JavaInstance``.
2. The latter bean is then called:
* When the object is written with a custom block data
* After the fields and annotations have been parsed, to update the content of
the Python bean.
Here is an example for a Java ``HashMap`` object. You can look at the code of
the ``javaobj.v2.transformer`` module to see the whole implementation.
.. code-block:: python
class JavaMap(dict, javaobj.v2.beans.JavaInstance):
"""
Inherits from dict for Python usage, JavaInstance for parsing purpose
"""
def __init__(self):
# Don't forget to call both constructors
dict.__init__(self)
JavaInstance.__init__(self)
def load_from_blockdata(self, parser, reader, indent=0):
"""
Reads content stored in a block data.
This method is called only if the class description has both the
``SC_EXTERNALIZABLE`` and ``SC_BLOCK_DATA`` flags set.
The stream parsing will stop and fail if this method returns False.
:param parser: The JavaStreamParser in use
:param reader: The underlying data stream reader
:param indent: Indentation to use in logs
:return: True on success, False on error
"""
# This kind of class is not supposed to have the SC_BLOCK_DATA flag set
return False
def load_from_instance(self, indent=0):
# type: (int) -> bool
"""
Load content from the parsed instance object.
This method is called after the block data (if any), the fields and
the annotations have been loaded.
:param indent: Indentation to use while logging
:return: True on success (currently ignored)
"""
# Maps have their content in their annotations
for cd, annotations in self.annotations.items():
# Annotations are associated to their definition class
if cd.name == "java.util.HashMap":
# We are in the annotation created by the handled class
# Group annotation elements 2 by 2
# (storage is: key, value, key, value, ...)
args = [iter(annotations[1:])] * 2
for key, value in zip(*args):
self[key] = value
# Job done
return True
# Couldn't load the data
return False
class MapObjectTransformer(javaobj.v2.api.ObjectTransformer):
"""
Creates a JavaInstance object with custom loading methods for the
classes it can handle
"""
def create_instance(self, classdesc):
# type: (JavaClassDesc) -> Optional[JavaInstance]
"""
Transforms a parsed Java object into a Python object
:param classdesc: The description of a Java class
:return: The Python form of the object, or the original JavaObject
"""
if classdesc.name == "java.util.HashMap":
# We can handle this class description
return JavaMap()
else:
# Return None if the class is not handled
return None