deeplearning4j
diff --git a/‎.github/lock.yml‎
Lines changed: 0 additions & 34 deletions b/‎.github/lock.yml‎
Lines changed: 0 additions & 34 deletions
diff --git a/‎nd4j/ADRs/0001-SameDiff_File_Format.md‎ ‎ADRs/0001-SameDiff_File_Format.md‎nd4j/ADRs/0001-SameDiff_File_Format.md renamed to ADRs/0001-SameDiff_File_Format.md b/‎nd4j/ADRs/0001-SameDiff_File_Format.md‎ ‎ADRs/0001-SameDiff_File_Format.md‎nd4j/ADRs/0001-SameDiff_File_Format.md renamed to ADRs/0001-SameDiff_File_Format.md
diff --git a/‎ADRs/0002-ONNX_Runtime.md‎
Lines changed: 58 additions & 0 deletions b/‎ADRs/0002-ONNX_Runtime.md‎
Lines changed: 58 additions & 0 deletions
diff --git a/‎ADRs/0003-Import_IR.md‎
Lines changed: 251 additions & 0 deletions b/‎ADRs/0003-Import_IR.md‎
Lines changed: 251 additions & 0 deletions
@@ -0,0 +1,58 @@
+# Onnx runtime module
+
+## Status
+Proposed
+
+Proposed by: Adam Gibson (23-09-2020)
+
+Discussed with: saudet
+
+## Context
+
+We need a way of providing nd4j a way of running onnx modules
+that is easily compatible with the onnx community. The gold standard for this
+is is using [onnxruntime](https://github.com/microsoft/onnxruntime/blob/master/docs/Java_API.md).
+
+
+## Decision
+
+We will use javacpp's onnxruntime bindings in a similar manner to [nd4j-tensorflow](../nd4j-tensorflow)
+allowing nd4j to be used as an ndarray format that interops with onnxruntime.
+
+We will implement a simple api similar to the [GraphRunner](../nd4j-tensorflow/src/main/java/org/nd4j/tensorflow/conversion/graphrunner/GraphRunner.java)
+This will sit on top of javacpp's lower level onnxruntime bindings.
+
+This module will follow a similar structure to the nd4j-tensorflow module
+focusing on INDArrays as a data interchange format, but otherwise pass execution
+down to onnxruntime.
+
+
+The main api to the graph runner works as follows:
+
+```java
+try(GraphRunner runner = new GraphRunner(...)) {
+   Map<String,INDArray> inputs = new HashMap<>();
+   // ..initialize inputs
+ Map<String,INDArray> outputs = runner.run(inputs);
+// process outputs...
+}
+```
+
+The core logic will contain the following components:
+
+1. Loading onnx pb files
+2. A graph runner in similar nature to nd4j-tensorflow
+3. Interop with onnxruntime's version of an ndarray/tensor
+
+Using different accelerators/backends
+-----------------------------------------
+
+Similar to nd4j-tensorflow which uses javacpp for the specific version of
+tensorflow to use, this module will rely on the user picking the right dependency
+to link against. Different builds of cpu, gpu, .. exist [here](https://repo1.maven.org/maven2/org/bytedeco/tensorflow/1.15.3-1.5.4/)
+The equivalent of this in onnxruntime can be found [here](https://repo1.maven.org/maven2/org/bytedeco/onnxruntime/1.4.0-1.5.4/)
+
+The user will need to include the version of onnxruntime they wish to use
+similar to how you link against a particular implementation in a c library
+or include a backend in nd4j. This will happen via maven.
+
@@ -0,0 +1,251 @@
+# Import IR
+
+## Status
+
+Proposed
+
+Proposed by: Adam Gibson (28-09-2020)
+
+Discussed with: Paul Dubs
+
+## Context
+
+Currently, there is a gap in the way samediff/nd4j operations are implemented
+vs. how other frameworks represent their models.
+
+Keras, Tensorflow, and Pytorch use an attribute based format with names. Interop
+between Onnx ,Tensorflow, and Keras tends to follow the following formula:
+
+1. Map names to equivalent names in the other framework for each operation
+   configuration. Names being both op names and associated attributes of the
+   operations such as in Conv2D where you have strides, kernel sizes.
+2. Map input/output tensors to the equivalent tensor type in each framework.
+3. Setup the complete graph in the equivalent framework. Sometimes the
+   framework's concepts don't map 1 to 1. They should output equivalent results
+   regardless though.  In order to do this, sometimes the framework needs to
+   add/remove operations in order to produce equivalent output in a different
+   graph. The [tensorflow onnx import](https://github.com/onnx/tensorflow-onnx#how-tf2onnx-works)
+   is a good example of this.
+
+Samediff/nd4j have their internal op representations as a set of ordered
+arguments for execution in the form of:
+
+1. t arguments: floating point arguments (float, double,..)
+2. integer arguments: integer arguments (long, integer)
+3. boolean argument: boolean arguments
+4. data type arguments: data types for input/output
+5. input arguments: ndarrays for input
+6. output arguments: often optional (dynamically created) output ndarray
+   arguments. If the user wants to pass in outputs to control memory, they are
+   allowed to do so.
+7. axis arguments: Integer arguments that represent the dimension(s) for an
+   operation to be executed on.
+
+[Reference implementation](https://github.com/KonduitAI/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/api/ops/DynamicCustomOp.java#L58)
+
+This maps well enough for execution, but not for file formats.
+
+## Related Work
+This may encourage future work to be done to the
+[samediff file format](https://github.com/KonduitAI/deeplearning4j/blob/master/nd4j/ADRs/0001-SameDiff_File_Format.md).
+Implementation of serialization of file format via flatbuffers can be found
+[here](https://github.com/eclipse/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/autodiff/samediff/SameDiff.java#L4748)
+Of note here for prior work is the
+[current code generation]
+(https://github.com/KonduitAI/dl4j-dev-tools/blob/master/codegen/src/main/ops/org/nd4j/codegen/ops/CNN.kt#L28)
+
+The definitions for the kotlin dsl can be found
+[here](https://github.com/KonduitAI/dl4j-dev-tools/blob/master/codegen/src/main/kotlin/org/nd4j/codegen/dsl/OpBuilder.kt)
+
+
+While it does have the intended description,
+it’s kotlin specific and is only available for a very small subset
+of the ops where pre-created objects were created
+for specific operations. The goal of this ADR is to expand upon
+that and make it language agnostic by providing this information in a
+ neutral file format that has code generation with it.
+
+Current code generation efforts can be augmented using this file format.
+More on this decision making can be found [here](https://github.com/KonduitAI/dl4j-dev-tools/blob/master/codegen/adr/0007-configuration_objects.md)
+
+
+
+## Proposal
+
+We expose a symbol based mapping in libnd4j in protobuf format, similar to how
+other frameworks are doing it, as a bridge/intermediary format.
+
+This makes it easier to implement interop with the other frameworks, because it
+adds the necessary information that is needed to be able to define a direct
+mapping.
+
+This could be a future file format depending on how the framework evolves. For
+now, this is considered a work around for making writing import code easier/more
+portable.
+
+Similar to [ONNX](https://onnx.ai/) and  [Tensorflow](https://tensorflow.org/)
+we use protobuf to express an attribute based file format and map
+samediff/nd4j operations to this format.
+
+We use a translation layer that handles mapping from attributes to the ordered
+arguments approach reflected in samediff/nd4j.
+
+For each operation, we define a mapping process to/from this attribute format to the
+order based execution format.
+
+A separate but similar set of rules are used for mapping ndarrays.
+
+This attribute based format is an Intermediary Representation that we then
+"compile" to the equivalent calls in libnd4j.
+
+
+The format definitions for the IR can be found [here](./src/main/proto/nd4j/nd4j.proto) 
+
+## Consequences
+
+Migration to an attribute based import format makes working with other deep
+learning frameworks easier in the future.
+
+
+### Drawbacks
+
+1. Yet another file format.
+2. Risk migrating to new file format in the future.
+3. A lot of up front manual work to index set of current operations.
+4. Backwards compatibility: yet another thing to maintain. We wrote converters
+   for any forward compatibility. We address this by specifying an opset schema
+   scheme similar to onnx.
+
+### Advantages
+
+1. Easy to maintain.
+2. Backwards compatible.
+3. Easily interops with existing other deep learning frameworks.
+4. No additional dependencies from what's already normal.
+5. Protobuf allows easy code generation for other languages.
+6. Industry standard conventions being used over proprietary tooling reducing
+   friction for adoption for people coming from other frameworks
+7. Straightforward mapping of arguments for import  
+8. Provide an easy bridge to existing libnd4j  
+9. Allow automation of op descriptors  in any language that would understand how
+   to pass data to the  c++ library.
+
+
+## Appendix A: Comparison with other Frameworks, implicit vs. explicit
+
+We can find the existing attributes from the conventions of the
+libnd4j code base. The libnd4j [conv1d.cpp](https://github.com/KonduitAI/deeplearning4j/blob/master/libnd4j/include/ops/declarable/generic/nn/convo/conv1d.cpp#L104)
+file contains the following declaration:
+
+```
+auto inputShapeInfo   = inputShape->at(0);
+auto weightsShapeInfo = inputShape->at(1);
+Nd4jLong const* biasShapeInfo    = block.width() > 2 ? inputShape->at(2) : nullptr;
+
+int kW = INT_ARG(0) > 0 ? INT_ARG(0) : static_cast<int>(shape::sizeAt(weightsShapeInfo, 0)); // filter(kernel) width
+int sW = INT_ARG(1);                                                        // strides width
+int pW = INT_ARG(2);                                                        // paddings width
+int dW = INT_ARG(3);                                                        // dilations width
+int paddingMode = INT_ARG(4);                                               // 0-VALID, 1-SAME
+int isNCW  = block.getIArguments()->size() > 5 ? !INT_ARG(5) : 1;           // INT_ARG(4): 1-NWC, 0-NCW
+int wFormat = block.getIArguments()->size() > 6 ? INT_ARG(6) : 0;           // 0 - [kW, iC, oC], 1 - [oC, iC, kW], 2 - [oC, kW, iC]
+```
+
+We can see that there are macros in the libnd4j code base, which reflect how
+each argument is accessed. Each list of arguments has an expected order, that we
+need to explicitly map to a parseable structure.
+
+In comparison, the
+[onnx Convolution operator](https://github.com/onnx/onnx/blob/master/docs/Operators.md#Conv)
+has *explicit* attributes of various types such as lists of ints and named
+tensors.
+
+As shown above, these concepts exist internally in the operations and layers
+themselves in nd4j/samediff, but they are not exposed directly to the user.
+
+
+A theoretical op descriptor from libnd4j is as follows:
+```java
+    private String name;
+    private int nIn,nOut,tArgs,iArgs;
+    private boolean inplaceAble;
+    private List<String> inArgNames;
+    private List<String> outArgNames;
+    private List<String> tArgNames;
+    private List<String> iArgNames;
+    private List<String> bArgNames;
+    private OpDeclarationType opDeclarationType;
+
+    public enum OpDeclarationType {
+        CUSTOM_OP_IMPL,
+        BOOLEAN_OP_IMPL,
+        LIST_OP_IMPL,
+        LOGIC_OP_IMPL,
+        OP_IMPL,
+        DIVERGENT_OP_IMPL,
+        CONFIGURABLE_OP_IMPL,
+        REDUCTION_OP_IMPL,
+        BROADCASTABLE_OP_IMPL,
+        BROADCASTABLE_BOOL_OP_IMPL
+    }
+```
+
+It contains all the op declarations and fields associated with a descriptor.
+
+In the libnd4j code base, we represent the op descriptor types above
+*implicitly* through validation as well as the different macros present in the
+code base representing what an op execution looks like.
+
+Validation for what can be present in the various names can be found
+[here](https://github.com/KonduitAI/deeplearning4j/blob/master/libnd4j/include/ops/declarable/impl/DeclarableOp.cpp#L734-L765)
+
+The set of macro declarations in libnd4j can be found
+[here](https://github.com/eclipse/deeplearning4j/blob/master/libnd4j/include/system/op_boilerplate.h)
+
+
+## Appendix B: Format Comparison to other frameworks
+
+An add op in tensorflow looks like:
+
+```
+op {
+  name: "Add"
+  input_arg {
+    name: "x"
+    type_attr: "T"
+  }
+  input_arg {
+    name: "y"
+    type_attr: "T"
+  }
+  output_arg {
+    name: "z"
+    type_attr: "T"
+  }
+  attr {
+    name: "T"
+    type: "type"
+    allowed_values {
+      list {
+        type: DT_BFLOAT16
+        type: DT_HALF
+        type: DT_FLOAT
+        type: DT_DOUBLE
+        type: DT_UINT8
+        type: DT_INT8
+        type: DT_INT16
+        type: DT_INT32
+        type: DT_INT64
+        type: DT_COMPLEX64
+        type: DT_COMPLEX128
+        type: DT_STRING
+      }
+    }
+  }
+}
+```
+
+Onnx’s add can be found here
+https://github.com/onnx/onnx/blob/master/docs/Operators.md#Add
+
+Onnx and tensorflow are purely attribute based formats.