Skip to content

Commit 65c6a9a

Browse files
committed
Dev commits
1 parent 2aa2a8b commit 65c6a9a

6,782 files changed

Lines changed: 448852 additions & 123495 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/lock.yml

Lines changed: 0 additions & 34 deletions
This file was deleted.
File renamed without changes.

ADRs/0002-ONNX_Runtime.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Onnx runtime module
2+
3+
## Status
4+
Proposed
5+
6+
Proposed by: Adam Gibson (23-09-2020)
7+
8+
Discussed with: saudet
9+
10+
## Context
11+
12+
We need a way of providing nd4j a way of running onnx modules
13+
that is easily compatible with the onnx community. The gold standard for this
14+
is is using [onnxruntime](https://github.com/microsoft/onnxruntime/blob/master/docs/Java_API.md).
15+
16+
17+
## Decision
18+
19+
We will use javacpp's onnxruntime bindings in a similar manner to [nd4j-tensorflow](../nd4j-tensorflow)
20+
allowing nd4j to be used as an ndarray format that interops with onnxruntime.
21+
22+
We will implement a simple api similar to the [GraphRunner](../nd4j-tensorflow/src/main/java/org/nd4j/tensorflow/conversion/graphrunner/GraphRunner.java)
23+
This will sit on top of javacpp's lower level onnxruntime bindings.
24+
25+
This module will follow a similar structure to the nd4j-tensorflow module
26+
focusing on INDArrays as a data interchange format, but otherwise pass execution
27+
down to onnxruntime.
28+
29+
30+
The main api to the graph runner works as follows:
31+
32+
```java
33+
try(GraphRunner runner = new GraphRunner(...)) {
34+
Map<String,INDArray> inputs = new HashMap<>();
35+
// ..initialize inputs
36+
Map<String,INDArray> outputs = runner.run(inputs);
37+
// process outputs...
38+
}
39+
```
40+
41+
The core logic will contain the following components:
42+
43+
1. Loading onnx pb files
44+
2. A graph runner in similar nature to nd4j-tensorflow
45+
3. Interop with onnxruntime's version of an ndarray/tensor
46+
47+
Using different accelerators/backends
48+
-----------------------------------------
49+
50+
Similar to nd4j-tensorflow which uses javacpp for the specific version of
51+
tensorflow to use, this module will rely on the user picking the right dependency
52+
to link against. Different builds of cpu, gpu, .. exist [here](https://repo1.maven.org/maven2/org/bytedeco/tensorflow/1.15.3-1.5.4/)
53+
The equivalent of this in onnxruntime can be found [here](https://repo1.maven.org/maven2/org/bytedeco/onnxruntime/1.4.0-1.5.4/)
54+
55+
The user will need to include the version of onnxruntime they wish to use
56+
similar to how you link against a particular implementation in a c library
57+
or include a backend in nd4j. This will happen via maven.
58+

ADRs/0003-Import_IR.md

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
# Import IR
2+
3+
## Status
4+
5+
Proposed
6+
7+
Proposed by: Adam Gibson (28-09-2020)
8+
9+
Discussed with: Paul Dubs
10+
11+
## Context
12+
13+
Currently, there is a gap in the way samediff/nd4j operations are implemented
14+
vs. how other frameworks represent their models.
15+
16+
Keras, Tensorflow, and Pytorch use an attribute based format with names. Interop
17+
between Onnx ,Tensorflow, and Keras tends to follow the following formula:
18+
19+
1. Map names to equivalent names in the other framework for each operation
20+
configuration. Names being both op names and associated attributes of the
21+
operations such as in Conv2D where you have strides, kernel sizes.
22+
2. Map input/output tensors to the equivalent tensor type in each framework.
23+
3. Setup the complete graph in the equivalent framework. Sometimes the
24+
framework's concepts don't map 1 to 1. They should output equivalent results
25+
regardless though. In order to do this, sometimes the framework needs to
26+
add/remove operations in order to produce equivalent output in a different
27+
graph. The [tensorflow onnx import](https://github.com/onnx/tensorflow-onnx#how-tf2onnx-works)
28+
is a good example of this.
29+
30+
Samediff/nd4j have their internal op representations as a set of ordered
31+
arguments for execution in the form of:
32+
33+
1. t arguments: floating point arguments (float, double,..)
34+
2. integer arguments: integer arguments (long, integer)
35+
3. boolean argument: boolean arguments
36+
4. data type arguments: data types for input/output
37+
5. input arguments: ndarrays for input
38+
6. output arguments: often optional (dynamically created) output ndarray
39+
arguments. If the user wants to pass in outputs to control memory, they are
40+
allowed to do so.
41+
7. axis arguments: Integer arguments that represent the dimension(s) for an
42+
operation to be executed on.
43+
44+
[Reference implementation](https://github.com/KonduitAI/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/api/ops/DynamicCustomOp.java#L58)
45+
46+
This maps well enough for execution, but not for file formats.
47+
48+
## Related Work
49+
This may encourage future work to be done to the
50+
[samediff file format](https://github.com/KonduitAI/deeplearning4j/blob/master/nd4j/ADRs/0001-SameDiff_File_Format.md).
51+
Implementation of serialization of file format via flatbuffers can be found
52+
[here](https://github.com/eclipse/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/autodiff/samediff/SameDiff.java#L4748)
53+
Of note here for prior work is the
54+
[current code generation]
55+
(https://github.com/KonduitAI/dl4j-dev-tools/blob/master/codegen/src/main/ops/org/nd4j/codegen/ops/CNN.kt#L28)
56+
57+
The definitions for the kotlin dsl can be found
58+
[here](https://github.com/KonduitAI/dl4j-dev-tools/blob/master/codegen/src/main/kotlin/org/nd4j/codegen/dsl/OpBuilder.kt)
59+
60+
61+
While it does have the intended description,
62+
it’s kotlin specific and is only available for a very small subset
63+
of the ops where pre-created objects were created
64+
for specific operations. The goal of this ADR is to expand upon
65+
that and make it language agnostic by providing this information in a
66+
neutral file format that has code generation with it.
67+
68+
Current code generation efforts can be augmented using this file format.
69+
More on this decision making can be found [here](https://github.com/KonduitAI/dl4j-dev-tools/blob/master/codegen/adr/0007-configuration_objects.md)
70+
71+
72+
73+
## Proposal
74+
75+
We expose a symbol based mapping in libnd4j in protobuf format, similar to how
76+
other frameworks are doing it, as a bridge/intermediary format.
77+
78+
This makes it easier to implement interop with the other frameworks, because it
79+
adds the necessary information that is needed to be able to define a direct
80+
mapping.
81+
82+
This could be a future file format depending on how the framework evolves. For
83+
now, this is considered a work around for making writing import code easier/more
84+
portable.
85+
86+
Similar to [ONNX](https://onnx.ai/) and [Tensorflow](https://tensorflow.org/)
87+
we use protobuf to express an attribute based file format and map
88+
samediff/nd4j operations to this format.
89+
90+
We use a translation layer that handles mapping from attributes to the ordered
91+
arguments approach reflected in samediff/nd4j.
92+
93+
For each operation, we define a mapping process to/from this attribute format to the
94+
order based execution format.
95+
96+
A separate but similar set of rules are used for mapping ndarrays.
97+
98+
This attribute based format is an Intermediary Representation that we then
99+
"compile" to the equivalent calls in libnd4j.
100+
101+
102+
The format definitions for the IR can be found [here](./src/main/proto/nd4j/nd4j.proto)
103+
104+
## Consequences
105+
106+
Migration to an attribute based import format makes working with other deep
107+
learning frameworks easier in the future.
108+
109+
110+
### Drawbacks
111+
112+
1. Yet another file format.
113+
2. Risk migrating to new file format in the future.
114+
3. A lot of up front manual work to index set of current operations.
115+
4. Backwards compatibility: yet another thing to maintain. We wrote converters
116+
for any forward compatibility. We address this by specifying an opset schema
117+
scheme similar to onnx.
118+
119+
### Advantages
120+
121+
1. Easy to maintain.
122+
2. Backwards compatible.
123+
3. Easily interops with existing other deep learning frameworks.
124+
4. No additional dependencies from what's already normal.
125+
5. Protobuf allows easy code generation for other languages.
126+
6. Industry standard conventions being used over proprietary tooling reducing
127+
friction for adoption for people coming from other frameworks
128+
7. Straightforward mapping of arguments for import
129+
8. Provide an easy bridge to existing libnd4j
130+
9. Allow automation of op descriptors in any language that would understand how
131+
to pass data to the c++ library.
132+
133+
134+
## Appendix A: Comparison with other Frameworks, implicit vs. explicit
135+
136+
We can find the existing attributes from the conventions of the
137+
libnd4j code base. The libnd4j [conv1d.cpp](https://github.com/KonduitAI/deeplearning4j/blob/master/libnd4j/include/ops/declarable/generic/nn/convo/conv1d.cpp#L104)
138+
file contains the following declaration:
139+
140+
```
141+
auto inputShapeInfo = inputShape->at(0);
142+
auto weightsShapeInfo = inputShape->at(1);
143+
Nd4jLong const* biasShapeInfo = block.width() > 2 ? inputShape->at(2) : nullptr;
144+
145+
int kW = INT_ARG(0) > 0 ? INT_ARG(0) : static_cast<int>(shape::sizeAt(weightsShapeInfo, 0)); // filter(kernel) width
146+
int sW = INT_ARG(1); // strides width
147+
int pW = INT_ARG(2); // paddings width
148+
int dW = INT_ARG(3); // dilations width
149+
int paddingMode = INT_ARG(4); // 0-VALID, 1-SAME
150+
int isNCW = block.getIArguments()->size() > 5 ? !INT_ARG(5) : 1; // INT_ARG(4): 1-NWC, 0-NCW
151+
int wFormat = block.getIArguments()->size() > 6 ? INT_ARG(6) : 0; // 0 - [kW, iC, oC], 1 - [oC, iC, kW], 2 - [oC, kW, iC]
152+
```
153+
154+
We can see that there are macros in the libnd4j code base, which reflect how
155+
each argument is accessed. Each list of arguments has an expected order, that we
156+
need to explicitly map to a parseable structure.
157+
158+
In comparison, the
159+
[onnx Convolution operator](https://github.com/onnx/onnx/blob/master/docs/Operators.md#Conv)
160+
has *explicit* attributes of various types such as lists of ints and named
161+
tensors.
162+
163+
As shown above, these concepts exist internally in the operations and layers
164+
themselves in nd4j/samediff, but they are not exposed directly to the user.
165+
166+
167+
A theoretical op descriptor from libnd4j is as follows:
168+
```java
169+
private String name;
170+
private int nIn,nOut,tArgs,iArgs;
171+
private boolean inplaceAble;
172+
private List<String> inArgNames;
173+
private List<String> outArgNames;
174+
private List<String> tArgNames;
175+
private List<String> iArgNames;
176+
private List<String> bArgNames;
177+
private OpDeclarationType opDeclarationType;
178+
179+
public enum OpDeclarationType {
180+
CUSTOM_OP_IMPL,
181+
BOOLEAN_OP_IMPL,
182+
LIST_OP_IMPL,
183+
LOGIC_OP_IMPL,
184+
OP_IMPL,
185+
DIVERGENT_OP_IMPL,
186+
CONFIGURABLE_OP_IMPL,
187+
REDUCTION_OP_IMPL,
188+
BROADCASTABLE_OP_IMPL,
189+
BROADCASTABLE_BOOL_OP_IMPL
190+
}
191+
```
192+
193+
It contains all the op declarations and fields associated with a descriptor.
194+
195+
In the libnd4j code base, we represent the op descriptor types above
196+
*implicitly* through validation as well as the different macros present in the
197+
code base representing what an op execution looks like.
198+
199+
Validation for what can be present in the various names can be found
200+
[here](https://github.com/KonduitAI/deeplearning4j/blob/master/libnd4j/include/ops/declarable/impl/DeclarableOp.cpp#L734-L765)
201+
202+
The set of macro declarations in libnd4j can be found
203+
[here](https://github.com/eclipse/deeplearning4j/blob/master/libnd4j/include/system/op_boilerplate.h)
204+
205+
206+
## Appendix B: Format Comparison to other frameworks
207+
208+
An add op in tensorflow looks like:
209+
210+
```
211+
op {
212+
name: "Add"
213+
input_arg {
214+
name: "x"
215+
type_attr: "T"
216+
}
217+
input_arg {
218+
name: "y"
219+
type_attr: "T"
220+
}
221+
output_arg {
222+
name: "z"
223+
type_attr: "T"
224+
}
225+
attr {
226+
name: "T"
227+
type: "type"
228+
allowed_values {
229+
list {
230+
type: DT_BFLOAT16
231+
type: DT_HALF
232+
type: DT_FLOAT
233+
type: DT_DOUBLE
234+
type: DT_UINT8
235+
type: DT_INT8
236+
type: DT_INT16
237+
type: DT_INT32
238+
type: DT_INT64
239+
type: DT_COMPLEX64
240+
type: DT_COMPLEX128
241+
type: DT_STRING
242+
}
243+
}
244+
}
245+
}
246+
```
247+
248+
Onnx’s add can be found here
249+
https://github.com/onnx/onnx/blob/master/docs/Operators.md#Add
250+
251+
Onnx and tensorflow are purely attribute based formats.

0 commit comments

Comments
 (0)