Skip to content

Commit 438720d

Browse files
authored
Merge pull request #261 from scijava/scijava-ops-api/relax-reproducibility
package-info.java: relax reproducibility
2 parents 1a01b30 + 4db2dab commit 438720d

2 files changed

Lines changed: 52 additions & 6 deletions

File tree

docs/ops/doc/WritingYourOwnOpPackage.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,49 @@ You'll find this page organized into two broad sections. The first section descr
88

99
SciJava Ops is designed for modularity, extensibility, and granularity - you can exploit these aspects by adhering to the following guidelines when writing Ops:
1010

11+
### Fostering reproduciblity
12+
13+
[Determinism](https://en.wikipedia.org/wiki/Deterministic_algorithm) is a valuable aspect of scientific computing, as our goal is to facilitate reproducible science. As many powerful algorithms behave non-deterministically, due to factors such as internal state or parallel processing, SciJava Ops does not *require* Ops be deterministic. However, if your Op *can be* deterministic, we highly recommend you make it so. Consider the following algorithm:
14+
15+
```java
16+
/**
17+
* A simple noise adder
18+
* @param input the input data
19+
* @implNote op name="filter.addNoise" type=Inplace
20+
*/
21+
public static void addNoise(double[] data) {
22+
Random r = new Random();
23+
for(int i = 0; i < data.length; i++) {
24+
// Add a number in [-0.5, 0.5)
25+
data[i] += (r.nextDouble() - 0.5);
26+
}
27+
}
28+
```
29+
30+
We can make it deterministic by adding an `@Nullable` `seed` parameter, with a default value if the user does not pass one:
31+
32+
```java
33+
/**
34+
* A simple noise adder
35+
* @param input the input data
36+
* @param seed the seed to the {@link java.util.Random}
37+
* @implNote op name="filter.addNoise" type=Inplace
38+
*/
39+
public static void addNoise(double[] data, @Nullable Long seed) {
40+
// use default seed if not provided
41+
if (seed == null) {
42+
seed = 0xdeadbeef;
43+
}
44+
Random r = new Random(seed);
45+
for(int i = 0; i < data.length; i++) {
46+
// Add a number in [-0.5, 0.5)
47+
data[i] += (r.nextDouble() - 0.5);
48+
}
49+
}
50+
```
51+
52+
These small steps give workflows the best chance to return consistent results every time, even years later!
53+
1154
### Using Dependencies
1255

1356
If you are writing an Op that performs many intermediate operations, there's a good chance someone else has written (and even optimized) some or all of them.

scijava-ops-api/src/main/java/org/scijava/ops/api/package-info.java

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -64,10 +64,12 @@
6464
* An Op is an algorithm adhering to the following traits:
6565
* </p>
6666
* <ol>
67-
* <li>Ops are stateless and deterministic - with no internal state, calling an
68-
* Op two times on the same inputs will produce the same output.</li>
6967
* <li>Ops are named - this name conveys an Op's purpose, and allows us to find
7068
* all Ops implementing a particular operation</li>
69+
* <li>Ops have a number of input and output parameters, each defined by a
70+
* type</li>
71+
* <li>Ops adhere to a {@link java.lang.FunctionalInterface}, defining how it
72+
* operates</li>
7173
* </ol>
7274
* <p>
7375
* Using the name and the combination of input and output parameters, we can
@@ -151,10 +153,11 @@
151153
* and SciJava Ops will take care to call the correct Op based on the concrete
152154
* inputs provided.</li>
153155
* <li>Result-equivalence, and therefore reproducibility, in Ops, is guaranteed
154-
* within an OpEnvironment and a set of input objects, but not just for Op
155-
* calls. This allows us to ensure reproducible pipelines, but also allows us to
156-
* introduce new Ops into the pipeline or to run pipelines on different inputs
157-
* without changing the pipeline itself.</li>
156+
* within an OpEnvironment and a set of input objects, when <b>deterministic</b>
157+
* algorithms are used. Cognizance of algorithm determinism allows users to
158+
* create reproducible pipelines, however determinism is <b>not</b> a
159+
* requirement for Ops, as it would preclude many valuable algorithms from
160+
* becoming an Op.</li>
158161
* </ol>
159162
*/
160163

0 commit comments

Comments
 (0)