Workspace Exception with ParallelWrapper and Custom Iterator

When training an BiLSTM-based time series forecasting model using DL4J’s ParallelWrapper and a custom iterator (or even standard iterators), we encounter the following runtime exception:

`org.nd4j.linalg.exception.ND4JIllegalStateException: Workspace [ADSI_ITER-...]: Can't borrow from borrowed workspace`
or
`java.lang.IllegalStateException: Feed forward to layer (training): array (INPUT) workspace validation failed (layer 1 - layer name "layer1" - class: org.deeplearning4j.nn.layers.recurrent.BidirectionalLayer) - array is defined in incorrect workspace`

DL4J Version: (1.0.0-M2.1)
ND4J Version: (1.0.0-M2.1)
Java Version: (java 21)
OS: (Linux)
CPU. Not GPU acceleration.

```
final MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
    .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
    .updater(new Adam(LEARNING_RATE))
    .list()
    .layer(new Bidirectional(Bidirectional.Mode.CONCAT, new LSTM.Builder()
            .nIn(NUM_FEATURES)
            .nOut(64)
            .activation(Activation.TANH)
            .l2(1e-4)
            .dropOut(0.2)
            .build()))
    .layer(new Bidirectional(Bidirectional.Mode.CONCAT, new LSTM.Builder()
            .nIn(128)
            .nOut(64)
            .activation(Activation.TANH)
            .l2(1e-4)
            .dropOut(0.2)
            .build()))
    .layer(new Bidirectional(Bidirectional.Mode.CONCAT, new LSTM.Builder()
            .nIn(128)
            .nOut(64)
            .activation(Activation.TANH)
            .l2(1e-4)
            .dropOut(0.2)
            .build()))
    .layer(new GlobalPoolingLayer.Builder().poolingType(PoolingType.AVG).build())
    .layer(new DenseLayer.Builder()
            .nIn(128)
            .nOut(64)
            .activation(Activation.RELU)
            .dropOut(0.2)
            .l2(1e-4)
            .build())
    .layer(new OutputLayer.Builder(LossFunctions.LossFunction.MEAN_ABSOLUTE_ERROR)
            .activation(Activation.IDENTITY)
            .nIn(64)
            .nOut(FORECAST_HORIZON)
            .l2(1e-4)
            .build())
    .build();
```


```
final ParallelWrapper networkWrapper = new ParallelWrapper.Builder<MultiLayerNetwork>(this.network)
        .workers(2)
        .prefetchBuffer(12)
        .build();

networkWrapper.fit(trainIter);
```


1. Custom RollingWindowDataSetIterator that creates new INDArrays per batch and does not keep references outside the batch scope.
2. Training fails with workspace exceptions as soon as ParallelWrapper is used, even with minimal worker count. Tested 2, 4, 8 and 12.
3. The issue appears only when using BiLSTM layers with ParallelWrapper.
4. The problem is reproducible with both custom and standard iterators.
5. No references to INDArrays or DataSets are kept outside the batch scope.
6. On a regular stack of LSTM layers, no issues at all.

I hope this is enough to figure out the issue.
Thank you.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workspace Exception with ParallelWrapper and Custom Iterator #10221

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Workspace Exception with ParallelWrapper and Custom Iterator #10221

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions