Skip to content

Commit 5e4b303

Browse files
committed
more info about batchsizes
1 parent ded03dc commit 5e4b303

1 file changed

Lines changed: 12 additions & 2 deletions

File tree

doc/gettingstarted.txt

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -398,8 +398,18 @@ With large :math:`B`, time is wasted in reducing the variance of the gradient
398398
estimator, that time would be better spent on additional gradient steps.
399399
An optimal :math:`B` is model-, dataset-, and hardware-dependent, and can be
400400
anywhere from 1 to maybe several hundreds. In the tutorial we set it to 20,
401-
but this choice is almost arbitrary (though harmless). All code-blocks
402-
above show pseudocode of how the algorithm looks like. Implementing such
401+
but this choice is almost arbitrary (though harmless).
402+
403+
.. note::
404+
405+
If you are training for a fixed number of epochs, the minibatch size becomes important
406+
because it controls the number of updates done to your parameters. Training the same model
407+
for 10 epochs using a batch size of 1 yields completely different results compared
408+
to training for the same 10 epochs but with a batchsize of 20. Keep this in mind when
409+
switching between batch sizes and be prepared to tweak all the other parameters acording
410+
to the batch size used.
411+
412+
All code-blocks above show pseudocode of how the algorithm looks like. Implementing such
403413
algorithm in Theano can be done as follows :
404414

405415
.. code-block:: python

0 commit comments

Comments
 (0)