Skip to content

Commit 1a10903

Browse files
author
Razvan Pascanu
committed
some bugs fixed
1 parent 160416e commit 1a10903

3 files changed

Lines changed: 30 additions & 15 deletions

File tree

code/rbm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ def sample_h_given_v(self, v0_sample):
121121
dtype = theano.config.floatX)
122122
return [h1_mean, h1_sample]
123123

124-
def propdown(self.hid):
124+
def propdown(self, hid):
125125
'''This function propagates the hidden units activation downwards to
126126
the visible units'''
127127
return T.nnet.sigmoid(T.dot(hid,self.W.T) + self.vbias)

doc/rbm.txt

Lines changed: 27 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,10 @@ loss function as being the negative log-likelihood.
5858

5959
.. math::
6060
\mathcal{L}(\theta, \mathcal{D}) = \frac{1}{N} \sum_{x^{(i)} \in
61-
\mathcal{D}} \log\ p(x^{(i)}).\\
61+
\mathcal{D}} \log\ p(x^{(i)})\\
6262
\ell (\theta, \mathcal{D}) = - \mathcal{L} (\theta, \mathcal{D})
6363

64-
using the stochastic gradient :math:`\frac{\partial - \log p(x^{(i)})}{\partial
64+
using the stochastic gradient :math:`-\frac{\partial \log p(x^{(i)})}{\partial
6565
\theta}`, where :math:`\theta` are the parameters of the model.
6666

6767

@@ -97,7 +97,7 @@ form.
9797
.. math::
9898
:label: free_energy_grad
9999

100-
- \frac{\partial - \log p(x)}{\partial \theta}
100+
- \frac{\partial \log p(x)}{\partial \theta}
101101
&= \frac{\partial \mathcal{F}(x)}{\partial \theta} -
102102
\sum_{\tilde{x}} p(\tilde{x}) \
103103
\frac{\partial \mathcal{F}(\tilde{x})}{\partial \theta}.
@@ -124,7 +124,7 @@ denoted as :math:`\mathcal{N}`. The gradient can then be written as:
124124
.. math::
125125
:label: bm_grad
126126

127-
\frac{\partial -\log p(x)}{\partial \theta}
127+
- \frac{\partial \log p(x)}{\partial \theta}
128128
&\approx
129129
\frac{\partial \mathcal{F}(x)}{\partial \theta} -
130130
\frac{1}{|\mathcal{N}|}\sum_{\tilde{x} \in \mathcal{N}} \
@@ -213,12 +213,12 @@ following log-likelihood gradients for an RBM with binary units:
213213
.. math::
214214
:label: rbm_grad
215215

216-
\frac {\partial{- \log p(v)}} {\partial W_{ij}} &=
216+
- \frac{\partial{ \log p(v)}}{\partial W_{ij}} &=
217217
E_v[p(h_i|v) \cdot v_j]
218-
- v^{(i)}_j \cdot sigm(W_i \cdot v^{(i)} + c_i)
219-
\frac {\partial{- \log p(v)}} {\partial c_i} &=
218+
- v^{(i)}_j \cdot sigm(W_i \cdot v^{(i)} + c_i) \\
219+
-\frac{\partial{ \log p(v)}}{\partial c_i} &=
220220
E_v[p(h_i|v)] - sigm(W_i \cdot v^{(i)}) \\
221-
\frac {\partial{- \log p(v)}} {\partial b_j} &=
221+
-\frac{\partial{ \log p(v)}}{\partial b_j} &=
222222
E_v[p(v_j|h)] - v^{(i)}_j
223223

224224
For a more detailed derivation of these equations, we refer the reader to the
@@ -396,10 +396,8 @@ with Eqs. :eq:`rbm_propup` - :eq:`rbm_propdown`. The code is as follows:
396396

397397
.. code-block:: python
398398

399-
400399
def propup(self, vis):
401-
''' This function propagates the visible units activation upwards to
402-
the hidden units '''
400+
''' This function propagates the visible units activation upwards to the hidden units '''
403401
return T.nnet.sigmoid(T.dot(v, self.W) + self.hbias)
404402

405403
def sample_h_given_v(self, v0_sample):
@@ -414,9 +412,8 @@ with Eqs. :eq:`rbm_propup` - :eq:`rbm_propdown`. The code is as follows:
414412
dtype = theano.config.floatX)
415413
return [h1_mean, h1_sample]
416414

417-
def propdown(self.hid):
418-
'''This function propagates the hidden units activation downwards to
419-
the visible units'''
415+
def propdown(self, hid):
416+
'''This function propagates the hidden units activation downwards to the visible units'''
420417
return T.nnet.sigmoid(T.dot(hid,self.W.T) + self.vbias)
421418

422419
def sample_v_given_h(self, h0_sample):
@@ -724,6 +721,22 @@ been shown to lead to a better generative model ([Tieleman08]_).
724721

725722
print 'Training epoch %d, cost is '%epoch, numpy.mean(mean_cost)
726723

724+
# Plot filters after each training epoch
725+
plotting_start = time.clock()
726+
# Construct image from the weight matrix
727+
image = PIL.Image.fromarray(tile_raster_images( X = rbm.W.value.T,
728+
img_shape = (28,28),tile_shape = (10,10),
729+
tile_spacing=(1,1)))
730+
image.save('filters_at_epoch_%i.png'%epoch)
731+
plotting_stop = time.clock()
732+
plotting_time += (plotting_stop - plotting_start)
733+
734+
end_time = time.clock()
735+
736+
pretraining_time = (end_time - start_time) - plotting_time
737+
738+
print ('Training took %f minutes' %(pretraining_time/60.))
739+
727740
Once the RBM is trained, we can then use the ``gibbs_vhv`` function to implement
728741
the Gibbs chain required for sampling. We initialize the Gibbs chain starting
729742
from test examples (although we could as well pick it from the training set)

open_issues/6_benchmarking_pybrain.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ Observations :
5353

5454
** Our thing with batchsize =1 **
5555
439s for 30 epochs => 14.63s
56+
our thing with batchsize 600 :0.43s
5657

5758

5859
Results :
@@ -70,6 +71,7 @@ Observations :
7071
iteration 0, with test performance 3.880000 %
7172
The code for file mlp.py ran for 13.12m expected 1.51m in our buildbot =>
7273
4.37min / epoch
74+
out thing with batchsize 20 :1.78min/epoch
7375

7476

7577

0 commit comments

Comments
 (0)