You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There seems to be a segfault in the compute-accuracy utility.
264
+
265
+
To get started:
266
+
267
+
::
268
+
269
+
cd scripts && ./demo-word.sh
270
+
271
+
Original README text follows:
272
+
273
+
This tool provides an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words. These representations can be subsequently used in many natural language processing applications and for further research.
274
+
275
+
276
+
this code provides an implementation of the Continuous Bag-of-Words (CBOW) and
277
+
the Skip-gram model (SG), as well as several demo scripts.
278
+
279
+
Given a text corpus, the word2vec tool learns a vector for every word in
280
+
the vocabulary using the Continuous Bag-of-Words or the Skip-Gram neural
281
+
network architectures. The user should to specify the following: -
282
+
desired vector dimensionality - the size of the context window for
283
+
either the Skip-Gram or the Continuous Bag-of-Words model - training
284
+
algorithm: hierarchical softmax and / or negative sampling - threshold
285
+
for downsampling the frequent words - number of threads to use - the
286
+
format of the output word vector file (text or binary)
287
+
288
+
Usually, the other hyper-parameters such as the learning rate do not
289
+
need to be tuned for different training sets.
290
+
291
+
The script demo-word.sh downloads a small (100MB) text corpus from the
292
+
web, and trains a small word vector model. After the training is
293
+
finished, the user can interactively explore the similarity of the
0 commit comments