You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| Word2Vec | * It captures the position of the words in the text (syntactic) | * It cannot capture the meaning of the word from the text (fails to capture polysemy) |
465
+
||||
466
+
|| * It captures meaning in the words (semantics) | * It cannot capture out-of-vocabulary words from corpus |
| GloVe (Pre-Trained) | * It captures the position of the words in the text (syntactic) | * It cannot capture the meaning of the word from the text (fails to capture polysemy) |
469
+
||||
470
+
|| * It captures meaning in the words (semantics) ||
471
+
||| * Memory consumption for storage |
472
+
|| * Trained on huge corpus ||
473
+
||||
474
+
||| * It cannot capture out-of-vocabulary words from corpus |
| GloVe (Trained) | * It is very straightforward, e.g., to enforce the word vectors to capture sub-linear relationships in the vector space (performs better than Word2vec) | * Memory consumption for storage |
477
+
||||
478
+
|| * Lower weight for highly frequent word pairs such as stop words like “am”, “is”, etc. Will not dominate training progress | * Needs huge corpus to learn |
479
+
||||
480
+
||| * It cannot capture out-of-vocabulary words from the corpus |
481
+
||||
482
+
||| * It cannot capture the meaning of the word from the text (fails to capture polysemy) |
| FastText | * Works for rare words (rare in their character n-grams which are still shared with other words | * It cannot capture the meaning of the word from the text (fails to capture polysemy) |
485
+
||||
486
+
||| * Memory consumption for storage |
487
+
|| * Solves out of vocabulary words with n-gram in character level ||
488
+
||| * Computationally is more expensive in comparing with GloVe and Word2Vec |
| Contextualized Word Representations | * It captures the meaning of the word from the text (incorporates context, handling polysemy) | * Memory consumption for storage |
491
+
||||
492
+
||| * Improves performance notably on downstream tasks. Computationally is more expensive in comparison to others |
493
+
||||
494
+
||| * Needs another word embedding for all LSTM and feedforward layers |
495
+
||||
496
+
||| * It cannot capture out-of-vocabulary words from a corpus |
497
+
||||
498
+
||||
499
+
||| * Works only sentence and document level (it cannot work for individual word level) |
0 commit comments