You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.rst
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -374,9 +374,11 @@ Dimensionality Reduction
374
374
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
375
375
Principal Component Analysis (PCA)
376
376
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
377
+
Principle component analysis~(PCA) is the most popular technique in multivariate analysis and dimensionality reduction. PCA is a method to identify a subspace in which the data approximately lies. This means finding new variables that are uncorrelated and maximizing the variance to preserve as much variability as possible.
377
378
378
379
379
380
Example of PCA on text dataset (20newsgroups) from tf-idf with 75000 features to 2000 components:
381
+
380
382
.. code:: python
381
383
382
384
from sklearn.feature_extraction.text import TfidfVectorizer
0 commit comments