Skip to content

Add educational UMAP implementation#14690

Open
AyushGarg76 wants to merge 4 commits into
TheAlgorithms:masterfrom
AyushGarg76:add-umap
Open

Add educational UMAP implementation#14690
AyushGarg76 wants to merge 4 commits into
TheAlgorithms:masterfrom
AyushGarg76:add-umap

Conversation

@AyushGarg76
Copy link
Copy Markdown

Describe your change:

Adds an educational implementation of Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction in the machine_learning section.

Features:

  • pairwise distance computation
  • nearest-neighbour graph construction
  • simplified fuzzy membership strengths
  • low-dimensional embedding optimisation
  • doctests and type hints
  • Iris dataset example

The implementation is simplified for educational purposes and follows the repository style used in existing dimensionality reduction algorithms such as PCA and t-SNE.

Fixes #14689

  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Add or change doctests? -- Note: Please avoid changing both code and tests in a single pull request.
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
  • If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword.

Copilot AI review requested due to automatic review settings May 16, 2026 22:02
@algorithms-keeper algorithms-keeper Bot added documentation This PR modified documentation files awaiting reviews This PR is ready to be reviewed labels May 16, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an educational UMAP dimensionality reduction implementation under machine_learning, aligning with existing examples like t-SNE and PCA.

Changes:

  • Adds UMAP helper functions for distances, nearest neighbors, membership strengths, and embedding optimization.
  • Adds Iris dataset demo output and doctests.
  • Updates DIRECTORY.md and CONTRIBUTING.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
machine_learning/uniform_manifold_approximation_and_projection.py New educational UMAP implementation and demo.
DIRECTORY.md Adds a directory entry for the new UMAP file.
CONTRIBUTING.md Adds guidance about skipping pre-commit hooks.
Comments suppressed due to low confidence (1)

machine_learning/uniform_manifold_approximation_and_projection.py:140

  • This assumes the first sorted index is always the point itself. With duplicate samples (zero distance to another row) or any tied zero distances, argsort can put another index first and this slice can include i as its own nearest neighbor, violating the function contract and producing self-edges in the fuzzy graph.
        sorted_indices = np.argsort(distance_matrix[i])
        neighbor_indices[i] = sorted_indices[1 : n_neighbors + 1]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CONTRIBUTING.md
Comment on lines +77 to +80
When you're making local experimental changes and don't want pre-commit hooks to run, you can bypass them using:

```bash
git commit --no-verify -m "Exploratory: Testing XYZ"
Comment thread DIRECTORY.md
* [Similarity Search](machine_learning/similarity_search.py)
* [Support Vector Machines](machine_learning/support_vector_machines.py)
* [T Stochastic Neighbour Embedding](machine_learning/t_stochastic_neighbour_embedding.py)
* [Uniform Manifold Approximation And Projection](machine_learning/uniform_manifold_approximation_and_projection.py)
print()
print("Loading Iris dataset …")
features, labels = collect_dataset()
print(f" Input shape : {features.shape} ((150 flowers x 2 dimensions))")
Comment thread CONTRIBUTING.md
Comment on lines +75 to +77
### Skipping Pre-commit Hooks for Exploratory Changes

When you're making local experimental changes and don't want pre-commit hooks to run, you can bypass them using:
Comment on lines +127 to +133

>>> dist = np.array([[0., 1., 2.], [1., 0., 3.], [2., 3., 0.]])
>>> nn = find_nearest_neighbors(dist, n_neighbors=1)
>>> nn.tolist()
[[1], [0], [0]]
"""
n_samples = distance_matrix.shape[0]
Comment on lines +320 to +324
if n_components < 1:
raise ValueError("n_components must be >= 1")
if n_neighbors < 1:
raise ValueError("n_neighbors must be >= 1")
if n_epochs < 1:
@algorithms-keeper algorithms-keeper Bot added the tests are failing Do not merge until tests pass label May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting reviews This PR is ready to be reviewed documentation This PR modified documentation files tests are failing Do not merge until tests pass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add educational UMAP implementation to machine_learning

2 participants