diff --git a/.devcontainer/Dockerfile b/.devcontainer/Dockerfile index a0bd05f47ec8..edee3bc4febb 100644 --- a/.devcontainer/Dockerfile +++ b/.devcontainer/Dockerfile @@ -3,6 +3,5 @@ ARG VARIANT=3.13-bookworm FROM mcr.microsoft.com/vscode/devcontainers/python:${VARIANT} COPY requirements.txt /tmp/pip-tmp/ RUN python3 -m pip install --upgrade pip \ - && python3 -m pip install --no-cache-dir install -r /tmp/pip-tmp/requirements.txt \ - && pipx install pre-commit ruff \ - && pre-commit install + && python3 -m pip install --no-cache-dir -r /tmp/pip-tmp/requirements.txt \ + && pipx install pre-commit ruff diff --git a/.devcontainer/README.md b/.devcontainer/README.md index ec3cdb61de7a..8056578ad3a8 100644 --- a/.devcontainer/README.md +++ b/.devcontainer/README.md @@ -1 +1,42 @@ -https://code.visualstudio.com/docs/devcontainers/tutorial +# Development Container + +This is **Devcontainer** configuration to provide a consistent development environment for all contributors. + +## Features + +- [x] Pre-configured **Python environment** +- [x] Automatic installation of **pre-commit hooks** +- [x] **Ruff** linter ready to check your code +- [x] **Oh My Zsh** with plugins: +- `zsh-autosuggestions` +- `zsh-syntax-highlighting` + +## Usage + +1. Install [**Docker** ](https://www.docker.com/get-started/) and [**Visual Studio Code**](https://code.visualstudio.com/) +2. Install the **Remote - Containers** extension in VS Code + + - Do `CTRL+P`, paste this command and press `Enter` + + ```shell + ext install ms-vscode-remote.remote-containers + ``` +3. Open this repository in VS Code +4. When prompted, click **"Reopen in Container"** +5. Wait for the environment to build and initialize + +After setup: + +- `pre-commit` hooks are installed +- `ruff` and other tools are available +- The shell uses Zsh by default + +## Tips + +To manually run checks on all files: + +```bash +pre-commit run --all-files +``` + +> For further information here's [Microsoft tutorial about devcontainers.](https://code.visualstudio.com/docs/devcontainers/tutorial) diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json index e23263f5b9de..4951d5eb268d 100644 --- a/.devcontainer/devcontainer.json +++ b/.devcontainer/devcontainer.json @@ -7,10 +7,12 @@ // Update 'VARIANT' to pick a Python version: 3, 3.11, 3.10, 3.9, 3.8 // Append -bullseye or -buster to pin to an OS version. // Use -bullseye variants on local on arm64/Apple Silicon. - "VARIANT": "3.13-bookworm", + "VARIANT": "3.13-bookworm" } }, + "postCreateCommand": "zsh .devcontainer/post_install", + // Configure tool-specific properties. "customizations": { // Configure properties specific to VS Code. @@ -20,7 +22,8 @@ "python.defaultInterpreterPath": "/usr/local/bin/python", "python.linting.enabled": true, "python.formatting.blackPath": "/usr/local/py-utils/bin/black", - "python.linting.mypyPath": "/usr/local/py-utils/bin/mypy" + "python.linting.mypyPath": "/usr/local/py-utils/bin/mypy", + "terminal.integrated.defaultProfile.linux": "zsh" }, // Add the IDs of extensions you want installed when the container is created. diff --git a/.devcontainer/post_install b/.devcontainer/post_install new file mode 100755 index 000000000000..589ee361f5cb --- /dev/null +++ b/.devcontainer/post_install @@ -0,0 +1,29 @@ +#!/usr/bin/env bash + +echo "Begin post-installation steps..." + +set -e + +echo "Installing pre-commit hooks..." +pre-commit install + +echo "Installing Oh My Zsh plugins..." + +# Install zsh-autosuggestions if not present +if [ ! -d "${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/plugins/zsh-autosuggestions" ]; then + echo "Cloning zsh-autosuggestions..." + git clone https://github.com/zsh-users/zsh-autosuggestions \ + "${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/plugins/zsh-autosuggestions" +fi + +# Install zsh-syntax-highlighting if not present +if [ ! -d "${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting" ]; then + echo "Cloning zsh-syntax-highlighting..." + git clone https://github.com/zsh-users/zsh-syntax-highlighting.git \ + "${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting" +fi + +echo "Configuring plugins in ~/.zshrc..." +sed -i '/^plugins=/c\plugins=(git zsh-autosuggestions zsh-syntax-highlighting)' ~/.zshrc + +echo "Post-installation steps completed successfully. Enjoy!" diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml index 8b83cb41c79a..2bb8e1d69217 100644 --- a/.github/workflows/build.yml +++ b/.github/workflows/build.yml @@ -9,27 +9,31 @@ jobs: build: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 - - uses: astral-sh/setup-uv@v6 + - run: sudo apt-get update && sudo apt-get install -y libhdf5-dev + - uses: actions/checkout@v6 + - uses: astral-sh/setup-uv@v7 with: enable-cache: true cache-dependency-glob: uv.lock - - uses: actions/setup-python@v5 + - uses: actions/setup-python@v6 with: - python-version: 3.13 + python-version: 3.14 allow-prereleases: true - run: uv sync --group=test - name: Run tests # TODO: #8818 Re-enable quantum tests - run: uv run pytest + run: uv run --with=pytest-run-parallel pytest + --iterations=8 --parallel-threads=auto --ignore=computer_vision/cnn_classification.py --ignore=docs/conf.py --ignore=dynamic_programming/k_means_clustering_tensorflow.py + --ignore=machine_learning/local_weighted_learning/local_weighted_learning.py --ignore=machine_learning/lstm/lstm_prediction.py --ignore=neural_network/input_data.py --ignore=project_euler/ --ignore=quantum/q_fourier_transform.py --ignore=scripts/validate_solutions.py + --ignore=web_programming/current_stock_price.py --ignore=web_programming/fetch_anime_and_play.py --cov-report=term-missing:skip-covered --cov=. . diff --git a/.github/workflows/devcontainer_ci.yml b/.github/workflows/devcontainer_ci.yml new file mode 100644 index 000000000000..d1b81593866f --- /dev/null +++ b/.github/workflows/devcontainer_ci.yml @@ -0,0 +1,19 @@ +name: Test DevContainer Build + +on: + push: + paths: + - ".devcontainer/**" + pull_request: + paths: + - ".devcontainer/**" + +jobs: + build: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + - uses: devcontainers/ci@v0.3 + with: + push: never + runCmd: "true" diff --git a/.github/workflows/directory_writer.yml b/.github/workflows/directory_writer.yml index 3edb5c91a951..deffbe9e364f 100644 --- a/.github/workflows/directory_writer.yml +++ b/.github/workflows/directory_writer.yml @@ -6,12 +6,13 @@ jobs: directory_writer: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 - - uses: actions/setup-python@v5 + - uses: actions/setup-python@v6 with: - python-version: 3.x + python-version: 3.14 + allow-prereleases: true - name: Write DIRECTORY.md run: | scripts/build_directory_md.py 2>&1 | tee DIRECTORY.md diff --git a/.github/workflows/project_euler.yml b/.github/workflows/project_euler.yml index eaf4150e4eaa..591b2163cc1a 100644 --- a/.github/workflows/project_euler.yml +++ b/.github/workflows/project_euler.yml @@ -14,21 +14,37 @@ jobs: project-euler: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 - - uses: astral-sh/setup-uv@v6 - - uses: actions/setup-python@v5 + - run: + sudo apt-get update && sudo apt-get install -y libtiff5-dev libjpeg8-dev libopenjp2-7-dev + zlib1g-dev libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk + libharfbuzz-dev libfribidi-dev libxcb1-dev + libxml2-dev libxslt-dev + libhdf5-dev + libopenblas-dev + - uses: actions/checkout@v6 + - uses: astral-sh/setup-uv@v7 + - uses: actions/setup-python@v6 with: - python-version: 3.x + python-version: 3.14 + allow-prereleases: true - run: uv sync --group=euler-validate --group=test - run: uv run pytest --doctest-modules --cov-report=term-missing:skip-covered --cov=project_euler/ project_euler/ validate-solutions: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 - - uses: astral-sh/setup-uv@v6 - - uses: actions/setup-python@v5 + - run: + sudo apt-get update && sudo apt-get install -y libtiff5-dev libjpeg8-dev libopenjp2-7-dev + zlib1g-dev libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk + libharfbuzz-dev libfribidi-dev libxcb1-dev + libxml2-dev libxslt-dev + libhdf5-dev + libopenblas-dev + - uses: actions/checkout@v6 + - uses: astral-sh/setup-uv@v7 + - uses: actions/setup-python@v6 with: - python-version: 3.x + python-version: 3.14 + allow-prereleases: true - run: uv sync --group=euler-validate --group=test - run: uv run pytest scripts/validate_solutions.py env: diff --git a/.github/workflows/ruff.yml b/.github/workflows/ruff.yml index ec9f0202bd7e..13df19c8d743 100644 --- a/.github/workflows/ruff.yml +++ b/.github/workflows/ruff.yml @@ -11,6 +11,6 @@ jobs: ruff: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 - - uses: astral-sh/setup-uv@v6 + - uses: actions/checkout@v6 + - uses: astral-sh/setup-uv@v7 - run: uvx ruff check --output-format=github . diff --git a/.github/workflows/sphinx.yml b/.github/workflows/sphinx.yml index 2010041d80c5..3f00094e0264 100644 --- a/.github/workflows/sphinx.yml +++ b/.github/workflows/sphinx.yml @@ -25,16 +25,23 @@ jobs: build_docs: runs-on: ubuntu-24.04-arm steps: - - uses: actions/checkout@v4 - - uses: astral-sh/setup-uv@v6 - - uses: actions/setup-python@v5 + - run: + sudo apt-get update && sudo apt-get install -y libtiff5-dev libjpeg8-dev libopenjp2-7-dev + zlib1g-dev libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk + libharfbuzz-dev libfribidi-dev libxcb1-dev + libxml2-dev libxslt-dev + libhdf5-dev + libopenblas-dev + - uses: actions/checkout@v6 + - uses: astral-sh/setup-uv@v7 + - uses: actions/setup-python@v6 with: - python-version: 3.13 + python-version: 3.14 allow-prereleases: true - run: uv sync --group=docs - - uses: actions/configure-pages@v5 + - uses: actions/configure-pages@v6 - run: uv run sphinx-build -c docs . docs/_build/html - - uses: actions/upload-pages-artifact@v3 + - uses: actions/upload-pages-artifact@v5 with: path: docs/_build/html @@ -46,5 +53,5 @@ jobs: needs: build_docs runs-on: ubuntu-latest steps: - - uses: actions/deploy-pages@v4 + - uses: actions/deploy-pages@v5 id: deployment diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 6c1879ab1ac6..adca030fefe0 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -1,6 +1,9 @@ +ci: + autoupdate_schedule: monthly + repos: - repo: https://github.com/pre-commit/pre-commit-hooks - rev: v5.0.0 + rev: v6.0.0 hooks: - id: check-executables-have-shebangs - id: check-toml @@ -11,25 +14,25 @@ repos: - id: requirements-txt-fixer - repo: https://github.com/MarcoGorelli/auto-walrus - rev: 0.3.4 + rev: 0.4.1 hooks: - id: auto-walrus - repo: https://github.com/astral-sh/ruff-pre-commit - rev: v0.11.11 + rev: v0.15.14 hooks: - - id: ruff + - id: ruff-check - id: ruff-format - repo: https://github.com/codespell-project/codespell - rev: v2.4.1 + rev: v2.4.2 hooks: - id: codespell additional_dependencies: - tomli - repo: https://github.com/tox-dev/pyproject-fmt - rev: "v2.6.0" + rev: v2.21.2 hooks: - id: pyproject-fmt @@ -42,23 +45,22 @@ repos: pass_filenames: false - repo: https://github.com/abravalheri/validate-pyproject - rev: v0.24.1 + rev: v0.25 hooks: - id: validate-pyproject - - repo: https://github.com/pre-commit/mirrors-mypy - rev: v1.15.0 - hooks: - - id: mypy - args: - - --explicit-package-bases - - --ignore-missing-imports - - --install-types # See mirrors-mypy README.md - - --non-interactive - additional_dependencies: [types-requests] + # - repo: https://github.com/pre-commit/mirrors-mypy + # rev: v1.20.0 + # hooks: + # - id: mypy + # args: + # - --explicit-package-bases + # - --ignore-missing-imports + # - --install-types + # - --non-interactive - repo: https://github.com/pre-commit/mirrors-prettier - rev: "v4.0.0-alpha.8" + rev: v4.0.0-alpha.8 hooks: - id: prettier types_or: [toml, yaml] diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 3df39f95b784..aa6bff3ad1da 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -99,7 +99,7 @@ We want your work to be readable by others; therefore, we encourage you to note ruff check ``` -- Original code submission require docstrings or comments to describe your work. +- Original code submissions require docstrings or comments to describe your work. - More on docstrings and comments: @@ -159,7 +159,7 @@ We want your work to be readable by others; therefore, we encourage you to note starting_value = int(input("Please enter a starting value: ").strip()) ``` - The use of [Python type hints](https://docs.python.org/3/library/typing.html) is encouraged for function parameters and return values. Our automated testing will run [mypy](http://mypy-lang.org) so run that locally before making your submission. + The use of [Python type hints](https://docs.python.org/3/library/typing.html) is encouraged for function parameters and return values. Our automated testing will run [mypy](https://mypy-lang.org) so run that locally before making your submission. ```python def sum_ab(a: int, b: int) -> int: diff --git a/DIRECTORY.md b/DIRECTORY.md index 00f4bb4ef2b2..daf71bab8162 100644 --- a/DIRECTORY.md +++ b/DIRECTORY.md @@ -12,6 +12,7 @@ * [Combination Sum](backtracking/combination_sum.py) * [Crossword Puzzle Solver](backtracking/crossword_puzzle_solver.py) * [Generate Parentheses](backtracking/generate_parentheses.py) + * [Generate Parentheses Iterative](backtracking/generate_parentheses_iterative.py) * [Hamiltonian Cycle](backtracking/hamiltonian_cycle.py) * [Knight Tour](backtracking/knight_tour.py) * [Match Word Pattern](backtracking/match_word_pattern.py) @@ -174,6 +175,7 @@ ## Data Compression * [Burrows Wheeler](data_compression/burrows_wheeler.py) + * [Coordinate Compression](data_compression/coordinate_compression.py) * [Huffman](data_compression/huffman.py) * [Lempel Ziv](data_compression/lempel_ziv.py) * [Lempel Ziv Decompress](data_compression/lempel_ziv_decompress.py) @@ -193,6 +195,7 @@ * [Permutations](data_structures/arrays/permutations.py) * [Prefix Sum](data_structures/arrays/prefix_sum.py) * [Product Sum](data_structures/arrays/product_sum.py) + * [Rotate Array](data_structures/arrays/rotate_array.py) * [Sparse Table](data_structures/arrays/sparse_table.py) * [Sudoku Solver](data_structures/arrays/sudoku_solver.py) * Binary Tree @@ -395,6 +398,7 @@ * [Minimum Squares To Represent A Number](dynamic_programming/minimum_squares_to_represent_a_number.py) * [Minimum Steps To One](dynamic_programming/minimum_steps_to_one.py) * [Minimum Tickets Cost](dynamic_programming/minimum_tickets_cost.py) + * [Narcissistic Number](dynamic_programming/narcissistic_number.py) * [Optimal Binary Search Tree](dynamic_programming/optimal_binary_search_tree.py) * [Palindrome Partitioning](dynamic_programming/palindrome_partitioning.py) * [Range Sum Query](dynamic_programming/range_sum_query.py) @@ -443,6 +447,7 @@ * [Present Value](financial/present_value.py) * [Price Plus Tax](financial/price_plus_tax.py) * [Simple Moving Average](financial/simple_moving_average.py) + * [Straight Line Depreciation](financial/straight_line_depreciation.py) * [Time And Half Pay](financial/time_and_half_pay.py) ## Fractals @@ -464,6 +469,13 @@ ## Geometry * [Geometry](geometry/geometry.py) + * [Graham Scan](geometry/graham_scan.py) + * [Jarvis March](geometry/jarvis_march.py) + * [Ramer Douglas Peucker](geometry/ramer_douglas_peucker.py) + * [Segment Intersection](geometry/segment_intersection.py) + * Tests + * [Test Graham Scan](geometry/tests/test_graham_scan.py) + * [Test Jarvis March](geometry/tests/test_jarvis_march.py) ## Graphics * [Bezier Curve](graphics/bezier_curve.py) @@ -513,6 +525,7 @@ * [Graphs Floyd Warshall](graphs/graphs_floyd_warshall.py) * [Greedy Best First](graphs/greedy_best_first.py) * [Greedy Min Vertex Cover](graphs/greedy_min_vertex_cover.py) + * [Johnson](graphs/johnson.py) * [Kahns Algorithm Long](graphs/kahns_algorithm_long.py) * [Kahns Algorithm Topo](graphs/kahns_algorithm_topo.py) * [Karger](graphs/karger.py) @@ -533,6 +546,7 @@ * [Strongly Connected Components](graphs/strongly_connected_components.py) * [Tarjans Scc](graphs/tarjans_scc.py) * Tests + * [Test Johnson](graphs/tests/test_johnson.py) * [Test Min Spanning Tree Kruskal](graphs/tests/test_min_spanning_tree_kruskal.py) * [Test Min Spanning Tree Prim](graphs/tests/test_min_spanning_tree_prim.py) @@ -620,6 +634,7 @@ * [Sequential Minimum Optimization](machine_learning/sequential_minimum_optimization.py) * [Similarity Search](machine_learning/similarity_search.py) * [Support Vector Machines](machine_learning/support_vector_machines.py) + * [T Stochastic Neighbour Embedding](machine_learning/t_stochastic_neighbour_embedding.py) * [Word Frequency Functions](machine_learning/word_frequency_functions.py) * [Xgboost Classifier](machine_learning/xgboost_classifier.py) * [Xgboost Regressor](machine_learning/xgboost_regressor.py) @@ -722,6 +737,7 @@ * [Secant Method](maths/numerical_analysis/secant_method.py) * [Simpson Rule](maths/numerical_analysis/simpson_rule.py) * [Square Root](maths/numerical_analysis/square_root.py) + * [Weierstrass Method](maths/numerical_analysis/weierstrass_method.py) * [Odd Sieve](maths/odd_sieve.py) * [Perfect Cube](maths/perfect_cube.py) * [Perfect Number](maths/perfect_number.py) @@ -790,6 +806,7 @@ * [Sumset](maths/sumset.py) * [Sylvester Sequence](maths/sylvester_sequence.py) * [Tanh](maths/tanh.py) + * [Test Factorial](maths/test_factorial.py) * [Test Prime Check](maths/test_prime_check.py) * [Three Sum](maths/three_sum.py) * [Trapezoidal Rule](maths/trapezoidal_rule.py) @@ -873,6 +890,7 @@ * [Quine](other/quine.py) * [Scoring Algorithm](other/scoring_algorithm.py) * [Sdes](other/sdes.py) + * [Sliding Window Maximum](other/sliding_window_maximum.py) * [Tower Of Hanoi](other/tower_of_hanoi.py) * [Word Search](other/word_search.py) @@ -954,6 +972,7 @@ * [Sol1](project_euler/problem_009/sol1.py) * [Sol2](project_euler/problem_009/sol2.py) * [Sol3](project_euler/problem_009/sol3.py) + * [Sol4](project_euler/problem_009/sol4.py) * Problem 010 * [Sol1](project_euler/problem_010/sol1.py) * [Sol2](project_euler/problem_010/sol2.py) @@ -971,6 +990,7 @@ * [Sol2](project_euler/problem_014/sol2.py) * Problem 015 * [Sol1](project_euler/problem_015/sol1.py) + * [Sol2](project_euler/problem_015/sol2.py) * Problem 016 * [Sol1](project_euler/problem_016/sol1.py) * [Sol2](project_euler/problem_016/sol2.py) @@ -1264,6 +1284,7 @@ * [Comb Sort](sorts/comb_sort.py) * [Counting Sort](sorts/counting_sort.py) * [Cycle Sort](sorts/cycle_sort.py) + * [Cyclic Sort](sorts/cyclic_sort.py) * [Double Sort](sorts/double_sort.py) * [Dutch National Flag Sort](sorts/dutch_national_flag_sort.py) * [Exchange Sort](sorts/exchange_sort.py) @@ -1294,6 +1315,7 @@ * [Shell Sort](sorts/shell_sort.py) * [Shrink Shell Sort](sorts/shrink_shell_sort.py) * [Slowsort](sorts/slowsort.py) + * [Stalin Sort](sorts/stalin_sort.py) * [Stooge Sort](sorts/stooge_sort.py) * [Strand Sort](sorts/strand_sort.py) * [Tim Sort](sorts/tim_sort.py) diff --git a/README.md b/README.md index d8eba4e016fa..182d36a8d905 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,7 @@

The Algorithms - Python

+ @@ -19,6 +20,7 @@ Gitter chat +
@@ -27,23 +29,24 @@ pre-commit - - code style: black + + code style: black + -

All algorithms implemented in Python - for education

+

All algorithms implemented in Python - for education 📚

Implementations are for learning purposes only. They may be less efficient than the implementations in the Python standard library. Use them at your discretion. -## Getting Started +## 🚀 Getting Started -Read through our [Contribution Guidelines](CONTRIBUTING.md) before you contribute. +📋 Read through our [Contribution Guidelines](CONTRIBUTING.md) before you contribute. -## Community Channels +## 🌐 Community Channels We are on [Discord](https://the-algorithms.com/discord) and [Gitter](https://gitter.im/TheAlgorithms/community)! Community channels are a great way for you to ask questions and get help. Please join us! -## List of Algorithms +## 📜 List of Algorithms See our [directory](DIRECTORY.md) for easier navigation and a better overview of the project. diff --git a/backtracking/coloring.py b/backtracking/coloring.py index f10cdbcf9d26..abfdf16f1342 100644 --- a/backtracking/coloring.py +++ b/backtracking/coloring.py @@ -104,6 +104,14 @@ def color(graph: list[list[int]], max_colors: int) -> list[int]: >>> max_colors = 2 >>> color(graph, max_colors) [] + >>> color([], 2) # empty graph + [] + >>> color([[0]], 1) # single node, 1 color + [0] + >>> color([[0, 1], [1, 0]], 1) # 2 nodes, 1 color (impossible) + [] + >>> color([[0, 1], [1, 0]], 2) # 2 nodes, 2 colors (possible) + [0, 1] """ colored_vertices = [-1] * len(graph) diff --git a/backtracking/combination_sum.py b/backtracking/combination_sum.py index 3c6ed81f44f0..3d954f11d2c5 100644 --- a/backtracking/combination_sum.py +++ b/backtracking/combination_sum.py @@ -47,8 +47,18 @@ def combination_sum(candidates: list, target: int) -> list: >>> combination_sum([-8, 2.3, 0], 1) Traceback (most recent call last): ... - RecursionError: maximum recursion depth exceeded + ValueError: All elements in candidates must be non-negative + >>> combination_sum([], 1) + Traceback (most recent call last): + ... + ValueError: Candidates list should not be empty """ + if not candidates: + raise ValueError("Candidates list should not be empty") + + if any(x < 0 for x in candidates): + raise ValueError("All elements in candidates must be non-negative") + path = [] # type: list[int] answer = [] # type: list[int] backtrack(candidates, path, answer, target, 0) diff --git a/backtracking/generate_parentheses.py b/backtracking/generate_parentheses.py index 18c21e2a9b51..5094f4b08619 100644 --- a/backtracking/generate_parentheses.py +++ b/backtracking/generate_parentheses.py @@ -64,6 +64,10 @@ def generate_parenthesis(n: int) -> list[str]: Example 2: >>> generate_parenthesis(1) ['()'] + + Example 3: + >>> generate_parenthesis(0) + [''] """ result: list[str] = [] diff --git a/backtracking/generate_parentheses_iterative.py b/backtracking/generate_parentheses_iterative.py new file mode 100644 index 000000000000..84c032f52dc4 --- /dev/null +++ b/backtracking/generate_parentheses_iterative.py @@ -0,0 +1,67 @@ +def generate_parentheses_iterative(length: int) -> list[str]: + """ + Generate all valid combinations of parentheses (Iterative Approach). + + The algorithm works as follows: + 1. Initialize an empty list to store the combinations. + 2. Initialize a stack to keep track of partial combinations. + 3. Start with empty string and push it on stack along with + the counts of '(' and ')'. + 4. While the stack is not empty: + a. Pop a partial combination and its open and close counts from the stack. + b. If the combination length is equal to 2*length, add it to the result. + c. If open count < length, push new combination with added '(' on stack. + d. If close count < open count, push new combination with added ')' on stack. + 5. Return the result containing all valid combinations. + + Args: + length: The desired length of the parentheses combinations + + Returns: + A list of strings representing valid combinations of parentheses + + Time Complexity: + O(2^(2*length)) + + Space Complexity: + O(2^(2*length)) + + >>> generate_parentheses_iterative(3) + ['()()()', '()(())', '(())()', '(()())', '((()))'] + >>> generate_parentheses_iterative(2) + ['()()', '(())'] + >>> generate_parentheses_iterative(1) + ['()'] + >>> generate_parentheses_iterative(0) + [''] + """ + if length == 0: + return [""] + + result: list[str] = [] + stack: list[tuple[str, int, int]] = [] + + # Each element in stack is a tuple (current_combination, open_count, close_count) + stack.append(("", 0, 0)) + + while stack: + current_combination, open_count, close_count = stack.pop() + + if len(current_combination) == 2 * length: + result.append(current_combination) + continue + + if open_count < length: + stack.append((current_combination + "(", open_count + 1, close_count)) + + if close_count < open_count: + stack.append((current_combination + ")", open_count, close_count + 1)) + + return result + + +if __name__ == "__main__": + import doctest + + doctest.testmod() + print(generate_parentheses_iterative(3)) diff --git a/backtracking/n_queens.py b/backtracking/n_queens.py index d10181f319b3..6fac93aa77d6 100644 --- a/backtracking/n_queens.py +++ b/backtracking/n_queens.py @@ -33,6 +33,14 @@ def is_safe(board: list[list[int]], row: int, column: int) -> bool: False >>> is_safe([[0, 0, 1], [0, 0, 0], [0, 0, 0]], 1, 1) False + >>> is_safe([[1, 0, 0], [0, 0, 0], [0, 0, 0]], 1, 2) + True + >>> is_safe([[1, 0, 0], [0, 0, 0], [0, 0, 0]], 2, 1) + True + >>> is_safe([[0, 0, 0], [1, 0, 0], [0, 0, 0]], 0, 2) + True + >>> is_safe([[0, 0, 0], [1, 0, 0], [0, 0, 0]], 2, 2) + True """ n = len(board) # Size of the board diff --git a/backtracking/word_break.py b/backtracking/word_break.py index 1f2ab073f499..2e874a02b61c 100644 --- a/backtracking/word_break.py +++ b/backtracking/word_break.py @@ -66,6 +66,9 @@ def word_break(input_string: str, word_dict: set[str]) -> bool: >>> word_break("catsandog", {"cats", "dog", "sand", "and", "cat"}) False + + >>> word_break("applepenapple", {}) + False """ return backtrack(input_string, word_dict, 0) diff --git a/bit_manipulation/reverse_bits.py b/bit_manipulation/reverse_bits.py index 74b4f2563234..4a0b2ff7047a 100644 --- a/bit_manipulation/reverse_bits.py +++ b/bit_manipulation/reverse_bits.py @@ -1,6 +1,6 @@ def get_reverse_bit_string(number: int) -> str: """ - return the bit string of an integer + Return the reverse bit string of a 32 bit integer >>> get_reverse_bit_string(9) '10010000000000000000000000000000' @@ -8,76 +8,76 @@ def get_reverse_bit_string(number: int) -> str: '11010100000000000000000000000000' >>> get_reverse_bit_string(2873) '10011100110100000000000000000000' + >>> get_reverse_bit_string(2550136832) + '00000000000000000000000000011001' >>> get_reverse_bit_string("this is not a number") Traceback (most recent call last): ... - TypeError: operation can not be conducted on a object of type str + TypeError: operation can not be conducted on an object of type str """ if not isinstance(number, int): msg = ( - "operation can not be conducted on a object of type " + "operation can not be conducted on an object of type " f"{type(number).__name__}" ) raise TypeError(msg) bit_string = "" for _ in range(32): bit_string += str(number % 2) - number = number >> 1 + number >>= 1 return bit_string -def reverse_bit(number: int) -> str: +def reverse_bit(number: int) -> int: """ - Take in an 32 bit integer, reverse its bits, - return a string of reverse bits - - result of a reverse_bit and operation on the integer provided. + Take in a 32 bit integer, reverse its bits, return a 32 bit integer result >>> reverse_bit(25) - '00000000000000000000000000011001' + 2550136832 >>> reverse_bit(37) - '00000000000000000000000000100101' + 2751463424 >>> reverse_bit(21) - '00000000000000000000000000010101' + 2818572288 >>> reverse_bit(58) - '00000000000000000000000000111010' + 1543503872 >>> reverse_bit(0) - '00000000000000000000000000000000' + 0 >>> reverse_bit(256) - '00000000000000000000000100000000' + 8388608 + >>> reverse_bit(2550136832) + 25 >>> reverse_bit(-1) Traceback (most recent call last): ... - ValueError: the value of input must be positive + ValueError: The value of input must be non-negative >>> reverse_bit(1.1) Traceback (most recent call last): ... - TypeError: Input value must be a 'int' type + TypeError: Input value must be an 'int' type >>> reverse_bit("0") Traceback (most recent call last): ... - TypeError: '<' not supported between instances of 'str' and 'int' + TypeError: Input value must be an 'int' type """ + if not isinstance(number, int): + raise TypeError("Input value must be an 'int' type") if number < 0: - raise ValueError("the value of input must be positive") - elif isinstance(number, float): - raise TypeError("Input value must be a 'int' type") - elif isinstance(number, str): - raise TypeError("'<' not supported between instances of 'str' and 'int'") + raise ValueError("The value of input must be non-negative") + result = 0 - # iterator over [1 to 32],since we are dealing with 32 bit integer - for _ in range(1, 33): + # iterator over [0 to 31], since we are dealing with a 32 bit integer + for _ in range(32): # left shift the bits by unity - result = result << 1 + result <<= 1 # get the end bit - end_bit = number % 2 + end_bit = number & 1 # right shift the bits by unity - number = number >> 1 - # add that bit to our ans - result = result | end_bit - return get_reverse_bit_string(result) + number >>= 1 + # add that bit to our answer + result |= end_bit + return result if __name__ == "__main__": diff --git a/blockchain/README.md b/blockchain/README.md index b5fab7b36eaa..ecd784fc2c7d 100644 --- a/blockchain/README.md +++ b/blockchain/README.md @@ -1,8 +1,8 @@ # Blockchain -A Blockchain is a type of **distributed ledger** technology (DLT) that consists of growing list of records, called **blocks**, that are securely linked together using **cryptography**. +A Blockchain is a type of **distributed ledger** technology (DLT) that consists of a growing list of records, called **blocks**, that are securely linked together using **cryptography**. -Let's breakdown the terminologies in the above definition. We find below terminologies, +Let's break down the terminologies in the above definition. We find below terminologies, - Digital Ledger Technology (DLT) - Blocks @@ -10,35 +10,35 @@ Let's breakdown the terminologies in the above definition. We find below termino ## Digital Ledger Technology - It is otherwise called as distributed ledger technology. It is simply the opposite of centralized database. Firstly, what is a **ledger**? A ledger is a book or collection of accounts that records account transactions. +Blockchain is also called distributed ledger technology. It is simply the opposite of a centralized database. Firstly, what is a **ledger**? A ledger is a book or collection of accounts that records account transactions. - *Why is Blockchain addressed as digital ledger if it can record more than account transactions? What other transaction details and information can it hold?* +*Why is Blockchain addressed as a digital ledger if it can record more than account transactions? What other transaction details and information can it hold?* -Digital Ledger Technology is just a ledger which is shared among multiple nodes. This way there exist no need for central authority to hold the info. Okay, how is it differentiated from central database and what are their benefits? +Digital Ledger Technology is just a ledger that is shared among multiple nodes. This way there exists no need for a central authority to hold the info. Okay, how is it differentiated from a central database and what are their benefits? -There is an organization which has 4 branches whose data are stored in a centralized database. So even if one branch needs any data from ledger they need an approval from database in charge. And if one hacks the central database he gets to tamper and control all the data. +Suppose that there is an organization that has 4 branches whose data are stored in a centralized database. So even if one branch needs any data from the ledger it needs approval from the database in charge. And if one hacks the central database he gets to tamper and control all the data. -Now lets assume every branch has a copy of the ledger and then once anything is added to the ledger by anyone branch it is gonna automatically reflect in all other ledgers available in other branch. This is done using Peer-to-peer network. +Now let's assume every branch has a copy of the ledger and then once anything is added to the ledger by any branch it is gonna automatically reflect in all other ledgers available in other branches. This is done using a peer-to-peer network. -So this means even if information is tampered in one branch we can find out. If one branch is hacked we can be alerted ,so we can safeguard other branches. Now, assume these branches as computers or nodes and the ledger is a transaction record or digital receipt. If one ledger is hacked in a node we can detect since there will be a mismatch in comparison with other node information. So this is the concept of Digital Ledger Technology. +This means that even if information is tampered with in one branch we can find out. If one branch is hacked we can be alerted, so we can safeguard other branches. Now, assume these branches as computers or nodes and the ledger is a transaction record or digital receipt. If one ledger is hacked in a node we can detect since there will be a mismatch in comparison with other node information. So this is the concept of Digital Ledger Technology. *Is it required for all nodes to have access to all information in other nodes? Wouldn't this require enormous storage space in each node?* ## Blocks -In short a block is nothing but collections of records with a labelled header. These are connected cryptographically. Once a new block is added to a chain, the previous block is connected, more precisely said as locked and hence, will remain unaltered. We can understand this concept once we get a clear understanding of working mechanism of blockchain. +In short, a block is nothing but a collection of records with a labelled header. These are connected cryptographically. Once a new block is added to a chain, the previous block is connected, more precisely said as locked, and hence will remain unaltered. We can understand this concept once we get a clear understanding of the working mechanism of blockchain. ## Cryptography -It is the practice and study of secure communication techniques in the midst of adversarial behavior. More broadly, cryptography is the creation and analysis of protocols that prevent third parties or the general public from accessing private messages. +Cryptography is the practice and study of secure communication techniques amid adversarial behavior. More broadly, cryptography is the creation and analysis of protocols that prevent third parties or the general public from accessing private messages. *Which cryptography technology is most widely used in blockchain and why?* -So, in general, blockchain technology is a distributed record holder which records the information about ownership of an asset. To define precisely, +So, in general, blockchain technology is a distributed record holder that records the information about ownership of an asset. To define precisely, > Blockchain is a distributed, immutable ledger that makes it easier to record transactions and track assets in a corporate network. An asset could be tangible (such as a house, car, cash, or land) or intangible (such as a business) (intellectual property, patents, copyrights, branding). A blockchain network can track and sell almost anything of value, lowering risk and costs for everyone involved. -So this is all about introduction to blockchain technology. To learn more about the topic refer below links.... +So this is all about the introduction to blockchain technology. To learn more about the topic refer below links.... * * * diff --git a/boolean_algebra/imply_gate.py b/boolean_algebra/imply_gate.py index b64ebaceb306..3d71ff12f8d9 100644 --- a/boolean_algebra/imply_gate.py +++ b/boolean_algebra/imply_gate.py @@ -33,6 +33,58 @@ def imply_gate(input_1: int, input_2: int) -> int: return int(input_1 == 0 or input_2 == 1) +def recursive_imply_list(input_list: list[int]) -> int: + """ + Recursively calculates the implication of a list. + Strictly the implication is applied consecutively left to right: + ( (a -> b) -> c ) -> d ... + + >>> recursive_imply_list([]) + Traceback (most recent call last): + ... + ValueError: Input list must contain at least two elements + >>> recursive_imply_list([0]) + Traceback (most recent call last): + ... + ValueError: Input list must contain at least two elements + >>> recursive_imply_list([1]) + Traceback (most recent call last): + ... + ValueError: Input list must contain at least two elements + >>> recursive_imply_list([0, 0]) + 1 + >>> recursive_imply_list([0, 1]) + 1 + >>> recursive_imply_list([1, 0]) + 0 + >>> recursive_imply_list([1, 1]) + 1 + >>> recursive_imply_list([0, 0, 0]) + 0 + >>> recursive_imply_list([0, 0, 1]) + 1 + >>> recursive_imply_list([0, 1, 0]) + 0 + >>> recursive_imply_list([0, 1, 1]) + 1 + >>> recursive_imply_list([1, 0, 0]) + 1 + >>> recursive_imply_list([1, 0, 1]) + 1 + >>> recursive_imply_list([1, 1, 0]) + 0 + >>> recursive_imply_list([1, 1, 1]) + 1 + """ + if len(input_list) < 2: + raise ValueError("Input list must contain at least two elements") + first_implication = imply_gate(input_list[0], input_list[1]) + if len(input_list) == 2: + return first_implication + new_list = [first_implication, *input_list[2:]] + return recursive_imply_list(new_list) + + if __name__ == "__main__": import doctest diff --git a/ciphers/caesar_cipher.py b/ciphers/caesar_cipher.py index 1cf4d67cbaed..ef5f49313ee7 100644 --- a/ciphers/caesar_cipher.py +++ b/ciphers/caesar_cipher.py @@ -45,7 +45,7 @@ def encrypt(input_string: str, key: int, alphabet: str | None = None) -> str: And our shift is ``2`` We can then encode the message, one letter at a time. ``H`` would become ``J``, - since ``J`` is two letters away, and so on. If the shift is ever two large, or + since ``J`` is two letters away, and so on. If the shift is ever too large, or our letter is at the end of the alphabet, we just start at the beginning (``Z`` would shift to ``a`` then ``b`` and so on). diff --git a/ciphers/gronsfeld_cipher.py b/ciphers/gronsfeld_cipher.py index 8fbeab4307fc..a72b141bd502 100644 --- a/ciphers/gronsfeld_cipher.py +++ b/ciphers/gronsfeld_cipher.py @@ -20,7 +20,7 @@ def gronsfeld(text: str, key: str) -> str: >>> gronsfeld('yes, ¥€$ - _!@#%?', '') Traceback (most recent call last): ... - ZeroDivisionError: integer modulo by zero + ZeroDivisionError: division by zero """ ascii_len = len(ascii_uppercase) key_len = len(key) diff --git a/ciphers/hill_cipher.py b/ciphers/hill_cipher.py index 33b2529f017b..c690d29bd113 100644 --- a/ciphers/hill_cipher.py +++ b/ciphers/hill_cipher.py @@ -78,8 +78,10 @@ def replace_digits(self, num: int) -> str: 'T' >>> hill_cipher.replace_digits(26) '0' + >>> hill_cipher.replace_digits(26.1) + '0' """ - return self.key_string[round(num)] + return self.key_string[int(num)] def check_determinant(self) -> None: """ diff --git a/data_compression/coordinate_compression.py b/data_compression/coordinate_compression.py new file mode 100644 index 000000000000..9c4ad9a99ac3 --- /dev/null +++ b/data_compression/coordinate_compression.py @@ -0,0 +1,132 @@ +""" +Assumption: + - The values to compress are assumed to be comparable, + values can be sorted and compared with '<' and '>' operators. +""" + + +class CoordinateCompressor: + """ + A class for coordinate compression. + + This class allows you to compress and decompress a list of values. + + Mapping: + In addition to compression and decompression, this class maintains a mapping + between original values and their compressed counterparts using two data + structures: a dictionary `coordinate_map` and a list `reverse_map`: + - `coordinate_map`: A dictionary that maps original values to their compressed + coordinates. Keys are original values, and values are compressed coordinates. + - `reverse_map`: A list used for reverse mapping, where each index corresponds + to a compressed coordinate, and the value at that index is the original value. + + Example of mapping: + Original: 10, Compressed: 0 + Original: 52, Compressed: 1 + Original: 83, Compressed: 2 + Original: 100, Compressed: 3 + + This mapping allows for efficient compression and decompression of values within + the list. + """ + + def __init__(self, arr: list[int | float | str]) -> None: + """ + Initialize the CoordinateCompressor with a list. + + Args: + arr: The list of values to be compressed. + + >>> arr = [100, 10, 52, 83] + >>> cc = CoordinateCompressor(arr) + >>> cc.compress(100) + 3 + >>> cc.compress(52) + 1 + >>> cc.decompress(1) + 52 + """ + + # A dictionary to store compressed coordinates + self.coordinate_map: dict[int | float | str, int] = {} + + # A list to store reverse mapping + self.reverse_map: list[int | float | str] = [-1] * len(arr) + + self.arr = sorted(arr) # The input list + self.n = len(arr) # The length of the input list + self.compress_coordinates() + + def compress_coordinates(self) -> None: + """ + Compress the coordinates in the input list. + + >>> arr = [100, 10, 52, 83] + >>> cc = CoordinateCompressor(arr) + >>> cc.coordinate_map[83] + 2 + >>> cc.coordinate_map[80] # Value not in the original list + Traceback (most recent call last): + ... + KeyError: 80 + >>> cc.reverse_map[2] + 83 + """ + key = 0 + for val in self.arr: + if val not in self.coordinate_map: + self.coordinate_map[val] = key + self.reverse_map[key] = val + key += 1 + + def compress(self, original: float | str) -> int: + """ + Compress a single value. + + Args: + original: The value to compress. + + Returns: + The compressed integer, or -1 if not found in the original list. + + >>> arr = [100, 10, 52, 83] + >>> cc = CoordinateCompressor(arr) + >>> cc.compress(100) + 3 + >>> cc.compress(7) # Value not in the original list + -1 + """ + return self.coordinate_map.get(original, -1) + + def decompress(self, num: int) -> int | float | str: + """ + Decompress a single integer. + + Args: + num: The compressed integer to decompress. + + Returns: + The original value. + + >>> arr = [100, 10, 52, 83] + >>> cc = CoordinateCompressor(arr) + >>> cc.decompress(0) + 10 + >>> cc.decompress(5) # Compressed coordinate out of range + -1 + """ + return self.reverse_map[num] if 0 <= num < len(self.reverse_map) else -1 + + +if __name__ == "__main__": + from doctest import testmod + + testmod() + + arr: list[int | float | str] = [100, 10, 52, 83] + cc = CoordinateCompressor(arr) + + for original in arr: + compressed = cc.compress(original) + decompressed = cc.decompress(compressed) + print(f"Original: {decompressed}, Compressed: {compressed}") diff --git a/data_structures/arrays/rotate_array.py b/data_structures/arrays/rotate_array.py new file mode 100644 index 000000000000..d5ce4b4078b3 --- /dev/null +++ b/data_structures/arrays/rotate_array.py @@ -0,0 +1,80 @@ +def rotate_array(arr: list[int], steps: int) -> list[int]: + """ + Rotates a list to the right by steps positions. + + Parameters: + arr (List[int]): The list of integers to rotate. + steps (int): Number of positions to rotate. Can be negative for left rotation. + + Returns: + List[int]: Rotated list. + + Examples: + >>> rotate_array([1, 2, 3, 4, 5], 2) + [4, 5, 1, 2, 3] + >>> rotate_array([1, 2, 3, 4, 5], -2) + [3, 4, 5, 1, 2] + >>> rotate_array([1, 2, 3, 4, 5], 7) + [4, 5, 1, 2, 3] + >>> rotate_array([], 3) + [] + """ + + n = len(arr) + if n == 0: + return arr + + steps = steps % n + + if steps < 0: + steps += n + + def reverse(start: int, end: int) -> None: + """ + Reverses a portion of the list in place from index start to end. + + Parameters: + start (int): Starting index of the portion to reverse. + end (int): Ending index of the portion to reverse. + + Returns: + None + + Examples: + >>> example = [1, 2, 3, 4, 5] + >>> def reverse_test(arr, start, end): + ... while start < end: + ... arr[start], arr[end] = arr[end], arr[start] + ... start += 1 + ... end -= 1 + >>> reverse_test(example, 0, 2) + >>> example + [3, 2, 1, 4, 5] + >>> reverse_test(example, 2, 4) + >>> example + [3, 2, 5, 4, 1] + """ + + while start < end: + arr[start], arr[end] = arr[end], arr[start] + start += 1 + end -= 1 + + reverse(0, n - 1) + reverse(0, steps - 1) + reverse(steps, n - 1) + + return arr + + +if __name__ == "__main__": + examples = [ + ([1, 2, 3, 4, 5], 2), + ([1, 2, 3, 4, 5], -2), + ([1, 2, 3, 4, 5], 7), + ([], 3), + ] + + for arr, steps in examples: + rotated = rotate_array(arr.copy(), steps) + print(f"Rotate {arr} by {steps}: {rotated}") diff --git a/data_structures/arrays/sudoku_solver.py b/data_structures/arrays/sudoku_solver.py index 4c722f12fd6e..d2fa43bbf298 100644 --- a/data_structures/arrays/sudoku_solver.py +++ b/data_structures/arrays/sudoku_solver.py @@ -11,6 +11,19 @@ def cross(items_a, items_b): """ Cross product of elements in A and elements in B. + + >>> cross('AB', '12') + ['A1', 'A2', 'B1', 'B2'] + >>> cross('ABC', '123') + ['A1', 'A2', 'A3', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3'] + >>> cross('ABC', '1234') + ['A1', 'A2', 'A3', 'A4', 'B1', 'B2', 'B3', 'B4', 'C1', 'C2', 'C3', 'C4'] + >>> cross('', '12') + [] + >>> cross('A', '') + [] + >>> cross('', '') + [] """ return [a + b for a in items_a for b in items_b] @@ -149,7 +162,7 @@ def search(values): if all(len(values[s]) == 1 for s in squares): return values ## Solved! ## Chose the unfilled square s with the fewest possibilities - n, s = min((len(values[s]), s) for s in squares if len(values[s]) > 1) + _n, s = min((len(values[s]), s) for s in squares if len(values[s]) > 1) return some(search(assign(values.copy(), s, d)) for d in values[s]) diff --git a/data_structures/binary_tree/binary_tree_path_sum.py b/data_structures/binary_tree/binary_tree_path_sum.py index a3fe9ca7a7e2..8477690c777a 100644 --- a/data_structures/binary_tree/binary_tree_path_sum.py +++ b/data_structures/binary_tree/binary_tree_path_sum.py @@ -50,6 +50,26 @@ class BinaryTreePathSum: >>> tree.right.right = Node(10) >>> BinaryTreePathSum().path_sum(tree, 8) 2 + >>> BinaryTreePathSum().path_sum(None, 0) + 0 + >>> BinaryTreePathSum().path_sum(tree, 0) + 0 + + The second tree looks like this + 0 + / \ + 5 5 + + >>> tree2 = Node(0) + >>> tree2.left = Node(5) + >>> tree2.right = Node(5) + + >>> BinaryTreePathSum().path_sum(tree2, 5) + 4 + >>> BinaryTreePathSum().path_sum(tree2, -1) + 0 + >>> BinaryTreePathSum().path_sum(tree2, 0) + 1 """ target: int diff --git a/data_structures/binary_tree/non_recursive_segment_tree.py b/data_structures/binary_tree/non_recursive_segment_tree.py index ca0d5c111c4f..7d1c965fab50 100644 --- a/data_structures/binary_tree/non_recursive_segment_tree.py +++ b/data_structures/binary_tree/non_recursive_segment_tree.py @@ -39,12 +39,12 @@ from __future__ import annotations from collections.abc import Callable -from typing import Any, Generic, TypeVar +from typing import Any, TypeVar T = TypeVar("T") -class SegmentTree(Generic[T]): +class SegmentTree[T]: def __init__(self, arr: list[T], fnc: Callable[[T, T], T]) -> None: """ Segment Tree constructor, it works just with commutative combiner. diff --git a/data_structures/hashing/hash_map.py b/data_structures/hashing/hash_map.py index 8c56c327a492..0d99e578b73e 100644 --- a/data_structures/hashing/hash_map.py +++ b/data_structures/hashing/hash_map.py @@ -10,14 +10,14 @@ from collections.abc import Iterator, MutableMapping from dataclasses import dataclass -from typing import Generic, TypeVar +from typing import TypeVar KEY = TypeVar("KEY") VAL = TypeVar("VAL") @dataclass(slots=True) -class _Item(Generic[KEY, VAL]): +class _Item[KEY, VAL]: key: KEY val: VAL diff --git a/data_structures/hashing/hash_table_with_linked_list.py b/data_structures/hashing/hash_table_with_linked_list.py index f404c5251246..c8dffa30b8e8 100644 --- a/data_structures/hashing/hash_table_with_linked_list.py +++ b/data_structures/hashing/hash_table_with_linked_list.py @@ -8,7 +8,7 @@ def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) def _set_value(self, key, data): - self.values[key] = deque([]) if self.values[key] is None else self.values[key] + self.values[key] = deque() if self.values[key] is None else self.values[key] self.values[key].appendleft(data) self._keys[key] = self.values[key] diff --git a/data_structures/heap/heap.py b/data_structures/heap/heap.py index 7b15e69f13ca..41ef0ddd1005 100644 --- a/data_structures/heap/heap.py +++ b/data_structures/heap/heap.py @@ -2,7 +2,7 @@ from abc import abstractmethod from collections.abc import Iterable -from typing import Generic, Protocol, TypeVar +from typing import Protocol, TypeVar class Comparable(Protocol): @@ -22,7 +22,7 @@ def __eq__(self: T, other: object) -> bool: T = TypeVar("T", bound=Comparable) -class Heap(Generic[T]): +class Heap[T: Comparable]: """A Max Heap Implementation >>> unsorted = [103, 9, 1, 7, 11, 15, 25, 201, 209, 107, 5] diff --git a/data_structures/heap/randomized_heap.py b/data_structures/heap/randomized_heap.py index 12888c1f4089..9668d3ac2a45 100644 --- a/data_structures/heap/randomized_heap.py +++ b/data_structures/heap/randomized_heap.py @@ -4,12 +4,12 @@ import random from collections.abc import Iterable -from typing import Any, Generic, TypeVar +from typing import Any, TypeVar T = TypeVar("T", bound=bool) -class RandomizedHeapNode(Generic[T]): +class RandomizedHeapNode[T: bool]: """ One node of the randomized heap. Contains the value and references to two children. @@ -73,7 +73,7 @@ def merge( return root1 -class RandomizedHeap(Generic[T]): +class RandomizedHeap[T: bool]: """ A data structure that allows inserting a new value and to pop the smallest values. Both operations take O(logN) time where N is the size of the diff --git a/data_structures/heap/skew_heap.py b/data_structures/heap/skew_heap.py index 0839db711cb1..4e82569444fa 100644 --- a/data_structures/heap/skew_heap.py +++ b/data_structures/heap/skew_heap.py @@ -3,12 +3,12 @@ from __future__ import annotations from collections.abc import Iterable, Iterator -from typing import Any, Generic, TypeVar +from typing import Any, TypeVar T = TypeVar("T", bound=bool) -class SkewNode(Generic[T]): +class SkewNode[T: bool]: """ One node of the skew heap. Contains the value and references to two children. @@ -87,7 +87,7 @@ def merge( return result -class SkewHeap(Generic[T]): +class SkewHeap[T: bool]: """ A data structure that allows inserting a new value and to pop the smallest values. Both operations take O(logN) time where N is the size of the diff --git a/data_structures/linked_list/from_sequence.py b/data_structures/linked_list/from_sequence.py index fa43f4d10e08..b16b2258c1f1 100644 --- a/data_structures/linked_list/from_sequence.py +++ b/data_structures/linked_list/from_sequence.py @@ -1,5 +1,7 @@ -# Recursive Program to create a Linked List from a sequence and -# print a string representation of it. +""" +Recursive Program to create a Linked List from a sequence and +print a string representation of it. +""" class Node: @@ -18,13 +20,32 @@ def __repr__(self): return string_rep -def make_linked_list(elements_list): - """Creates a Linked List from the elements of the given sequence - (list/tuple) and returns the head of the Linked List.""" +def make_linked_list(elements_list: list | tuple) -> Node: + """ + Creates a Linked List from the elements of the given sequence + (list/tuple) and returns the head of the Linked List. + + >>> make_linked_list([]) + Traceback (most recent call last): + ... + ValueError: The Elements List is empty + >>> make_linked_list(()) + Traceback (most recent call last): + ... + ValueError: The Elements List is empty + >>> make_linked_list([1]) + <1> ---> + >>> make_linked_list((1,)) + <1> ---> + >>> make_linked_list([1, 3, 5, 32, 44, 12, 43]) + <1> ---> <3> ---> <5> ---> <32> ---> <44> ---> <12> ---> <43> ---> + >>> make_linked_list((1, 3, 5, 32, 44, 12, 43)) + <1> ---> <3> ---> <5> ---> <32> ---> <44> ---> <12> ---> <43> ---> + """ # if elements_list is empty if not elements_list: - raise Exception("The Elements List is empty") + raise ValueError("The Elements List is empty") # Set first element as Head head = Node(elements_list[0]) @@ -34,11 +55,3 @@ def make_linked_list(elements_list): current.next = Node(data) current = current.next return head - - -list_data = [1, 3, 5, 32, 44, 12, 43] -print(f"List: {list_data}") -print("Creating Linked List from List.") -linked_list = make_linked_list(list_data) -print("Linked List:") -print(linked_list) diff --git a/data_structures/linked_list/skip_list.py b/data_structures/linked_list/skip_list.py index 13e9a94a8698..f21ca70bbc82 100644 --- a/data_structures/linked_list/skip_list.py +++ b/data_structures/linked_list/skip_list.py @@ -7,13 +7,13 @@ from itertools import pairwise from random import random -from typing import Generic, TypeVar +from typing import TypeVar KT = TypeVar("KT") VT = TypeVar("VT") -class Node(Generic[KT, VT]): +class Node[KT, VT]: def __init__(self, key: KT | str = "root", value: VT | None = None): self.key = key self.value = value @@ -49,7 +49,7 @@ def level(self) -> int: return len(self.forward) -class SkipList(Generic[KT, VT]): +class SkipList[KT, VT]: def __init__(self, p: float = 0.5, max_level: int = 16): self.head: Node[KT, VT] = Node[KT, VT]() self.level = 0 diff --git a/data_structures/queues/circular_queue.py b/data_structures/queues/circular_queue.py index efbf1efdc42d..e9cb2cac4fd8 100644 --- a/data_structures/queues/circular_queue.py +++ b/data_structures/queues/circular_queue.py @@ -17,7 +17,7 @@ def __len__(self) -> int: >>> len(cq) 0 >>> cq.enqueue("A") # doctest: +ELLIPSIS - >>> cq.array ['A', None, None, None, None] >>> len(cq) @@ -51,17 +51,24 @@ def enqueue(self, data): """ This function inserts an element at the end of the queue using self.rear value as an index. + >>> cq = CircularQueue(5) >>> cq.enqueue("A") # doctest: +ELLIPSIS - >>> (cq.size, cq.first()) (1, 'A') >>> cq.enqueue("B") # doctest: +ELLIPSIS - >>> cq.array ['A', 'B', None, None, None] >>> (cq.size, cq.first()) (2, 'A') + >>> cq.enqueue("C").enqueue("D").enqueue("E") # doctest: +ELLIPSIS + + >>> cq.enqueue("F") + Traceback (most recent call last): + ... + Exception: QUEUE IS FULL """ if self.size >= self.n: raise Exception("QUEUE IS FULL") @@ -75,6 +82,7 @@ def dequeue(self): """ This function removes an element from the queue using on self.front value as an index and returns it + >>> cq = CircularQueue(5) >>> cq.dequeue() Traceback (most recent call last): diff --git a/data_structures/queues/queue_by_list.py b/data_structures/queues/queue_by_list.py index 4b05be9fd08e..182cc4147c47 100644 --- a/data_structures/queues/queue_by_list.py +++ b/data_structures/queues/queue_by_list.py @@ -1,13 +1,10 @@ """Queue represented by a Python list""" from collections.abc import Iterable -from typing import Generic, TypeVar -_T = TypeVar("_T") - -class QueueByList(Generic[_T]): - def __init__(self, iterable: Iterable[_T] | None = None) -> None: +class QueueByList[T]: + def __init__(self, iterable: Iterable[T] | None = None) -> None: """ >>> QueueByList() Queue(()) @@ -16,7 +13,7 @@ def __init__(self, iterable: Iterable[_T] | None = None) -> None: >>> QueueByList((i**2 for i in range(1, 4))) Queue((1, 4, 9)) """ - self.entries: list[_T] = list(iterable or []) + self.entries: list[T] = list(iterable or []) def __len__(self) -> int: """ @@ -58,7 +55,7 @@ def __repr__(self) -> str: return f"Queue({tuple(self.entries)})" - def put(self, item: _T) -> None: + def put(self, item: T) -> None: """Put `item` to the Queue >>> queue = QueueByList() @@ -72,7 +69,7 @@ def put(self, item: _T) -> None: self.entries.append(item) - def get(self) -> _T: + def get(self) -> T: """ Get `item` from the Queue @@ -118,7 +115,7 @@ def rotate(self, rotation: int) -> None: for _ in range(rotation): put(get(0)) - def get_front(self) -> _T: + def get_front(self) -> T: """Get the front item from the Queue >>> queue = QueueByList((10, 20, 30)) diff --git a/data_structures/queues/queue_by_two_stacks.py b/data_structures/queues/queue_by_two_stacks.py index cd62f155a63b..f9e302ffcedd 100644 --- a/data_structures/queues/queue_by_two_stacks.py +++ b/data_structures/queues/queue_by_two_stacks.py @@ -1,13 +1,10 @@ """Queue implementation using two stacks""" from collections.abc import Iterable -from typing import Generic, TypeVar -_T = TypeVar("_T") - -class QueueByTwoStacks(Generic[_T]): - def __init__(self, iterable: Iterable[_T] | None = None) -> None: +class QueueByTwoStacks[T]: + def __init__(self, iterable: Iterable[T] | None = None) -> None: """ >>> QueueByTwoStacks() Queue(()) @@ -16,8 +13,8 @@ def __init__(self, iterable: Iterable[_T] | None = None) -> None: >>> QueueByTwoStacks((i**2 for i in range(1, 4))) Queue((1, 4, 9)) """ - self._stack1: list[_T] = list(iterable or []) - self._stack2: list[_T] = [] + self._stack1: list[T] = list(iterable or []) + self._stack2: list[T] = [] def __len__(self) -> int: """ @@ -59,7 +56,7 @@ def __repr__(self) -> str: """ return f"Queue({tuple(self._stack2[::-1] + self._stack1)})" - def put(self, item: _T) -> None: + def put(self, item: T) -> None: """ Put `item` into the Queue @@ -74,7 +71,7 @@ def put(self, item: _T) -> None: self._stack1.append(item) - def get(self) -> _T: + def get(self) -> T: """ Get `item` from the Queue diff --git a/data_structures/stacks/stack.py b/data_structures/stacks/stack.py index 93698f5aa116..3ffa32d4167f 100644 --- a/data_structures/stacks/stack.py +++ b/data_structures/stacks/stack.py @@ -1,6 +1,6 @@ from __future__ import annotations -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") @@ -13,7 +13,7 @@ class StackUnderflowError(BaseException): pass -class Stack(Generic[T]): +class Stack[T]: """A stack is an abstract data type that serves as a collection of elements with two principal operations: push() and pop(). push() adds an element to the top of the stack, and pop() removes an element from the top diff --git a/data_structures/stacks/stack_with_doubly_linked_list.py b/data_structures/stacks/stack_with_doubly_linked_list.py index 50c5236e073c..ad01cd7eb6aa 100644 --- a/data_structures/stacks/stack_with_doubly_linked_list.py +++ b/data_structures/stacks/stack_with_doubly_linked_list.py @@ -3,19 +3,19 @@ from __future__ import annotations -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") -class Node(Generic[T]): +class Node[T]: def __init__(self, data: T): self.data = data # Assign data self.next: Node[T] | None = None # Initialize next as null self.prev: Node[T] | None = None # Initialize prev as null -class Stack(Generic[T]): +class Stack[T]: """ >>> stack = Stack() >>> stack.is_empty() diff --git a/data_structures/stacks/stack_with_singly_linked_list.py b/data_structures/stacks/stack_with_singly_linked_list.py index 8e77c2b967ef..57a68679eee1 100644 --- a/data_structures/stacks/stack_with_singly_linked_list.py +++ b/data_structures/stacks/stack_with_singly_linked_list.py @@ -3,12 +3,12 @@ from __future__ import annotations from collections.abc import Iterator -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") -class Node(Generic[T]): +class Node[T]: def __init__(self, data: T): self.data = data self.next: Node[T] | None = None @@ -17,7 +17,7 @@ def __str__(self) -> str: return f"{self.data}" -class LinkedStack(Generic[T]): +class LinkedStack[T]: """ Linked List Stack implementing push (to top), pop (from top) and is_empty diff --git a/data_structures/stacks/stock_span_problem.py b/data_structures/stacks/stock_span_problem.py index 5efe58d25798..74c2636784e2 100644 --- a/data_structures/stacks/stock_span_problem.py +++ b/data_structures/stacks/stock_span_problem.py @@ -8,8 +8,29 @@ """ -def calculation_span(price, s): +def calculate_span(price: list[int]) -> list[int]: + """ + Calculate the span values for a given list of stock prices. + Args: + price: List of stock prices. + Returns: + List of span values. + + >>> calculate_span([10, 4, 5, 90, 120, 80]) + [1, 1, 2, 4, 5, 1] + >>> calculate_span([100, 50, 60, 70, 80, 90]) + [1, 1, 2, 3, 4, 5] + >>> calculate_span([5, 4, 3, 2, 1]) + [1, 1, 1, 1, 1] + >>> calculate_span([1, 2, 3, 4, 5]) + [1, 2, 3, 4, 5] + >>> calculate_span([10, 20, 30, 40, 50]) + [1, 2, 3, 4, 5] + >>> calculate_span([100, 80, 60, 70, 60, 75, 85]) + [1, 1, 1, 2, 1, 4, 6] + """ n = len(price) + s = [0] * n # Create a stack and push index of fist element to it st = [] st.append(0) @@ -21,18 +42,20 @@ def calculation_span(price, s): for i in range(1, n): # Pop elements from stack while stack is not # empty and top of stack is smaller than price[i] - while len(st) > 0 and price[st[0]] <= price[i]: + while len(st) > 0 and price[st[-1]] <= price[i]: st.pop() # If stack becomes empty, then price[i] is greater # than all elements on left of it, i.e. price[0], # price[1], ..price[i-1]. Else the price[i] is # greater than elements after top of stack - s[i] = i + 1 if len(st) <= 0 else (i - st[0]) + s[i] = i + 1 if len(st) <= 0 else (i - st[-1]) # Push this element to stack st.append(i) + return s + # A utility function to print elements of array def print_array(arr, n): @@ -42,10 +65,9 @@ def print_array(arr, n): # Driver program to test above function price = [10, 4, 5, 90, 120, 80] -S = [0 for i in range(len(price) + 1)] -# Fill the span values in array S[] -calculation_span(price, S) +# Calculate the span values +S = calculate_span(price) # Print the calculated span values print_array(S, len(price)) diff --git a/data_structures/trie/radix_tree.py b/data_structures/trie/radix_tree.py index caf566a6ce30..bd2306befa79 100644 --- a/data_structures/trie/radix_tree.py +++ b/data_structures/trie/radix_tree.py @@ -115,7 +115,7 @@ def find(self, word: str) -> bool: if not incoming_node: return False else: - matching_string, remaining_prefix, remaining_word = incoming_node.match( + _matching_string, remaining_prefix, remaining_word = incoming_node.match( word ) # If there is remaining prefix, the word can't be on the tree @@ -144,7 +144,7 @@ def delete(self, word: str) -> bool: if not incoming_node: return False else: - matching_string, remaining_prefix, remaining_word = incoming_node.match( + _matching_string, remaining_prefix, remaining_word = incoming_node.match( word ) # If there is remaining prefix, the word can't be on the tree diff --git a/digital_image_processing/filters/local_binary_pattern.py b/digital_image_processing/filters/local_binary_pattern.py index 861369ba6a32..ac54ecce755c 100644 --- a/digital_image_processing/filters/local_binary_pattern.py +++ b/digital_image_processing/filters/local_binary_pattern.py @@ -19,7 +19,7 @@ def get_neighbors_pixel( try: return int(image[x_coordinate][y_coordinate] >= center) - except (IndexError, TypeError): + except IndexError, TypeError: return 0 diff --git a/divide_and_conquer/convex_hull.py b/divide_and_conquer/convex_hull.py index 93f6daf1f88c..b1ab33cc9415 100644 --- a/divide_and_conquer/convex_hull.py +++ b/divide_and_conquer/convex_hull.py @@ -124,7 +124,7 @@ def _construct_points( else: try: points.append(Point(p[0], p[1])) - except (IndexError, TypeError): + except IndexError, TypeError: print( f"Ignoring deformed point {p}. All points" " must have at least 2 coordinates." diff --git a/dynamic_programming/catalan_numbers.py b/dynamic_programming/catalan_numbers.py index 7b74f2763d43..a62abe36d670 100644 --- a/dynamic_programming/catalan_numbers.py +++ b/dynamic_programming/catalan_numbers.py @@ -71,7 +71,7 @@ def catalan_numbers(upper_limit: int) -> "list[int]": print(f"The Catalan numbers from 0 through {N} are:") print(catalan_numbers(N)) print("Try another upper limit for the sequence: ", end="") - except (NameError, ValueError): + except NameError, ValueError: print("\n********* Invalid input, goodbye! ************\n") import doctest diff --git a/dynamic_programming/narcissistic_number.py b/dynamic_programming/narcissistic_number.py new file mode 100644 index 000000000000..dc1c6f5a5660 --- /dev/null +++ b/dynamic_programming/narcissistic_number.py @@ -0,0 +1,103 @@ +""" +Find all narcissistic numbers up to a given limit using dynamic programming. + +A narcissistic number (also known as an Armstrong number or plus perfect number) +is a number that is the sum of its own digits each raised to the power of the +number of digits. + +For example, 153 is a narcissistic number because 153 = 1^3 + 5^3 + 3^3. + +This implementation uses dynamic programming with memoization to efficiently +compute digit powers and find all narcissistic numbers up to a specified limit. + +The DP optimization caches digit^power calculations. When searching through many +numbers, the same digit power calculations occur repeatedly (e.g., 153, 351, 135 +all need 1^3, 5^3, 3^3). Memoization avoids these redundant calculations. + +Examples of narcissistic numbers: + Single digit: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 + Three digit: 153, 370, 371, 407 + Four digit: 1634, 8208, 9474 + Five digit: 54748, 92727, 93084 + +Reference: https://en.wikipedia.org/wiki/Narcissistic_number +""" + + +def find_narcissistic_numbers(limit: int) -> list[int]: + """ + Find all narcissistic numbers up to the given limit using dynamic programming. + + This function uses memoization to cache digit power calculations, avoiding + redundant computations across different numbers with the same digit count. + + Args: + limit: The upper bound for searching narcissistic numbers (exclusive) + + Returns: + list[int]: A sorted list of all narcissistic numbers below the limit + + Examples: + >>> find_narcissistic_numbers(10) + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + >>> find_narcissistic_numbers(160) + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 153] + >>> find_narcissistic_numbers(400) + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 153, 370, 371] + >>> find_narcissistic_numbers(1000) + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 153, 370, 371, 407] + >>> find_narcissistic_numbers(10000) + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 153, 370, 371, 407, 1634, 8208, 9474] + >>> find_narcissistic_numbers(1) + [0] + >>> find_narcissistic_numbers(0) + [] + """ + if limit <= 0: + return [] + + narcissistic_nums = [] + + # Memoization: cache[(power, digit)] = digit^power + # This avoids recalculating the same power for different numbers + power_cache: dict[tuple[int, int], int] = {} + + def get_digit_power(digit: int, power: int) -> int: + """Get digit^power using memoization (DP optimization).""" + if (power, digit) not in power_cache: + power_cache[(power, digit)] = digit**power + return power_cache[(power, digit)] + + # Check each number up to the limit + for number in range(limit): + # Count digits + num_digits = len(str(number)) + + # Calculate sum of powered digits using memoized powers + remaining = number + digit_sum = 0 + while remaining > 0: + digit = remaining % 10 + digit_sum += get_digit_power(digit, num_digits) + remaining //= 10 + + # Check if narcissistic + if digit_sum == number: + narcissistic_nums.append(number) + + return narcissistic_nums + + +if __name__ == "__main__": + import doctest + + doctest.testmod() + + # Demonstrate the dynamic programming approach + print("Finding all narcissistic numbers up to 10000:") + print("(Using memoization to cache digit power calculations)") + print() + + narcissistic_numbers = find_narcissistic_numbers(10000) + print(f"Found {len(narcissistic_numbers)} narcissistic numbers:") + print(narcissistic_numbers) diff --git a/dynamic_programming/word_break.py b/dynamic_programming/word_break.py index 4d7ac869080c..c4ba2d7aa976 100644 --- a/dynamic_programming/word_break.py +++ b/dynamic_programming/word_break.py @@ -90,7 +90,7 @@ def is_breakable(index: int) -> bool: if index == len_string: return True - trie_node = trie + trie_node: Any = trie for i in range(index, len_string): trie_node = trie_node.get(string[i], None) diff --git a/geodesy/lamberts_ellipsoidal_distance.py b/geodesy/lamberts_ellipsoidal_distance.py index 4805674e51ab..a5c43c5656e9 100644 --- a/geodesy/lamberts_ellipsoidal_distance.py +++ b/geodesy/lamberts_ellipsoidal_distance.py @@ -32,6 +32,26 @@ def lamberts_ellipsoidal_distance( Returns: geographical distance between two points in metres + >>> lamberts_ellipsoidal_distance(100, 0, 0, 0) + Traceback (most recent call last): + ... + ValueError: Latitude must be between -90 and 90 degrees + + >>> lamberts_ellipsoidal_distance(0, 0, -100, 0) + Traceback (most recent call last): + ... + ValueError: Latitude must be between -90 and 90 degrees + + >>> lamberts_ellipsoidal_distance(0, 200, 0, 0) + Traceback (most recent call last): + ... + ValueError: Longitude must be between -180 and 180 degrees + + >>> lamberts_ellipsoidal_distance(0, 0, 0, -200) + Traceback (most recent call last): + ... + ValueError: Longitude must be between -180 and 180 degrees + >>> from collections import namedtuple >>> point_2d = namedtuple("point_2d", "lat lon") >>> SAN_FRANCISCO = point_2d(37.774856, -122.424227) @@ -46,6 +66,14 @@ def lamberts_ellipsoidal_distance( '9,737,326 meters' """ + # Validate latitude values + if not -90 <= lat1 <= 90 or not -90 <= lat2 <= 90: + raise ValueError("Latitude must be between -90 and 90 degrees") + + # Validate longitude values + if not -180 <= lon1 <= 180 or not -180 <= lon2 <= 180: + raise ValueError("Longitude must be between -180 and 180 degrees") + # CONSTANTS per WGS84 https://en.wikipedia.org/wiki/World_Geodetic_System # Distance in metres(m) # Equation Parameters diff --git a/geometry/graham_scan.py b/geometry/graham_scan.py new file mode 100644 index 000000000000..a48391dfbc5d --- /dev/null +++ b/geometry/graham_scan.py @@ -0,0 +1,246 @@ +""" +Graham Scan algorithm for finding the convex hull of a set of points. + +The Graham scan is a method of computing the convex hull of a finite set of points +in the plane with time complexity O(n log n). It is named after Ronald Graham, who +published the original algorithm in 1972. + +The algorithm finds all vertices of the convex hull ordered along its boundary. +It uses a stack to efficiently identify and remove points that would create +non-convex angles. + +References: +- https://en.wikipedia.org/wiki/Graham_scan +- Graham, R.L. (1972). "An Efficient Algorithm for Determining the Convex Hull of a + Finite Planar Set" +""" + +from __future__ import annotations + +from collections.abc import Sequence +from dataclasses import dataclass +from typing import TypeVar + +T = TypeVar("T", bound="Point") + + +@dataclass +class Point: + """ + A point in 2D space. + + >>> Point(0, 0) + Point(x=0.0, y=0.0) + >>> Point(1.5, 2.5) + Point(x=1.5, y=2.5) + """ + + x: float + y: float + + def __init__(self, x_coordinate: float, y_coordinate: float) -> None: + """ + Initialize a 2D point. + + Args: + x_coordinate: The x-coordinate (horizontal position) of the point + y_coordinate: The y-coordinate (vertical position) of the point + """ + self.x = float(x_coordinate) + self.y = float(y_coordinate) + + def __eq__(self, other: object) -> bool: + """ + Check if two points are equal. + + >>> Point(1, 2) == Point(1, 2) + True + >>> Point(1, 2) == Point(2, 1) + False + """ + if not isinstance(other, Point): + return NotImplemented + return self.x == other.x and self.y == other.y + + def __lt__(self, other: Point) -> bool: + """ + Compare two points for sorting (bottom-most, then left-most). + + >>> Point(1, 2) < Point(1, 3) + True + >>> Point(1, 2) < Point(2, 2) + True + >>> Point(2, 2) < Point(1, 2) + False + """ + if self.y == other.y: + return self.x < other.x + return self.y < other.y + + def euclidean_distance(self, other: Point) -> float: + """ + Calculate Euclidean distance between two points. + + >>> Point(0, 0).euclidean_distance(Point(3, 4)) + 5.0 + >>> Point(1, 1).euclidean_distance(Point(4, 5)) + 5.0 + """ + return ((self.x - other.x) ** 2 + (self.y - other.y) ** 2) ** 0.5 + + def consecutive_orientation(self, point_a: Point, point_b: Point) -> float: + """ + Calculate the cross product of vectors (self -> point_a) and + (point_a -> point_b). + + Returns: + - Positive value: counter-clockwise turn + - Negative value: clockwise turn + - Zero: collinear points + + >>> Point(0, 0).consecutive_orientation(Point(1, 0), Point(1, 1)) + 1.0 + >>> Point(0, 0).consecutive_orientation(Point(1, 0), Point(1, -1)) + -1.0 + >>> Point(0, 0).consecutive_orientation(Point(1, 0), Point(2, 0)) + 0.0 + """ + return (point_a.x - self.x) * (point_b.y - point_a.y) - (point_a.y - self.y) * ( + point_b.x - point_a.x + ) + + +def graham_scan(points: Sequence[Point]) -> list[Point]: + """ + Find the convex hull of a set of points using the Graham scan algorithm. + + The algorithm works as follows: + 1. Find the bottom-most point (or left-most in case of tie) + 2. Sort all other points by polar angle with respect to the bottom-most point + 3. Process points in order, maintaining a stack of hull candidates + 4. Remove points that would create a clockwise turn + + Args: + points: A sequence of Point objects + + Returns: + A list of Point objects representing the convex hull in counter-clockwise order. + Returns an empty list if there are fewer than 3 distinct points or if all + points are collinear. + + Time Complexity: O(n log n) due to sorting + Space Complexity: O(n) for the output hull + + >>> graham_scan([]) + [] + >>> graham_scan([Point(0, 0)]) + [] + >>> graham_scan([Point(0, 0), Point(1, 1)]) + [] + >>> hull = graham_scan([Point(0, 0), Point(1, 0), Point(0.5, 1)]) + >>> len(hull) + 3 + >>> Point(0, 0) in hull and Point(1, 0) in hull and Point(0.5, 1) in hull + True + """ + if len(points) <= 2: + return [] + + # Find the bottom-most point (left-most in case of tie) + min_point = min(points) + + # Remove the min_point from the list + points_list = [p for p in points if p != min_point] + if not points_list: + # Edge case where all points are the same + return [] + + def polar_angle_key(point: Point) -> tuple[float, float, float]: + """ + Key function for sorting points by polar angle relative to min_point. + + Points are sorted counter-clockwise. When two points have the same angle, + the farther point comes first (we'll remove duplicates later). + """ + # We use a dummy third point (min_point itself) to calculate relative angles + # Instead, we'll compute the angle between points + dx = point.x - min_point.x + dy = point.y - min_point.y + + # Use atan2 for angle, but we can also use cross product for comparison + # For sorting, we compare orientations between consecutive points + distance = min_point.euclidean_distance(point) + return (dx, dy, -distance) # Negative distance to sort farther points first + + # Sort by polar angle using a comparison based on cross product + def compare_points(point_a: Point, point_b: Point) -> int: + """Compare two points by polar angle relative to min_point.""" + orientation = min_point.consecutive_orientation(point_a, point_b) + if orientation < 0.0: + return 1 # point_a comes after point_b (clockwise) + elif orientation > 0.0: + return -1 # point_a comes before point_b (counter-clockwise) + else: + # Collinear: farther point should come first + dist_a = min_point.euclidean_distance(point_a) + dist_b = min_point.euclidean_distance(point_b) + if dist_b < dist_a: + return -1 + elif dist_b > dist_a: + return 1 + else: + return 0 + + from functools import cmp_to_key + + points_list.sort(key=cmp_to_key(compare_points)) + + # Build the convex hull + convex_hull: list[Point] = [min_point, points_list[0]] + + for point in points_list[1:]: + # Skip consecutive points with the same angle (collinear with min_point) + if min_point.consecutive_orientation(point, convex_hull[-1]) == 0.0: + continue + + # Remove points that create a clockwise turn (or are collinear) + while len(convex_hull) >= 2: + orientation = convex_hull[-2].consecutive_orientation( + convex_hull[-1], point + ) + if orientation <= 0.0: + convex_hull.pop() + else: + break + + convex_hull.append(point) + + # Need at least 3 points for a valid convex hull + if len(convex_hull) <= 2: + return [] + + return convex_hull + + +if __name__ == "__main__": + import doctest + + doctest.testmod() + + # Example usage + points = [ + Point(0, 0), + Point(1, 0), + Point(2, 0), + Point(2, 1), + Point(2, 2), + Point(1, 2), + Point(0, 2), + Point(0, 1), + Point(1, 1), # Interior point + ] + + hull = graham_scan(points) + print("Convex hull vertices:") + for point in hull: + print(f" ({point.x}, {point.y})") diff --git a/geometry/jarvis_march.py b/geometry/jarvis_march.py new file mode 100644 index 000000000000..55a0872ff60e --- /dev/null +++ b/geometry/jarvis_march.py @@ -0,0 +1,187 @@ +""" +Jarvis March (Gift Wrapping) algorithm for finding the convex hull of a set of points. + +The convex hull is the smallest convex polygon that contains all the points. + +Time Complexity: O(n*h) where n is the number of points and h is the number of +hull points. +Space Complexity: O(h) where h is the number of hull points. + +USAGE: + -> Import this file into your project. + -> Use the jarvis_march() function to find the convex hull of a set of points. + -> Parameters: + -> points: A list of Point objects representing 2D coordinates + +REFERENCES: + -> Wikipedia reference: https://en.wikipedia.org/wiki/Gift_wrapping_algorithm + -> GeeksforGeeks: + https://www.geeksforgeeks.org/convex-hull-set-1-jarviss-algorithm-or-wrapping/ +""" + +from __future__ import annotations + + +class Point: + """Represents a 2D point with x and y coordinates.""" + + def __init__(self, x_coordinate: float, y_coordinate: float) -> None: + self.x = x_coordinate + self.y = y_coordinate + + def __eq__(self, other: object) -> bool: + if not isinstance(other, Point): + return NotImplemented + return self.x == other.x and self.y == other.y + + def __repr__(self) -> str: + return f"Point({self.x}, {self.y})" + + def __hash__(self) -> int: + return hash((self.x, self.y)) + + +def _cross_product(origin: Point, point_a: Point, point_b: Point) -> float: + """ + Calculate the cross product of vectors OA and OB. + + Returns: + > 0: Counter-clockwise turn (left turn) + = 0: Collinear + < 0: Clockwise turn (right turn) + """ + return (point_a.x - origin.x) * (point_b.y - origin.y) - (point_a.y - origin.y) * ( + point_b.x - origin.x + ) + + +def _is_point_on_segment(p1: Point, p2: Point, point: Point) -> bool: + """Check if a point lies on the line segment between p1 and p2.""" + # Check if point is collinear with segment endpoints + cross = (point.y - p1.y) * (p2.x - p1.x) - (point.x - p1.x) * (p2.y - p1.y) + + if abs(cross) > 1e-9: + return False + + # Check if point is within the bounding box of the segment + return min(p1.x, p2.x) <= point.x <= max(p1.x, p2.x) and min( + p1.y, p2.y + ) <= point.y <= max(p1.y, p2.y) + + +def _find_leftmost_point(points: list[Point]) -> int: + """Find index of leftmost point (and bottom-most in case of tie).""" + left_idx = 0 + for i in range(1, len(points)): + if points[i].x < points[left_idx].x or ( + points[i].x == points[left_idx].x and points[i].y < points[left_idx].y + ): + left_idx = i + return left_idx + + +def _find_next_hull_point(points: list[Point], current_idx: int) -> int: + """Find the next point on the convex hull.""" + next_idx = (current_idx + 1) % len(points) + # Ensure next_idx is not the same as current_idx + while next_idx == current_idx: + next_idx = (next_idx + 1) % len(points) + + for i in range(len(points)): + if i == current_idx: + continue + cross = _cross_product(points[current_idx], points[i], points[next_idx]) + if cross > 0: + next_idx = i + + return next_idx + + +def _is_valid_polygon(hull: list[Point]) -> bool: + """Check if hull forms a valid polygon (has at least one non-collinear turn).""" + for i in range(len(hull)): + p1 = hull[i] + p2 = hull[(i + 1) % len(hull)] + p3 = hull[(i + 2) % len(hull)] + if abs(_cross_product(p1, p2, p3)) > 1e-9: + return True + return False + + +def _add_point_to_hull(hull: list[Point], point: Point) -> None: + """Add a point to hull, removing collinear intermediate points.""" + last = len(hull) - 1 + if len(hull) > 1 and _is_point_on_segment(hull[last - 1], hull[last], point): + hull[last] = Point(point.x, point.y) + else: + hull.append(Point(point.x, point.y)) + + +def jarvis_march(points: list[Point]) -> list[Point]: + """ + Find the convex hull of a set of points using the Jarvis March algorithm. + + The algorithm starts with the leftmost point and wraps around the set of + points, selecting the most counter-clockwise point at each step. + + Args: + points: List of Point objects representing 2D coordinates + + Returns: + List of Points that form the convex hull in counter-clockwise order. + Returns empty list if there are fewer than 3 non-collinear points. + """ + if len(points) <= 2: + return [] + + # Remove duplicate points to avoid infinite loops + unique_points = list(set(points)) + + if len(unique_points) <= 2: + return [] + + convex_hull: list[Point] = [] + + # Find the leftmost point + left_point_idx = _find_leftmost_point(unique_points) + convex_hull.append( + Point(unique_points[left_point_idx].x, unique_points[left_point_idx].y) + ) + + current_idx = left_point_idx + while True: + # Find the next counter-clockwise point + next_idx = _find_next_hull_point(unique_points, current_idx) + + if next_idx == left_point_idx: + break + + if next_idx == current_idx: + break + + current_idx = next_idx + _add_point_to_hull(convex_hull, unique_points[current_idx]) + + # Check for degenerate cases + if len(convex_hull) <= 2: + return [] + + # Check if last point is collinear with first and second-to-last + last = len(convex_hull) - 1 + if _is_point_on_segment(convex_hull[last - 1], convex_hull[last], convex_hull[0]): + convex_hull.pop() + if len(convex_hull) == 2: + return [] + + # Verify the hull forms a valid polygon + if not _is_valid_polygon(convex_hull): + return [] + + return convex_hull + + +if __name__ == "__main__": + # Example usage + points = [Point(0, 0), Point(1, 1), Point(0, 1), Point(1, 0), Point(0.5, 0.5)] + hull = jarvis_march(points) + print(f"Convex hull: {hull}") diff --git a/geometry/ramer_douglas_peucker.py b/geometry/ramer_douglas_peucker.py new file mode 100644 index 000000000000..a03bbb2e5086 --- /dev/null +++ b/geometry/ramer_douglas_peucker.py @@ -0,0 +1,184 @@ +""" +Ramer-Douglas-Peucker polyline simplification algorithm. + +Given a sequence of 2-D points and a tolerance epsilon, the algorithm +reduces the number of points while preserving the overall shape of the curve. + +Time complexity: O(n log n) average, O(n²) worst case +Space complexity: O(n) + +References: + https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm +""" + +from __future__ import annotations + +import math + + +def _euclidean_distance( + point_a: tuple[float, float], + point_b: tuple[float, float], +) -> float: + """Return the Euclidean distance between two 2-D points. + + >>> _euclidean_distance((0.0, 0.0), (3.0, 4.0)) + 5.0 + >>> _euclidean_distance((1.0, 1.0), (1.0, 1.0)) + 0.0 + """ + return math.hypot(point_b[0] - point_a[0], point_b[1] - point_a[1]) + + +def _perpendicular_distance( + point: tuple[float, float], + line_start: tuple[float, float], + line_end: tuple[float, float], +) -> float: + """Return the distance from *point* to the line **segment** between + *line_start* and *line_end*. + + When the perpendicular projection of *point* onto the infinite line falls + within the segment, this equals the perpendicular distance to that line. + When the projection falls outside the segment, the distance to the nearest + endpoint is returned instead (projection clamped to [0, 1]). + + This is the correct distance measure for the Ramer-Douglas-Peucker + algorithm: using the infinite-line distance can incorrectly discard points + whose projection lies beyond a segment endpoint. + + >>> _perpendicular_distance((4.0, 0.0), (0.0, 0.0), (0.0, 3.0)) + 4.0 + >>> # order of line_start and line_end does not affect the result + >>> _perpendicular_distance((4.0, 0.0), (0.0, 3.0), (0.0, 0.0)) + 4.0 + >>> _perpendicular_distance((4.0, 1.0), (0.0, 1.0), (0.0, 4.0)) + 4.0 + >>> _perpendicular_distance((2.0, 1.0), (-2.0, 1.0), (-2.0, 4.0)) + 4.0 + >>> # projection falls outside the segment; distance to nearest endpoint + >>> round(_perpendicular_distance((0.0, 2.0), (1.0, 0.0), (3.0, 0.0)), 6) + 2.236068 + """ + px, py = point + ax, ay = line_start + bx, by = line_end + dx, dy = bx - ax, by - ay + seg_len_sq = dx * dx + dy * dy + if seg_len_sq == 0.0: + # line_start and line_end coincide; fall back to point-to-point distance + return _euclidean_distance(point, line_start) + # Project point onto the segment line, then clamp t to [0, 1] so the + # nearest point is always on the segment rather than the infinite line. + t = max(0.0, min(1.0, ((px - ax) * dx + (py - ay) * dy) / seg_len_sq)) + nearest_x = ax + t * dx + nearest_y = ay + t * dy + return math.hypot(px - nearest_x, py - nearest_y) + + +def ramer_douglas_peucker( + pts: list[tuple[float, float]], + epsilon: float, +) -> list[tuple[float, float]]: + """Simplify a polyline using the Ramer-Douglas-Peucker algorithm. + + Given a sequence of 2-D points and a maximum allowable deviation + *epsilon* (>= 0), returns a simplified list of points such that no + discarded point is farther than *epsilon* from the simplified polyline. + + Parameters + ---------- + pts: + Ordered sequence of ``(x, y)`` points describing the polyline. + epsilon: + Maximum allowable distance of any discarded point from the + simplified polyline. Must be non-negative. + + Returns + ------- + list[tuple[float, float]] + Simplified list of ``(x, y)`` points. The first and last points of + *pts* are always preserved. + + Raises + ------ + ValueError + If *epsilon* is negative. + + References + ---------- + https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm + + Examples + -------- + >>> ramer_douglas_peucker([], epsilon=1.0) + [] + >>> ramer_douglas_peucker([(0.0, 0.0)], epsilon=1.0) + [(0.0, 0.0)] + >>> ramer_douglas_peucker([(0.0, 0.0), (1.0, 0.0)], epsilon=1.0) + [(0.0, 0.0), (1.0, 0.0)] + >>> # middle point is within epsilon - it is discarded + >>> ramer_douglas_peucker([(0.0, 0.0), (1.0, 0.1), (2.0, 0.0)], epsilon=0.5) + [(0.0, 0.0), (2.0, 0.0)] + >>> # middle point exceeds epsilon - it is kept + >>> ramer_douglas_peucker([(0.0, 0.0), (1.0, 1.0), (2.0, 0.0)], epsilon=0.5) + [(0.0, 0.0), (1.0, 1.0), (2.0, 0.0)] + >>> ramer_douglas_peucker([(0.0, 0.0), (1.0, 0.5), (2.0, 0.0)], epsilon=-1.0) + Traceback (most recent call last): + ... + ValueError: epsilon must be non-negative, got -1.0 + """ + if epsilon < 0: + msg = f"epsilon must be non-negative, got {epsilon!r}" + raise ValueError(msg) + + if len(pts) < 3: + return list(pts) + + # --------------------------------------------------------------------------- + # Iterative, stack-based implementation. + # + # The naive recursive approach copies sublists at every level via slicing + # (pts[:max_index+1] / pts[max_index:]), which is O(n) per call and makes + # the overall algorithm O(n²) in memory even for well-balanced splits. An + # explicit stack operating on index ranges avoids all copying and also + # eliminates the risk of hitting Python's recursion limit for long polylines. + # --------------------------------------------------------------------------- + n = len(pts) + + # keep[i] is True when pts[i] must appear in the output. + keep: list[bool] = [False] * n + keep[0] = True + keep[-1] = True + + # Stack of (start_index, end_index) pairs still to be examined. + stack: list[tuple[int, int]] = [(0, n - 1)] + + while stack: + start, end = stack.pop() + if end - start < 2: + # Only one interior candidate at most; nothing to split further. + continue + + # Find the interior point with the greatest distance to the segment. + max_dist = 0.0 + max_index = start + for i in range(start + 1, end): + dist = _perpendicular_distance(pts[i], pts[start], pts[end]) + if dist > max_dist: + max_dist = dist + max_index = i + + if max_dist > epsilon: + keep[max_index] = True + stack.append((start, max_index)) + stack.append((max_index, end)) + # else: all interior points are within epsilon; discard them all. + + return [pts[i] for i in range(n) if keep[i]] + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/geometry/segment_intersection.py b/geometry/segment_intersection.py new file mode 100644 index 000000000000..e2e2e10f1e4d --- /dev/null +++ b/geometry/segment_intersection.py @@ -0,0 +1,112 @@ +""" +Given two line segments, determine whether they intersect. + +This is based on the algorithm described in Introduction to Algorithms +(CLRS), Chapter 33. + +Reference: + - https://en.wikipedia.org/wiki/Line%E2%80%93line_intersection + - https://en.wikipedia.org/wiki/Orientation_(geometry) +""" + +from __future__ import annotations + +from typing import NamedTuple + + +class Point(NamedTuple): + """A point in 2D space. + + >>> Point(0, 0) + Point(x=0, y=0) + >>> Point(1, -3) + Point(x=1, y=-3) + """ + + x: float + y: float + + +def direction(pivot: Point, target: Point, query: Point) -> float: + """Return the cross product of vectors (pivot->query) and (pivot->target). + + The sign of the result encodes the orientation of the ordered triple + (pivot, target, query): + - Negative -> counter-clockwise (left turn) + - Positive -> clockwise (right turn) + - Zero -> collinear + + >>> direction(Point(0, 0), Point(1, 0), Point(0, 1)) + -1 + >>> direction(Point(0, 0), Point(0, 1), Point(1, 0)) + 1 + >>> direction(Point(0, 0), Point(1, 1), Point(2, 2)) + 0 + """ + return (query.x - pivot.x) * (target.y - pivot.y) - (target.x - pivot.x) * ( + query.y - pivot.y + ) + + +def on_segment(seg_start: Point, seg_end: Point, point: Point) -> bool: + """Check whether *point*, known to be collinear with the segment, lies on it. + + >>> on_segment(Point(0, 0), Point(4, 4), Point(2, 2)) + True + >>> on_segment(Point(0, 0), Point(4, 4), Point(5, 5)) + False + >>> on_segment(Point(0, 0), Point(4, 0), Point(2, 0)) + True + """ + return min(seg_start.x, seg_end.x) <= point.x <= max( + seg_start.x, seg_end.x + ) and min(seg_start.y, seg_end.y) <= point.y <= max(seg_start.y, seg_end.y) + + +def segments_intersect(p1: Point, p2: Point, p3: Point, p4: Point) -> bool: + """Return True if line segment p1p2 intersects line segment p3p4. + + Uses the CLRS cross-product / orientation method. Handles both the + general case (proper crossing) and degenerate cases where one endpoint + lies exactly on the other segment. + + >>> segments_intersect(Point(0, 0), Point(2, 2), Point(0, 2), Point(2, 0)) + True + >>> segments_intersect(Point(0, 0), Point(2, 2), Point(1, 1), Point(3, 3)) + True + >>> segments_intersect(Point(0, 0), Point(1, 0), Point(2, 0), Point(3, 0)) + False + >>> segments_intersect(Point(0, 0), Point(1, 1), Point(1, 0), Point(2, 1)) + False + >>> segments_intersect(Point(0, 0), Point(1, 1), Point(0, 1), Point(0, 2)) + False + >>> segments_intersect(Point(0, 0), Point(1, 0), Point(1, 0), Point(2, 0)) + True + """ + d1 = direction(p3, p4, p1) + d2 = direction(p3, p4, p2) + d3 = direction(p1, p2, p3) + d4 = direction(p1, p2, p4) + + if ((d1 < 0 < d2) or (d2 < 0 < d1)) and ((d3 < 0 < d4) or (d4 < 0 < d3)): + return True + + if d1 == 0 and on_segment(p3, p4, p1): + return True + if d2 == 0 and on_segment(p3, p4, p2): + return True + if d3 == 0 and on_segment(p1, p2, p3): + return True + return d4 == 0 and on_segment(p1, p2, p4) + + +if __name__ == "__main__": + import doctest + + doctest.testmod() + + print("Enter four points as 'x y' pairs (one per line):") + points = [Point(*map(float, input().split())) for _ in range(4)] + p1, p2, p3, p4 = points + result = segments_intersect(p1, p2, p3, p4) + print(1 if result else 0) diff --git a/geometry/tests/__init__.py b/geometry/tests/__init__.py new file mode 100644 index 000000000000..e69de29bb2d1 diff --git a/geometry/tests/test_graham_scan.py b/geometry/tests/test_graham_scan.py new file mode 100644 index 000000000000..d9a573289ce9 --- /dev/null +++ b/geometry/tests/test_graham_scan.py @@ -0,0 +1,266 @@ +""" +Tests for the Graham scan convex hull algorithm. +""" + +from geometry.graham_scan import Point, graham_scan + + +def test_empty_points() -> None: + """Test with no points.""" + assert graham_scan([]) == [] + + +def test_single_point() -> None: + """Test with a single point.""" + assert graham_scan([Point(0, 0)]) == [] + + +def test_two_points() -> None: + """Test with two points.""" + assert graham_scan([Point(0, 0), Point(1, 1)]) == [] + + +def test_duplicate_points() -> None: + """Test with all duplicate points.""" + p = Point(0, 0) + points = [p, Point(0, 0), Point(0, 0), Point(0, 0), Point(0, 0)] + assert graham_scan(points) == [] + + +def test_collinear_points() -> None: + """Test with all points on the same line.""" + points = [ + Point(1, 0), + Point(2, 0), + Point(3, 0), + Point(4, 0), + Point(5, 0), + ] + assert graham_scan(points) == [] + + +def test_triangle() -> None: + """Test with a triangle (3 points).""" + p1 = Point(1, 1) + p2 = Point(2, 1) + p3 = Point(1.5, 2) + points = [p1, p2, p3] + hull = graham_scan(points) + + assert len(hull) == 3 + assert p1 in hull + assert p2 in hull + assert p3 in hull + + +def test_rectangle() -> None: + """Test with a rectangle (4 points).""" + p1 = Point(1, 1) + p2 = Point(2, 1) + p3 = Point(2, 2) + p4 = Point(1, 2) + points = [p1, p2, p3, p4] + hull = graham_scan(points) + + assert len(hull) == 4 + assert all(p in hull for p in points) + + +def test_triangle_with_interior_points() -> None: + """Test triangle with points inside.""" + p1 = Point(1, 1) + p2 = Point(2, 1) + p3 = Point(1.5, 2) + p4 = Point(1.5, 1.5) # Interior + p5 = Point(1.2, 1.3) # Interior + p6 = Point(1.8, 1.2) # Interior + p7 = Point(1.5, 1.9) # Interior + + hull_points = [p1, p2, p3] + interior_points = [p4, p5, p6, p7] + all_points = hull_points + interior_points + + hull = graham_scan(all_points) + + # All hull points should be in the result + for p in hull_points: + assert p in hull + + # No interior points should be in the result + for p in interior_points: + assert p not in hull + + +def test_rectangle_with_interior_points() -> None: + """Test rectangle with points inside.""" + p1 = Point(1, 1) + p2 = Point(2, 1) + p3 = Point(2, 2) + p4 = Point(1, 2) + p5 = Point(1.5, 1.5) # Interior + p6 = Point(1.2, 1.3) # Interior + p7 = Point(1.8, 1.2) # Interior + p8 = Point(1.9, 1.7) # Interior + p9 = Point(1.4, 1.9) # Interior + + hull_points = [p1, p2, p3, p4] + interior_points = [p5, p6, p7, p8, p9] + all_points = hull_points + interior_points + + hull = graham_scan(all_points) + + # All hull points should be in the result + for p in hull_points: + assert p in hull + + # No interior points should be in the result + for p in interior_points: + assert p not in hull + + +def test_star_shape() -> None: + """Test with a star shape where only tips are on the convex hull.""" + # Tips of the star (on convex hull) + p1 = Point(-5, 6) + p2 = Point(-11, 0) + p3 = Point(-9, -8) + p4 = Point(4, 4) + p5 = Point(6, -7) + + # Interior points (not on convex hull) + p6 = Point(-7, -2) + p7 = Point(-2, -4) + p8 = Point(0, 1) + p9 = Point(1, 0) + p10 = Point(-6, 1) + + hull_points = [p1, p2, p3, p4, p5] + interior_points = [p6, p7, p8, p9, p10] + all_points = hull_points + interior_points + + hull = graham_scan(all_points) + + # All hull points should be in the result + for p in hull_points: + assert p in hull + + # No interior points should be in the result + for p in interior_points: + assert p not in hull + + +def test_rectangle_with_collinear_points() -> None: + """Test rectangle with points on the edges (collinear with vertices).""" + p1 = Point(1, 1) + p2 = Point(2, 1) + p3 = Point(2, 2) + p4 = Point(1, 2) + p5 = Point(1.5, 1) # On edge p1-p2 + p6 = Point(1, 1.5) # On edge p1-p4 + p7 = Point(2, 1.5) # On edge p2-p3 + p8 = Point(1.5, 2) # On edge p3-p4 + + hull_points = [p1, p2, p3, p4] + edge_points = [p5, p6, p7, p8] + all_points = hull_points + edge_points + + hull = graham_scan(all_points) + + # All corner points should be in the result + for p in hull_points: + assert p in hull + + # Edge points should not be in the result (only corners) + for p in edge_points: + assert p not in hull + + +def test_point_equality() -> None: + """Test Point equality.""" + p1 = Point(1, 2) + p2 = Point(1, 2) + p3 = Point(2, 1) + + assert p1 == p2 + assert p1 != p3 + + +def test_point_comparison() -> None: + """Test Point comparison for sorting.""" + p1 = Point(1, 2) + p2 = Point(1, 3) + p3 = Point(2, 2) + + assert p1 < p2 # Lower y value + assert p1 < p3 # Same y, lower x + assert not p2 < p1 + + +def test_euclidean_distance() -> None: + """Test Euclidean distance calculation.""" + p1 = Point(0, 0) + p2 = Point(3, 4) + + assert p1.euclidean_distance(p2) == 5.0 + + +def test_consecutive_orientation() -> None: + """Test orientation calculation.""" + p1 = Point(0, 0) + p2 = Point(1, 0) + p3_ccw = Point(1, 1) # Counter-clockwise + p3_cw = Point(1, -1) # Clockwise + p3_collinear = Point(2, 0) # Collinear + + assert p1.consecutive_orientation(p2, p3_ccw) > 0 # Counter-clockwise + assert p1.consecutive_orientation(p2, p3_cw) < 0 # Clockwise + assert p1.consecutive_orientation(p2, p3_collinear) == 0 # Collinear + + +def test_large_hull() -> None: + """Test with a larger set of points.""" + # Create a circle of points + import math + + points = [] + for i in range(20): + angle = 2 * math.pi * i / 20 + x = math.cos(angle) + y = math.sin(angle) + points.append(Point(x, y)) + + # Add some interior points + points.append(Point(0, 0)) + points.append(Point(0.5, 0.5)) + points.append(Point(-0.3, 0.2)) + + hull = graham_scan(points) + + # The hull should contain the circle points but not the interior points + assert len(hull) >= 3 + assert Point(0, 0) not in hull + assert Point(0.5, 0.5) not in hull + assert Point(-0.3, 0.2) not in hull + + +def test_random_order() -> None: + """Test that point order doesn't affect the result.""" + p1 = Point(0, 0) + p2 = Point(4, 0) + p3 = Point(4, 3) + p4 = Point(0, 3) + p5 = Point(2, 1.5) # Interior + + # Try different orderings + order1 = [p1, p2, p3, p4, p5] + order2 = [p5, p4, p3, p2, p1] + order3 = [p3, p5, p1, p4, p2] + + hull1 = graham_scan(order1) + hull2 = graham_scan(order2) + hull3 = graham_scan(order3) + + # All should have the same points (though possibly in different order) + assert len(hull1) == len(hull2) == len(hull3) == 4 + assert {(p.x, p.y) for p in hull1} == {(p.x, p.y) for p in hull2} + assert {(p.x, p.y) for p in hull2} == {(p.x, p.y) for p in hull3} diff --git a/geometry/tests/test_jarvis_march.py b/geometry/tests/test_jarvis_march.py new file mode 100644 index 000000000000..6e7defe414a3 --- /dev/null +++ b/geometry/tests/test_jarvis_march.py @@ -0,0 +1,115 @@ +""" +Unit tests for Jarvis March (Gift Wrapping) algorithm. +""" + +from geometry.jarvis_march import Point, jarvis_march + + +class TestPoint: + """Tests for the Point class.""" + + def test_point_creation(self) -> None: + """Test Point initialization.""" + p = Point(1.0, 2.0) + assert p.x == 1.0 + assert p.y == 2.0 + + def test_point_equality(self) -> None: + """Test Point equality comparison.""" + p1 = Point(1.0, 2.0) + p2 = Point(1.0, 2.0) + p3 = Point(2.0, 1.0) + assert p1 == p2 + assert p1 != p3 + + def test_point_repr(self) -> None: + """Test Point string representation.""" + p = Point(1.5, 2.5) + assert repr(p) == "Point(1.5, 2.5)" + + def test_point_hash(self) -> None: + """Test Point hashing.""" + p1 = Point(1.0, 2.0) + p2 = Point(1.0, 2.0) + assert hash(p1) == hash(p2) + + +class TestJarvisMarch: + """Tests for the jarvis_march function.""" + + def test_triangle(self) -> None: + """Test convex hull of a triangle.""" + p1, p2, p3 = Point(1, 1), Point(2, 1), Point(1.5, 2) + hull = jarvis_march([p1, p2, p3]) + assert len(hull) == 3 + assert all(p in hull for p in [p1, p2, p3]) + + def test_collinear_points(self) -> None: + """Test that collinear points return empty hull.""" + points = [Point(i, 0) for i in range(5)] + hull = jarvis_march(points) + assert hull == [] + + def test_rectangle_with_interior_point(self) -> None: + """Test rectangle with interior point - interior point excluded.""" + p1, p2 = Point(1, 1), Point(2, 1) + p3, p4 = Point(2, 2), Point(1, 2) + p5 = Point(1.5, 1.5) + hull = jarvis_march([p1, p2, p3, p4, p5]) + assert len(hull) == 4 + assert p5 not in hull + + def test_star_shape(self) -> None: + """Test star shape - only tips are in hull.""" + tips = [ + Point(-5, 6), + Point(-11, 0), + Point(-9, -8), + Point(4, 4), + Point(6, -7), + ] + interior = [Point(-7, -2), Point(-2, -4), Point(0, 1)] + hull = jarvis_march(tips + interior) + assert len(hull) == 5 + assert all(p in hull for p in tips) + assert not any(p in hull for p in interior) + + def test_empty_list(self) -> None: + """Test empty list returns empty hull.""" + assert jarvis_march([]) == [] + + def test_single_point(self) -> None: + """Test single point returns empty hull.""" + assert jarvis_march([Point(0, 0)]) == [] + + def test_two_points(self) -> None: + """Test two points return empty hull.""" + assert jarvis_march([Point(0, 0), Point(1, 1)]) == [] + + def test_square(self) -> None: + """Test convex hull of a square.""" + p1, p2 = Point(0, 0), Point(1, 0) + p3, p4 = Point(1, 1), Point(0, 1) + hull = jarvis_march([p1, p2, p3, p4]) + assert len(hull) == 4 + assert all(p in hull for p in [p1, p2, p3, p4]) + + def test_duplicate_points(self) -> None: + """Test handling of duplicate points.""" + p1, p2, p3 = Point(0, 0), Point(1, 0), Point(0, 1) + points = [p1, p2, p3, p1, p2] # Include duplicates + hull = jarvis_march(points) + assert len(hull) == 3 + + def test_pentagon(self) -> None: + """Test convex hull of a pentagon.""" + points = [ + Point(0, 1), + Point(1, 2), + Point(2, 1), + Point(1.5, 0), + Point(0.5, 0), + ] + hull = jarvis_march(points) + assert len(hull) == 5 + assert all(p in hull for p in points) diff --git a/graphs/breadth_first_search_shortest_path_2.py b/graphs/breadth_first_search_shortest_path_2.py index 4f9b6e65bdf3..efba9b7b6ae6 100644 --- a/graphs/breadth_first_search_shortest_path_2.py +++ b/graphs/breadth_first_search_shortest_path_2.py @@ -1,10 +1,12 @@ -"""Breadth-first search shortest path implementations. +"""Breadth-first search the shortest path implementations. doctest: -python -m doctest -v bfs_shortest_path.py +python -m doctest -v breadth_first_search_shortest_path_2.py Manual test: -python bfs_shortest_path.py +python breadth_first_search_shortest_path_2.py """ +from collections import deque + demo_graph = { "A": ["B", "C", "E"], "B": ["A", "D", "E"], @@ -17,7 +19,7 @@ def bfs_shortest_path(graph: dict, start, goal) -> list[str]: - """Find shortest path between `start` and `goal` nodes. + """Find the shortest path between `start` and `goal` nodes. Args: graph (dict): node/list of neighboring nodes key/value pairs. start: start node. @@ -36,7 +38,7 @@ def bfs_shortest_path(graph: dict, start, goal) -> list[str]: # keep track of explored nodes explored = set() # keep track of all the paths to be checked - queue = [[start]] + queue = deque([[start]]) # return path if start is goal if start == goal: @@ -45,7 +47,7 @@ def bfs_shortest_path(graph: dict, start, goal) -> list[str]: # keeps looping until all possible paths have been checked while queue: # pop the first path from the queue - path = queue.pop(0) + path = queue.popleft() # get the last node from the path node = path[-1] if node not in explored: @@ -68,13 +70,13 @@ def bfs_shortest_path(graph: dict, start, goal) -> list[str]: def bfs_shortest_path_distance(graph: dict, start, target) -> int: - """Find shortest path distance between `start` and `target` nodes. + """Find the shortest path distance between `start` and `target` nodes. Args: graph: node/list of neighboring nodes key/value pairs. start: node to start search from. target: node to search for. Returns: - Number of edges in shortest path between `start` and `target` nodes. + Number of edges in the shortest path between `start` and `target` nodes. -1 if no path exists. Example: >>> bfs_shortest_path_distance(demo_graph, "G", "D") @@ -88,12 +90,12 @@ def bfs_shortest_path_distance(graph: dict, start, target) -> int: return -1 if start == target: return 0 - queue = [start] + queue = deque([start]) visited = set(start) # Keep tab on distances from `start` node. dist = {start: 0, target: -1} while queue: - node = queue.pop(0) + node = queue.popleft() if node == target: dist[target] = ( dist[node] if dist[target] == -1 else min(dist[target], dist[node]) diff --git a/graphs/check_bipatrite.py b/graphs/check_bipatrite.py index 213f3f9480b5..897c78850d58 100644 --- a/graphs/check_bipatrite.py +++ b/graphs/check_bipatrite.py @@ -1,7 +1,7 @@ from collections import defaultdict, deque -def is_bipartite_dfs(graph: defaultdict[int, list[int]]) -> bool: +def is_bipartite_dfs(graph: dict[int, list[int]]) -> bool: """ Check if a graph is bipartite using depth-first search (DFS). @@ -16,12 +16,9 @@ def is_bipartite_dfs(graph: defaultdict[int, list[int]]) -> bool: Examples: - >>> # FIXME: This test should pass. - >>> is_bipartite_dfs(defaultdict(list, {0: [1, 2], 1: [0, 3], 2: [0, 4]})) - Traceback (most recent call last): - ... - RuntimeError: dictionary changed size during iteration - >>> is_bipartite_dfs(defaultdict(list, {0: [1, 2], 1: [0, 3], 2: [0, 1]})) + >>> is_bipartite_dfs({0: [1, 2], 1: [0, 3], 2: [0, 4]}) + True + >>> is_bipartite_dfs({0: [1, 2], 1: [0, 3], 2: [0, 1]}) False >>> is_bipartite_dfs({}) True @@ -34,36 +31,26 @@ def is_bipartite_dfs(graph: defaultdict[int, list[int]]) -> bool: >>> is_bipartite_dfs({0: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2], 4: [0]}) False >>> is_bipartite_dfs({7: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2], 4: [0]}) - Traceback (most recent call last): - ... - KeyError: 0 + False >>> # FIXME: This test should fails with KeyError: 4. >>> is_bipartite_dfs({0: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2], 9: [0]}) False >>> is_bipartite_dfs({0: [-1, 3], 1: [0, -2]}) - Traceback (most recent call last): - ... - KeyError: -1 + False >>> is_bipartite_dfs({-1: [0, 2], 0: [-1, 1], 1: [0, 2], 2: [-1, 1]}) True >>> is_bipartite_dfs({0.9: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2]}) - Traceback (most recent call last): - ... - KeyError: 0 + True >>> # FIXME: This test should fails with >>> # TypeError: list indices must be integers or... >>> is_bipartite_dfs({0: [1.0, 3.0], 1.0: [0, 2.0], 2.0: [1.0, 3.0], 3.0: [0, 2.0]}) True >>> is_bipartite_dfs({"a": [1, 3], "b": [0, 2], "c": [1, 3], "d": [0, 2]}) - Traceback (most recent call last): - ... - KeyError: 1 + True >>> is_bipartite_dfs({0: ["b", "d"], 1: ["a", "c"], 2: ["b", "d"], 3: ["a", "c"]}) - Traceback (most recent call last): - ... - KeyError: 'b' + True """ def depth_first_search(node: int, color: int) -> bool: @@ -80,6 +67,8 @@ def depth_first_search(node: int, color: int) -> bool: """ if visited[node] == -1: visited[node] = color + if node not in graph: + return True for neighbor in graph[node]: if not depth_first_search(neighbor, 1 - color): return False @@ -92,7 +81,7 @@ def depth_first_search(node: int, color: int) -> bool: return True -def is_bipartite_bfs(graph: defaultdict[int, list[int]]) -> bool: +def is_bipartite_bfs(graph: dict[int, list[int]]) -> bool: """ Check if a graph is bipartite using a breadth-first search (BFS). @@ -107,12 +96,9 @@ def is_bipartite_bfs(graph: defaultdict[int, list[int]]) -> bool: Examples: - >>> # FIXME: This test should pass. - >>> is_bipartite_bfs(defaultdict(list, {0: [1, 2], 1: [0, 3], 2: [0, 4]})) - Traceback (most recent call last): - ... - RuntimeError: dictionary changed size during iteration - >>> is_bipartite_bfs(defaultdict(list, {0: [1, 2], 1: [0, 2], 2: [0, 1]})) + >>> is_bipartite_bfs({0: [1, 2], 1: [0, 3], 2: [0, 4]}) + True + >>> is_bipartite_bfs({0: [1, 2], 1: [0, 2], 2: [0, 1]}) False >>> is_bipartite_bfs({}) True @@ -125,36 +111,26 @@ def is_bipartite_bfs(graph: defaultdict[int, list[int]]) -> bool: >>> is_bipartite_bfs({0: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2], 4: [0]}) False >>> is_bipartite_bfs({7: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2], 4: [0]}) - Traceback (most recent call last): - ... - KeyError: 0 + False >>> # FIXME: This test should fails with KeyError: 4. >>> is_bipartite_bfs({0: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2], 9: [0]}) False >>> is_bipartite_bfs({0: [-1, 3], 1: [0, -2]}) - Traceback (most recent call last): - ... - KeyError: -1 + False >>> is_bipartite_bfs({-1: [0, 2], 0: [-1, 1], 1: [0, 2], 2: [-1, 1]}) True >>> is_bipartite_bfs({0.9: [1, 3], 1: [0, 2], 2: [1, 3], 3: [0, 2]}) - Traceback (most recent call last): - ... - KeyError: 0 + True >>> # FIXME: This test should fails with >>> # TypeError: list indices must be integers or... >>> is_bipartite_bfs({0: [1.0, 3.0], 1.0: [0, 2.0], 2.0: [1.0, 3.0], 3.0: [0, 2.0]}) True >>> is_bipartite_bfs({"a": [1, 3], "b": [0, 2], "c": [1, 3], "d": [0, 2]}) - Traceback (most recent call last): - ... - KeyError: 1 + True >>> is_bipartite_bfs({0: ["b", "d"], 1: ["a", "c"], 2: ["b", "d"], 3: ["a", "c"]}) - Traceback (most recent call last): - ... - KeyError: 'b' + True """ visited: defaultdict[int, int] = defaultdict(lambda: -1) for node in graph: @@ -164,6 +140,8 @@ def is_bipartite_bfs(graph: defaultdict[int, list[int]]) -> bool: visited[node] = 0 while queue: curr_node = queue.popleft() + if curr_node not in graph: + continue for neighbor in graph[curr_node]: if visited[neighbor] == -1: visited[neighbor] = 1 - visited[curr_node] @@ -173,7 +151,7 @@ def is_bipartite_bfs(graph: defaultdict[int, list[int]]) -> bool: return True -if __name__ == "__main": +if __name__ == "__main__": import doctest result = doctest.testmod() diff --git a/graphs/dijkstra_algorithm.py b/graphs/dijkstra_algorithm.py index 51412b790bac..60646862fca8 100644 --- a/graphs/dijkstra_algorithm.py +++ b/graphs/dijkstra_algorithm.py @@ -52,45 +52,33 @@ def min_heapify(self, idx): >>> priority_queue_test.array = [(5, 'A'), (10, 'B'), (15, 'C')] >>> priority_queue_test.min_heapify(0) - Traceback (most recent call last): - ... - TypeError: 'list' object is not callable >>> priority_queue_test.array [(5, 'A'), (10, 'B'), (15, 'C')] >>> priority_queue_test.array = [(10, 'A'), (5, 'B'), (15, 'C')] >>> priority_queue_test.min_heapify(0) - Traceback (most recent call last): - ... - TypeError: 'list' object is not callable >>> priority_queue_test.array - [(10, 'A'), (5, 'B'), (15, 'C')] + [(5, 'B'), (10, 'A'), (15, 'C')] >>> priority_queue_test.array = [(10, 'A'), (15, 'B'), (5, 'C')] >>> priority_queue_test.min_heapify(0) - Traceback (most recent call last): - ... - TypeError: 'list' object is not callable >>> priority_queue_test.array - [(10, 'A'), (15, 'B'), (5, 'C')] + [(5, 'C'), (15, 'B'), (10, 'A')] >>> priority_queue_test.array = [(10, 'A'), (5, 'B')] >>> priority_queue_test.cur_size = len(priority_queue_test.array) >>> priority_queue_test.pos = {'A': 0, 'B': 1} >>> priority_queue_test.min_heapify(0) - Traceback (most recent call last): - ... - TypeError: 'list' object is not callable >>> priority_queue_test.array - [(10, 'A'), (5, 'B')] + [(5, 'B'), (10, 'A')] """ lc = self.left(idx) rc = self.right(idx) - if lc < self.cur_size and self.array(lc)[0] < self.array[idx][0]: + if lc < self.cur_size and self.array[lc][0] < self.array[idx][0]: smallest = lc else: smallest = idx - if rc < self.cur_size and self.array(rc)[0] < self.array[smallest][0]: + if rc < self.cur_size and self.array[rc][0] < self.array[smallest][0]: smallest = rc if smallest != idx: self.swap(idx, smallest) @@ -130,12 +118,12 @@ def extract_min(self): >>> priority_queue_test.extract_min() 'C' >>> priority_queue_test.array[0] - (15, 'B') + (10, 'A') """ min_node = self.array[0][1] self.array[0] = self.array[self.cur_size - 1] self.cur_size -= 1 - self.min_heapify(1) + self.min_heapify(0) del self.pos[min_node] return min_node diff --git a/graphs/graph_adjacency_list.py b/graphs/graph_adjacency_list.py index abc75311cd60..34014d69dfb8 100644 --- a/graphs/graph_adjacency_list.py +++ b/graphs/graph_adjacency_list.py @@ -21,14 +21,14 @@ import random import unittest from pprint import pformat -from typing import Generic, TypeVar +from typing import TypeVar import pytest T = TypeVar("T") -class GraphAdjacencyList(Generic[T]): +class GraphAdjacencyList[T]: def __init__( self, vertices: list[T], edges: list[list[T]], directed: bool = True ) -> None: @@ -61,6 +61,15 @@ def add_vertex(self, vertex: T) -> None: """ Adds a vertex to the graph. If the given vertex already exists, a ValueError will be thrown. + + >>> g = GraphAdjacencyList(vertices=[], edges=[], directed=False) + >>> g.add_vertex("A") + >>> g.adj_list + {'A': []} + >>> g.add_vertex("A") + Traceback (most recent call last): + ... + ValueError: Incorrect input: A is already in the graph. """ if self.contains_vertex(vertex): msg = f"Incorrect input: {vertex} is already in the graph." @@ -448,7 +457,7 @@ def test_remove_edge(self) -> None: ( undirected_graph, directed_graph, - random_vertices, + _random_vertices, random_edges, ) = self.__generate_graphs(20, 0, 100, 4) @@ -502,7 +511,7 @@ def test_add_vertex_exception_check(self) -> None: undirected_graph, directed_graph, random_vertices, - random_edges, + _random_edges, ) = self.__generate_graphs(20, 0, 100, 4) for vertex in random_vertices: @@ -516,7 +525,7 @@ def test_remove_vertex_exception_check(self) -> None: undirected_graph, directed_graph, random_vertices, - random_edges, + _random_edges, ) = self.__generate_graphs(20, 0, 100, 4) for i in range(101): @@ -530,7 +539,7 @@ def test_add_edge_exception_check(self) -> None: ( undirected_graph, directed_graph, - random_vertices, + _random_vertices, random_edges, ) = self.__generate_graphs(20, 0, 100, 4) @@ -569,7 +578,7 @@ def test_contains_edge_exception_check(self) -> None: undirected_graph, directed_graph, random_vertices, - random_edges, + _random_edges, ) = self.__generate_graphs(20, 0, 100, 4) for vertex in random_vertices: diff --git a/graphs/graph_adjacency_matrix.py b/graphs/graph_adjacency_matrix.py index 568c84166e4b..6dca0fbbcf05 100644 --- a/graphs/graph_adjacency_matrix.py +++ b/graphs/graph_adjacency_matrix.py @@ -21,14 +21,14 @@ import random import unittest from pprint import pformat -from typing import Generic, TypeVar +from typing import TypeVar import pytest T = TypeVar("T") -class GraphAdjacencyMatrix(Generic[T]): +class GraphAdjacencyMatrix[T]: def __init__( self, vertices: list[T], edges: list[list[T]], directed: bool = True ) -> None: @@ -469,7 +469,7 @@ def test_remove_edge(self) -> None: ( undirected_graph, directed_graph, - random_vertices, + _random_vertices, random_edges, ) = self.__generate_graphs(20, 0, 100, 4) @@ -523,7 +523,7 @@ def test_add_vertex_exception_check(self) -> None: undirected_graph, directed_graph, random_vertices, - random_edges, + _random_edges, ) = self.__generate_graphs(20, 0, 100, 4) for vertex in random_vertices: @@ -537,7 +537,7 @@ def test_remove_vertex_exception_check(self) -> None: undirected_graph, directed_graph, random_vertices, - random_edges, + _random_edges, ) = self.__generate_graphs(20, 0, 100, 4) for i in range(101): @@ -551,7 +551,7 @@ def test_add_edge_exception_check(self) -> None: ( undirected_graph, directed_graph, - random_vertices, + _random_vertices, random_edges, ) = self.__generate_graphs(20, 0, 100, 4) @@ -590,7 +590,7 @@ def test_contains_edge_exception_check(self) -> None: undirected_graph, directed_graph, random_vertices, - random_edges, + _random_edges, ) = self.__generate_graphs(20, 0, 100, 4) for vertex in random_vertices: diff --git a/graphs/graph_list.py b/graphs/graph_list.py index 6563cbb76132..6b63590bf55a 100644 --- a/graphs/graph_list.py +++ b/graphs/graph_list.py @@ -6,12 +6,12 @@ from __future__ import annotations from pprint import pformat -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") -class GraphAdjacencyList(Generic[T]): +class GraphAdjacencyList[T]: """ Adjacency List type Graph Data Structure that accounts for directed and undirected Graphs. Initialize graph object indicating whether it's directed or undirected. diff --git a/graphs/johnson.py b/graphs/johnson.py new file mode 100644 index 000000000000..6306ab5f8654 --- /dev/null +++ b/graphs/johnson.py @@ -0,0 +1,118 @@ +import heapq +from collections.abc import Hashable + +Node = Hashable +edge = tuple[Node, Node, float] +adjacency = dict[Node, list[tuple[Node, float]]] + + +def _collect_nodes_and_edges(graph: adjacency) -> tuple[list[Node], list[edge]]: + nodes = set() + edges: list[edge] = [] + for u, neighbors in graph.items(): + nodes.add(u) + for v, w in neighbors: + nodes.add(v) + edges.append((u, v, w)) + return list(nodes), edges + + +def _bellman_ford(nodes: list[Node], edges: list[edge]) -> dict[Node, float]: + """ + Bellman-Ford relaxation to compute potentials h[v] for all vertices. + Raises ValueError if a negative weight cycle exists. + """ + dist: dict[Node, float] = dict.fromkeys(nodes, 0.0) + n = len(nodes) + + for _ in range(n - 1): + updated = False + for u, v, w in edges: + if dist[u] + w < dist[v]: + dist[v] = dist[u] + w + updated = True + if not updated: + break + else: + for u, v, w in edges: + if dist[u] + w < dist[v]: + raise ValueError("Negative weight cycle detected") + return dist + + +def _dijkstra( + start: Node, + nodes: list[Node], + graph: adjacency, + potentials: dict[Node, float], +) -> dict[Node, float]: + """ + Dijkstra over reweighted graph, using potentials h to make weights non-negative. + Returns distances from start in the reweighted space. + """ + inf = float("inf") + dist: dict[Node, float] = dict.fromkeys(nodes, inf) + dist[start] = 0.0 + heap: list[tuple[float, Node]] = [(0.0, start)] + + while heap: + d_u, u = heapq.heappop(heap) + if d_u > dist[u]: + continue + for v, w in graph.get(u, []): + w_prime = w + potentials[u] - potentials[v] + if w_prime < 0: + raise ValueError( + "Negative edge weight after reweighting: numeric error" + ) + new_dist = d_u + w_prime + if new_dist < dist[v]: + dist[v] = new_dist + heapq.heappush(heap, (new_dist, v)) + return dist + + +def johnson(graph: adjacency) -> dict[Node, dict[Node, float]]: + """ + Compute all-pairs shortest paths using Johnson's algorithm. + + Reference: + https://en.wikipedia.org/wiki/Johnson%27s_algorithm + + Args: + graph: adjacency list {u: [(v, weight), ...], ...} + + Returns: + dict of dicts: dist[u][v] = shortest distance from u to v + + Raises: + ValueError: if a negative weight cycle is detected + + Example: + >>> g = { + ... 0: [(1, 3), (2, 8), (4, -4)], + ... 1: [(3, 1), (4, 7)], + ... 2: [(1, 4)], + ... 3: [(0, 2), (2, -5)], + ... 4: [(3, 6)], + ... } + >>> round(johnson(g)[0][3], 2) + 2.0 + """ + nodes, edges = _collect_nodes_and_edges(graph) + potentials = _bellman_ford(nodes, edges) + + all_pairs: dict[Node, dict[Node, float]] = {} + inf = float("inf") + for s in nodes: + dist_reweighted = _dijkstra(s, nodes, graph, potentials) + dists_orig: dict[Node, float] = {} + for v in nodes: + d_prime = dist_reweighted[v] + if d_prime < inf: + dists_orig[v] = d_prime - potentials[s] + potentials[v] + else: + dists_orig[v] = inf + all_pairs[s] = dists_orig + + return all_pairs diff --git a/graphs/minimum_spanning_tree_kruskal2.py b/graphs/minimum_spanning_tree_kruskal2.py index 0ddb43ce8e6e..1f6d7255683b 100644 --- a/graphs/minimum_spanning_tree_kruskal2.py +++ b/graphs/minimum_spanning_tree_kruskal2.py @@ -1,11 +1,11 @@ from __future__ import annotations -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") -class DisjointSetTreeNode(Generic[T]): +class DisjointSetTreeNode[T]: # Disjoint Set Node to store the parent and rank def __init__(self, data: T) -> None: self.data = data @@ -13,7 +13,7 @@ def __init__(self, data: T) -> None: self.rank = 0 -class DisjointSetTree(Generic[T]): +class DisjointSetTree[T]: # Disjoint Set DataStructure def __init__(self) -> None: # map from node name to the node object @@ -46,7 +46,7 @@ def union(self, data1: T, data2: T) -> None: self.link(self.find_set(data1), self.find_set(data2)) -class GraphUndirectedWeighted(Generic[T]): +class GraphUndirectedWeighted[T]: def __init__(self) -> None: # connections: map from the node to the neighbouring nodes (with weights) self.connections: dict[T, dict[T, int]] = {} diff --git a/graphs/minimum_spanning_tree_prims2.py b/graphs/minimum_spanning_tree_prims2.py index 6870cc80f844..d22111289a48 100644 --- a/graphs/minimum_spanning_tree_prims2.py +++ b/graphs/minimum_spanning_tree_prims2.py @@ -10,7 +10,7 @@ from __future__ import annotations from sys import maxsize -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") @@ -47,7 +47,7 @@ def get_child_right_position(position: int) -> int: return (2 * position) + 2 -class MinPriorityQueue(Generic[T]): +class MinPriorityQueue[T]: """ Minimum Priority Queue Class @@ -184,7 +184,7 @@ def _swap_nodes(self, node1_pos: int, node2_pos: int) -> None: self.position_map[node2_elem] = node1_pos -class GraphUndirectedWeighted(Generic[T]): +class GraphUndirectedWeighted[T]: """ Graph Undirected Weighted Class @@ -217,7 +217,7 @@ def add_edge(self, node1: T, node2: T, weight: int) -> None: self.connections[node2][node1] = weight -def prims_algo( +def prims_algo[T]( graph: GraphUndirectedWeighted[T], ) -> tuple[dict[T, int], dict[T, T | None]]: """ diff --git a/graphs/tests/test_johnson.py b/graphs/tests/test_johnson.py new file mode 100644 index 000000000000..e149aac85d0f --- /dev/null +++ b/graphs/tests/test_johnson.py @@ -0,0 +1,24 @@ +import math + +import pytest + +from graphs.johnson import johnson + + +def test_johnson_basic(): + g = { + 0: [(1, 3), (2, 8), (4, -4)], + 1: [(3, 1), (4, 7)], + 2: [(1, 4)], + 3: [(0, 2), (2, -5)], + 4: [(3, 6)], + } + dist = johnson(g) + assert math.isclose(dist[0][3], 2.0, abs_tol=1e-9) + assert math.isclose(dist[3][2], -5.0, abs_tol=1e-9) + + +def test_johnson_negative_cycle(): + g2 = {0: [(1, 1)], 1: [(0, -3)]} + with pytest.raises(ValueError): + johnson(g2) diff --git a/hashes/hamming_code.py b/hashes/hamming_code.py index b3095852ac51..fead26cf7536 100644 --- a/hashes/hamming_code.py +++ b/hashes/hamming_code.py @@ -118,7 +118,6 @@ def emitter_converter(size_par, data): data_ord.append(None) # Calculates parity - qtd_bp = 0 # parity bit counter for bp in range(1, size_par + 1): # Bit counter one for a given parity cont_bo = 0 @@ -133,8 +132,6 @@ def emitter_converter(size_par, data): cont_bo += 1 parity.append(cont_bo % 2) - qtd_bp += 1 - # Mount the message cont_bp = 0 # parity bit counter for x in range(size_par + len(data)): @@ -208,7 +205,6 @@ def receptor_converter(size_par, data): data_ord.append(None) # Calculates parity - qtd_bp = 0 # parity bit counter for bp in range(1, size_par + 1): # Bit counter one for a certain parity cont_bo = 0 @@ -222,8 +218,6 @@ def receptor_converter(size_par, data): cont_bo += 1 parity.append(str(cont_bo % 2)) - qtd_bp += 1 - # Mount the message cont_bp = 0 # Parity bit counter for x in range(size_par + len(data_output)): diff --git a/knapsack/README.md b/knapsack/README.md index f31e5f591412..686ea929255a 100644 --- a/knapsack/README.md +++ b/knapsack/README.md @@ -1,4 +1,4 @@ -# A naive recursive implementation of 0-1 Knapsack Problem +# A recursive implementation of 0-N Knapsack Problem This overview is taken from: diff --git a/knapsack/knapsack.py b/knapsack/knapsack.py index bb507be1ba3c..0648773c919f 100644 --- a/knapsack/knapsack.py +++ b/knapsack/knapsack.py @@ -1,14 +1,23 @@ -"""A naive recursive implementation of 0-1 Knapsack Problem +"""A recursive implementation of 0-N Knapsack Problem https://en.wikipedia.org/wiki/Knapsack_problem """ from __future__ import annotations +from functools import lru_cache -def knapsack(capacity: int, weights: list[int], values: list[int], counter: int) -> int: + +def knapsack( + capacity: int, + weights: list[int], + values: list[int], + counter: int, + allow_repetition=False, +) -> int: """ Returns the maximum value that can be put in a knapsack of a capacity cap, - whereby each weight w has a specific value val. + whereby each weight w has a specific value val + with option to allow repetitive selection of items >>> cap = 50 >>> val = [60, 100, 120] @@ -17,28 +26,40 @@ def knapsack(capacity: int, weights: list[int], values: list[int], counter: int) >>> knapsack(cap, w, val, c) 220 - The result is 220 cause the values of 100 and 120 got the weight of 50 + Given the repetition is NOT allowed, + the result is 220 cause the values of 100 and 120 got the weight of 50 which is the limit of the capacity. + >>> knapsack(cap, w, val, c, True) + 300 + + Given the repetition is allowed, + the result is 300 cause the values of 60*5 (pick 5 times) + got the weight of 10*5 which is the limit of the capacity. """ - # Base Case - if counter == 0 or capacity == 0: - return 0 - - # If weight of the nth item is more than Knapsack of capacity, - # then this item cannot be included in the optimal solution, - # else return the maximum of two cases: - # (1) nth item included - # (2) not included - if weights[counter - 1] > capacity: - return knapsack(capacity, weights, values, counter - 1) - else: - left_capacity = capacity - weights[counter - 1] - new_value_included = values[counter - 1] + knapsack( - left_capacity, weights, values, counter - 1 - ) - without_new_value = knapsack(capacity, weights, values, counter - 1) - return max(new_value_included, without_new_value) + @lru_cache + def knapsack_recur(capacity: int, counter: int) -> int: + # Base Case + if counter == 0 or capacity == 0: + return 0 + + # If weight of the nth item is more than Knapsack of capacity, + # then this item cannot be included in the optimal solution, + # else return the maximum of two cases: + # (1) nth item included only once (0-1), if allow_repetition is False + # nth item included one or more times (0-N), if allow_repetition is True + # (2) not included + if weights[counter - 1] > capacity: + return knapsack_recur(capacity, counter - 1) + else: + left_capacity = capacity - weights[counter - 1] + new_value_included = values[counter - 1] + knapsack_recur( + left_capacity, counter - 1 if not allow_repetition else counter + ) + without_new_value = knapsack_recur(capacity, counter - 1) + return max(new_value_included, without_new_value) + + return knapsack_recur(capacity, counter) if __name__ == "__main__": diff --git a/knapsack/tests/test_greedy_knapsack.py b/knapsack/tests/test_greedy_knapsack.py index e6a40084109e..7ebaddd3c99e 100644 --- a/knapsack/tests/test_greedy_knapsack.py +++ b/knapsack/tests/test_greedy_knapsack.py @@ -28,7 +28,7 @@ def test_negative_max_weight(self): # profit = [10, 20, 30, 40, 50, 60] # weight = [2, 4, 6, 8, 10, 12] # max_weight = -15 - pytest.raises(ValueError, match="max_weight must greater than zero.") + pytest.raises(ValueError, match=r"max_weight must greater than zero.") def test_negative_profit_value(self): """ @@ -38,7 +38,7 @@ def test_negative_profit_value(self): # profit = [10, -20, 30, 40, 50, 60] # weight = [2, 4, 6, 8, 10, 12] # max_weight = 15 - pytest.raises(ValueError, match="Weight can not be negative.") + pytest.raises(ValueError, match=r"Weight can not be negative.") def test_negative_weight_value(self): """ @@ -48,7 +48,7 @@ def test_negative_weight_value(self): # profit = [10, 20, 30, 40, 50, 60] # weight = [2, -4, 6, -8, 10, 12] # max_weight = 15 - pytest.raises(ValueError, match="Profit can not be negative.") + pytest.raises(ValueError, match=r"Profit can not be negative.") def test_null_max_weight(self): """ @@ -58,7 +58,7 @@ def test_null_max_weight(self): # profit = [10, 20, 30, 40, 50, 60] # weight = [2, 4, 6, 8, 10, 12] # max_weight = null - pytest.raises(ValueError, match="max_weight must greater than zero.") + pytest.raises(ValueError, match=r"max_weight must greater than zero.") def test_unequal_list_length(self): """ @@ -68,7 +68,9 @@ def test_unequal_list_length(self): # profit = [10, 20, 30, 40, 50] # weight = [2, 4, 6, 8, 10, 12] # max_weight = 100 - pytest.raises(IndexError, match="The length of profit and weight must be same.") + pytest.raises( + IndexError, match=r"The length of profit and weight must be same." + ) if __name__ == "__main__": diff --git a/knapsack/tests/test_knapsack.py b/knapsack/tests/test_knapsack.py index 7bfb8780627b..80378aae4579 100644 --- a/knapsack/tests/test_knapsack.py +++ b/knapsack/tests/test_knapsack.py @@ -30,7 +30,7 @@ def test_base_case(self): def test_easy_case(self): """ - test for the base case + test for the easy case """ cap = 3 val = [1, 2, 3] @@ -48,6 +48,16 @@ def test_knapsack(self): c = len(val) assert k.knapsack(cap, w, val, c) == 220 + def test_knapsack_repetition(self): + """ + test for the knapsack repetition + """ + cap = 50 + val = [60, 100, 120] + w = [10, 20, 30] + c = len(val) + assert k.knapsack(cap, w, val, c, True) == 300 + if __name__ == "__main__": unittest.main() diff --git a/linear_algebra/gaussian_elimination.py b/linear_algebra/gaussian_elimination.py index 6f4075b710fd..cf816940b0d1 100644 --- a/linear_algebra/gaussian_elimination.py +++ b/linear_algebra/gaussian_elimination.py @@ -33,7 +33,7 @@ def retroactive_resolution( [ 0.5]]) """ - rows, columns = np.shape(coefficients) + rows, _columns = np.shape(coefficients) x: NDArray[float64] = np.zeros((rows, 1), dtype=float) for row in reversed(range(rows)): diff --git a/linear_algebra/jacobi_iteration_method.py b/linear_algebra/jacobi_iteration_method.py index 2cc9c103018b..0f9fcde7af6c 100644 --- a/linear_algebra/jacobi_iteration_method.py +++ b/linear_algebra/jacobi_iteration_method.py @@ -112,7 +112,7 @@ def jacobi_iteration_method( (coefficient_matrix, constant_matrix), axis=1 ) - rows, cols = table.shape + rows, _cols = table.shape strictly_diagonally_dominant(table) @@ -149,7 +149,7 @@ def jacobi_iteration_method( # Here we get 'i_col' - these are the column numbers, for each row # without diagonal elements, except for the last column. - i_row, i_col = np.where(masks) + _i_row, i_col = np.where(masks) ind = i_col.reshape(-1, rows - 1) #'i_col' is converted to a two-dimensional list 'ind', which will be diff --git a/machine_learning/apriori_algorithm.py b/machine_learning/apriori_algorithm.py index 09a89ac236bd..5c3e2baba2c2 100644 --- a/machine_learning/apriori_algorithm.py +++ b/machine_learning/apriori_algorithm.py @@ -11,6 +11,7 @@ Examples: https://www.kaggle.com/code/earthian/apriori-association-rules-mining """ +from collections import Counter from itertools import combinations @@ -44,11 +45,16 @@ def prune(itemset: list, candidates: list, length: int) -> list: >>> prune(itemset, candidates, 3) [] """ + itemset_counter = Counter(tuple(item) for item in itemset) pruned = [] for candidate in candidates: is_subsequence = True for item in candidate: - if item not in itemset or itemset.count(item) < length - 1: + item_tuple = tuple(item) + if ( + item_tuple not in itemset_counter + or itemset_counter[item_tuple] < length - 1 + ): is_subsequence = False break if is_subsequence: diff --git a/machine_learning/decision_tree.py b/machine_learning/decision_tree.py index 72970431c3fc..b4df64796bb1 100644 --- a/machine_learning/decision_tree.py +++ b/machine_learning/decision_tree.py @@ -146,14 +146,13 @@ def predict(self, x): """ if self.prediction is not None: return self.prediction - elif self.left or self.right is not None: + elif self.left is not None and self.right is not None: if x >= self.decision_boundary: return self.right.predict(x) else: return self.left.predict(x) else: - print("Error: Decision tree not yet trained") - return None + raise ValueError("Decision tree not yet trained") class TestDecisionTree: @@ -201,4 +200,4 @@ def main(): main() import doctest - doctest.testmod(name="mean_squarred_error", verbose=True) + doctest.testmod(name="mean_squared_error", verbose=True) diff --git a/machine_learning/k_means_clust.py b/machine_learning/k_means_clust.py index a926362fc18b..a55153628f9c 100644 --- a/machine_learning/k_means_clust.py +++ b/machine_learning/k_means_clust.py @@ -37,7 +37,13 @@ heterogeneity, k ) - 5. Transfers Dataframe into excel format it must have feature called + 5. Plot the labeled 3D data points with centroids. + plot_kmeans( + X, + centroids, + cluster_assignment + ) + 6. Transfers Dataframe into excel format it must have feature called 'Clust' with k means clustering numbers in it. """ @@ -126,6 +132,19 @@ def plot_heterogeneity(heterogeneity, k): plt.show() +def plot_kmeans(data, centroids, cluster_assignment): + ax = plt.axes(projection="3d") + ax.scatter(data[:, 0], data[:, 1], data[:, 2], c=cluster_assignment, cmap="viridis") + ax.scatter( + centroids[:, 0], centroids[:, 1], centroids[:, 2], c="red", s=100, marker="x" + ) + ax.set_xlabel("X") + ax.set_ylabel("Y") + ax.set_zlabel("Z") + ax.set_title("3D K-Means Clustering Visualization") + plt.show() + + def kmeans( data, k, initial_centroids, maxiter=500, record_heterogeneity=None, verbose=False ): @@ -193,6 +212,7 @@ def kmeans( verbose=True, ) plot_heterogeneity(heterogeneity, k) + plot_kmeans(dataset["data"], centroids, cluster_assignment) def report_generator( diff --git a/machine_learning/linear_discriminant_analysis.py b/machine_learning/linear_discriminant_analysis.py index 8528ccbbae51..de2d1de46ba1 100644 --- a/machine_learning/linear_discriminant_analysis.py +++ b/machine_learning/linear_discriminant_analysis.py @@ -47,7 +47,6 @@ from math import log from os import name, system from random import gauss, seed -from typing import TypeVar # Make a training dataset drawn from a gaussian distribution @@ -249,10 +248,7 @@ def accuracy(actual_y: list, predicted_y: list) -> float: return (correct / len(actual_y)) * 100 -num = TypeVar("num") - - -def valid_input( +def valid_input[num]( input_type: Callable[[object], num], # Usually float or int input_msg: str, err_msg: str, diff --git a/machine_learning/polynomial_regression.py b/machine_learning/polynomial_regression.py index 212f40bea197..f52177df1292 100644 --- a/machine_learning/polynomial_regression.py +++ b/machine_learning/polynomial_regression.py @@ -93,7 +93,7 @@ def _design_matrix(data: np.ndarray, degree: int) -> np.ndarray: ... ValueError: Data must have dimensions N x 1 """ - rows, *remaining = data.shape + _rows, *remaining = data.shape if remaining: raise ValueError("Data must have dimensions N x 1") diff --git a/machine_learning/principle_component_analysis.py b/machine_learning/principle_component_analysis.py index 46ccdb968494..174500d89620 100644 --- a/machine_learning/principle_component_analysis.py +++ b/machine_learning/principle_component_analysis.py @@ -65,7 +65,7 @@ def main() -> None: """ Driver function to execute PCA and display results. """ - data_x, data_y = collect_dataset() + data_x, _data_y = collect_dataset() # Number of principal components to retain n_components = 2 diff --git a/machine_learning/t_stochastic_neighbour_embedding.py b/machine_learning/t_stochastic_neighbour_embedding.py new file mode 100644 index 000000000000..d6f630149087 --- /dev/null +++ b/machine_learning/t_stochastic_neighbour_embedding.py @@ -0,0 +1,178 @@ +""" +t-distributed stochastic neighbor embedding (t-SNE) + +For more details, see: +https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding +""" + +import doctest + +import numpy as np +from numpy import ndarray +from sklearn.datasets import load_iris + + +def collect_dataset() -> tuple[ndarray, ndarray]: + """ + Load the Iris dataset and return features and labels. + + Returns: + tuple[ndarray, ndarray]: Feature matrix and target labels. + + >>> features, targets = collect_dataset() + >>> features.shape + (150, 4) + >>> targets.shape + (150,) + """ + iris_dataset = load_iris() + return np.array(iris_dataset.data), np.array(iris_dataset.target) + + +def compute_pairwise_affinities(data_matrix: ndarray, sigma: float = 1.0) -> ndarray: + """ + Compute high-dimensional affinities (P matrix) using a Gaussian kernel. + + Args: + data_matrix: Input data of shape (n_samples, n_features). + sigma: Gaussian kernel bandwidth. + + Returns: + ndarray: Symmetrized probability matrix. + + >>> x = np.array([[0.0, 0.0], [1.0, 0.0]]) + >>> probabilities = compute_pairwise_affinities(x) + >>> float(round(probabilities[0, 1], 3)) + 0.25 + """ + n_samples = data_matrix.shape[0] + squared_sum = np.sum(np.square(data_matrix), axis=1) + squared_distance = np.add( + np.add(-2 * np.dot(data_matrix, data_matrix.T), squared_sum).T, squared_sum + ) + + affinity_matrix = np.exp(-squared_distance / (2 * sigma**2)) + np.fill_diagonal(affinity_matrix, 0) + + affinity_matrix /= np.sum(affinity_matrix) + return (affinity_matrix + affinity_matrix.T) / (2 * n_samples) + + +def compute_low_dim_affinities(embedding_matrix: ndarray) -> tuple[ndarray, ndarray]: + """ + Compute low-dimensional affinities (Q matrix) using a Student-t distribution. + + Args: + embedding_matrix: Low-dimensional embedding of shape (n_samples, n_components). + + Returns: + tuple[ndarray, ndarray]: (Q probability matrix, numerator matrix). + + >>> y = np.array([[0.0, 0.0], [1.0, 0.0]]) + >>> q_matrix, numerators = compute_low_dim_affinities(y) + >>> q_matrix.shape + (2, 2) + """ + squared_sum = np.sum(np.square(embedding_matrix), axis=1) + numerator_matrix = 1 / ( + 1 + + np.add( + np.add(-2 * np.dot(embedding_matrix, embedding_matrix.T), squared_sum).T, + squared_sum, + ) + ) + np.fill_diagonal(numerator_matrix, 0) + + q_matrix = numerator_matrix / np.sum(numerator_matrix) + return q_matrix, numerator_matrix + + +def apply_tsne( + data_matrix: ndarray, + n_components: int = 2, + learning_rate: float = 200.0, + n_iter: int = 500, +) -> ndarray: + """ + Apply t-SNE for dimensionality reduction. + + Args: + data_matrix: Original dataset (features). + n_components: Target dimension (2D or 3D). + learning_rate: Step size for gradient descent. + n_iter: Number of iterations. + + Returns: + ndarray: Low-dimensional embedding of the data. + + >>> features, _ = collect_dataset() + >>> embedding = apply_tsne(features, n_components=2, n_iter=50) + >>> embedding.shape + (150, 2) + """ + if n_components < 1 or n_iter < 1: + raise ValueError("n_components and n_iter must be >= 1") + + n_samples = data_matrix.shape[0] + rng = np.random.default_rng() + embedding = rng.standard_normal((n_samples, n_components)) * 1e-4 + + high_dim_affinities = compute_pairwise_affinities(data_matrix) + high_dim_affinities = np.maximum(high_dim_affinities, 1e-12) + + embedding_increment = np.zeros_like(embedding) + momentum = 0.5 + + for iteration in range(n_iter): + low_dim_affinities, numerator_matrix = compute_low_dim_affinities(embedding) + low_dim_affinities = np.maximum(low_dim_affinities, 1e-12) + + affinity_diff = high_dim_affinities - low_dim_affinities + + gradient = 4 * ( + np.dot((affinity_diff * numerator_matrix), embedding) + - np.multiply( + np.sum(affinity_diff * numerator_matrix, axis=1)[:, np.newaxis], + embedding, + ) + ) + + embedding_increment = momentum * embedding_increment - learning_rate * gradient + embedding += embedding_increment + + if iteration == int(n_iter / 4): + momentum = 0.8 + + return embedding + + +def main() -> None: + """ + Run t-SNE on the Iris dataset and display the first 5 embeddings. + + >>> main() # doctest: +ELLIPSIS + t-SNE embedding (first 5 points): + [[... + """ + features, _labels = collect_dataset() + embedding = apply_tsne(features, n_components=2, n_iter=300) + + if not isinstance(embedding, np.ndarray): + raise TypeError("t-SNE embedding must be an ndarray") + + print("t-SNE embedding (first 5 points):") + print(embedding[:5]) + + # Optional visualization (Ruff/mypy compliant) + + # import matplotlib.pyplot as plt + # plt.scatter(embedding[:, 0], embedding[:, 1], c=labels, cmap="viridis") + # plt.title("t-SNE Visualization of the Iris Dataset") + # plt.xlabel("Dimension 1") + # plt.ylabel("Dimension 2") + # plt.show() + + +if __name__ == "__main__": + doctest.testmod() + main() diff --git a/machine_learning/xgboost_classifier.py b/machine_learning/xgboost_classifier.py index 1da933cf690f..e845480074b9 100644 --- a/machine_learning/xgboost_classifier.py +++ b/machine_learning/xgboost_classifier.py @@ -42,8 +42,6 @@ def xgboost(features: np.ndarray, target: np.ndarray) -> XGBClassifier: def main() -> None: """ - >>> main() - Url for the algorithm: https://xgboost.readthedocs.io/en/stable/ Iris type dataset is used to demonstrate algorithm. diff --git a/maths/area.py b/maths/area.py index 31a654206977..e14cc0aa7195 100644 --- a/maths/area.py +++ b/maths/area.py @@ -552,7 +552,6 @@ def area_reg_polygon(sides: int, length: float) -> float: length of a side" ) return (sides * length**2) / (4 * tan(pi / sides)) - return (sides * length**2) / (4 * tan(pi / sides)) if __name__ == "__main__": diff --git a/maths/chinese_remainder_theorem.py b/maths/chinese_remainder_theorem.py index 18af63d106e8..b7a7712ae917 100644 --- a/maths/chinese_remainder_theorem.py +++ b/maths/chinese_remainder_theorem.py @@ -65,7 +65,7 @@ def invert_modulo(a: int, n: int) -> int: 1 """ - (b, x) = extended_euclid(a, n) + (b, _x) = extended_euclid(a, n) if b < 0: b = (b % n + n) % n return b diff --git a/maths/factorial.py b/maths/factorial.py index aaf90f384bb9..2b8b68764d89 100644 --- a/maths/factorial.py +++ b/maths/factorial.py @@ -41,22 +41,22 @@ def factorial_recursive(n: int) -> int: https://en.wikipedia.org/wiki/Factorial >>> import math - >>> all(factorial(i) == math.factorial(i) for i in range(20)) + >>> all(factorial_recursive(i) == math.factorial(i) for i in range(20)) True - >>> factorial(0.1) + >>> factorial_recursive(0.1) Traceback (most recent call last): ... - ValueError: factorial() only accepts integral values - >>> factorial(-1) + ValueError: factorial_recursive() only accepts integral values + >>> factorial_recursive(-1) Traceback (most recent call last): ... - ValueError: factorial() not defined for negative values + ValueError: factorial_recursive() not defined for negative values """ if not isinstance(n, int): - raise ValueError("factorial() only accepts integral values") + raise ValueError("factorial_recursive() only accepts integral values") if n < 0: - raise ValueError("factorial() not defined for negative values") - return 1 if n in {0, 1} else n * factorial(n - 1) + raise ValueError("factorial_recursive() not defined for negative values") + return 1 if n in {0, 1} else n * factorial_recursive(n - 1) if __name__ == "__main__": diff --git a/maths/fibonacci.py b/maths/fibonacci.py index 24b2d7ae449e..595233cf8446 100644 --- a/maths/fibonacci.py +++ b/maths/fibonacci.py @@ -91,15 +91,15 @@ def fib_iterative(n: int) -> list[int]: def fib_recursive(n: int) -> list[int]: """ Calculates the first n (0-indexed) Fibonacci numbers using recursion - >>> fib_iterative(0) + >>> fib_recursive(0) [0] - >>> fib_iterative(1) + >>> fib_recursive(1) [0, 1] - >>> fib_iterative(5) + >>> fib_recursive(5) [0, 1, 1, 2, 3, 5] - >>> fib_iterative(10) + >>> fib_recursive(10) [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55] - >>> fib_iterative(-1) + >>> fib_recursive(-1) Traceback (most recent call last): ... ValueError: n is negative @@ -119,7 +119,7 @@ def fib_recursive_term(i: int) -> int: >>> fib_recursive_term(-1) Traceback (most recent call last): ... - Exception: n is negative + ValueError: n is negative """ if i < 0: raise ValueError("n is negative") @@ -135,15 +135,15 @@ def fib_recursive_term(i: int) -> int: def fib_recursive_cached(n: int) -> list[int]: """ Calculates the first n (0-indexed) Fibonacci numbers using recursion - >>> fib_iterative(0) + >>> fib_recursive_cached(0) [0] - >>> fib_iterative(1) + >>> fib_recursive_cached(1) [0, 1] - >>> fib_iterative(5) + >>> fib_recursive_cached(5) [0, 1, 1, 2, 3, 5] - >>> fib_iterative(10) + >>> fib_recursive_cached(10) [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55] - >>> fib_iterative(-1) + >>> fib_recursive_cached(-1) Traceback (most recent call last): ... ValueError: n is negative @@ -176,14 +176,14 @@ def fib_memoization(n: int) -> list[int]: [0, 1, 1, 2, 3, 5] >>> fib_memoization(10) [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55] - >>> fib_iterative(-1) + >>> fib_memoization(-1) Traceback (most recent call last): ... ValueError: n is negative """ if n < 0: raise ValueError("n is negative") - # Cache must be outside recursuive function + # Cache must be outside recursive function # other it will reset every time it calls itself. cache: dict[int, int] = {0: 0, 1: 1, 2: 1} # Prefilled cache diff --git a/maths/greatest_common_divisor.py b/maths/greatest_common_divisor.py index a2174a8eb74a..ce0abc664cf9 100644 --- a/maths/greatest_common_divisor.py +++ b/maths/greatest_common_divisor.py @@ -30,6 +30,8 @@ def greatest_common_divisor(a: int, b: int) -> int: 3 >>> greatest_common_divisor(-3, -9) 3 + >>> greatest_common_divisor(0, 0) + 0 """ return abs(b) if a == 0 else greatest_common_divisor(b % a, a) @@ -50,6 +52,8 @@ def gcd_by_iterative(x: int, y: int) -> int: 1 >>> gcd_by_iterative(11, 37) 1 + >>> gcd_by_iterative(0, 0) + 0 """ while y: # --> when y=0 then loop will terminate and return x as final GCD. x, y = y, x % y @@ -69,7 +73,7 @@ def main(): f"{greatest_common_divisor(num_1, num_2)}" ) print(f"By iterative gcd({num_1}, {num_2}) = {gcd_by_iterative(num_1, num_2)}") - except (IndexError, UnboundLocalError, ValueError): + except IndexError, UnboundLocalError, ValueError: print("Wrong input") diff --git a/maths/largest_of_very_large_numbers.py b/maths/largest_of_very_large_numbers.py index edee50371e02..e38ab2edb932 100644 --- a/maths/largest_of_very_large_numbers.py +++ b/maths/largest_of_very_large_numbers.py @@ -15,7 +15,7 @@ def res(x, y): >>> res(-1, 5) Traceback (most recent call last): ... - ValueError: math domain error + ValueError: expected a positive input """ if 0 not in (x, y): # We use the relation x^y = y*log10(x), where 10 is the base. diff --git a/maths/matrix_exponentiation.py b/maths/matrix_exponentiation.py index 7cdac9d34674..15b0c96e0f07 100644 --- a/maths/matrix_exponentiation.py +++ b/maths/matrix_exponentiation.py @@ -11,7 +11,7 @@ class Matrix: - def __init__(self, arg): + def __init__(self, arg: list[list] | int) -> None: if isinstance(arg, list): # Initializes a matrix identical to the one provided. self.t = arg self.n = len(arg) @@ -19,7 +19,7 @@ def __init__(self, arg): self.n = arg self.t = [[0 for _ in range(self.n)] for _ in range(self.n)] - def __mul__(self, b): + def __mul__(self, b: Matrix) -> Matrix: matrix = Matrix(self.n) for i in range(self.n): for j in range(self.n): @@ -28,7 +28,7 @@ def __mul__(self, b): return matrix -def modular_exponentiation(a, b): +def modular_exponentiation(a: Matrix, b: int) -> Matrix: matrix = Matrix([[1, 0], [0, 1]]) while b > 0: if b & 1: @@ -38,7 +38,7 @@ def modular_exponentiation(a, b): return matrix -def fibonacci_with_matrix_exponentiation(n, f1, f2): +def fibonacci_with_matrix_exponentiation(n: int, f1: int, f2: int) -> int: """ Returns the nth number of the Fibonacci sequence that starts with f1 and f2 @@ -64,7 +64,7 @@ def fibonacci_with_matrix_exponentiation(n, f1, f2): return f2 * matrix.t[0][0] + f1 * matrix.t[0][1] -def simple_fibonacci(n, f1, f2): +def simple_fibonacci(n: int, f1: int, f2: int) -> int: """ Returns the nth number of the Fibonacci sequence that starts with f1 and f2 @@ -95,7 +95,7 @@ def simple_fibonacci(n, f1, f2): return f2 -def matrix_exponentiation_time(): +def matrix_exponentiation_time() -> float: setup = """ from random import randint from __main__ import fibonacci_with_matrix_exponentiation @@ -106,7 +106,7 @@ def matrix_exponentiation_time(): return exec_time -def simple_fibonacci_time(): +def simple_fibonacci_time() -> float: setup = """ from random import randint from __main__ import simple_fibonacci @@ -119,7 +119,7 @@ def simple_fibonacci_time(): return exec_time -def main(): +def main() -> None: matrix_exponentiation_time() simple_fibonacci_time() diff --git a/maths/modular_division.py b/maths/modular_division.py index 2f8f4479b27d..ed4ae6ae8ce3 100644 --- a/maths/modular_division.py +++ b/maths/modular_division.py @@ -28,10 +28,14 @@ def modular_division(a: int, b: int, n: int) -> int: 4 """ - assert n > 1 - assert a > 0 - assert greatest_common_divisor(a, n) == 1 - (d, t, s) = extended_gcd(n, a) # Implemented below + if n <= 1: + raise ValueError("Modulus n must be greater than 1") + if a <= 0: + raise ValueError("Divisor a must be a positive integer") + if greatest_common_divisor(a, n) != 1: + raise ValueError("a and n must be coprime (gcd(a, n) = 1)") + + (_d, _t, s) = extended_gcd(n, a) # Implemented below x = (b * s) % n return x @@ -47,7 +51,7 @@ def invert_modulo(a: int, n: int) -> int: 1 """ - (b, x) = extended_euclid(a, n) # Implemented below + (b, _x) = extended_euclid(a, n) # Implemented below if b < 0: b = (b % n + n) % n return b diff --git a/maths/monte_carlo.py b/maths/monte_carlo.py index d174a0b188a2..5eb176238ffb 100644 --- a/maths/monte_carlo.py +++ b/maths/monte_carlo.py @@ -8,7 +8,7 @@ from statistics import mean -def pi_estimator(iterations: int): +def pi_estimator(iterations: int) -> None: """ An implementation of the Monte Carlo method used to find pi. 1. Draw a 2x2 square centred at (0,0). diff --git a/maths/numerical_analysis/weierstrass_method.py b/maths/numerical_analysis/weierstrass_method.py new file mode 100644 index 000000000000..b5a767af3a86 --- /dev/null +++ b/maths/numerical_analysis/weierstrass_method.py @@ -0,0 +1,97 @@ +from collections.abc import Callable + +import numpy as np + + +def weierstrass_method( + polynomial: Callable[[np.ndarray], np.ndarray], + degree: int, + roots: np.ndarray | None = None, + max_iter: int = 100, +) -> np.ndarray: + """ + Approximates all complex roots of a polynomial using the + Weierstrass (Durand-Kerner) method. + Args: + polynomial: A function that takes a NumPy array of complex numbers and returns + the polynomial values at those points. + degree: Degree of the polynomial (number of roots to find). Must be ≥ 1. + roots: Optional initial guess as a NumPy array of complex numbers. + Must have length equal to 'degree'. + If None, perturbed complex roots of unity are used. + max_iter: Number of iterations to perform (default: 100). + + Returns: + np.ndarray: Array of approximated complex roots. + + Raises: + ValueError: If degree < 1, or if initial roots length doesn't match the degree. + + Note: + - Root updates are clipped to prevent numerical overflow. + + Example: + >>> import numpy as np + >>> def check(poly, degree, expected): + ... roots = weierstrass_method(poly, degree) + ... return np.allclose(np.sort(roots), np.sort(expected)) + + >>> check( + ... lambda x: x**2 - 1, + ... 2, + ... np.array([-1, 1])) + True + + >>> check( + ... lambda x: x**3 - 4.5*x**2 + 5.75*x - 1.875, + ... 3, + ... np.array([1.5, 0.5, 2.5]) + ... ) + True + + See Also: + https://en.wikipedia.org/wiki/Durand%E2%80%93Kerner_method + """ + + if degree < 1: + raise ValueError("Degree of the polynomial must be at least 1.") + + if roots is None: + # Use perturbed complex roots of unity as initial guesses + rng = np.random.default_rng() + roots = np.array( + [ + np.exp(2j * np.pi * i / degree) * (1 + 1e-3 * rng.random()) + for i in range(degree) + ], + dtype=np.complex128, + ) + + else: + roots = np.asarray(roots, dtype=np.complex128) + if roots.shape[0] != degree: + raise ValueError( + "Length of initial roots must match the degree of the polynomial." + ) + + for _ in range(max_iter): + # Construct the product denominator for each root + denominator = np.array([root - roots for root in roots], dtype=np.complex128) + np.fill_diagonal(denominator, 1.0) # Avoid zero in diagonal + denominator = np.prod(denominator, axis=1) + + # Evaluate polynomial at each root + numerator = polynomial(roots).astype(np.complex128) + + # Compute update and clip to prevent overflow + delta = numerator / denominator + delta = np.clip(delta, -1e10, 1e10) + roots -= delta + + return roots + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/maths/prime_factors.py b/maths/prime_factors.py index 47abcf10e618..6eff57d12d17 100644 --- a/maths/prime_factors.py +++ b/maths/prime_factors.py @@ -47,6 +47,46 @@ def prime_factors(n: int) -> list[int]: return factors +def unique_prime_factors(n: int) -> list[int]: + """ + Returns unique prime factors of n as a list. + + >>> unique_prime_factors(0) + [] + >>> unique_prime_factors(100) + [2, 5] + >>> unique_prime_factors(2560) + [2, 5] + >>> unique_prime_factors(10**-2) + [] + >>> unique_prime_factors(0.02) + [] + >>> unique_prime_factors(10**241) + [2, 5] + >>> unique_prime_factors(10**-354) + [] + >>> unique_prime_factors('hello') + Traceback (most recent call last): + ... + TypeError: '<=' not supported between instances of 'int' and 'str' + >>> unique_prime_factors([1,2,'hello']) + Traceback (most recent call last): + ... + TypeError: '<=' not supported between instances of 'int' and 'list' + """ + i = 2 + factors = [] + while i * i <= n: + if not n % i: + while not n % i: + n //= i + factors.append(i) + i += 1 + if n > 1: + factors.append(n) + return factors + + if __name__ == "__main__": import doctest diff --git a/maths/radix2_fft.py b/maths/radix2_fft.py index ccd5cdcc0160..5efbccc7a17d 100644 --- a/maths/radix2_fft.py +++ b/maths/radix2_fft.py @@ -39,14 +39,14 @@ class FFT: >>> x = FFT(A, B) Print product - >>> x.product # 2x + 3x^2 + 8x^3 + 4x^4 + 6x^5 + >>> x.product # 2x + 3x^2 + 8x^3 + 6x^4 + 8x^5 [(-0-0j), (2+0j), (3-0j), (8-0j), (6+0j), (8+0j)] __str__ test >>> print(x) - A = 0*x^0 + 1*x^1 + 2*x^0 + 3*x^2 - B = 0*x^2 + 1*x^3 + 2*x^4 - A*B = 0*x^(-0-0j) + 1*x^(2+0j) + 2*x^(3-0j) + 3*x^(8-0j) + 4*x^(6+0j) + 5*x^(8+0j) + A = 0*x^0 + 1*x^1 + 0*x^2 + 2*x^3 + B = 2*x^0 + 3*x^1 + 4*x^2 + A*B = (-0-0j)*x^0 + (2+0j)*x^1 + (3-0j)*x^2 + (8-0j)*x^3 + (6+0j)*x^4 + (8+0j)*x^5 """ def __init__(self, poly_a=None, poly_b=None): @@ -159,13 +159,13 @@ def __multiply(self): # Overwrite __str__ for print(); Shows A, B and A*B def __str__(self): a = "A = " + " + ".join( - f"{coef}*x^{i}" for coef, i in enumerate(self.polyA[: self.len_A]) + f"{coef}*x^{i}" for i, coef in enumerate(self.polyA[: self.len_A]) ) b = "B = " + " + ".join( - f"{coef}*x^{i}" for coef, i in enumerate(self.polyB[: self.len_B]) + f"{coef}*x^{i}" for i, coef in enumerate(self.polyB[: self.len_B]) ) c = "A*B = " + " + ".join( - f"{coef}*x^{i}" for coef, i in enumerate(self.product) + f"{coef}*x^{i}" for i, coef in enumerate(self.product) ) return f"{a}\n{b}\n{c}" diff --git a/maths/special_numbers/proth_number.py b/maths/special_numbers/proth_number.py index 47747ed260f7..b9b827b6a5a2 100644 --- a/maths/special_numbers/proth_number.py +++ b/maths/special_numbers/proth_number.py @@ -59,6 +59,50 @@ def proth(number: int) -> int: return proth_list[number - 1] +def is_proth_number(number: int) -> bool: + """ + :param number: positive integer number + :return: true if number is a Proth number, false otherwise + >>> is_proth_number(1) + False + >>> is_proth_number(2) + False + >>> is_proth_number(3) + True + >>> is_proth_number(4) + False + >>> is_proth_number(5) + True + >>> is_proth_number(34) + False + >>> is_proth_number(-1) + Traceback (most recent call last): + ... + ValueError: Input value of [number=-1] must be > 0 + >>> is_proth_number(6.0) + Traceback (most recent call last): + ... + TypeError: Input value of [number=6.0] must be an integer + """ + if not isinstance(number, int): + message = f"Input value of [{number=}] must be an integer" + raise TypeError(message) + + if number <= 0: + message = f"Input value of [{number=}] must be > 0" + raise ValueError(message) + + if number == 1: + return False + + number -= 1 + n = 0 + while number % 2 == 0: + n += 1 + number //= 2 + return number < 2**n + + if __name__ == "__main__": import doctest @@ -73,3 +117,9 @@ def proth(number: int) -> int: continue print(f"The {number}th Proth number: {value}") + + for number in [1, 2, 3, 4, 5, 9, 13, 49, 57, 193, 241, 163, 201]: + if is_proth_number(number): + print(f"{number} is a Proth number") + else: + print(f"{number} is not a Proth number") diff --git a/maths/test_factorial.py b/maths/test_factorial.py new file mode 100644 index 000000000000..1795ebba194f --- /dev/null +++ b/maths/test_factorial.py @@ -0,0 +1,43 @@ +# /// script +# requires-python = ">=3.13" +# dependencies = [ +# "pytest", +# ] +# /// + +import pytest + +from maths.factorial import factorial, factorial_recursive + + +@pytest.mark.parametrize("function", [factorial, factorial_recursive]) +def test_zero(function): + assert function(0) == 1 + + +@pytest.mark.parametrize("function", [factorial, factorial_recursive]) +def test_positive_integers(function): + assert function(1) == 1 + assert function(5) == 120 + assert function(7) == 5040 + + +@pytest.mark.parametrize("function", [factorial, factorial_recursive]) +def test_large_number(function): + assert function(10) == 3628800 + + +@pytest.mark.parametrize("function", [factorial, factorial_recursive]) +def test_negative_number(function): + with pytest.raises(ValueError): + function(-3) + + +@pytest.mark.parametrize("function", [factorial, factorial_recursive]) +def test_float_number(function): + with pytest.raises(ValueError): + function(1.5) + + +if __name__ == "__main__": + pytest.main(["-v", __file__]) diff --git a/maths/volume.py b/maths/volume.py index 08bdf72b013b..1715c9c300d5 100644 --- a/maths/volume.py +++ b/maths/volume.py @@ -555,7 +555,7 @@ def main(): print(f"Torus: {vol_torus(2, 2) = }") # ~= 157.9 print(f"Conical Frustum: {vol_conical_frustum(2, 2, 4) = }") # ~= 58.6 print(f"Spherical cap: {vol_spherical_cap(1, 2) = }") # ~= 5.24 - print(f"Spheres intersetion: {vol_spheres_intersect(2, 2, 1) = }") # ~= 21.21 + print(f"Spheres intersection: {vol_spheres_intersect(2, 2, 1) = }") # ~= 21.21 print(f"Spheres union: {vol_spheres_union(2, 2, 1) = }") # ~= 45.81 print( f"Hollow Circular Cylinder: {vol_hollow_circular_cylinder(1, 2, 3) = }" diff --git a/matrix/matrix_class.py b/matrix/matrix_class.py index a5940a38e836..dee9247282f9 100644 --- a/matrix/matrix_class.py +++ b/matrix/matrix_class.py @@ -260,7 +260,7 @@ def add_row(self, row: list[int], position: int | None = None) -> None: if position is None: self.rows.append(row) else: - self.rows = self.rows[0:position] + [row] + self.rows[position:] + self.rows = [*self.rows[0:position], row, *self.rows[position:]] def add_column(self, column: list[int], position: int | None = None) -> None: type_error = TypeError( @@ -279,7 +279,7 @@ def add_column(self, column: list[int], position: int | None = None) -> None: self.rows = [self.rows[i] + [column[i]] for i in range(self.num_rows)] else: self.rows = [ - self.rows[i][0:position] + [column[i]] + self.rows[i][position:] + [*self.rows[i][0:position], column[i], *self.rows[i][position:]] for i in range(self.num_rows) ] diff --git a/neural_network/convolution_neural_network.py b/neural_network/convolution_neural_network.py index d4ac360a98de..6b1aa50c7981 100644 --- a/neural_network/convolution_neural_network.py +++ b/neural_network/convolution_neural_network.py @@ -317,7 +317,7 @@ def predict(self, datas_test): print((" - - Shape: Test_Data ", np.shape(datas_test))) for p in range(len(datas_test)): data_test = np.asmatrix(datas_test[p]) - data_focus1, data_conved1 = self.convolute( + _data_focus1, data_conved1 = self.convolute( data_test, self.conv1, self.w_conv1, @@ -339,7 +339,7 @@ def predict(self, datas_test): def convolution(self, data): # return the data of image after convoluting process so we can check it out data_test = np.asmatrix(data) - data_focus1, data_conved1 = self.convolute( + _data_focus1, data_conved1 = self.convolute( data_test, self.conv1, self.w_conv1, diff --git a/other/least_recently_used.py b/other/least_recently_used.py index cb692bb1b1c0..d96960868488 100644 --- a/other/least_recently_used.py +++ b/other/least_recently_used.py @@ -2,12 +2,12 @@ import sys from collections import deque -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") -class LRUCache(Generic[T]): +class LRUCache[T]: """ Page Replacement Algorithm, Least Recently Used (LRU) Caching. diff --git a/other/lfu_cache.py b/other/lfu_cache.py index 5a143c739b9d..6eaacff2966a 100644 --- a/other/lfu_cache.py +++ b/other/lfu_cache.py @@ -1,13 +1,13 @@ from __future__ import annotations from collections.abc import Callable -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") U = TypeVar("U") -class DoubleLinkedListNode(Generic[T, U]): +class DoubleLinkedListNode[T, U]: """ Double Linked List Node built specifically for LFU Cache @@ -30,7 +30,7 @@ def __repr__(self) -> str: ) -class DoubleLinkedList(Generic[T, U]): +class DoubleLinkedList[T, U]: """ Double Linked List built specifically for LFU Cache @@ -161,7 +161,7 @@ def remove( return node -class LFUCache(Generic[T, U]): +class LFUCache[T, U]: """ LFU Cache to store a given capacity of data. Can be used as a stand-alone object or as a function decorator. diff --git a/other/lru_cache.py b/other/lru_cache.py index 4f0c843c86cc..058b03b021bc 100644 --- a/other/lru_cache.py +++ b/other/lru_cache.py @@ -1,13 +1,13 @@ from __future__ import annotations from collections.abc import Callable -from typing import Generic, TypeVar +from typing import TypeVar T = TypeVar("T") U = TypeVar("U") -class DoubleLinkedListNode(Generic[T, U]): +class DoubleLinkedListNode[T, U]: """ Double Linked List Node built specifically for LRU Cache @@ -28,7 +28,7 @@ def __repr__(self) -> str: ) -class DoubleLinkedList(Generic[T, U]): +class DoubleLinkedList[T, U]: """ Double Linked List built specifically for LRU Cache @@ -143,7 +143,7 @@ def remove( return node -class LRUCache(Generic[T, U]): +class LRUCache[T, U]: """ LRU Cache to store a given capacity of data. Can be used as a stand-alone object or as a function decorator. diff --git a/other/sliding_window_maximum.py b/other/sliding_window_maximum.py new file mode 100644 index 000000000000..1c2c3c8e37e6 --- /dev/null +++ b/other/sliding_window_maximum.py @@ -0,0 +1,58 @@ +from collections import deque + + +def sliding_window_maximum(numbers: list[int], window_size: int) -> list[int]: + """ + Return a list containing the maximum of each sliding window of size window_size. + + This implementation uses a monotonic deque to achieve O(n) time complexity. + + Args: + numbers: List of integers representing the input array. + window_size: Size of the sliding window (must be positive). + + Returns: + List of maximum values for each valid window. + + Raises: + ValueError: If window_size is not a positive integer. + + Time Complexity: O(n) - each element is added and removed at most once + Space Complexity: O(k) - deque stores at most window_size indices + + Examples: + >>> sliding_window_maximum([1, 3, -1, -3, 5, 3, 6, 7], 3) + [3, 3, 5, 5, 6, 7] + >>> sliding_window_maximum([9, 11], 2) + [11] + >>> sliding_window_maximum([], 3) + [] + >>> sliding_window_maximum([4, 2, 12, 3], 1) + [4, 2, 12, 3] + >>> sliding_window_maximum([1], 1) + [1] + """ + if window_size <= 0: + raise ValueError("Window size must be a positive integer") + if not numbers: + return [] + + result: list[int] = [] + index_deque: deque[int] = deque() + + for current_index, current_value in enumerate(numbers): + # Remove the element which is out of this window + if index_deque and index_deque[0] == current_index - window_size: + index_deque.popleft() + + # Remove useless elements (smaller than current) from back + while index_deque and numbers[index_deque[-1]] < current_value: + index_deque.pop() + + index_deque.append(current_index) + + # Start adding to result once we have a full window + if current_index >= window_size - 1: + result.append(numbers[index_deque[0]]) + + return result diff --git a/project_euler/problem_002/sol4.py b/project_euler/problem_002/sol4.py index a13d34fd760e..3341aa1d4569 100644 --- a/project_euler/problem_002/sol4.py +++ b/project_euler/problem_002/sol4.py @@ -56,7 +56,7 @@ def solution(n: int = 4000000) -> int: try: n = int(n) - except (TypeError, ValueError): + except TypeError, ValueError: raise TypeError("Parameter n must be int or castable to int.") if n <= 0: raise ValueError("Parameter n must be greater than or equal to one.") diff --git a/project_euler/problem_003/sol1.py b/project_euler/problem_003/sol1.py index d1c0e61cf1a6..dbf9a84f68bb 100644 --- a/project_euler/problem_003/sol1.py +++ b/project_euler/problem_003/sol1.py @@ -80,7 +80,7 @@ def solution(n: int = 600851475143) -> int: try: n = int(n) - except (TypeError, ValueError): + except TypeError, ValueError: raise TypeError("Parameter n must be int or castable to int.") if n <= 0: raise ValueError("Parameter n must be greater than or equal to one.") diff --git a/project_euler/problem_003/sol2.py b/project_euler/problem_003/sol2.py index 0af0daceed06..4c4f88220514 100644 --- a/project_euler/problem_003/sol2.py +++ b/project_euler/problem_003/sol2.py @@ -44,7 +44,7 @@ def solution(n: int = 600851475143) -> int: try: n = int(n) - except (TypeError, ValueError): + except TypeError, ValueError: raise TypeError("Parameter n must be int or castable to int.") if n <= 0: raise ValueError("Parameter n must be greater than or equal to one.") diff --git a/project_euler/problem_003/sol3.py b/project_euler/problem_003/sol3.py index e13a0eb74ec1..1a454b618f75 100644 --- a/project_euler/problem_003/sol3.py +++ b/project_euler/problem_003/sol3.py @@ -44,7 +44,7 @@ def solution(n: int = 600851475143) -> int: try: n = int(n) - except (TypeError, ValueError): + except TypeError, ValueError: raise TypeError("Parameter n must be int or castable to int.") if n <= 0: raise ValueError("Parameter n must be greater than or equal to one.") diff --git a/project_euler/problem_005/sol1.py b/project_euler/problem_005/sol1.py index 01cbd0e15ff7..f889c420c61d 100644 --- a/project_euler/problem_005/sol1.py +++ b/project_euler/problem_005/sol1.py @@ -47,7 +47,7 @@ def solution(n: int = 20) -> int: try: n = int(n) - except (TypeError, ValueError): + except TypeError, ValueError: raise TypeError("Parameter n must be int or castable to int.") if n <= 0: raise ValueError("Parameter n must be greater than or equal to one.") diff --git a/project_euler/problem_007/sol2.py b/project_euler/problem_007/sol2.py index fd99453c1100..d63b2f2d86ec 100644 --- a/project_euler/problem_007/sol2.py +++ b/project_euler/problem_007/sol2.py @@ -87,7 +87,7 @@ def solution(nth: int = 10001) -> int: try: nth = int(nth) - except (TypeError, ValueError): + except TypeError, ValueError: raise TypeError("Parameter nth must be int or castable to int.") from None if nth <= 0: raise ValueError("Parameter nth must be greater than or equal to one.") diff --git a/project_euler/problem_009/sol4.py b/project_euler/problem_009/sol4.py new file mode 100644 index 000000000000..a07d40ccb54d --- /dev/null +++ b/project_euler/problem_009/sol4.py @@ -0,0 +1,60 @@ +""" +Project Euler Problem 9: https://projecteuler.net/problem=9 + +Special Pythagorean triplet + +A Pythagorean triplet is a set of three natural numbers, a < b < c, for which, + + a^2 + b^2 = c^2. + +For example, 3^2 + 4^2 = 9 + 16 = 25 = 5^2. + +There exists exactly one Pythagorean triplet for which a + b + c = 1000. +Find the product abc. + +References: + - https://en.wikipedia.org/wiki/Pythagorean_triple +""" + + +def get_squares(n: int) -> list[int]: + """ + >>> get_squares(0) + [] + >>> get_squares(1) + [0] + >>> get_squares(2) + [0, 1] + >>> get_squares(3) + [0, 1, 4] + >>> get_squares(4) + [0, 1, 4, 9] + """ + return [number * number for number in range(n)] + + +def solution(n: int = 1000) -> int: + """ + Precomputing squares and checking if a^2 + b^2 is the square by set look-up. + + >>> solution(12) + 60 + >>> solution(36) + 1620 + """ + + squares = get_squares(n) + squares_set = set(squares) + for a in range(1, n // 3): + for b in range(a + 1, (n - a) // 2 + 1): + if ( + squares[a] + squares[b] in squares_set + and squares[n - a - b] == squares[a] + squares[b] + ): + return a * b * (n - a - b) + + return -1 + + +if __name__ == "__main__": + print(f"{solution() = }") diff --git a/project_euler/problem_015/sol2.py b/project_euler/problem_015/sol2.py new file mode 100644 index 000000000000..903095e144ec --- /dev/null +++ b/project_euler/problem_015/sol2.py @@ -0,0 +1,32 @@ +""" +Problem 15: https://projecteuler.net/problem=15 + +Starting in the top left corner of a 2x2 grid, and only being able to move to +the right and down, there are exactly 6 routes to the bottom right corner. +How many such routes are there through a 20x20 grid? +""" + + +def solution(n: int = 20) -> int: + """ + Solve by explicitly counting the paths with dynamic programming. + + >>> solution(6) + 924 + >>> solution(2) + 6 + >>> solution(1) + 2 + """ + + counts = [[1 for _ in range(n + 1)] for _ in range(n + 1)] + + for i in range(1, n + 1): + for j in range(1, n + 1): + counts[i][j] = counts[i - 1][j] + counts[i][j - 1] + + return counts[n][n] + + +if __name__ == "__main__": + print(solution()) diff --git a/project_euler/problem_073/sol1.py b/project_euler/problem_073/sol1.py index 2b66b7d8769b..c39110252ccd 100644 --- a/project_euler/problem_073/sol1.py +++ b/project_euler/problem_073/sol1.py @@ -36,7 +36,12 @@ def solution(max_d: int = 12_000) -> int: fractions_number = 0 for d in range(max_d + 1): - for n in range(d // 3 + 1, (d + 1) // 2): + n_start = d // 3 + 1 + n_step = 1 + if d % 2 == 0: + n_start += 1 - n_start % 2 + n_step = 2 + for n in range(n_start, (d + 1) // 2, n_step): if gcd(n, d) == 1: fractions_number += 1 return fractions_number diff --git a/project_euler/problem_551/sol1.py b/project_euler/problem_551/sol1.py index 100e9d41dd31..e13cf77a776d 100644 --- a/project_euler/problem_551/sol1.py +++ b/project_euler/problem_551/sol1.py @@ -185,7 +185,7 @@ def solution(n: int = 10**15) -> int: i = 1 dn = 0 while True: - diff, terms_jumped = next_term(digits, 20, i + dn, n) + _diff, terms_jumped = next_term(digits, 20, i + dn, n) dn += terms_jumped if dn == n - i: break diff --git a/pyproject.toml b/pyproject.toml index 2ead5cd51ae8..34e099a46435 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -3,25 +3,27 @@ name = "thealgorithms-python" version = "0.0.1" description = "TheAlgorithms in Python" authors = [ { name = "TheAlgorithms Contributors" } ] -requires-python = ">=3.13" +requires-python = ">=3.14" classifiers = [ "Programming Language :: Python :: 3 :: Only", - "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", ] dependencies = [ "beautifulsoup4>=4.12.3", + "cython>=3.1.2", "fake-useragent>=1.5.1", "httpx>=0.28.1", "imageio>=2.36.1", "keras>=3.7", - "lxml>=5.3", + "lxml>=6", "matplotlib>=3.9.3", "numpy>=2.1.3", "opencv-python>=4.10.0.84", "pandas>=2.2.3", - "pillow>=11", + "pillow>=11.3", "rich>=13.9.4", "scikit-learn>=1.5.2", + "scipy>=1.16.2", "sphinx-pyproject>=0.3", "statsmodels>=0.14.4", "sympy>=1.13.3", @@ -32,10 +34,9 @@ dependencies = [ [dependency-groups] test = [ - "pytest>=8.3.4", + "pytest>=8.4.1", "pytest-cov>=6", ] - docs = [ "myst-parser>=4", "sphinx-autoapi>=3.4", @@ -47,8 +48,7 @@ euler-validate = [ ] [tool.ruff] -target-version = "py313" - +target-version = "py314" output-format = "full" lint.select = [ # https://beta.ruff.rs/docs/rules @@ -108,22 +108,24 @@ lint.ignore = [ # `ruff rule S101` for a description of that rule "B904", # Within an `except` clause, raise exceptions with `raise ... from err` -- FIX ME "B905", # `zip()` without an explicit `strict=` parameter -- FIX ME - "EM101", # Exception must not use a string literal, assign to variable first + "EM101", # Exception must not use a string literal, assign to a variable first "EXE001", # Shebang is present but file is not executable -- DO NOT FIX "G004", # Logging statement uses f-string "ISC001", # Conflicts with ruff format -- DO NOT FIX + "PLC0415", # import-outside-top-level -- DO NOT FIX "PLC1901", # `{}` can be simplified to `{}` as an empty string is falsey "PLW060", # Using global for `{name}` but no assignment is done -- DO NOT FIX + "PLW1641", # eq-without-hash "PLW2901", # PLW2901: Redefined loop variable -- FIX ME "PT011", # `pytest.raises(Exception)` is too broad, set the `match` parameter or use a more specific exception "PT018", # Assertion should be broken down into multiple parts + "PT028", # pytest-parameter-with-default-argument "S101", # Use of `assert` detected -- DO NOT FIX "S311", # Standard pseudo-random generators are not suitable for cryptographic purposes -- FIX ME "SIM905", # Consider using a list literal instead of `str.split` -- DO NOT FIX "SLF001", # Private member accessed: `_Iterator` -- FIX ME - "UP038", # Use `X | Y` in `{}` call instead of `(X, Y)` -- DO NOT FIX + "UP037", # FIX ME ] - lint.per-file-ignores."data_structures/hashing/tests/test_hash_map.py" = [ "BLE001", ] @@ -145,37 +147,43 @@ lint.per-file-ignores."project_euler/problem_099/sol1.py" = [ lint.per-file-ignores."sorts/external_sort.py" = [ "SIM115", ] -lint.mccabe.max-complexity = 17 # default: 10 +lint.mccabe.max-complexity = 17 # default: 10 lint.pylint.allow-magic-value-types = [ "float", "int", "str", ] -lint.pylint.max-args = 10 # default: 5 -lint.pylint.max-branches = 20 # default: 12 -lint.pylint.max-returns = 8 # default: 6 -lint.pylint.max-statements = 88 # default: 50 +lint.pylint.max-args = 10 # default: 5 +lint.pylint.max-branches = 20 # default: 12 +lint.pylint.max-returns = 8 # default: 6 +lint.pylint.max-statements = 88 # default: 50 [tool.codespell] ignore-words-list = "3rt,abd,aer,ans,bitap,crate,damon,fo,followings,hist,iff,kwanza,manuel,mater,secant,som,sur,tim,toi,zar" -skip = "./.*,*.json,*.lock,ciphers/prehistoric_men.txt,project_euler/problem_022/p022_names.txt,pyproject.toml,strings/dictionary.txt,strings/words.txt" +skip = """\ + ./.*,*.json,*.lock,ciphers/prehistoric_men.txt,project_euler/problem_022/p022_names.txt,pyproject.toml,strings/dictio\ + nary.txt,strings/words.txt\ + """ + +[tool.mypy] +python_version = "3.14" -[tool.pytest.ini_options] -markers = [ +[tool.pytest] +ini_options.markers = [ "mat_ops: mark a test as utilizing matrix operations.", ] -addopts = [ +ini_options.addopts = [ "--durations=10", "--doctest-modules", "--showlocals", ] -[tool.coverage.report] -omit = [ +[tool.coverage] +report.omit = [ ".env/*", "project_euler/*", ] -sort = "Cover" +report.sort = "Cover" [tool.sphinx-pyproject] copyright = "2014, TheAlgorithms" @@ -256,7 +264,6 @@ myst_fence_as_directive = [ "include", ] templates_path = [ "_templates" ] -[tool.sphinx-pyproject.source_suffix] -".rst" = "restructuredtext" +source_suffix.".rst" = "restructuredtext" # ".txt" = "markdown" -".md" = "markdown" +source_suffix.".md" = "markdown" diff --git a/requirements.txt b/requirements.txt deleted file mode 100644 index 66b5d8a6b94e..000000000000 --- a/requirements.txt +++ /dev/null @@ -1,19 +0,0 @@ -beautifulsoup4 -fake-useragent -httpx -imageio -keras -lxml -matplotlib -numpy -opencv-python -pandas -pillow -rich -scikit-learn -sphinx-pyproject -statsmodels -sympy -tweepy -typing_extensions -xgboost diff --git a/scheduling/multi_level_feedback_queue.py b/scheduling/multi_level_feedback_queue.py index abee3c85c5a5..58ba2afa0e67 100644 --- a/scheduling/multi_level_feedback_queue.py +++ b/scheduling/multi_level_feedback_queue.py @@ -255,7 +255,7 @@ def multi_level_feedback_queue(self) -> deque[Process]: # all queues except last one have round_robin algorithm for i in range(self.number_of_queues - 1): - finished, self.ready_queue = self.round_robin( + _finished, self.ready_queue = self.round_robin( self.ready_queue, self.time_slices[i] ) # the last queue has first_come_first_served algorithm diff --git a/scripts/README.md b/scripts/README.md new file mode 100644 index 000000000000..92ebf3a7e8ba --- /dev/null +++ b/scripts/README.md @@ -0,0 +1,27 @@ +Dealing with the onslaught of Hacktoberfest +* https://hacktoberfest.com + +Each year, October brings a swarm of new contributors participating in Hacktoberfest. This event has its pros and cons, but it presents a monumental workload for the few active maintainers of this repo. The maintainer workload is further impacted by a new version of CPython being released in the first week of each October. + +To help make our algorithms more valuable to visitors, our CONTRIBUTING.md file outlines several strict requirements, such as tests, type hints, descriptive names, functions, and/or classes. Maintainers reviewing pull requests should try to encourage improvements to meet these goals, but when the workload becomes overwhelming (esp. in October), pull requests that do not meet these goals should be closed. + +Below are a few [`gh`](https://cli.github.com) scripts that should close pull requests that do not match the definition of an acceptable algorithm as defined in CONTRIBUTING.md. I tend to run these scripts in the following order. + +* close_pull_requests_with_require_descriptive_names.sh +* close_pull_requests_with_require_tests.sh +* close_pull_requests_with_require_type_hints.sh +* close_pull_requests_with_failing_tests.sh +* close_pull_requests_with_awaiting_changes.sh +* find_git_conflicts.sh + +### Run on 14 Oct 2025: 107 of 541 (19.77%) pull requests closed. + +Script run | Open pull requests | Pull requests closed +--- | --- | --- +None | 541 | 0 +require_descriptive_names | 515 | 26 +require_tests | 498 | 17 +require_type_hints | 496 | 2 +failing_tests | 438 | ___58___ +awaiting_changes | 434 | 4 +git_conflicts | [ broken ] | 0 diff --git a/scripts/build_directory_md.py b/scripts/build_directory_md.py index aa95b95db4b5..bdad7686c7e3 100755 --- a/scripts/build_directory_md.py +++ b/scripts/build_directory_md.py @@ -18,8 +18,20 @@ def good_file_paths(top_dir: str = ".") -> Iterator[str]: yield os.path.join(dir_path, filename).lstrip("./") -def md_prefix(i): - return f"{i * ' '}*" if i else "\n##" +def md_prefix(indent: int) -> str: + """ + Markdown prefix based on indent for bullet points + + >>> md_prefix(0) + '\\n##' + >>> md_prefix(1) + ' *' + >>> md_prefix(2) + ' *' + >>> md_prefix(3) + ' *' + """ + return f"{indent * ' '}*" if indent else "\n##" def print_path(old_path: str, new_path: str) -> str: diff --git a/searches/binary_search.py b/searches/binary_search.py index 2e66b672d5b4..bec87b3c5aec 100644 --- a/searches/binary_search.py +++ b/searches/binary_search.py @@ -10,9 +10,8 @@ python3 binary_search.py """ -from __future__ import annotations - import bisect +from itertools import pairwise def bisect_left( @@ -198,7 +197,7 @@ def binary_search(sorted_collection: list[int], item: int) -> int: >>> binary_search([0, 5, 7, 10, 15], 6) -1 """ - if list(sorted_collection) != sorted(sorted_collection): + if any(a > b for a, b in pairwise(sorted_collection)): raise ValueError("sorted_collection must be sorted in ascending order") left = 0 right = len(sorted_collection) - 1 @@ -243,6 +242,81 @@ def binary_search_std_lib(sorted_collection: list[int], item: int) -> int: return -1 +def binary_search_with_duplicates(sorted_collection: list[int], item: int) -> list[int]: + """Pure implementation of a binary search algorithm in Python that supports + duplicates. + + Resources used: + https://stackoverflow.com/questions/13197552/using-binary-search-with-sorted-array-with-duplicates + + The collection must be sorted in ascending order; otherwise the result will be + unpredictable. If the target appears multiple times, this function returns a + list of all indexes where the target occurs. If the target is not found, + this function returns an empty list. + + :param sorted_collection: some ascending sorted collection with comparable items + :param item: item value to search for + :return: a list of indexes where the item is found (empty list if not found) + + Examples: + >>> binary_search_with_duplicates([0, 5, 7, 10, 15], 0) + [0] + >>> binary_search_with_duplicates([0, 5, 7, 10, 15], 15) + [4] + >>> binary_search_with_duplicates([1, 2, 2, 2, 3], 2) + [1, 2, 3] + >>> binary_search_with_duplicates([1, 2, 2, 2, 3], 4) + [] + """ + if list(sorted_collection) != sorted(sorted_collection): + raise ValueError("sorted_collection must be sorted in ascending order") + + def lower_bound(sorted_collection: list[int], item: int) -> int: + """ + Returns the index of the first element greater than or equal to the item. + + :param sorted_collection: The sorted list to search. + :param item: The item to find the lower bound for. + :return: The index where the item can be inserted while maintaining order. + """ + left = 0 + right = len(sorted_collection) + while left < right: + midpoint = left + (right - left) // 2 + current_item = sorted_collection[midpoint] + if current_item < item: + left = midpoint + 1 + else: + right = midpoint + return left + + def upper_bound(sorted_collection: list[int], item: int) -> int: + """ + Returns the index of the first element strictly greater than the item. + + :param sorted_collection: The sorted list to search. + :param item: The item to find the upper bound for. + :return: The index where the item can be inserted after all existing instances. + """ + left = 0 + right = len(sorted_collection) + while left < right: + midpoint = left + (right - left) // 2 + current_item = sorted_collection[midpoint] + if current_item <= item: + left = midpoint + 1 + else: + right = midpoint + return left + + left = lower_bound(sorted_collection, item) + right = upper_bound(sorted_collection, item) + + if left == len(sorted_collection) or sorted_collection[left] != item: + return [] + return list(range(left, right)) + + def binary_search_by_recursion( sorted_collection: list[int], item: int, left: int = 0, right: int = -1 ) -> int: diff --git a/searches/jump_search.py b/searches/jump_search.py index e72d85e8a868..437faf306bb2 100644 --- a/searches/jump_search.py +++ b/searches/jump_search.py @@ -10,17 +10,14 @@ import math from collections.abc import Sequence -from typing import Any, Protocol, TypeVar +from typing import Any, Protocol class Comparable(Protocol): def __lt__(self, other: Any, /) -> bool: ... -T = TypeVar("T", bound=Comparable) - - -def jump_search(arr: Sequence[T], item: T) -> int: +def jump_search[T: Comparable](arr: Sequence[T], item: T) -> int: """ Python implementation of the jump search algorithm. Return the index if the `item` is found, otherwise return -1. diff --git a/searches/linear_search.py b/searches/linear_search.py index ba6e81d6bae4..8adb4a7015f0 100644 --- a/searches/linear_search.py +++ b/searches/linear_search.py @@ -1,5 +1,5 @@ """ -This is pure Python implementation of linear search algorithm +This is a pure Python implementation of the linear search algorithm. For doctests run following command: python3 -m doctest -v linear_search.py @@ -12,8 +12,8 @@ def linear_search(sequence: list, target: int) -> int: """A pure Python implementation of a linear search algorithm - :param sequence: a collection with comparable items (as sorted items not required - in Linear Search) + :param sequence: a collection with comparable items (sorting is not required for + linear search) :param target: item value to search :return: index of found item or -1 if item is not found diff --git a/sorts/binary_insertion_sort.py b/sorts/binary_insertion_sort.py index 50653a99e7ce..b928316a849d 100644 --- a/sorts/binary_insertion_sort.py +++ b/sorts/binary_insertion_sort.py @@ -56,7 +56,7 @@ def binary_insertion_sort(collection: list) -> list: return collection -if __name__ == "__main": +if __name__ == "__main__": user_input = input("Enter numbers separated by a comma:\n").strip() try: unsorted = [int(item) for item in user_input.split(",")] diff --git a/sorts/bogo_sort.py b/sorts/bogo_sort.py index 9c133f0d8a55..70785140ee5c 100644 --- a/sorts/bogo_sort.py +++ b/sorts/bogo_sort.py @@ -16,7 +16,7 @@ import random -def bogo_sort(collection): +def bogo_sort(collection: list) -> list: """Pure implementation of the bogosort algorithm in Python :param collection: some mutable ordered collection with heterogeneous comparable items inside @@ -30,7 +30,7 @@ def bogo_sort(collection): [-45, -5, -2] """ - def is_sorted(collection): + def is_sorted(collection: list) -> bool: for i in range(len(collection) - 1): if collection[i] > collection[i + 1]: return False diff --git a/sorts/bubble_sort.py b/sorts/bubble_sort.py index 9ec3d5384f38..4d658a4a12e4 100644 --- a/sorts/bubble_sort.py +++ b/sorts/bubble_sort.py @@ -6,7 +6,7 @@ def bubble_sort_iterative(collection: list[Any]) -> list[Any]: :param collection: some mutable ordered collection with heterogeneous comparable items inside - :return: the same collection ordered by ascending + :return: the same collection ordered in ascending order Examples: >>> bubble_sort_iterative([0, 5, 2, 3, 2]) @@ -17,6 +17,12 @@ def bubble_sort_iterative(collection: list[Any]) -> list[Any]: [-45, -5, -2] >>> bubble_sort_iterative([-23, 0, 6, -4, 34]) [-23, -4, 0, 6, 34] + >>> bubble_sort_iterative([1, 2, 3, 4]) + [1, 2, 3, 4] + >>> bubble_sort_iterative([3, 3, 3, 3]) + [3, 3, 3, 3] + >>> bubble_sort_iterative([56]) + [56] >>> bubble_sort_iterative([0, 5, 2, 3, 2]) == sorted([0, 5, 2, 3, 2]) True >>> bubble_sort_iterative([]) == sorted([]) @@ -63,7 +69,7 @@ def bubble_sort_recursive(collection: list[Any]) -> list[Any]: Examples: >>> bubble_sort_recursive([0, 5, 2, 3, 2]) [0, 2, 2, 3, 5] - >>> bubble_sort_iterative([]) + >>> bubble_sort_recursive([]) [] >>> bubble_sort_recursive([-2, -45, -5]) [-45, -5, -2] diff --git a/sorts/bucket_sort.py b/sorts/bucket_sort.py index 1c1320a58a7d..893c7ff3a23a 100644 --- a/sorts/bucket_sort.py +++ b/sorts/bucket_sort.py @@ -51,12 +51,35 @@ def bucket_sort(my_list: list, bucket_count: int = 10) -> list: >>> collection = random.sample(range(-50, 50), 50) >>> bucket_sort(collection) == sorted(collection) True + >>> data = [1, 2, 2, 1, 1, 3] + >>> bucket_sort(data) == sorted(data) + True + >>> data = [5, 5, 5, 5, 5] + >>> bucket_sort(data) == sorted(data) + True + >>> data = [1000, -1000, 500, -500, 0] + >>> bucket_sort(data) == sorted(data) + True + >>> data = [5.5, 2.2, -1.1, 3.3, 0.0] + >>> bucket_sort(data) == sorted(data) + True + >>> bucket_sort([1]) == [1] + True + >>> data = [-1.1, -1.5, -3.4, 2.5, 3.6, -3.3] + >>> bucket_sort(data) == sorted(data) + True + >>> data = [9, 2, 7, 1, 5] + >>> bucket_sort(data) == sorted(data) + True """ if len(my_list) == 0 or bucket_count <= 0: return [] min_value, max_value = min(my_list), max(my_list) + if min_value == max_value: + return my_list + bucket_size = (max_value - min_value) / bucket_count buckets: list[list] = [[] for _ in range(bucket_count)] @@ -73,3 +96,6 @@ def bucket_sort(my_list: list, bucket_count: int = 10) -> list: testmod() assert bucket_sort([4, 5, 3, 2, 1]) == [1, 2, 3, 4, 5] assert bucket_sort([0, 1, -10, 15, 2, -2]) == [-10, -2, 0, 1, 2, 15] + assert bucket_sort([1.1, 1.2, -1.2, 0, 2.4]) == [-1.2, 0, 1.1, 1.2, 2.4] + assert bucket_sort([5, 5, 5, 5, 5]) == [5, 5, 5, 5, 5] + assert bucket_sort([-5, -1, -6, -2]) == [-6, -5, -2, -1] diff --git a/sorts/comb_sort.py b/sorts/comb_sort.py index 3c8b1e99a454..94ad8f533328 100644 --- a/sorts/comb_sort.py +++ b/sorts/comb_sort.py @@ -5,8 +5,7 @@ Comb sort improves on bubble sort algorithm. In bubble sort, distance (or gap) between two compared elements is always one. Comb sort improvement is that gap can be much more than 1, in order to prevent slowing -down by small values -at the end of a list. +down by small values at the end of a list. More info on: https://en.wikipedia.org/wiki/Comb_sort diff --git a/sorts/cyclic_sort.py b/sorts/cyclic_sort.py new file mode 100644 index 000000000000..9e81291548d4 --- /dev/null +++ b/sorts/cyclic_sort.py @@ -0,0 +1,55 @@ +""" +This is a pure Python implementation of the Cyclic Sort algorithm. + +For doctests run following command: +python -m doctest -v cyclic_sort.py +or +python3 -m doctest -v cyclic_sort.py +For manual testing run: +python cyclic_sort.py +or +python3 cyclic_sort.py +""" + + +def cyclic_sort(nums: list[int]) -> list[int]: + """ + Sorts the input list of n integers from 1 to n in-place + using the Cyclic Sort algorithm. + + :param nums: List of n integers from 1 to n to be sorted. + :return: The same list sorted in ascending order. + + Time complexity: O(n), where n is the number of integers in the list. + + Examples: + >>> cyclic_sort([]) + [] + >>> cyclic_sort([3, 5, 2, 1, 4]) + [1, 2, 3, 4, 5] + """ + + # Perform cyclic sort + index = 0 + while index < len(nums): + # Calculate the correct index for the current element + correct_index = nums[index] - 1 + # If the current element is not at its correct position, + # swap it with the element at its correct index + if index != correct_index: + nums[index], nums[correct_index] = nums[correct_index], nums[index] + else: + # If the current element is already in its correct position, + # move to the next element + index += 1 + + return nums + + +if __name__ == "__main__": + import doctest + + doctest.testmod() + user_input = input("Enter numbers separated by a comma:\n").strip() + unsorted = [int(item) for item in user_input.split(",")] + print(*cyclic_sort(unsorted), sep=",") diff --git a/sorts/insertion_sort.py b/sorts/insertion_sort.py index 46b263d84a33..2e39be255df7 100644 --- a/sorts/insertion_sort.py +++ b/sorts/insertion_sort.py @@ -24,7 +24,7 @@ def __lt__(self, other: Any, /) -> bool: ... T = TypeVar("T", bound=Comparable) -def insertion_sort(collection: MutableSequence[T]) -> MutableSequence[T]: +def insertion_sort[T: Comparable](collection: MutableSequence[T]) -> MutableSequence[T]: """A pure Python implementation of the insertion sort algorithm :param collection: some mutable ordered collection with heterogeneous diff --git a/sorts/merge_sort.py b/sorts/merge_sort.py index 0628b848b794..11c202788035 100644 --- a/sorts/merge_sort.py +++ b/sorts/merge_sort.py @@ -18,6 +18,7 @@ def merge_sort(collection: list) -> list: :return: The same collection ordered in ascending order. Time Complexity: O(n log n) + Space Complexity: O(n) Examples: >>> merge_sort([0, 5, 3, 2, 2]) diff --git a/sorts/pigeonhole_sort.py b/sorts/pigeonhole_sort.py index bfa9bb11b8a6..7fbc6188cfb4 100644 --- a/sorts/pigeonhole_sort.py +++ b/sorts/pigeonhole_sort.py @@ -10,7 +10,11 @@ def pigeonhole_sort(a): >>> pigeonhole_sort(a) # a destructive sort >>> a == b True + + >>> pigeonhole_sort([]) """ + if not a: + return # size of range of values in the list (ie, number of pigeonholes we need) min_val = min(a) # min() finds the minimum value @@ -38,7 +42,7 @@ def pigeonhole_sort(a): def main(): a = [8, 3, 2, 7, 4, 6, 8] pigeonhole_sort(a) - print("Sorted order is:", " ".join(a)) + print("Sorted order is:", *a) if __name__ == "__main__": diff --git a/sorts/stalin_sort.py b/sorts/stalin_sort.py new file mode 100644 index 000000000000..6dd5708c7f01 --- /dev/null +++ b/sorts/stalin_sort.py @@ -0,0 +1,47 @@ +""" +Stalin Sort algorithm: Removes elements that are out of order. +Elements that are not greater than or equal to the previous element are discarded. +Reference: https://medium.com/@kaweendra/the-ultimate-sorting-algorithm-6513d6968420 +""" + + +def stalin_sort(sequence: list[int]) -> list[int]: + """ + Sorts a list using the Stalin sort algorithm. + + >>> stalin_sort([4, 3, 5, 2, 1, 7]) + [4, 5, 7] + + >>> stalin_sort([1, 2, 3, 4]) + [1, 2, 3, 4] + + >>> stalin_sort([4, 5, 5, 2, 3]) + [4, 5, 5] + + >>> stalin_sort([6, 11, 12, 4, 1, 5]) + [6, 11, 12] + + >>> stalin_sort([5, 0, 4, 3]) + [5] + + >>> stalin_sort([5, 4, 3, 2, 1]) + [5] + + >>> stalin_sort([1, 2, 3, 4, 5]) + [1, 2, 3, 4, 5] + + >>> stalin_sort([1, 2, 8, 7, 6]) + [1, 2, 8] + """ + result = [sequence[0]] + for element in sequence[1:]: + if element >= result[-1]: + result.append(element) + + return result + + +if __name__ == "__main__": + import doctest + + doctest.testmod() diff --git a/sorts/tim_sort.py b/sorts/tim_sort.py index 138f11c71bcc..2eeed88b7399 100644 --- a/sorts/tim_sort.py +++ b/sorts/tim_sort.py @@ -1,4 +1,7 @@ -def binary_search(lst, item, start, end): +from typing import Any + + +def binary_search(lst: list[Any], item: Any, start: int, end: int) -> int: if start == end: return start if lst[start] > item else start + 1 if start > end: @@ -13,18 +16,18 @@ def binary_search(lst, item, start, end): return mid -def insertion_sort(lst): +def insertion_sort(lst: list[Any]) -> list[Any]: length = len(lst) for index in range(1, length): value = lst[index] pos = binary_search(lst, value, 0, index - 1) - lst = lst[:pos] + [value] + lst[pos:index] + lst[index + 1 :] + lst = [*lst[:pos], value, *lst[pos:index], *lst[index + 1 :]] return lst -def merge(left, right): +def merge(left: list[Any], right: list[Any]) -> list[Any]: if not left: return right @@ -37,7 +40,7 @@ def merge(left, right): return [right[0], *merge(left, right[1:])] -def tim_sort(lst): +def tim_sort(lst: list[Any] | tuple[Any, ...] | str) -> list[Any]: """ >>> tim_sort("Python") ['P', 'h', 'n', 'o', 't', 'y'] @@ -53,7 +56,7 @@ def tim_sort(lst): length = len(lst) runs, sorted_runs = [], [] new_run = [lst[0]] - sorted_array = [] + sorted_array: list[Any] = [] i = 1 while i < length: if lst[i] < lst[i - 1]: diff --git a/sorts/unknown_sort.py b/sorts/unknown_sort.py index 9fa9d22fb5e0..3545da68ea80 100644 --- a/sorts/unknown_sort.py +++ b/sorts/unknown_sort.py @@ -6,7 +6,7 @@ """ -def merge_sort(collection): +def merge_sort(collection: list) -> list: """Pure implementation of the fastest merge sort algorithm in Python :param collection: some mutable ordered collection with heterogeneous diff --git a/strings/anagrams.py b/strings/anagrams.py index fb9ac0bd1f45..71cc142fb2ad 100644 --- a/strings/anagrams.py +++ b/strings/anagrams.py @@ -6,19 +6,26 @@ def signature(word: str) -> str: - """Return a word sorted + """ + Return a word's frequency-based signature. + >>> signature("test") - 'estt' + 'e1s1t2' >>> signature("this is a test") - ' aehiisssttt' + ' 3a1e1h1i2s3t3' >>> signature("finaltest") - 'aefilnstt' + 'a1e1f1i1l1n1s1t2' """ - return "".join(sorted(word)) + frequencies = collections.Counter(word) + return "".join( + f"{char}{frequency}" for char, frequency in sorted(frequencies.items()) + ) def anagram(my_word: str) -> list[str]: - """Return every anagram of the given word + """ + Return every anagram of the given word from the dictionary. + >>> anagram('test') ['sett', 'stet', 'test'] >>> anagram('this is a test') @@ -40,5 +47,5 @@ def anagram(my_word: str) -> list[str]: all_anagrams = {word: anagram(word) for word in word_list if len(anagram(word)) > 1} with open("anagrams.txt", "w") as file: - file.write("all_anagrams = \n ") + file.write("all_anagrams = \n") file.write(pprint.pformat(all_anagrams)) diff --git a/strings/capitalize.py b/strings/capitalize.py index c0b45e0d9614..628ebffc8852 100644 --- a/strings/capitalize.py +++ b/strings/capitalize.py @@ -1,6 +1,3 @@ -from string import ascii_lowercase, ascii_uppercase - - def capitalize(sentence: str) -> str: """ Capitalizes the first letter of a sentence or word. @@ -19,11 +16,9 @@ def capitalize(sentence: str) -> str: if not sentence: return "" - # Create a dictionary that maps lowercase letters to uppercase letters # Capitalize the first character if it's a lowercase letter # Concatenate the capitalized character with the rest of the string - lower_to_upper = dict(zip(ascii_lowercase, ascii_uppercase)) - return lower_to_upper.get(sentence[0], sentence[0]) + sentence[1:] + return sentence[0].upper() + sentence[1:] if __name__ == "__main__": diff --git a/strings/count_vowels.py b/strings/count_vowels.py index 8a52b331c81b..e222d80590ff 100644 --- a/strings/count_vowels.py +++ b/strings/count_vowels.py @@ -22,7 +22,7 @@ def count_vowels(s: str) -> int: 1 """ if not isinstance(s, str): - raise ValueError("Input must be a string") + raise TypeError("Input must be a string") vowels = "aeiouAEIOU" return sum(1 for char in s if char in vowels) diff --git a/strings/edit_distance.py b/strings/edit_distance.py index e842c8555c8e..77ed23037937 100644 --- a/strings/edit_distance.py +++ b/strings/edit_distance.py @@ -14,6 +14,20 @@ def edit_distance(source: str, target: str) -> int: >>> edit_distance("GATTIC", "GALTIC") 1 + >>> edit_distance("NUM3", "HUM2") + 2 + >>> edit_distance("cap", "CAP") + 3 + >>> edit_distance("Cat", "") + 3 + >>> edit_distance("cat", "cat") + 0 + >>> edit_distance("", "123456789") + 9 + >>> edit_distance("Be@uty", "Beautyyyy!") + 5 + >>> edit_distance("lstring", "lsstring") + 1 """ if len(source) == 0: return len(target) diff --git a/strings/palindrome.py b/strings/palindrome.py index bfdb3ddcf396..4df5639b0c49 100644 --- a/strings/palindrome.py +++ b/strings/palindrome.py @@ -11,16 +11,18 @@ "BB": True, "ABC": False, "amanaplanacanalpanama": True, # "a man a plan a canal panama" + "abcdba": False, + "AB": False, } # Ensure our test data is valid -assert all((key == key[::-1]) is value for key, value in test_data.items()) +assert all((key == key[::-1]) == value for key, value in test_data.items()) def is_palindrome(s: str) -> bool: """ Return True if s is a palindrome otherwise return False. - >>> all(is_palindrome(key) is value for key, value in test_data.items()) + >>> all(is_palindrome(key) == value for key, value in test_data.items()) True """ @@ -39,7 +41,7 @@ def is_palindrome_traversal(s: str) -> bool: """ Return True if s is a palindrome otherwise return False. - >>> all(is_palindrome_traversal(key) is value for key, value in test_data.items()) + >>> all(is_palindrome_traversal(key) == value for key, value in test_data.items()) True """ end = len(s) // 2 @@ -58,10 +60,10 @@ def is_palindrome_recursive(s: str) -> bool: """ Return True if s is a palindrome otherwise return False. - >>> all(is_palindrome_recursive(key) is value for key, value in test_data.items()) + >>> all(is_palindrome_recursive(key) == value for key, value in test_data.items()) True """ - if len(s) <= 2: + if len(s) <= 1: return True if s[0] == s[len(s) - 1]: return is_palindrome_recursive(s[1:-1]) @@ -73,14 +75,14 @@ def is_palindrome_slice(s: str) -> bool: """ Return True if s is a palindrome otherwise return False. - >>> all(is_palindrome_slice(key) is value for key, value in test_data.items()) + >>> all(is_palindrome_slice(key) == value for key, value in test_data.items()) True """ return s == s[::-1] def benchmark_function(name: str) -> None: - stmt = f"all({name}(key) is value for key, value in test_data.items())" + stmt = f"all({name}(key) == value for key, value in test_data.items())" setup = f"from __main__ import test_data, {name}" number = 500000 result = timeit(stmt=stmt, setup=setup, number=number) @@ -89,8 +91,8 @@ def benchmark_function(name: str) -> None: if __name__ == "__main__": for key, value in test_data.items(): - assert is_palindrome(key) is is_palindrome_recursive(key) - assert is_palindrome(key) is is_palindrome_slice(key) + assert is_palindrome(key) == is_palindrome_recursive(key) + assert is_palindrome(key) == is_palindrome_slice(key) print(f"{key:21} {value}") print("a man a plan a canal panama") diff --git a/strings/reverse_letters.py b/strings/reverse_letters.py index 4f73f816b382..cd1b7832d066 100644 --- a/strings/reverse_letters.py +++ b/strings/reverse_letters.py @@ -1,7 +1,7 @@ def reverse_letters(sentence: str, length: int = 0) -> str: """ Reverse all words that are longer than the given length of characters in a sentence. - If unspecified, length is taken as 0 + If ``length`` is not specified, it defaults to 0. >>> reverse_letters("Hey wollef sroirraw", 3) 'Hey fellow warriors' @@ -13,7 +13,7 @@ def reverse_letters(sentence: str, length: int = 0) -> str: 'racecar' """ return " ".join( - "".join(word[::-1]) if len(word) > length else word for word in sentence.split() + word[::-1] if len(word) > length else word for word in sentence.split() ) diff --git a/strings/reverse_words.py b/strings/reverse_words.py index 504c1c2089dd..bee237f2a2d9 100644 --- a/strings/reverse_words.py +++ b/strings/reverse_words.py @@ -1,12 +1,14 @@ -def reverse_words(input_str: str) -> str: - """ - Reverses words in a given string +def reverse_words(sentence: str) -> str: + """Reverse the order of words in a given string. + + Extra whitespace between words is ignored. + >>> reverse_words("I love Python") 'Python love I' >>> reverse_words("I Love Python") 'Python Love I' """ - return " ".join(input_str.split()[::-1]) + return " ".join(sentence.split()[::-1]) if __name__ == "__main__": diff --git a/web_programming/covid_stats_via_xpath.py b/web_programming/covid_stats_via_xpath.py index f7db51b63169..88a248610441 100644 --- a/web_programming/covid_stats_via_xpath.py +++ b/web_programming/covid_stats_via_xpath.py @@ -1,7 +1,8 @@ """ -This is to show simple COVID19 info fetching from worldometers site using lxml -* The main motivation to use lxml in place of bs4 is that it is faster and therefore -more convenient to use in Python web projects (e.g. Django or Flask-based) +This script demonstrates fetching simple COVID-19 statistics from the +Worldometers archive site using lxml. lxml is chosen over BeautifulSoup +for its speed and convenience in Python web projects (such as Django or +Flask). """ # /// script @@ -19,19 +20,40 @@ class CovidData(NamedTuple): - cases: int - deaths: int - recovered: int + cases: str + deaths: str + recovered: str -def covid_stats(url: str = "https://www.worldometers.info/coronavirus/") -> CovidData: +def covid_stats( + url: str = ( + "https://web.archive.org/web/20250825095350/" + "https://www.worldometers.info/coronavirus/" + ), +) -> CovidData: xpath_str = '//div[@class = "maincounter-number"]/span/text()' - return CovidData( - *html.fromstring(httpx.get(url, timeout=10).content).xpath(xpath_str) + try: + response = httpx.get(url, timeout=10).raise_for_status() + except httpx.TimeoutException: + print( + "Request timed out. Please check your network connection " + "or try again later." + ) + return CovidData("N/A", "N/A", "N/A") + except httpx.HTTPStatusError as e: + print(f"HTTP error occurred: {e}") + return CovidData("N/A", "N/A", "N/A") + data = html.fromstring(response.content).xpath(xpath_str) + if len(data) != 3: + print("Unexpected data format. The page structure may have changed.") + data = "N/A", "N/A", "N/A" + return CovidData(*data) + + +if __name__ == "__main__": + fmt = ( + "Total COVID-19 cases in the world: {}\n" + "Total deaths due to COVID-19 in the world: {}\n" + "Total COVID-19 patients recovered in the world: {}" ) - - -fmt = """Total COVID-19 cases in the world: {} -Total deaths due to COVID-19 in the world: {} -Total COVID-19 patients recovered in the world: {}""" -print(fmt.format(*covid_stats())) + print(fmt.format(*covid_stats())) diff --git a/web_programming/current_stock_price.py b/web_programming/current_stock_price.py index 16b0b6772a9c..531da949ea50 100644 --- a/web_programming/current_stock_price.py +++ b/web_programming/current_stock_price.py @@ -23,7 +23,7 @@ def stock_price(symbol: str = "AAPL") -> str: """ >>> stock_price("EEEE") - '- ' + 'No tag with the specified data-testid attribute found.' >>> isinstance(float(stock_price("GOOG")),float) True """ diff --git a/web_programming/fetch_well_rx_price.py b/web_programming/fetch_well_rx_price.py index 93be2a9235d9..680d7444bd1c 100644 --- a/web_programming/fetch_well_rx_price.py +++ b/web_programming/fetch_well_rx_price.py @@ -5,12 +5,10 @@ """ -from urllib.error import HTTPError - +import httpx from bs4 import BeautifulSoup -from requests import exceptions, get -BASE_URL = "https://www.wellrx.com/prescriptions/{0}/{1}/?freshSearch=true" +BASE_URL = "https://www.wellrx.com/prescriptions/{}/{}/?freshSearch=true" def fetch_pharmacy_and_price_list(drug_name: str, zip_code: str) -> list | None: @@ -18,8 +16,8 @@ def fetch_pharmacy_and_price_list(drug_name: str, zip_code: str) -> list | None: This function will take input of drug name and zipcode, then request to the BASE_URL site. - Get the page data and scrape it to the generate the - list of lowest prices for the prescription drug. + Get the page data and scrape it to generate the + list of the lowest prices for the prescription drug. Args: drug_name (str): [Drug name] @@ -28,12 +26,12 @@ def fetch_pharmacy_and_price_list(drug_name: str, zip_code: str) -> list | None: Returns: list: [List of pharmacy name and price] - >>> fetch_pharmacy_and_price_list(None, None) - - >>> fetch_pharmacy_and_price_list(None, 30303) - - >>> fetch_pharmacy_and_price_list("eliquis", None) - + >>> print(fetch_pharmacy_and_price_list(None, None)) + None + >>> print(fetch_pharmacy_and_price_list(None, 30303)) + None + >>> print(fetch_pharmacy_and_price_list("eliquis", None)) + None """ try: @@ -42,10 +40,7 @@ def fetch_pharmacy_and_price_list(drug_name: str, zip_code: str) -> list | None: return None request_url = BASE_URL.format(drug_name, zip_code) - response = get(request_url, timeout=10) - - # Is the response ok? - response.raise_for_status() + response = httpx.get(request_url, timeout=10).raise_for_status() # Scrape the data using bs4 soup = BeautifulSoup(response.text, "html.parser") @@ -53,14 +48,14 @@ def fetch_pharmacy_and_price_list(drug_name: str, zip_code: str) -> list | None: # This list will store the name and price. pharmacy_price_list = [] - # Fetch all the grids that contains the items. + # Fetch all the grids that contain the items. grid_list = soup.find_all("div", {"class": "grid-x pharmCard"}) if grid_list and len(grid_list) > 0: for grid in grid_list: # Get the pharmacy price. pharmacy_name = grid.find("p", {"class": "list-title"}).text - # Get price of the drug. + # Get the price of the drug. price = grid.find("span", {"p", "price price-large"}).text pharmacy_price_list.append( @@ -72,7 +67,7 @@ def fetch_pharmacy_and_price_list(drug_name: str, zip_code: str) -> list | None: return pharmacy_price_list - except (HTTPError, exceptions.RequestException, ValueError): + except httpx.HTTPError, ValueError: return None diff --git a/web_programming/instagram_crawler.py b/web_programming/instagram_crawler.py index 68271c1c4643..0b91db01ca09 100644 --- a/web_programming/instagram_crawler.py +++ b/web_programming/instagram_crawler.py @@ -53,7 +53,7 @@ def get_json(self) -> dict: scripts = BeautifulSoup(html, "html.parser").find_all("script") try: return extract_user_profile(scripts[4]) - except (json.decoder.JSONDecodeError, KeyError): + except json.decoder.JSONDecodeError, KeyError: return extract_user_profile(scripts[3]) def __repr__(self) -> str: