Skip to content

momonga-ml/gower-express

Repository files navigation

Gower Express

Fast Gower distance for mixed data in Python

PyPI version Python Version License: MIT CI

Gower Express provides Gower distance utilities for mixed numerical and categorical data, with optional sklearn integration and accelerator paths for larger workloads.

Install

pip install gower_exp

Optional extras:

pip install gower_exp[gpu]
pip install gower_exp[sklearn]
pip install gower_exp[gpu,sklearn]

Quick Start

import pandas as pd
import gower_exp as gower

data = pd.DataFrame(
    {
        "age": [25, 30, 35, 40],
        "category": ["A", "B", "A", "C"],
        "salary": [50_000, 60_000, 55_000, 65_000],
        "city": ["NYC", "LA", "NYC", "Chicago"],
    }
)

distances = gower.gower_matrix(data)
similar = gower.gower_topn(data.iloc[0:1], data, n=3)

Features

  • Pairwise Gower distance for mixed-type datasets
  • Top-N similarity search helpers
  • Optional sklearn-compatible interfaces
  • Optional accelerator paths using numba, vectorized code, and GPU support when available

Project Layout

  • gower_exp/: package source
  • tests/: pytest suite
  • benchmarks/: performance scripts
  • docs/: user and contributor documentation
  • examples/: notebooks and examples

Development

Set up a contributor environment with uv:

uv sync --all-extras --dev
pre-commit install

Run the standard checks before opening a pull request:

uv run ruff check .
uv run ruff format --check .
uv run bandit -r gower_exp/ -c pyproject.toml
uv run pytest tests/ --cov=gower_exp --cov-report=term-missing

Short imperative commit messages work best, for example fix sklearn edge case or bump test coverage. If a change affects performance-sensitive code, include benchmark notes in the pull request summary.

See docs/development.md for the full contributor workflow.

Releases

PyPI publishing is triggered by publishing a GitHub Release from a vX.Y.Z tag. Pushing a tag alone does not publish a package.

The expected maintainer flow is:

  1. Update version in pyproject.toml.
  2. Run just check and uv build.
  3. Commit the version bump.
  4. Create and push tag vX.Y.Z.
  5. Publish a GitHub Release from that tag to trigger .github/workflows/publish.yml.

Documentation

License

MIT. See LICENSE.