Fast Gower distance for mixed data in Python
Gower Express provides Gower distance utilities for mixed numerical and categorical data, with optional sklearn integration and accelerator paths for larger workloads.
pip install gower_expOptional extras:
pip install gower_exp[gpu]
pip install gower_exp[sklearn]
pip install gower_exp[gpu,sklearn]import pandas as pd
import gower_exp as gower
data = pd.DataFrame(
{
"age": [25, 30, 35, 40],
"category": ["A", "B", "A", "C"],
"salary": [50_000, 60_000, 55_000, 65_000],
"city": ["NYC", "LA", "NYC", "Chicago"],
}
)
distances = gower.gower_matrix(data)
similar = gower.gower_topn(data.iloc[0:1], data, n=3)- Pairwise Gower distance for mixed-type datasets
- Top-N similarity search helpers
- Optional sklearn-compatible interfaces
- Optional accelerator paths using numba, vectorized code, and GPU support when available
gower_exp/: package sourcetests/: pytest suitebenchmarks/: performance scriptsdocs/: user and contributor documentationexamples/: notebooks and examples
Set up a contributor environment with uv:
uv sync --all-extras --dev
pre-commit installRun the standard checks before opening a pull request:
uv run ruff check .
uv run ruff format --check .
uv run bandit -r gower_exp/ -c pyproject.toml
uv run pytest tests/ --cov=gower_exp --cov-report=term-missingShort imperative commit messages work best, for example fix sklearn edge case or bump test coverage. If a change affects performance-sensitive code, include benchmark notes in the pull request summary.
See docs/development.md for the full contributor workflow.
PyPI publishing is triggered by publishing a GitHub Release from a vX.Y.Z tag. Pushing a tag alone does not publish a package.
The expected maintainer flow is:
- Update
versioninpyproject.toml. - Run
just checkanduv build. - Commit the version bump.
- Create and push tag
vX.Y.Z. - Publish a GitHub Release from that tag to trigger
.github/workflows/publish.yml.
MIT. See LICENSE.