feat(hosted-keys): add Hunter.io and People Data Labs hosted key support by TheodoreSpeaks · Pull Request #4742 · simstudioai/sim

TheodoreSpeaks · 2026-05-26T19:40:57Z

Summary

Add hosted key support to all Hunter.io and People Data Labs tools — Sim supplies a key from a round-robin env pool when the user hasn't set one via BYOK
Register `hunter` and `peopledatalabs` as BYOK providers (type union, contract enum, settings UI)
Meter usage via per-tool `pricing.getCost`: Hunter search $0.015 / verify $0.0075, PDL $0.28/credit; free endpoints (Hunter discover/email-count/companies-find, PDL cleaners + autocomplete) billed $0
Per-workspace rate limits; PDL search/bulk add a 600 credits/min cap dimension
Hide the API Key field on hosted Sim; document `HUNTER_API_KEY_` / `PEOPLEDATALABS_API_KEY_` env vars

Type of Change

New feature

Testing

Tested manually. `bun run lint`, `bun run check:api-validation:strict`, `tsc --noEmit`, and the 92 tool/rate-limiter tests all pass.

Checklist

Code follows project style guidelines
Self-reviewed my changes
Tests added/updated and passing
No new warnings introduced
I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

vercel · 2026-05-26T19:41:03Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	May 26, 2026 7:47pm

cursor · 2026-05-26T19:41:07Z

PR Summary

Medium Risk
Introduces paid third-party usage metering and rate limits; misconfigured getCost logic could under- or over-charge hosted usage, though the pattern matches existing hosted tools.

Overview
Adds hosted API key support for Hunter.io and People Data Labs so Sim can supply keys from env pools when workspaces do not use BYOK.

Hunter — all six tools get hosting config (HUNTER_API_KEY_* round-robin, byokProviderId: hunter). Billing uses response-derived credits: search tools charge when results exist (~~$0.015/credit), verifier always charges (~~$0.0075); discover, email count, and companies find are $0. PDL — all eleven tools are wired similarly (PEOPLEDATALABS_API_KEY_*, ~$0.28/credit). Enrich/identify charge on match; search/bulk charge per returned or matched record; cleaners and autocomplete are $0. Search and bulk add a 600 credits/min rate-limit dimension via new countBulkMatched helper.

Product surface — hunter and peopledatalabs join BYOK types, API contract, and workspace settings UI. Hunter and PDL blocks hide the API key field when hosted (hideWhenHosted). .env.example documents the new key pools.

^{Reviewed by Cursor Bugbot for commit ecd9ef0. Bugbot is set up for automated code reviews on this repo. Configure here.}

greptile-apps · 2026-05-26T19:46:53Z

Greptile Summary

This PR adds Sim-hosted API key support for all Hunter.io and People Data Labs tools, registering both as BYOK providers and wiring per-tool pricing and rate-limit configs that meter usage when Sim supplies the key.

Hunter tools: domain_search, email_finder, and email_verifier get hosted-key configs with custom getCost functions that reflect Hunter's credit model (search credit charged only on non-empty results; verifier always charges half a credit); discover, email-count, and companies-find are marked free.
PDL tools: All 9 tools get hosted-key configs; bulk enrich uses a shared countBulkMatched utility (graceful on missing data), while search/identify tools charge per returned record with a 600-credits/min dimension cap; cleaners and autocomplete are free.
Infrastructure: The migration CI step switches from bunx drizzle-kit migrate to a custom bun run ./scripts/migrate.ts that sets statement_timeout = 0 before running, allowing long-running schema changes without hitting server-level query timeouts.

Confidence Score: 4/5

The core hosted-key plumbing is consistent with existing providers, BYOK contract and type union are correctly extended, and the pricing logic matches documented Hunter/PDL billing models. The getCost/extractUsage inconsistency is safe today but fragile if transformResponse is ever refactored.

The implementation is well-structured and follows established patterns. The getCost functions that throw on missing array fields (person_search, company_search, domain_search, person_identify) are inconsistent with their paired extractUsage lambdas that return 0 gracefully — if output fields are ever absent, a successful API call would surface an error to the user. The bulk enrich tools handle this correctly via countBulkMatched.

apps/sim/tools/peopledatalabs/person_search.ts, company_search.ts, person_identify.ts, and apps/sim/tools/hunter/domain_search.ts — the inconsistency between getCost throwing and extractUsage returning 0 is worth aligning before this pattern spreads to more tools.

Important Files Changed

Filename	Overview
apps/sim/tools/hunter/domain_search.ts	Adds hosting/pricing/rate-limit config for Hunter domain search; getCost throws on missing emails array (defensive but inconsistent with graceful extractUsage patterns).
apps/sim/tools/hunter/email_finder.ts	Adds hosting config; pricing checks output.email string presence — correctly charges 1 credit only when an email is returned.
apps/sim/tools/hunter/email_verifier.ts	Adds flat per-request pricing (always charges one verification credit regardless of result, matching Hunter's billing model).
apps/sim/tools/peopledatalabs/person_search.ts	Adds hosting config with custom rate-limit dimensions (600 credits/min); getCost throws on missing results while extractUsage returns 0 — inconsistent error handling.
apps/sim/tools/peopledatalabs/company_search.ts	Same hosting pattern as person_search; same getCost/extractUsage inconsistency for missing results.
apps/sim/tools/peopledatalabs/bulk_person_enrich.ts	Uses countBulkMatched utility in both getCost and extractUsage — consistent graceful handling when results are missing.
apps/sim/tools/peopledatalabs/bulk_company_enrich.ts	Same pattern as bulk_person_enrich; correctly uses countBulkMatched for both cost and dimension tracking.
apps/sim/tools/peopledatalabs/person_identify.ts	Adds hosting config; getCost throws if matches is not an array, though transformResponse always returns an array ([] on 404).
apps/sim/tools/peopledatalabs/utils.ts	Adds countBulkMatched helper that returns 0 gracefully for non-array results — good defensive design used by bulk tools.
packages/db/scripts/migrate.ts	Adds SET statement_timeout = 0 before migrations to override any server-level query timeout; removes the per-statement safety net but is a standard pattern for long-running migrations.
apps/sim/lib/api/contracts/byok-keys.ts	Correctly adds 'hunter' and 'peopledatalabs' to the byokProviderIdSchema enum.
apps/sim/tools/types.ts	Extends BYOKProviderId union type with 'hunter' and 'peopledatalabs' — in sync with the contract enum.
apps/sim/app/workspace/[workspaceId]/settings/components/byok/byok.tsx	Adds Hunter and People Data Labs entries to the PROVIDERS config with correct icons, descriptions, and placeholder text.
.github/workflows/migrations.yml	Migration command updated from bunx drizzle-kit migrate to bun run ./scripts/migrate.ts — uses the custom script that disables statement_timeout.
apps/sim/tools/hunter/types.ts	Adds HUNTER_API_KEY_PREFIX, HUNTER_SEARCH_CREDIT_USD, and HUNTER_VERIFICATION_CREDIT_USD constants with clear pricing rationale comments.
apps/sim/tools/peopledatalabs/types.ts	Adds PEOPLEDATALABS_API_KEY_PREFIX and PDL_CREDIT_USD constants with well-documented pricing source.

Sequence Diagram

sequenceDiagram
    participant User
    participant ToolExecutor as Tool Executor (index.ts)
    participant HKRL as HostedKeyRateLimiter
    participant API as Hunter.io / PDL API

    User->>ToolExecutor: Execute tool (no BYOK key)
    ToolExecutor->>HKRL: acquireKey(provider, envKeyPrefix, config, workspaceId)
    HKRL-->>ToolExecutor: key (round-robin selected)
    ToolExecutor->>API: HTTP request (hosted key injected)
    API-->>ToolExecutor: Response
    ToolExecutor->>ToolExecutor: transformResponse → output
    ToolExecutor->>ToolExecutor: applyHostedKeyCostToResult()
    ToolExecutor->>ToolExecutor: pricing.getCost(params, output) → cost
    ToolExecutor->>HKRL: reportUsage (custom dimensions only)
    ToolExecutor-->>User: output + cost metadata

_{Reviews (1): Last reviewed commit: "Merge branch 'staging' into feat/hosted-..." | Re-trigger Greptile}

greptile-apps · 2026-05-26T19:47:00Z

+        if (!Array.isArray(results)) {
+          throw new Error('PDL person search response missing results, cannot determine cost')
+        }


Asymmetric error handling between getCost and extractUsage

getCost throws when results is not an array, but the paired extractUsage lambda in the same rateLimit.dimensions block returns 0 gracefully for the exact same condition. If output.results were ever missing (e.g., after a future transformResponse refactor), extractUsage would silently record 0 credits while getCost would throw — causing the tool to return an error to the caller even though the upstream API call already succeeded. The same inconsistency appears in company_search.ts and, in slightly different form, in domain_search.ts (emails) and person_identify.ts (matches). countBulkMatched in the bulk tools avoids this by returning 0 rather than throwing.

TheodoreSpeaks added 3 commits May 21, 2026 19:27

fix(db): disable statement_timeout for migrations

6e98ef4

fix(ci): route migration workflow through guarded migrate.ts

1ca4dba

feat(hosted-keys): add Hunter.io and People Data Labs hosted key support

09e3a10

Merge branch 'staging' into feat/hosted-key-enrichments

ecd9ef0

vercel Bot deployed to Preview May 26, 2026 19:44 View deployment

greptile-apps Bot reviewed May 26, 2026

View reviewed changes

vercel Bot deployed to Preview May 26, 2026 19:47 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hosted-keys): add Hunter.io and People Data Labs hosted key support#4742

feat(hosted-keys): add Hunter.io and People Data Labs hosted key support#4742
TheodoreSpeaks wants to merge 4 commits into
stagingfrom
feat/hosted-key-enrichments

TheodoreSpeaks commented May 26, 2026

Uh oh!

vercel Bot commented May 26, 2026 •

edited

Loading

Uh oh!

cursor Bot commented May 26, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 26, 2026

Uh oh!

greptile-apps Bot May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TheodoreSpeaks commented May 26, 2026

Summary

Type of Change

Testing

Checklist

Uh oh!

vercel Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

greptile-apps Bot commented May 26, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 26, 2026 •

edited

Loading

cursor Bot commented May 26, 2026 •

edited

Loading