fix: Prompt studio index and run issues by chandrasekharan-zipstack · Pull Request #1 · Zipstack/unstract

chandrasekharan-zipstack · 2024-02-25T17:00:34Z

What

Updated platform key retrieval logic in PromptIdeBaseTool

Why

Noticed issues with indexing a file and running of prompts due

How

...

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Notes on Testing

Tried indexing and running a prompt on a file

Screenshots

Checklist

I have read and understood the Contribution Guidelines.

nehabagdia · 2024-02-26T02:34:43Z

@jaseemjaskp I am not merging this yet as I want you to confirm if org_id is really mandatory and if the approach for multi-tenancy sounds OK.
Pulling in the change makes my prompt studio work OK locally.

jaseemjaskp · 2024-02-26T03:44:55Z

@jaseemjaskp I am not merging this yet as I want you to confirm if org_id is really mandatory and if the approach for multi-tenancy sounds OK. Pulling in the change makes my prompt studio work OK locally.

@nehabagdia I think it is impossible to get platform key without org_id.

fix: Prompt studio index and run issues

1. Remove deprecated acquire_slot() method (PR comment #1) - Method marked as deprecated, not used anywhere in codebase - check_and_acquire() is the only method used (atomic operation) - Dead code removal improves maintainability 2. Add detailed comment explaining TTL vs manual cleanup (PR comment #2) - Clarifies why _cleanup_expired_entries() is needed even with TTL - Redis ZSET TTL expires entire key, not individual entries - Manual cleanup (ZREMRANGEBYSCORE) removes stale entries within ZSET - Both mechanisms work together for accurate rate limiting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…he and per-org locks (#1649) * UN-2972: Implement API deployment rate limiting with Django cache and per-org locks This commit implements a comprehensive rate limiting system for API deployments with atomic operations, caching, and management commands. Features: - Organization-level and global concurrent request limits - Per-organization Redis distributed locks to prevent race conditions - Django cache integration for org limits (~95% reduction in DB queries) - Automatic cache invalidation on limit updates - Centralized constants to eliminate magic strings - Management commands for easy administration Technical Details: - Uses Redis ZSET for tracking active executions with TTL-based cleanup - Per-org locks ensure atomic check-and-acquire operations - Global limit enforced with eventual consistency (acceptable tolerance) - Rate limiter fails open on Redis errors for system reliability - Retry-After header in 429 responses for proper client backoff New Files: - backend/api_v2/rate_limit_constants.py: Centralized keys and constants - backend/api_v2/rate_limiter.py: Core rate limiting logic - backend/api_v2/migrations/0003_add_organization_rate_limit.py: DB migration - backend/api_v2/management/commands/set_org_rate_limit.py: Set org limits - backend/api_v2/management/commands/get_org_rate_limit.py: View org limits - backend/api_v2/management/commands/list_org_rate_limits.py: List all limits Modified Files: - backend/api_v2/models.py: OrganizationRateLimit model with auto cache clearing - backend/api_v2/deployment_helper.py: Use atomic check_and_acquire - backend/api_v2/admin.py: Admin interface for OrganizationRateLimit - backend/api_v2/exceptions.py: RateLimitExceeded exception - backend/backend/exceptions.py: Add Retry-After header to 429 responses - backend/workflow_manager/workflow_v2/models/execution.py: Auto-release on completion - backend/backend/settings/base.py: Rate limiting configuration - backend/sample.env: Configuration documentation Configuration (environment variables): - API_DEPLOYMENT_DEFAULT_RATE_LIMIT: Default per-org limit (default: 5) - API_DEPLOYMENT_GLOBAL_RATE_LIMIT: System-wide limit (default: 50) - API_DEPLOYMENT_RATE_LIMIT_TTL_HOURS: ZSET entry TTL (default: 6 hours) - API_DEPLOYMENT_RATE_LIMIT_CACHE_TTL: Cache TTL (default: 1 hour) - API_DEPLOYMENT_RATE_LIMIT_LOCK_TIMEOUT: Lock timeout (default: 2 seconds) - API_DEPLOYMENT_RATE_LIMIT_LOCK_BLOCKING_TIMEOUT: Lock wait (default: 5 seconds) - API_DEPLOYMENT_RATE_LIMIT_RETRY_AFTER: Retry-After value (default: 300 seconds) Management Commands: - python manage.py set_org_rate_limit <org-id-or-name> <limit> - python manage.py get_org_rate_limit <org-id-or-name> [--clear-cache] - python manage.py list_org_rate_limits [--with-usage] Performance Impact: - DB queries for org limits: ~95% reduction (cached) - Rate limit check latency: 10-20ms (acceptable tradeoff for no race conditions) - Lock contention: Minimal (per-org, not global) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-2972 [FIX] Move signal imports to top of models.py to fix Ruff E402 Fix Ruff linting errors by moving Django signal imports to the top of the file according to PEP 8 conventions. Changes: - Moved `from django.db.models.signals import post_delete` to line 7 - Moved `from django.dispatch import receiver` to line 8 - Removed duplicate imports from line 268-269 (bottom of file) - Signal handler function remains at bottom unchanged Fixes Ruff errors: - E402: Module level import not at top of file (line 266) - E402: Module level import not at top of file (line 267) Also installed pre-commit hooks locally for future commits. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix organization lookup in rate limit management commands The Organization.organization_id field is a CharField (not UUIDField), storing custom string IDs like "org_qijtoAkJNhznYhNt". The previous implementation attempted UUID parsing which failed for these values. Changes: - Remove UUID validation that assumes organization_id is a UUID - Try organization_id lookup first (handles both UUID and string formats) - Fall back to name lookup if organization_id lookup fails - Update help text to remove "UUID" reference This allows commands to work with both: - Organization ID: python manage.py set_org_rate_limit org_qijtoAkJNhznYhNt 10 - Organization name: python manage.py set_org_rate_limit zipstack 10 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add delete_org_rate_limit management command and improve error logging New command allows administrators to remove custom organization rate limits, reverting organizations back to the system default limit. Features: - Delete custom rate limit for an organization (by ID or name) - Confirmation prompt (can be skipped with --force) - Shows default limit that will be used after deletion - Warns if current usage exceeds default - Automatic cache clearing via post_delete signal Also improved error logging in WorkflowExecution to use logger.exception() instead of logger.error() for better stack trace visibility. Usage: python manage.py delete_org_rate_limit org_qijtoAkJNhznYhNt python manage.py delete_org_rate_limit zipstack --force 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add cache clearing command and improve cache TTL behavior New Features: 1. Management command to clear organization rate limit cache - Clear specific org: --org-id <org_id> - Clear all with custom limits: (default) - Clear ALL orgs: --all - Uses Redis pattern deletion (rate_limit:cache:org_limit:*) for performance, falls back to individual deletion 2. Reduced cache TTL from 1 hour to 10 minutes - Allows faster pickup of default limit changes from ENV - Still provides significant DB query reduction 3. Implement cache TTL refresh on every read - TTL is extended by 10 minutes on every cache hit - Frequently-used orgs stay cached indefinitely - Inactive orgs expire after 10 minutes - LRU-like behavior without explicit LRU cache Usage: # Clear specific organization cache python manage.py clear_org_rate_limit_cache --org-id org_qijtoAkJNhznYhNt # Clear cache for all orgs with custom limits python manage.py clear_org_rate_limit_cache # Clear cache for ALL organizations (uses pattern deletion if Redis) python manage.py clear_org_rate_limit_cache --all This is useful when changing API_DEPLOYMENT_DEFAULT_RATE_LIMIT to ensure organizations pick up the new default value immediately. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add comprehensive rate limiting documentation and update default limits Documentation: - Created docs/API_DEPLOYMENT_RATE_LIMITING.md with: - Architecture overview (per-org + global limits, Redis locks, cache) - All 7 ENV variables with detailed explanations - All 5 management commands with usage examples - Usage scenarios and best practices - Troubleshooting guide - Performance characteristics and security considerations - Confirms rate limiting ONLY affects API deployments (not ETL/tasks) Default Limit Updates (3 locations): - Organization limit: 5 → 20 concurrent requests - Global limit: 50 → 100 concurrent requests - Updated in: 1. backend/api_v2/rate_limit_constants.py (RateLimitDefaults) 2. backend/sample.env (environment variable defaults) 3. backend/backend/settings/base.py (Django setting defaults) Also updated cache TTL default in base.py: 3600s → 600s (10 minutes) to match the change made in previous commit. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Use centralized constants and remove Retry-After functionality Issues Fixed: 1. RateLimitExceeded was not using centralized RateLimitMessages constants 2. Response format documentation didn't match actual drf-standardized-errors format 3. Retry-After header and retry_after_seconds were not useful (hard to predict accurately) Changes: - RateLimitExceeded now uses RateLimitMessages.get_org_limit_exceeded_message() and get_global_limit_exceeded_message() for consistent error messages - Removed retry_after_seconds parameter from RateLimitExceeded.__init__() - Removed retry_after_seconds from rate_limiter.py return dicts - Removed retry_after_seconds from deployment_helper.py exception call - Removed dead Retry-After header code from backend/exceptions.py - Removed API_DEPLOYMENT_RATE_LIMIT_RETRY_AFTER from: - sample.env - settings/base.py - rate_limit_constants.py (DEFAULT_RETRY_AFTER_SECONDS) - Updated documentation to show actual drf-standardized-errors response format: { "type": "client_error", "errors": [{"code": "error", "detail": "...", "attr": null}] } - Removed all Retry-After references from documentation Clients should implement their own retry logic with exponential backoff. Rate limits are released when active requests complete. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix late imports and comparison bug in rate limit management commands - Move APIDeploymentRateLimiter import to module top (PEP 8 compliance) - Fix off-by-one warning bug: change > to >= to match rate limiter blocking logic - Affects: delete_org_rate_limit, set_org_rate_limit, get_org_rate_limit, list_org_rate_limits The rate limiter blocks when usage >= limit, so warnings must also use >= for consistency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Refactor rate limiting to view layer with dual release paths **Problem:** Review bot identified critical bug where early failures (DB, file staging, queue dispatch) would leak rate limit slots for 6 hours, potentially exhausting org/global quotas. **Root Cause:** - Rate limiting was in deployment_helper.py - Helper has internal try-catch that swallows exceptions and returns error response - Exceptions that don't propagate can't be caught by outer handlers - No slot release on these failures → orphaned slots **Solution:** Move rate limiting to view layer with dual release strategy: 1. **View Layer (api_deployment_views.py)** - Acquires rate limit slot before calling execute_workflow() - Wraps execute_workflow() in try-catch - Releases slot if exception propagates (early setup failures) 2. **Helper Layer (deployment_helper.py)** - Accepts optional execution_id parameter - Existing exception handler now also releases slot - Handles failures in async dispatch that don't propagate 3. **Signal (workflow completion)** - Unchanged - releases slot when async job completes successfully **Coverage:** ✅ Lines 197-241 failures (Tag creation, WorkflowExecution, file staging) → View catches & releases ✅ Lines 243-289 failures (Async dispatch, config checks) → Helper catches & releases ✅ Async job completion → Signal releases ✅ No double-release (each path has one release point) ✅ No orphaned slots on any error path **Files Changed:** - api_deployment_views.py: Add rate limit check + blanket exception handler - deployment_helper.py: Accept execution_id param, add slot release to exception handler - API_DEPLOYMENT_RATE_LIMITING.md: Update architecture diagram and code locations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Use logger.exception() to capture full traceback in rate limit exception handler Changed logger.error() to logger.exception() in api_deployment_views.py exception handler to automatically include the full stack trace. This provides better debugging information when workflow execution fails during rate limit protected operations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Update documentation with generic organization IDs * Address PR review comments: remove dead code and clarify TTL behavior 1. Remove deprecated acquire_slot() method (PR comment #1) - Method marked as deprecated, not used anywhere in codebase - check_and_acquire() is the only method used (atomic operation) - Dead code removal improves maintainability 2. Add detailed comment explaining TTL vs manual cleanup (PR comment #2) - Clarifies why _cleanup_expired_entries() is needed even with TTL - Redis ZSET TTL expires entire key, not individual entries - Manual cleanup (ZREMRANGEBYSCORE) removes stale entries within ZSET - Both mechanisms work together for accurate rate limiting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Updated platform key retrieval in PromptIdeBaseTool

b7d06cf

chandrasekharan-zipstack requested review from harini-venkataraman, jaseemjaskp, muhammad-ali-e and nehabagdia February 25, 2024 17:00

chandrasekharan-zipstack self-assigned this Feb 25, 2024

nehabagdia approved these changes Feb 26, 2024

View reviewed changes

jaseemjaskp approved these changes Feb 26, 2024

View reviewed changes

jaseemjaskp merged commit c64479c into main Feb 26, 2024

jaseemjaskp deleted the fix/prompt-studio-index-and-run branch February 26, 2024 07:25

pk-zipstack pushed a commit that referenced this pull request Aug 20, 2025

Merge pull request #1 from Zipstack/fix/prompt-studio-index-and-run

ae61b91

fix: Prompt studio index and run issues

This was referenced Oct 30, 2025

UN-2930 [MISC] Add plugin infrastructure to unstract-core #1618

Merged

UN-2930 [FIX] Track subscription usage only for successful runs #1620

Merged

vishnuszipstack mentioned this pull request Feb 16, 2026

Rewrite README.md based on approved Confluence spec #1795

Merged

6 tasks

muhammad-ali-e mentioned this pull request Mar 5, 2026

[MISC] Add --active-only filter to backfill_metrics command #1824

Merged

greptile-apps bot mentioned this pull request Mar 17, 2026

UN-3318 [FEAT] Platform API key implementation #1860

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Prompt studio index and run issues#1

fix: Prompt studio index and run issues#1
jaseemjaskp merged 1 commit intomainfrom
fix/prompt-studio-index-and-run

chandrasekharan-zipstack commented Feb 25, 2024

Uh oh!

nehabagdia commented Feb 26, 2024

Uh oh!

jaseemjaskp commented Feb 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chandrasekharan-zipstack commented Feb 25, 2024

What

Why

How

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Notes on Testing

Screenshots

Checklist

Uh oh!

nehabagdia commented Feb 26, 2024

Uh oh!

jaseemjaskp commented Feb 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants