Skip to content

aws-samples/sample-badgers

🚧 This repository is under active development. Watch the repo, monitor branches and issues, and check the Changelog for the latest updates.

🧭 Navigation:
🔵 Home | Vision LLM Theory | Local Testing | Deployment UI | Deployment | CDK Stacks | Runtime | S3 Files | Lambda Analyzers | Prompting System


🦡 BADGERS

Broad Agentic Document Generative Extraction & Recognition System

BADGERS transforms document processing through vision-enabled AI and deep layout analysis. Unlike traditional text extraction tools, BADGERS understands document structure and meaning by recognizing visual hierarchies, reading patterns, and contextual relationships between elements.

🤔 Why BADGERS?

Traditional document processing tools extract text but lose context. They can't distinguish a header from body text, understand table relationships, or recognize that a diagram explains the adjacent paragraph. BADGERS solves this by:

  • 🏗️ Preserving semantic structure - Maintains document hierarchy and element relationships
  • 👁️ Understanding visual context - Recognizes how layout conveys meaning
  • 📚 Processing diverse content - Handles 21+ element types from handwriting to equations
  • 🤖 Automating complex workflows - Orchestrates multiple specialized analyzers via an AI agent

Use cases: research acceleration, compliance automation, content management, accessibility remediation.

📸 Screenshots

Local Testing — Home Local Testing — Chat
Home Page Chat Interface
Deployment UI — Stacks Deployment UI — Config Editor
Stack Information Config Editor

⚙️ How It Works

┌─────────────────────────────────────────────────────────────────────────────┐
│                           AgentCore Runtime                                 │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │  PDF Analysis Agent (Strands)                                       │   │
│   │  - Claude Sonnet 4.5 with Extended Thinking                         │   │
│   │  - Session state management                                         │   │
│   │  - MCP tool orchestration                                           │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                           AgentCore Gateway                                 │
│   - MCP Protocol (2025-03-26)                                               │
│   - Cognito JWT Authentication                                              │
│   - Semantic tool search                                                    │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                   ┌──────────────────┼──────────────────┐
                   │                  │                  │
                   ▼                  ▼                  ▼
            ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
            │   Lambda    │    │   Lambda    │    │   Lambda    │
            │  Analyzer   │    │  Analyzer   │    │  Analyzer   │
            │ (25 tools)  │    │             │    │             │
            └─────────────┘    └─────────────┘    └─────────────┘
                   │                  │                  │
                   └──────────────────┼──────────────────┘
                                      ▼
                               ┌─────────────┐
                               │   Bedrock   │
                               │   Claude    │
                               └─────────────┘
  1. 📄 User submits a document with analysis instructions
  2. 🧠 Strands Agent (running in AgentCore Runtime) interprets the request
  3. 🔧 Agent selects tools from a library of specialized analyzers via MCP Gateway
  4. Lambda analyzers (standardized and domain-specific functions, including container-based) process document elements using Claude vision models
  5. 📊 Results aggregate with preserved structure and semantic relationships

🛠️ Tech Stack

Component Technology
🤖 Agent Framework Strands Agents
🏠 Agent Hosting Amazon Bedrock AgentCore Runtime
🚪 Tool Gateway Amazon Bedrock AgentCore Gateway (MCP Protocol)
🧠 Foundation Model Claude Sonnet 4.5 (via Amazon Bedrock)
⚡ Compute AWS Lambda (modular analyzer functions, including container-based)
📦 Storage Amazon S3 (configs, prompts, outputs)
🔐 Auth Amazon Cognito (OAuth 2.0 client credentials)
🏗️ IaC AWS CDK (Python)
📈 Observability CloudWatch Logs, X-Ray
📊 Cost Tracking Bedrock Application Inference Profiles

🔬 Analyzers

Analyzer Purpose
📸 pdf_to_images_converter Convert PDF pages to images
🏷️ classify_pdf_content Classify document content type
📝 full_text_analyzer Extract all text content
📊 table_analyzer Extract and structure tables
📈 charts_analyzer Analyze charts and graphs
🔀 diagram_analyzer Process diagrams and flowcharts
📐 layout_analyzer Document structure analysis
accessibility_analyzer Generate accessibility metadata (part of remediation)
🏥 decision_tree_analyzer Medical/clinical document analysis
🔬 scientific_analyzer Scientific paper analysis
✍️ handwriting_analyzer Handwritten text recognition
💻 code_block_analyzer Extract code snippets
🗂️ metadata_generic_analyzer Generic metadata extraction
🗂️ metadata_mads_analyzer MADS metadata format extraction
🗂️ metadata_mods_analyzer MODS metadata format extraction
🔑 keyword_topic_analyzer Extract keywords and topics
🔧 remediation_analyzer PDF accessibility remediation (container, content stream tagging + structure tree builder)
📄 page_analyzer Single page content analysis
🧱 elements_analyzer Document element detection
🧱 robust_elements_analyzer Enhanced element detection with fallbacks
👁️ general_visual_analysis_analyzer General-purpose visual content analysis
✏️ editorial_analyzer Editorial content and markup analysis
🗺️ war_map_analyzer Historical war map analysis
🎓 edu_transcript_analyzer Educational transcript analysis
🔗 correlation_analyzer Correlate multi-analyzer results per page
🖼️ image_enhancer Image enhancement and preprocessing

🚀 Deployment

Prerequisites

Quick Start

cd deployment
./deploy_from_scratch.sh

This deploys 10 CloudFormation stacks:

  1. 📦 S3 (config + output buckets)
  2. 🔐 Cognito (OAuth authentication)
  3. 👤 IAM (execution roles)
  4. 🐳 ECR (container registry)
  5. ⚡ Lambda (25 analyzer functions)
  6. 🚪 Gateway (MCP endpoint)
  7. 🧠 Memory (session persistence)
  8. 📊 Inference Profiles (cost tracking)
  9. 🏃 Runtime (Strands agent container)
  10. 🧩 Custom Analyzers (optional, wizard-created)

Manual Steps

See deployment/DEPLOYMENT_README.md for step-by-step instructions.

Cleanup

cd deployment
./destroy.sh

📁 Project Structure

├── deployment/
│   ├── app.py                 # CDK app entry point
│   ├── stacks/                # CDK stack definitions
│   ├── lambdas/code/          # Analyzer Lambda functions
│   ├── runtime/               # AgentCore Runtime container
│   ├── s3_files/              # Prompts, schemas, manifests
│   └── badgers-foundation/    # Shared analyzer framework
├── local_testing/             # Local dev/testing UI (React + Express)
│   ├── src/                   # React components (chat, wizard, editor, pricing, etc.)
│   └── server/                # Express API server
└── pyproject.toml

🔍 Technical Deep Dive

📦 Lambda Layers

BADGERS uses Lambda layers shared across analyzer functions:

🏗️ Foundation Layer (layer.zip)

  • Built via deployment/lambdas/build_foundation_layer.sh
  • Contains the analyzer framework (7 Python modules)
  • Includes dependencies: boto3, botocore
  • Includes core system prompts used by all analyzers
layer/python/
├── foundation/
│   ├── analyzer_foundation.py    # 🎯 Main orchestration class
│   ├── bedrock_client.py         # 🔄 Bedrock API with retry/fallback
│   ├── configuration_manager.py  # ⚙️ Config loading/validation
│   ├── image_processor.py        # 🖼️ Image optimization
│   ├── message_chain_builder.py  # 💬 Claude message formatting
│   ├── prompt_loader.py          # 📜 Prompt file loading (local/S3)
│   └── response_processor.py     # 📤 Response extraction
├── config/
│   └── config.py
└── prompts/core_system_prompts/
    └── *.xml

📄 Poppler Layer (poppler-qpdf-layer.zip)

  • PDF rendering library for pdf_to_images_converter
  • Built via deployment/lambdas/build_poppler_qdf_layer.sh

🔬 How an Analyzer Works

Each analyzer follows the same pattern using AnalyzerFoundation:

# Lambda handler (simplified)
def lambda_handler(event, context):
    # 1️⃣ Load config from S3 manifest
    config = load_manifest_from_s3(bucket, "full_text_analyzer")

    # 2️⃣ Initialize foundation with S3-aware prompt loader
    analyzer = AnalyzerFoundation(...)

    # 3️⃣ Run analysis pipeline
    result = analyzer.analyze(image_data)

    # 4️⃣ Save result to S3 and return
    save_result_to_s3(result, session_id)
    return {"result": result}

The analyze() method orchestrates:

  1. 🖼️ Image processing - Resize/optimize for Claude's vision API
  2. 📜 Prompt loading - Combine wrapper + analyzer prompts from S3
  3. 💬 Message building - Format for Bedrock Converse API
  4. Dynamic token estimation - Score image complexity and set token budget (when enabled)
  5. 🤖 Model invocation - Call Claude with retry/fallback logic
  6. Response processing - Extract and validate result

📜 Prompting System

Prompts are modular XML files composed at runtime:

s3://config-bucket/
├── core_system_prompts/
│   ├── prompt_system_wrapper.xml   # 🎁 Main template with placeholders
│   ├── core_rules/rules.xml        # 📏 Shared rules for all analyzers
│   └── error_handling/*.xml        # ⚠️ Error response templates
├── prompts/{analyzer_name}/
│   ├── {analyzer}_job_role.xml     # 👤 Role definition
│   ├── {analyzer}_context.xml      # 🌍 Domain context
│   ├── {analyzer}_rules.xml        # 📏 Analyzer-specific rules
│   ├── {analyzer}_tasks.xml        # ✅ Task instructions
│   └── {analyzer}_format.xml       # 📋 Output format spec
└── wrappers/
    └── prompt_system_wrapper.xml

The PromptLoader composes the final system prompt:

<!-- prompt_system_wrapper.xml -->
<system_prompt>
    {core_rules}           <!-- 📏 Injected from core_rules/rules.xml -->
    {composed_prompt}      <!-- 🧩 Injected from analyzer prompt files -->
    {error_handler_general}
    {error_handler_not_found}
</system_prompt>

Placeholders like [[PIXEL_WIDTH]] and [[PIXEL_HEIGHT]] are replaced with actual image dimensions at runtime.

⚙️ Configuration System

Each analyzer has a manifest file in S3:

// s3://config-bucket/manifests/full_text_analyzer.json
{
    "tool": {
        "name": "analyze_full_text_tool",
        "description": "Extracts text content maintaining reading order...",
        "inputSchema": {
            "type": "object",
            "properties": {
                "image_path": { "type": "string" },
                "session_id": { "type": "string" },
                "audit_mode": { "type": "boolean" }
            },
            "required": ["image_path", "session_id"]
        }
    },
    "analyzer": {
        "name": "full_text_analyzer",
        "enhancement_eligible": true,
        "model_selections": {
            "primary": "us.anthropic.claude-sonnet-4-5-20250929-v1:0",
            "fallback_list": [
                "us.anthropic.claude-haiku-4-5-20251001-v1:0",
                "us.amazon.nova-premier-v1:0"
            ]
        },
        "max_retries": 3,
        "prompt_files": [
            "full_text_job_role.xml",
            "full_text_context.xml",
            "full_text_rules.xml",
            "full_text_tasks_extraction.xml",
            "full_text_format.xml"
        ],
        "max_examples": 0,
        "analysis_text": "full text content",
        "expected_output_tokens": 6000,
        "output_extension": "xml"
    }
}

Key configuration features:

  • 🔄 Model fallback chain - Primary model with ordered fallbacks
  • 🔁 Retry logic - Configurable retry count per analyzer
  • 🧩 Prompt composition - List of XML files to combine
  • 📋 Tool schema - MCP-compatible input schema for Gateway
  • 🖼️ Enhancement eligible - Flag indicating analyzer benefits from image preprocessing (used by image_enhancer tool)

Global settings (from environment or defaults):

{
    "max_tokens": 8000,
    "temperature": 0.1,
    "max_image_size": 20971520,  # 20MB
    "max_dimension": 2048,
    "jpeg_quality": 85,
    "throttle_delay": 1.0,
    "aws_region": "us-west-2"
}

⚡ Dynamic Token Estimation

When enabled, BADGERS estimates the optimal max_tokens per image based on visual complexity, reducing cost on simple documents and avoiding truncation on dense ones. The scorer runs on the already-processed image bytes — no extra I/O.

Four metrics are combined into a complexity score: text pixel ratio, grayscale entropy, edge density, and color standard deviation. The score maps to a token budget (8K / 12K / 16K / 24K).

Enabling: Toggle "Dynamic Token Estimation" in the chat UI, or set the Lambda environment variable DYNAMIC_TOKENS_ENABLED=true.

Tuning: Add a dynamic_tokens block to an analyzer manifest to customize weights and thresholds:

"dynamic_tokens": {
    "weights": {
        "text_ratio": 0.2,
        "entropy": 0.3,
        "edge_density": 0.3,
        "color_std": 0.2
    },
    "thresholds": [
        {"max_score": 0.20, "max_tokens": 8000},
        {"max_score": 0.30, "max_tokens": 12000},
        {"max_score": 0.45, "max_tokens": 16000},
        {"max_score": 1.00, "max_tokens": 24000}
    ]
}

Observability: When active, logs report the estimated budget, actual token usage, and utilization percentage for calibration.

📊 Inference Profiles for Cost Tracking

BADGERS uses Application Inference Profiles to enable cost allocation and usage monitoring. The system maps model IDs to profile ARNs at runtime:

┌─────────────────────────────────────────────────────────────────────────────┐
│                        Inference Profile Flow                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  1. CDK deploys InferenceProfilesStack                                      │
│     └─> Creates ApplicationInferenceProfile for each model                  │
│         • badgers-claude-sonnet-{id}  (US)                               │
│         • badgers-claude-haiku-{id}   (US)                               │
│         • badgers-claude-opus-{id}    (US)                               │
│         • badgers-nova-premier-{id}   (US)                               │
│                                                                             │
│  2. Runtime receives profile ARNs as environment variables                  │
│     └─> CLAUDE_SONNET_PROFILE_ARN, CLAUDE_HAIKU_PROFILE_ARN, etc.           │
│                                                                             │
│  3. At invocation, bedrock_client.py maps model_id → profile ARN            │
│     └─> "us.anthropic.claude-sonnet-4-5-*" → $CLAUDE_SONNET_PROFILE_ARN    │
│                                                                             │
│  4. Bedrock invoked with profile ARN (enables cost tracking)                │
│     └─> Falls back to model ID if no profile configured                     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Model ID to environment variable mapping:

Model Pattern Environment Variable
*claude-sonnet-4-5* CLAUDE_SONNET_PROFILE_ARN
*claude-haiku-4-5* CLAUDE_HAIKU_PROFILE_ARN
*claude-opus-4-6* CLAUDE_OPUS_PROFILE_ARN
*nova-premier* NOVA_PREMIER_PROFILE_ARN

➕ Adding a New Analyzer

Option 1: Use the Wizard (Recommended)

cd local_testing
npm run dev

The Analyzer Creation Wizard is available as the 🧙 Create Analyzer tab in the Local Testing UI.

Option 2: Manual Creation

  1. 📜 Create prompt files in deployment/s3_files/prompts/{analyzer_name}/
  2. 📋 Create manifest in deployment/s3_files/manifests/{analyzer_name}.json
  3. 📐 Create schema in deployment/s3_files/schemas/{analyzer_name}.json
  4. ⚡ Create Lambda code in deployment/lambdas/code/{analyzer_name}/lambda_handler.py
  5. 📝 Register in deployment/stacks/lambda_stack.py
  6. 🚀 Redeploy: cdk deploy badgers-lambda badgers-gateway

🔧 Troubleshooting

Service Control Policy (SCP) Blocks Cross-Region Inference

If your AWS organization uses strict SCPs that deny cross-region Bedrock operations, you may see:

AccessDeniedException: ... is not authorized to perform: bedrock:InvokeModelWithResponseStream
on resource: arn:aws:bedrock:::foundation-model/anthropic.claude-* with an explicit deny
in a service control policy

BADGERS defaults to regional (us.anthropic.*) inference profiles which avoid cross-region routing. If you previously deployed with global.anthropic.* profiles, redeploy after pulling the latest code.

Marketplace Subscription Error on First Invocation

After a fresh deployment, the first model invocation may fail with:

AccessDeniedException: Model access is denied due to IAM user or service role is not authorized
to perform the required AWS Marketplace actions (aws-marketplace:ViewSubscriptions,
aws-marketplace:Subscribe)

The IAM stack now includes aws-marketplace:ViewSubscriptions and aws-marketplace:Subscribe permissions. If you see this error on an older deployment, redeploy the IAM stack. As a workaround, manually invoke the model once in the Bedrock console playground to trigger the Marketplace subscription.


Notices

Customers are responsible for making their own independent assessment of the information in this Guidance. This Guidance: (a) is for informational purposes only, (b) represents AWS current product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided "as is" without warranties, representations, or conditions of any kind, whether express or implied. AWS responsibilities and liabilities to its customers are controlled by AWS agreements, and this Guidance is not part of, nor does it modify, any agreement between AWS and its customers.


Authors

  • Randall Potter

📖 Further Reading

🤖 Amazon Bedrock & Foundation Models

🚀 Amazon Bedrock AgentCore

⚡ AWS Lambda

🔐 Amazon Cognito

📈 Observability

📦 Amazon S3

💻 Amazon Kiro IDE

About

Guidance on deploying a generative AI document analysis with Amazon Bedrock AgentCore. Auto-classifies, enhances, and aggregates multi-type documents using Gestalt-informed vision prompts. Custom analyzer creation wizard. Scripted CDK deployment. Gradio frontend included.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors