Extracting data from images using ICR
Use ICR to extract structured document data from images with local models.
Common use cases include:
- Air-gapped processing environments
- Privacy-sensitive document workflows
- High-volume extraction with predictable runtime cost
- Pipelines that need layout and semantic structure
ICR returns more than plain text. It detects layout and semantic elements such as tables, key-value regions, headings, and equations.
Download sampleHow Nutrient helps
Nutrient Java SDK handles local model loading, layout analysis, and JSON generation.
The SDK handles:
- Deploying and managing local AI models for document layout detection
- Implementing table detection algorithms and cell boundary extraction
- Handling semantic element classification and hierarchical structure parsing
- Complex bounding box calculation and reading order determination
Prerequisites
Before following this guide, ensure you have:
- Java 8 or higher installed
- Nutrient Java SDK added to your project (Maven, Gradle, or manual JAR)
- An image file to process (PNG, JPEG, or other supported formats)
- Basic familiarity with Java try-with-resources statements
For initial SDK setup and dependency configuration, refer to the getting started guide.
Complete implementation
This example extracts structured JSON from an image using the ICR engine:
package io.nutrient.Sample;Import the required classes from the SDK:
import io.nutrient.sdk.Document;import io.nutrient.sdk.Vision;import io.nutrient.sdk.enums.VisionEngine;import io.nutrient.sdk.exceptions.NutrientException;
import java.io.FileWriter;import java.io.IOException;
public class ExtractDataFromImageIcr {Create the main method and declare thrown exceptions:
public static void main(String[] args) throws NutrientException, IOException {Configuring ICR mode
Open the image and set the vision engine to ICR.
In this sample:
- The document opens in try-with-resources.
setEngine(VisionEngine.Icr)sets local ICR mode.- ICR is the default engine, so this step is optional.
ICR is the default engine, so this method call is optional but shown here for illustration purposes.
try (Document document = Document.open("input_ocr_multiple_languages.png")) { // Configure ICR engine for local processing (this is the default) document.getSettings().getVisionSettings().setEngine(VisionEngine.Icr);Creating a vision instance and extracting content
Create a vision instance and call extractContent().
In this sample:
Vision.set(document)binds extraction to the opened document.extractContent()returns structured JSON as a string.- Processing runs locally when the engine is ICR.
Vision vision = Vision.set(document); String contentJson = vision.extractContent();Write the JSON string to a file for downstream use.
Use this output for storage, indexing, or custom analysis:
try (FileWriter writer = new FileWriter("output.json")) { writer.write(contentJson); } } }}Understanding the output
extractContent() returns structured JSON with layout and semantic information.
ICR output includes:
- Document elements — Paragraphs, headings, tables, figures, and equations
- Bounding boxes — Pixel coordinates for detected regions
- Reading order — Element order for content flow reconstruction
- Element classification — Semantic labels such as paragraph, table, and heading
- Hierarchical structure — Parent-child relationships across sections and blocks
Use this JSON for extraction pipelines, structured storage, and search indexing.
Error handling
Vision API throws VisionException when extraction fails.
Common failure scenarios include:
- The image file can’t be read because of path or permission issues.
- Image data is corrupted or truncated.
- ICR models are missing or inaccessible.
- Available memory is insufficient for model loading.
- Image format or encoding is unsupported.
In production code:
- Catch
VisionException. - Return a clear error message.
- Log failure details for debugging.
Conclusion
Use this workflow for ICR-based extraction:
- Open the image document using try-with-resources for automatic resource cleanup.
- Configure the vision settings with
setEngine()to assignVisionEngine.Icrfor local AI processing. - ICR is the default engine, making this configuration optional but useful for explicit control.
- Create a vision instance with
Vision.set()to bind content extraction operations to the document. - Call
extractContent()to invoke local AI models for document layout analysis. - The ICR engine loads AI models, detects semantic elements (tables, equations, headings), and determines reading order.
- The method returns a JSON-formatted string containing complete document structure with bounding boxes in pixel coordinates.
- All processing occurs locally without external API calls, ensuring data privacy and offline capability.
- Write the JSON content to a file using try-with-resources with
FileWriterfor automatic resource cleanup. - Handle
VisionExceptionerrors for robust error recovery in production environments. - The JSON output enables integration with downstream pipelines, including data extraction, database storage, and search indexing.
- ICR mode is ideal for air-gapped environments, sensitive document processing, and high-volume workflows.
For related image extraction workflows, refer to the Java SDK guides.
Download this ready-to-use sample package to explore the Vision API capabilities with preconfigured ICR settings.