Extracting data from images using ICR

Use ICR to extract structured document data from images with local models.

Common use cases include:

Air-gapped processing environments
Privacy-sensitive document workflows
High-volume extraction with predictable runtime cost
Pipelines that need layout and semantic structure

ICR returns more than plain text. It detects layout and semantic elements such as tables, key-value regions, headings, and equations.

Download sample

How Nutrient helps

Nutrient Java SDK handles local model loading, layout analysis, and JSON generation.

The SDK handles:

Deploying and managing local AI models for document layout detection
Implementing table detection algorithms and cell boundary extraction
Handling semantic element classification and hierarchical structure parsing
Complex bounding box calculation and reading order determination

Prerequisites

Before following this guide, ensure you have:

Java 8 or higher installed
Nutrient Java SDK added to your project (Maven, Gradle, or manual JAR)
An image file to process (PNG, JPEG, or other supported formats)
Basic familiarity with Java try-with-resources statements

For initial SDK setup and dependency configuration, refer to the getting started guide.

Complete implementation

This example extracts structured JSON from an image using the ICR engine:

package io.nutrient.Sample;

Import the required classes from the SDK:

import io.nutrient.sdk.Document;
import io.nutrient.sdk.Vision;
import io.nutrient.sdk.enums.VisionEngine;
import io.nutrient.sdk.exceptions.NutrientException;

import java.io.FileWriter;
import java.io.IOException;

public class ExtractDataFromImageIcr {

Create the main method and declare thrown exceptions:

    public static void main(String[] args) throws NutrientException, IOException {

Configuring ICR mode

Open the image and set the vision engine to ICR.

In this sample:

The document opens in try-with-resources.
setEngine(VisionEngine.Icr) sets local ICR mode.
ICR is the default engine, so this step is optional.

ICR is the default engine, so this method call is optional but shown here for illustration purposes.

        try (Document document = Document.open("input_ocr_multiple_languages.png")) {
            // Configure ICR engine for local processing (this is the default)
            document.getSettings().getVisionSettings().setEngine(VisionEngine.Icr);

Creating a vision instance and extracting content

Create a vision instance and call extractContent().

In this sample:

Vision.set(document) binds extraction to the opened document.
extractContent() returns structured JSON as a string.
Processing runs locally when the engine is ICR.

            Vision vision = Vision.set(document);
            String contentJson = vision.extractContent();

Write the JSON string to a file for downstream use.

Use this output for storage, indexing, or custom analysis:

            try (FileWriter writer = new FileWriter("output.json")) {
                writer.write(contentJson);
            }
        }
    }
}

Understanding the output

extractContent() returns structured JSON with layout and semantic information.

ICR output includes:

Document elements — Paragraphs, headings, tables, figures, and equations
Bounding boxes — Pixel coordinates for detected regions
Reading order — Element order for content flow reconstruction
Element classification — Semantic labels such as paragraph, table, and heading
Hierarchical structure — Parent-child relationships across sections and blocks

Use this JSON for extraction pipelines, structured storage, and search indexing.

Error handling

Vision API throws VisionException when extraction fails.

Common failure scenarios include:

The image file can’t be read because of path or permission issues.
Image data is corrupted or truncated.
ICR models are missing or inaccessible.
Available memory is insufficient for model loading.
Image format or encoding is unsupported.

In production code:

Catch VisionException.
Return a clear error message.
Log failure details for debugging.

Conclusion

Use this workflow for ICR-based extraction:

Open the image document using try-with-resources for automatic resource cleanup.
Configure the vision settings with setEngine() to assign VisionEngine.Icr for local AI processing.
ICR is the default engine, making this configuration optional but useful for explicit control.
Create a vision instance with Vision.set() to bind content extraction operations to the document.
Call extractContent() to invoke local AI models for document layout analysis.
The ICR engine loads AI models, detects semantic elements (tables, equations, headings), and determines reading order.
The method returns a JSON-formatted string containing complete document structure with bounding boxes in pixel coordinates.
All processing occurs locally without external API calls, ensuring data privacy and offline capability.
Write the JSON content to a file using try-with-resources with FileWriter for automatic resource cleanup.
Handle VisionException errors for robust error recovery in production environments.
The JSON output enables integration with downstream pipelines, including data extraction, database storage, and search indexing.
ICR mode is ideal for air-gapped environments, sensitive document processing, and high-volume workflows.

For related image extraction workflows, refer to the Java SDK guides.

Download this ready-to-use sample package to explore the Vision API capabilities with preconfigured ICR settings.

Extracting data from images using ICR

How Nutrient helps

Prerequisites

Complete implementation

Configuring ICR mode

Creating a vision instance and extracting content

Understanding the output

Error handling

Conclusion

Was this helpful?

Help us improve

Thank you for your feedback!

Something went wrong. Please try again or let us know.