# OCR Models

**Overview**

Our API offers advanced Optical Character Recognition (OCR) capabilities, leveraging state-of-the-art models to extract text from various document types and images. This service supports multiple input formats, ensuring flexibility and high-quality text extraction for your applications.

#### Select Model

Start by selecting the OCR model that best suits your needs. Each model is optimized for specific document types and languages, ensuring accurate text extraction across various use cases.

#### Input Options

Our OCR API provides three flexible ways to submit documents for processing:

1. **Document URL**
   * Provide a direct URL to a PDF document hosted online
   * Ensures seamless processing of remote documents without downloading
   * Example: `https://example.com/sample-document.pdf`
2. **Image URL**
   * Submit a URL pointing to an image containing text
   * Supports common image formats (JPEG, PNG, etc.)
   * Example: `https://example.com/sample-image.jpg`
3. **File Upload**
   * Directly upload PDF documents (up to 10MB)
   * Images are automatically processed and converted for OCR
   * Supported formats: PDF , Images for direct upload,&#x20;
     * Images are automatically processed via Cloudflare storage

#### Configure Parameters

Fine-tune the OCR process with optional configuration parameters:

* **Language**
  * Specify the primary language of the document for improved accuracy
  * If not specified, the model will attempt to auto-detect the language

#### Processing and Results

Once your document is submitted, our OCR engine processes it and returns:

1. **Extracted Text**
   * Complete text content extracted from the document
   * Preserves formatting where possible
2. **Usage Information**
   * Details about the processing, including model used and processing metrics

#### Real-World Use Cases

Based on Mistral's OCR capabilities, you can effectively use this API for:

1. **Document Digitization**
   * Convert physical documents to searchable digital text
   * Preserve document structure for better readability
2. **Content Extraction**
   * Extract text from images and scanned documents
   * Maintain formatting for easier content analysis
3. **Data Processing**
   * Extract structured information from forms and tables
   * Process business documents like invoices and receipts
4. **Research and Analysis**
   * Digitize research papers and technical documents
   * Extract information for further analysis

#### Best Practices

For optimal OCR results:

* Ensure documents are clear and well-scanned
* Use high-resolution images when possible
* For multi-language documents, specify the primary language
* Consider document orientation and layout when analyzing results


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://kidjig.gitbook.io/kidjig-docs/api-provider/ocr-models.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
