Request
multipart/form-data
Request Body
filefilerequiredPDF file to extract text from using OCR
Maximum 10MB, PDF format only
OCR Technology
Industry-leading accuracy across 50+ languages with advanced text detection and structure analysis.
Request Example
curl --request POST \
--url https://pdfmage.app/api/v1/document-ocr \
--header 'Accept: application/json' \
--header 'Authorization: Bearer pk_live_abc123...' \
--form 'file=@/path/to/document.pdf'Response
application/json
Response Body
fileNamestring - Original filename of processed documentprocessedAtstring (ISO 8601) - Processing completion timestampdocumentIdnumber - Unique identifier for this OCR operationextractedTextobjectComplete text extraction with spatial data
fullTextstring - Complete extracted text from all pagespagesarray - Comprehensive per-page data• pageNumber: number - Page number (1-based)
• dimensions: object - Page dimensions (width, height, unit)
• text: string - Extracted text for this page
• confidence: number - Average confidence score (0-1)
• blocks: array - Text blocks with bounding boxes
• paragraphs: array - Paragraphs with bounding boxes
• lines: array - Text lines with bounding boxes
• tokens: array - Individual tokens with bounding boxes
metadataobjectProcessing metadata and statistics
pageCountnumber - Total pages processedprocessingTimeMsnumber - Processing time in millisecondslanguagestring - Document language (default: 'en')elementCountsobject - Count of detected elements• blocks: number - Text blocks detected
• paragraphs: number - Paragraphs detected
• lines: number - Text lines detected
• tokens: number - Individual tokens detected
Element StructureformatEach text element (block, paragraph, line, token) contains:
textstring - The extracted text contentboundingBoxobject - Normalized coordinates (0-1 range)• vertices: array - Four corner points [{'x, y}]
• normalized: boolean - Always true (coordinates are normalized)
confidencenumber - OCR confidence score (0-1)Success Response Example
{
"fileName": "contract.pdf",
"processedAt": "2024-01-15T10:30:45.123Z",
"documentId": 12345,
"extractedText": {
"fullText": "Employment Agreement\n\nThis Employment Agreement is entered into between John Doe and Acme Corporation...",
"pages": [
{
"pageNumber": 1,
"dimensions": {
"width": 612,
"height": 792,
"unit": "px"
},
"text": "Employment Agreement\n\nThis Employment Agreement is entered into between John Doe...",
"confidence": 0.98,
"blocks": [
{
"text": "Employment Agreement",
"boundingBox": {
"vertices": [
{"x": 0.35, "y": 0.08},
{"x": 0.65, "y": 0.08},
{"x": 0.65, "y": 0.12},
{"x": 0.35, "y": 0.12}
],
"normalized": true
},
"confidence": 0.99
},
{
"text": "This Employment Agreement is entered into between John Doe and Acme Corporation...",
"boundingBox": {
"vertices": [
{"x": 0.1, "y": 0.15},
{"x": 0.9, "y": 0.15},
{"x": 0.9, "y": 0.25},
{"x": 0.1, "y": 0.25}
],
"normalized": true
},
"confidence": 0.97
}
],
"paragraphs": [...],
"lines": [...],
"tokens": [...]
}
]
},
"metadata": {
"pageCount": 2,
"processingTimeMs": 1247,
"language": "en",
"elementCounts": {
"blocks": 15,
"paragraphs": 28,
"lines": 142,
"tokens": 487
}
}
}Response Headers
HTTP/1.1 200 OK
Content-Type: application/json
X-Credits-Used: 0.02
X-Credits-Remaining: 4.98
X-Credits-Currency: USD
X-Processing-Time: 1247Error Responses
400
Bad Request
Invalid file format, missing file, or corrupted PDF
401
Unauthorized
Invalid or missing API key
402
Payment Required
Insufficient credit balance
413
Payload Too Large
File exceeds maximum size limit (10MB)
422
Unprocessable Entity
PDF contains no extractable text (blank pages, images only)
Error Response Example
{
"error": "Bad Request",
"message": "Invalid file format",
"details": {
"code": "INVALID_FILE_FORMAT",
"allowedFormats": ["pdf"],
"receivedFormat": "docx"
},
"timestamp": "2024-01-15T10:30:00Z",
"requestId": "req_abc123"
}Processing Error Example
{
"error": "Unprocessable Entity",
"message": "No extractable text found",
"details": {
"code": "NO_TEXT_CONTENT",
"pageCount": 3,
"reason": "Document contains only images"
},
"timestamp": "2024-01-15T10:30:00Z",
"requestId": "req_def456"
}