API response formats#

The /processing/<workflow> endpoint returns extraction results in the form of a JSON response body with the following general shape:

{
  "processing_id": <uuid>,
  "ocr": <see OCR Format> ,
  "extractions": <see Extractions Format>
}

However, not all workflows support both ocr and extractions. If a workflow does not support any of the two result types, these keys will be omitted from the response entirely. You can find which result types are supported by each workflow in the respective workflow specification, see e.g. the invoice extraction.

OCR format#

An OCR result object is defined per document and contains a list of document pages at the top level.

{
  "pages": [
    {
      "fulltext": "The fulltext of your document\"as multiline string.",
      "width": 1000,
      "height": 1414,
      "bboxes": [
        {
          "id": 42,
          "x1": 0,
          "y1": 22,
          "x2": 100,
          "y2": 46,
          "text": "Natif",
          "text_entropy": 0.02
        }, 
        ...
      ]
    },
    ...
  ]
}

pages#

pages are defined by a width and a height in pixels and a list of text bounding boxes bboxes. These text boxes are not subject to any sorting or other ordering. Optionally a fulltext can be computed per page. The fulltext algorithm layouts the text bounding boxes and concatenates the individual text fields to one string.

bboxes#

Text bounding boxes are rectangles defined by the top-left-most (x1, y1) and bottom-right-most (x2, y2) coordinates. In addition, the bounding boxes have a text field which contains the detected text at the corresponding position. Each text box is identified by an integer id, which is unique for and might be referenced within the same response.

Extractions format#

Root shape#

If a workflow supports the extractions response, the resulting object will have the following shape.

Response JSON

{
  "schema_version": 1,
  "document_type": "letter",
  "l": {
    "name": {
      "validation_problem": false,
      "note": "",
      "confidence": 0.99,
      "bbox_refs": [
        {
          "page_num": 1,
          "bbox_id": 1
        }
      ],
      "value": "a string value"
    }
  }
}

Hereby the fields schema_version and document_type are always included on the root level. The remaining structure is dependent on the actual document_type of the extraction and has a tree-like structure.

Hint

See our versioning policy to learn what we consider non-breaking changes.

The schema_version is used by us to communicate breaking changes to the supported extractions-fields for each document type. Each version in itself guarantees, that only backwards-compatible changes will be made to the respective schema, and the version used is defined by and will not change for a workflow identifier. That means you can be sure, that a working implementation against a certain workflow identifier will not break over time ‐ unless the whole workflow itself is removed (We currently don't have plans in doing so for any of our prebuilt workflows).

Tree-like structured extracted fields#

More complete example for invoice

An invoice extraction might look somewhat like this. The {...}-abbreviations denote value leafs as described below.

"extractions": {
"schema_version": 2,
"document_type": "invoice",
"customer": {
  "name": {
      "validation_problem": false,
      "note": "",
      "confidence": 0.9968333333333333,
      "bbox_refs": [
        {
          "page_num": 1,
          "bbox_id": 129
        }
      ],
      "value": "John Doe"
    },
  "address": {"value": "Some Street 1, 12345 City, Antarctica", "confidence": 0.99, ...},
  "address_struct": {
    "address_line_1": {...},
    "city": {...},
    "zip": {...}
  },
  "banking_information": []
},
"vendor": {
  "name": {...},
  "address": {...},
  "address_struct": {
    "address_line_1": {...},
    "city": {...},
    "zip": {...},
    "country": {...}
  },
  "vat_id": {...},
  "register_id": {...},
  "banking_information": [
    {
      "validation_problem": false,
      "note": "",
      "confidence": ...,
      "iban": {...},
      "bic": {...}
    }
  ],
  "phone": {...},
  "fax": {...},
  "e_mail": {...},
"currency": {...},
"date": {...},
"due_date": {...},
"service_period": {...},
"number": {...},
"order_numbers": [],
"order_confirmation_numbers": [{...}],
"delivery_note_numbers": [{...}],
"payment_methods": [{...}],
"net_amount": {...},
"tax_amount": {...},
"gross_amount": {...},
"tax_calculation": [
  {
    "tax_rate": {...},
    "tax_amount": {...},
    "net_amount": {...},
    "gross_amount": {...}
  }
],
"early_payment_benefit": [],
"line_item": [
  {
    "article_id": {...},
    "description": {...},
    "quantity": {...},
    "unit_price": {...},
    "total_price": {...},
    "delivery_note_number": {...}
  },
  {
    "article_id": {...},
    "description": {...},
    "quantity": {...},
    "unit_price": {...},
    "total_price": {...},
    "delivery_note_number": {...}
  },
  ...
],
"due_payable_amount": {...},
}

Note oftentimes elements in the extractions tree are optional - sometimes even complete sub-objects, e.g. "address_struct" - and may be null, if they weren't present or found in the uploaded document.

Extracted values#

If a leaf is present in the extraction, it will always have the following structure:

StringExtraction

validation_problem: boolean
note: string
confidence: number
bbox_refs: array of Reference
- page_num: integer
- bbox_id: integer
value: string

DateExtraction

validation_problem: boolean
note: string
confidence: number
bbox_refs: array of Reference
- page_num: integer
- bbox_id: integer
value: string

PaymentMethodExtraction

validation_problem: boolean
note: string
confidence: number
bbox_refs: array of Reference
- page_num: integer
- bbox_id: integer
value: string (possible values: ['money_transfer', 'bank_collection', 'cash', 'card', 'debit', 'amazon', 'paypal', 'credit_card', 'prepayment', 'multiple_options', 'other'])

Supported payment methods.

Note: multiple_options is deprecated.
raw_value: string (deprecated)

Value before correction. Get the raw value from bbox_refs instead.
category: string (possible values: ['money_transfer', 'bank_collection', 'cash', 'card', 'debit', 'amazon', 'paypal', 'credit_card', 'prepayment', 'multiple_options', 'other'], deprecated)

Supported payment methods.

Note: multiple_options is deprecated.

FloatExtraction

validation_problem: boolean
note: string
confidence: number
bbox_refs: array of Reference
- page_num: integer
- bbox_id: integer
value: one of:
- number
- string

CurrencyExtraction

validation_problem: boolean
note: string
confidence: number
bbox_refs: array of Reference
- page_num: integer
- bbox_id: integer
value: string (possible values: ['AED', 'AFN', 'ALL', 'AMD', 'ANG', 'AOA', 'ARS', 'AUD', 'AWG', 'AZN', 'BAM', 'BBD', 'BDT', 'BGN', 'BHD', 'BIF', 'BMD', 'BND', 'BOB', 'BOV', 'BRL', 'BSD', 'BTN', 'BWP', 'BYN', 'BZD', 'CAD', 'CDF', 'CHE', 'CHF', 'CHW', 'CLF', 'CLP', 'CNY', 'COP', 'COU', 'CRC', 'CUC', 'CUP', 'CVE', 'CZK', 'DJF', 'DKK', 'DOP', 'DZD', 'EGP', 'ERN', 'ETB', 'EUR', 'FJD', 'FKP', 'GBP', 'GEL', 'GHS', 'GIP', 'GMD', 'GNF', 'GTQ', 'GYD', 'HKD', 'HNL', 'HRK', 'HTG', 'HUF', 'IDR', 'ILS', 'INR', 'IQD', 'IRR', 'ISK', 'JMD', 'JOD', 'JPY', 'KES', 'KGS', 'KHR', 'KMF', 'KPW', 'KRW', 'KWD', 'KYD', 'KZT', 'LAK', 'LBP', 'LKR', 'LRD', 'LSL', 'LYD', 'MAD', 'MDL', 'MGA', 'MKD', 'MMK', 'MNT', 'MOP', 'MRU', 'MUR', 'MVR', 'MWK', 'MXN', 'MXV', 'MYR', 'MZN', 'NAD', 'NGN', 'NIO', 'NOK', 'NPR', 'NZD', 'OMR', 'PAB', 'PEN', 'PGK', 'PHP', 'PKR', 'PLN', 'PYG', 'QAR', 'RON', 'RSD', 'RUB', 'RWF', 'SAR', 'SBD', 'SCR', 'SDG', 'SEK', 'SGD', 'SHP', 'SLL', 'SOS', 'SRD', 'SSP', 'STN', 'SVC', 'SYP', 'SZL', 'THB', 'TJS', 'TMT', 'TND', 'TOP', 'TRY', 'TTD', 'TWD', 'TZS', 'UAH', 'UGX', 'USD', 'USN', 'UYI', 'UYU', 'UYW', 'UZS', 'VES', 'VND', 'VUV', 'WST', 'XAF', 'XAG', 'XAU', 'XBA', 'XBB', 'XBC', 'XBD', 'XCD', 'XDR', 'XOF', 'XPD', 'XPF', 'XPT', 'XSU', 'XTS', 'XUA', 'XXX', 'YER', 'ZAR', 'ZMW', 'ZWL'])

ISO 4217 alphabetic currency codes.

CountryExtraction

validation_problem: boolean
note: string
confidence: number
bbox_refs: array of Reference
- page_num: integer
- bbox_id: integer
value: string (possible values: ['AD', 'AE', 'AF', 'AG', 'AI', 'AL', 'AM', 'AO', 'AQ', 'AR', 'AS', 'AT', 'AU', 'AW', 'AX', 'AZ', 'BA', 'BB', 'BD', 'BE', 'BF', 'BG', 'BH', 'BI', 'BJ', 'BL', 'BM', 'BN', 'BO', 'BQ', 'BR', 'BS', 'BT', 'BV', 'BW', 'BY', 'BZ', 'CA', 'CC', 'CD', 'CF', 'CG', 'CH', 'CI', 'CK', 'CL', 'CM', 'CN', 'CO', 'CR', 'CU', 'CV', 'CW', 'CX', 'CY', 'CZ', 'DE', 'DJ', 'DK', 'DM', 'DO', 'DZ', 'EC', 'EE', 'EG', 'EH', 'ER', 'ES', 'ET', 'FI', 'FJ', 'FK', 'FM', 'FO', 'FR', 'GA', 'GB', 'GD', 'GE', 'GF', 'GG', 'GH', 'GI', 'GL', 'GM', 'GN', 'GP', 'GQ', 'GR', 'GS', 'GT', 'GU', 'GW', 'GY', 'HK', 'HM', 'HN', 'HR', 'HT', 'HU', 'ID', 'IE', 'IL', 'IM', 'IN', 'IO', 'IQ', 'IR', 'IS', 'IT', 'JE', 'JM', 'JO', 'JP', 'KE', 'KG', 'KH', 'KI', 'KM', 'KN', 'KP', 'KR', 'KW', 'KY', 'KZ', 'LA', 'LB', 'LC', 'LI', 'LK', 'LR', 'LS', 'LT', 'LU', 'LV', 'LY', 'MA', 'MC', 'MD', 'ME', 'MF', 'MG', 'MH', 'MK', 'ML', 'MM', 'MN', 'MO', 'MP', 'MQ', 'MR', 'MS', 'MT', 'MU', 'MV', 'MW', 'MX', 'MY', 'MZ', 'NA', 'NC', 'NE', 'NF', 'NG', 'NI', 'NL', 'NO', 'NP', 'NR', 'NU', 'NZ', 'OM', 'PA', 'PE', 'PF', 'PG', 'PH', 'PK', 'PL', 'PM', 'PN', 'PR', 'PS', 'PT', 'PW', 'PY', 'QA', 'RE', 'RO', 'RS', 'RU', 'RW', 'SA', 'SB', 'SC', 'SD', 'SE', 'SG', 'SH', 'SI', 'SJ', 'SK', 'SL', 'SM', 'SN', 'SO', 'SR', 'SS', 'ST', 'SV', 'SX', 'SY', 'SZ', 'TC', 'TD', 'TF', 'TG', 'TH', 'TJ', 'TK', 'TL', 'TM', 'TN', 'TO', 'TR', 'TT', 'TV', 'TW', 'TZ', 'UA', 'UG', 'UM', 'US', 'UY', 'UZ', 'VA', 'VC', 'VE', 'VG', 'VI', 'VN', 'VU', 'WF', 'WS', 'YE', 'YT', 'ZA', 'ZM', 'ZW'])

ISO 3166-1 two-letter country codes.

SenderCategoryExtraction

validation_problem: boolean
note: string
confidence: number
bbox_refs: array of Reference
- page_num: integer
- bbox_id: integer
value: string (possible values: ['anwalt', 'bank', 'dienstleister', 'einzelhandel', 'energie', 'finanzamt', 'finanzdienstleister', 'genossenschaft', 'gericht', 'gewerkschaften', 'immobilien', 'inkasso', 'krankenkasse', 'notar', 'oeffentliche_verwaltung', 'staatsanwaltschaft', 'steuerberater', 'sonstiges'])

An enumeration.

DocCategoryExtraction

validation_problem: boolean
note: string
confidence: number
bbox_refs: array of Reference
- page_num: integer
- bbox_id: integer
value: string (possible values: ['anfrage', 'angebot', 'benachrichtigung', 'bescheid', 'bestellbestaetigung', 'bussgeld', 'einladung', 'gutschrift', 'konto', 'kuendigung', 'mahnung', 'persoenlich', 'rechnung', 'vertrag', 'werbung', 'sonstiges'])

An enumeration.

Example for a single extracted value object (leaf)

{
  "validation_problem": false,
  "note": "",
  "confidence": 0.9968333333333333,
  "bbox_refs": [
    {
      "page_num": 1,
      "bbox_id": 129
    }
  ],
  "value": "whatever"
}

Classifications format#

A classifications object is defined per document and contains the selected class for the document and a list of all options for classification.

{
  "selected": {
      "value": "...",
      "confidence": 0.99
    },
    "options": [
      {
        "value": "...",
        "confidence": 0.99
      }
    ]
}

selected#

The class with the highest probability for the given document. "value" is a string corresponding to the chosen class and "confidence" is the probability for the given class.

options#

The exhaustive list of candidate classes (including the selected class) with their corresponding probabilities for the given document. "value" and "confidence" are the same as above.

Image format#

Images served by our API use jpeg compression. The size of the images varies depending on the document type. For documents in A4 format, the DPI value is 220, which corresponds to a resolution of 2600x1800.

Document Splitting format#

The document splitting object defines the sub documents that our system determined during processing. This information is saved and made available in the form details below.

{
  "schema_version": 1,
  "sub_documents": [
    {
      "name": "sub_document_0",
      "pages": [
        1
      ]
    },
    {
      "name": "sub_document_1",
      "pages": [
        2, 3
      ]
    }
  ],
  "split_point_confidences": [
    1.0, 0.1
  ]
}

sub_documents#

List of the documents split from the larger document, each given a "name" and a list containing the indices of each page that is contained in the specific sub document.

split_point_confidences#

The confidence for each page "split" in the document. If the document has n pages, this list will have n-1 split positions.

Split PDFs format#

The split PDF information is generated alongside the document splitting data and provides mapping information to where each split PDF can be retrieved. The response returns a list of links, following the format detailed below:

[
  "/processing/results/181eb796-509b-4b34-bdde-4fddb8f5fb70/split-pdfs/0",
  "/processing/results/181eb796-509b-4b34-bdde-4fddb8f5fb70/split-pdfs/1"
]