Skip to content

Invoice extraction#

The Invoice extraction workflow performs ocr and extractions steps specifically suited for invoices.

It is used with the workflow identifier invoice_extraction, at the processing endpoint i.e. POST /processing/invoice_extraction.

Supported workflow-parameters#

To be submitted as a configuration object with the parameters form field.

field type default
language one of de (german) or en (english) unset, i.e. multi-language (at the potential cost of OCR accuracy)

Supported return values#

As automatically included in the response JSON, unless otherwise specified via include query parameters.

Credit cost#

A Freemium account allows for up to 100 pages per month, where the cost is 50 credits per page, and 75 credits per document.

Note

A document is usually a bundle of 10 pages.

Extractions formats#

The value of the extractions key for this workflow has the following form:

For document_type = invoice

All supported fields
  • schema_version: integer (possible values: [2])
  • document_type: string (possible values: ['invoice'])
  • customer: Customer
    • name: StringExtraction
    • address: StringExtraction
    • address_struct: Address
      • address_line_1: StringExtraction
      • address_line_2: StringExtraction
      • city: StringExtraction
      • zip: StringExtraction
      • country: CountryExtraction
    • vat_id: StringExtraction
    • tax_number: StringExtraction
    • eori_number: StringExtraction
    • customer_number: StringExtraction
    • banking_information: array of BankingInformation
      • validation_problem: boolean (deprecated)
      • note: string (deprecated)
      • confidence: number (deprecated)
      • bbox_refs: array of Reference
        • page_num: integer
        • bbox_id: integer
      • iban: StringExtraction
      • bic: StringExtraction
  • vendor: Vendor
    • name: StringExtraction
    • address: StringExtraction
    • address_struct: Address
      • address_line_1: StringExtraction
      • address_line_2: StringExtraction
      • city: StringExtraction
      • zip: StringExtraction
      • country: CountryExtraction
    • vat_id: StringExtraction
    • tax_number: StringExtraction
    • eori_number: StringExtraction
    • register_id: StringExtraction
    • banking_information: array of BankingInformation
      • validation_problem: boolean (deprecated)
      • note: string (deprecated)
      • confidence: number (deprecated)
      • bbox_refs: array of Reference
        • page_num: integer
        • bbox_id: integer
      • iban: StringExtraction
      • bic: StringExtraction
    • phone: StringExtraction
    • fax: StringExtraction
    • url: StringExtraction
    • e_mail: StringExtraction
  • currency: CurrencyExtraction
  • date: DateExtraction
  • due_date: DateExtraction
  • service_period: one of:
    • DateExtraction
    • Period
      • start_date: DateExtraction
      • end_date: DateExtraction
  • number: StringExtraction
  • order_numbers: array of StringExtraction
  • order_confirmation_numbers: array of StringExtraction
  • delivery_note_numbers: array of StringExtraction
  • payment_methods: array of PaymentMethodExtraction
  • net_amount: FloatExtraction
  • tax_amount: FloatExtraction
  • additional_cost: FloatExtraction
  • gross_amount: FloatExtraction
  • tax_calculation: array of TaxCalculation
    • tax_code: StringExtraction
    • tax_rate: FloatExtraction
    • tax_amount: FloatExtraction
    • net_amount: FloatExtraction
    • gross_amount: FloatExtraction
    • early_payment_date: DateExtraction
    • discount_percentage: FloatExtraction

      The percentage is only returned if it is written on the document, i.e. it is not calculated.

    early_payment_benefit: array of EarlyPaymentBenefit

    • discount_amount: FloatExtraction (deprecated)

      Use new_amount instead.

    • new_amount: FloatExtraction

  • line_item: array of LineItem

    • pos_id: StringExtraction
    • article_id: StringExtraction
    • ean: StringExtraction
    • description: StringExtraction
    • quantity: FloatExtraction
    • unit_of_measure: StringExtraction
    • service_period: one of:
      • DateExtraction
      • Period
        • start_date: DateExtraction
        • end_date: DateExtraction
    • discount: FloatExtraction
    • additional_cost: FloatExtraction
    • tax_rate: FloatExtraction
    • tax_code: StringExtraction
    • unit_price: FloatExtraction
    • total_price: FloatExtraction
    • order_number: StringExtraction
    • order_confirmation_number: StringExtraction
    • delivery_note_number: StringExtraction
  • number_of_line_items: FloatExtraction
  • due_payable_amount: FloatExtraction
  • discount_amount: FloatExtraction
  • payment_reference: StringExtraction
  • barcodes: array of StringExtraction
  • key_value_pairs: array of KeyValuePair
    • key: StringExtraction
    • value: StringExtraction

Note

For a reference of the structure of each of the extractions objects see Extracted Values. Also, for accessing individual processing results or artifacts, have a look at Fetch Processing Results and Artifacts.

Important

The structure of extractions might contain optional paths. See this and this part of the documentation.