Skip to content

Train-your-own Extraction Model#

Our API Hub offers a wide variety of pre-built extraction workflows. However, your specific business problem may not be included. With our train-your-own extraction capabilities, you can solve any extraction use case outside of our provided models. You have complete control over the entities you want to extract and can get a model up and running in no time.

The steps required to build your own extraction model are fairly simple and straightforward. Just follow our simple wizard:

  • Define the entities you want to extract from your documents step 1
  • Upload your documents: You can specify individual templates, or just upload all your documents in one place if they don't follow specific templates) step 2
  • Annotate your documents by selecting an entity on the left and assigning it to the text boxes on the right. step 3
  • Train your model and explore the analysis for each entity to see what can be improved. step 4

You can check this blog post to see the steps in detail.

Once training is complete, this workflow will provide you with extractions and ocr. Your workflow identifier is a UUID that is automatically generated during the creation process, i.e. you can call the workflow like any other workflow using the at the processing endpoint i.e. POST /processing/{your_workflow_identifier}.

Supported return values#

As automatically included in the response JSON, unless otherwise specified via include query parameters.

Credit cost#

A Freemium account allows for up to 100 pages per month, where the cost is 80 credits per page, and 120 credits per document.

Note

A document is usually a bundle of 10 pages.

Extractions format#

The value of the extractions key for this workflow is based on your specifications in the "Define Fields" wizard step.

Note

For a reference of the structure of each of the extractions objects see Extracted Values. Also, for accessing individual processing results or artifacts, have a look at Fetch Processing Results and Artifacts.

Important

The structure of extractions might contain optional paths. See this and this part of the documentation.