Skip to content

(Re-)Fetching Processing Results and Artifacts#

Even though the POST /processing/{workflow} endpoint already returns processing results in its response, there might be a need to re-retrieve individual results of previously submitted Documents later on.

For this reason we provide a set of additional GET /processing/results/{processing_id}/... endpoints that provide access to each available result of a successful processing request individually.

Hint

The collection of endpoints on this page return single processing results and other workflow artifacts. If you want to re-retrieve the full response as returned by the POST /processing/{workflow_key}endpoint, please have a look at this section.

Common Signature#

Parameters#

All of the available sub-result endpoints require the following common mandatory parameter:

FieldTypeDescription
{processing_id} UUID It is passed as a part of the query path and identifies a successful processing request. This is the same processing_id that is returned in the response body of POST /processing/{workflow}.

Responses#

All of the available sub-result endpoints consistently respond with the following HTTP-status-codes consistent with the POST /processing/{workflow}.

HTTP-Status-Code Description
200 Processing result is available and returned.
202 Processing request is still in progress.
401 Credentials could not be validated.
403 Credentials could not be authorized for this processing request.
404 Processing instance with provided id was not found.
422 Validation Error.
429 Temporary rate limit exceeded or uploaded document too large.
500 Processing request failed to complete.

Hint

Not only the semantics, also the response schemas for all the HTTP-Status-Codes above 200 are identical to the POST /processing/{workflow} to provide a maximum of consistency. On success (200), the response schema depends on the specific requested sub-result.

Hint

Please note that the GET /processing/results/{processing_id}/... endpoint described on this page slightly differs from the POST /processing/{workflow} by returning an HTTP code 200 instead of HTTP 201 for the regular success response.

Individually Requestable Results#

OCR#

GET /processing/results/{processing_id}/ocr

Retrieves the ocr processing result.

Response Details#

On success, this endpoint returns an application/json document. For details about its shape, please refer to OCR Format.

HOCR#

GET /processing/results/{processing_id}/hocr

Retrieves the ocr information for a given process instance, formatted as html, containing a visual representation of the same information as provided by the OCR result.

Response Details#

On success, this endpoint returns a text/html document.

Extractions#

GET /processing/results/{processing_id}/extractions

Retrieves the extractions processing result. The kind of available extractions formats depends on the workflow that extracted it and potentially also the detected class of the submitted document. See also Extractions Format

Response Details#

On success, this endpoint returns an application/json document. For details about its shape, please refer to Extractions Format.

List of Generated Page Images#

GET /processing/results/{processing_id}/page-images

Each document - independent of the original upload format, will usually be converted into a standard image format that is suitable to be processed page by page by our AI. This endpoint retrieves the page-images processing result, which contains a list of paths where each of these page-images can be downloaded. See GET /processing/results/{processing_id}/page-images/{page_num}.

Response Details#

On success, this endpoint returns an application/json document of the following form:

Response JSON
{
  "pages": [
    "/processing/results/{processing_id}/page-images/1",
    "/processing/results/{processing_id}/page-images/2",
    ...
  ]
}

Individual Page Images#

GET /processing/results/{processing_id}/page-images/{page_num}

Retrieves a single page image generated during processing, specified by the given page number. This endpoint corresponds to the download paths returned by GET /processing/results/{processing_id}/page-images.

Additional Parameters#

FieldTypeDescription
{page_num} int It is passed as a part of the query path and identifies the page number for which the page image should be retrieved.
width int (optional) It can passed as a query parameter to scale the image down to the provided width maintaining aspect ratio. If left empty, and height is provided, the image will be scaled with respect to height.
height int (optional) It can passed as a query parameter to scale the image down to the provided height maintaining aspect ratio. If left empty, and width is provided, the image will be scaled with respect to width.

Hint

If the given width and height don't match the original aspect ratio, the image will be scaled with respect to the larger dimension while keeping the aspect ratio.

Response Details#

On success, this endpoint returns a file of Content-Type: image/*, representing the respective document page, as it has been used for further processing.

Thumbnail#

GET /processing/results/{processing_id}/thumbnail

Retrieves a thumbnail that is automatically generated from the first page image of the document.

Response Details#

On success, this endpoint returns a file of Content-Type: image/*.

Document Splitting#

GET /processing/results/{processing_id}/document-splitting

Retrieves the document splitting result generated. Is only available as part of workflows that implement document splitting.

Response Details#

On success, this endpoint returns an application/json document. For details about its shape, please refer to Document Splitting Format.

Sub PDFs#

GET /processing/results/{processing_id}/sub-pdfs

Generated as part of the document splitting process, providing url maps to download the generated PDF during split. Similar to the document splitting result, this result is only available as part of workflows that implement document splitting.

Individual Sub PDFs#

GET /processing/results/{processing_id}/sub-pdfs/{page_num}

Retrieves a sub pdf document generated during processing, specified by the given page number. This endpoint corresponds to the download paths returned by GET /processing/results/{processing_id}/sub-pdfs.

Response Details#

On success, this endpoint returns an application/json document. For details about its shape, please refer to Sub PDF Format.

More to come... 🚀#

Not yet implemented.

We're currently working on building additional results endpoints for more processing results and artifacts, e.g. PDF with OCR overlay! Stay tuned!