Advanced HTTP response handling#
HTTP response codes#
In most cases the POST /processing/{workflow}
request should respond with an HTTP code 201 - Created
.
This success code indicates that the document processing has been completed and the results are immediately delivered
within the response body. However, there are also some other possible responses, that may deserve advanced handling
within your application.
HTTP Code 202 - Accepted#
This HTTP response code indicates that the document processing has been initiated but did not finish
early enough to provide the results immediately within the same request. In this case the response
body will not contain any processing results, but only the two fields processing_id
and url
.
Whereby:
-
processing_id
contains a unique identifier for the document processing instance, that will be processed further in the background. This identifier can be used to retrieve the finished results object later via theGET /processing/results/{processing_id}
endpoint. -
url
contains the URL pointing to theGET /processing/results/{processing_id}
endpoint, already including the aforementionedprocessing_id
for convenience.
A call to the GET /processing/results/{processing_id}
endpoint will behave completely analogous to the
initial POST /processing/{workflow}
endpoint:
- If the processing again does not finish within the long polling timeout, it returns the same response again with status code 202.
- If the processing finishes in time, it will return the exact same response as the initial request would have provided for the workflow.
Hint
This endpoint will return the same response format as the POST /processing/{workflow_key}
endpoint, i.e.
the relevant processing results packed into a single large JSON object.
However, we also provide endpoints for (re-)fetching individual processing results and additional workflow artifacts
Implementation examples#
import urllib.parse
import requests
headers = {"Accept": "application/json", "Authorization": "ApiKey " + YOUR_API_KEY_SECRET}
params = {}
url = "https://api.natif.ai/processing/{workflow}?{params}".format(
workflow=WORKFLOW_IDENTIFIER,
params=urllib.parse.urlencode(params)
)
with open(FILE_PATH, 'rb') as file:
response = requests.post(url, headers=headers, files={"file": file})
while response.status_code == 202: # long polling timed-out
processing_id = response.json()['processing_id']
retry_url = "https://api.natif.ai/processing/{workflow}/{processing_id}?{params}".format(
workflow=WORKFLOW_IDENTIFIER,
processing_id=processing_id,
params=urllib.parse.urlencode(params)
)
response = requests.get(retry_url, headers=headers)
HTTP Code 429 - Too Many Requests#
Hint
For details about the default API rate-limits see this page.
If the endpoint returns this HTTP Client error code, this indicates, that one of the specified rate-limits
for the selected workflow
has been reached and the current request was therefore temporarily rejected.
In this case the response contains a Retry-After
header, which indicates the number seconds, before the
next request is going to be accepted. You may use this information to implement automated request
back-offs in your software, similar to the following examples.
Implementation examples#
import time
import urllib.parse
import requests
headers = {"Accept": "application/json", "Authorization": "ApiKey " + YOUR_API_KEY_SECRET}
params = {}
url = "https://api.natif.ai/processing/{workflow}?{params}".format(
workflow=WORKFLOW_IDENTIFIER,
params=urllib.parse.urlencode(params)
)
with open(FILE_PATH, 'rb') as file:
response = requests.post(url, headers=headers, files={"file": file})
while response.status_code == 429: # too many requests, i.e. hit a rate limit: back off for a while and retry
suggested_wait_seconds = int(response.headers.get('Retry-After', 1))
time.sleep(suggested_wait_seconds)
response = requests.post(url, headers=headers, files={"file": file})
HTTP Codes 400, 401 and 403#
Info
If you still use the deprecated token based authentication, please refer to the token documentation.
In this case something about your request authorization seems to be wrong. Please make sure you use a
valid Auhorization: ApiKey <...>
header.
Invalid authentication can also be caused by an API key that has been revoked or automatically expired. This can be checked in the API Keys section of the natif.ai API Hub.
HTTP Code 404 - Not Found#
This error may be caused by specifying a non-existent workflow
key in the endpoint path - or just a typo in
the endpoint base-url.
HTTP Codes 30X#
Redirect codes may happen due to missing or extra trailing slashes /
at the end of the endpoint URL. If the
requests library of your choice does not support following redirects automatically, please try adding or removing
the trailing slash /
to/from the endpoint URL. In any case it makes sense to avoid such redirects to reduce
the amount of HTTP requests being issued from your application.
HTTP Code 422 - Validation Error#
At least one of the parameters passed to the endpoint could not successfully be parsed or validated. Please refer to
the error details from the application/json
response to find out which parameter is affected and how.
HTTP Code 50X - Internal Server Error#
Potential Bug
Shoot! Something went wrong on our side! If this happens regularly or systematically, please get in touch and report a bug, describing the process that led to the error-code, so we can have a look at it.
Partial extractions
objects in the response JSON#
Note
The AI models used for document extraction work probabilistically, which means that it is impossible to guarantee 100% certainty and correctness of extracted results - nonetheless we take care to be as close as possible to the 100%.
Oftentimes entire subtrees of the extractions
result objects are optional, i.e. only included in the
JSON response body, if they
- were found in the submitted document - documents in the wild have a lot of variation and not all documents of the same kind always contain the exact same set of information
- were found in the submitted document with sufficient certainty
This means anywhere on the path to a nested value one needs to anticipate and handle null
values,
when using the /processing/<workflow>
response JSON object.
Below artificial example illustrates this with three different extractions
variations.
Many programming languages have their own idioms to handle optional values along object paths: