Synchronous vs. Asychronous Usage
A note about "asynchronous processing"#
The POST /processing/{workflow}
API by default implements "long-polling", i.e. it waits for the document extraction
to finish (up to a maximum wait-time) and answers directly with the processing result as soon as it becomes available.
This behavior may enable easy API integration in most circumstances, because only a single HTTP request is required
to process a document in the common case. If however this "synchronous" processing approach does not fit your business
needs, it is also possible to use the same API in an "asynchronous" fashion.
By setting an additional query-parameter wait_for=0
, the maximum wait-time of the POST /processing/{workflow}
request will be set to "not waiting for the result at all", causing the endpoint to immediately respond with an
HTTP 202 response.
The processing_id
returned by this HTTP response can then be used to asynchronously fetch the processing result(s)
any time later via the GET /processing/results/{processing_id}
set of endpoints as described in the previous section.
Asynchronous implementation example:#
import time
import urllib.parse
import requests
headers = {"Accept": "application/json", "Authorization": "ApiKey " + YOUR_API_KEY_SECRET}
params = {"wait_for": 0}
post_url = "https://api.natif.ai/processing/{workflow}?{params}".format(
workflow=WORKFLOW_IDENTIFIER,
params=urllib.parse.urlencode(params)
)
with open(FILE_PATH, 'rb') as file:
post_response = requests.post(post_url, headers=headers, files={"file": file})
post_response.raise_for_status()
processing_id = post_response.json()['processing_id'] # we expect an HTTP 202, as it will not finish in 0 time
...
del params["wait_for"] # now we iteratively query the results with default waiting time
fetch_url = "https://api.natif.ai/processing/{workflow}/{processing_id}?{params}".format(
workflow=WORKFLOW_IDENTIFIER,
processing_id=processing_id,
params=urllib.parse.urlencode(params)
)
fetch_response = requests.get(fetch_url, headers=headers)
while fetch_response.status_code == 202: # retry logic, just in case the processing takes exceptionally long
fetch_response = requests.get(fetch_url, headers=headers)