Overview of How FormXtract Works

FormXtract is an API-based solution for extracting data off of completed forms.

The process includes AI-driven technologies for reading the data off the form and employing humans to validate, verify or reject data that the machine gave a low-confidence score to. The resulting data is up to 99.9% accurate and context-rich so the customer’s system can ingest the data and perform work without involving any humans in the process.

The animation below shows how FormXtract takes in a submitted form and then reads each of the fields off the form to extract not only the name and value of the field, but additional meta data that can be used to display the data, make business decisions and more.

FormXtract Steps

Fill Out A Form - A customer’s user or client fills out a form and submits it back to the customer for processing.
Submit To FormXtract - The customer’s processing system takes the electronic copy of the submitted form (or scans it to TIF or PDF) and submits it to the FormXtract API.
Match Pages - FormXtract then matches the submitted document to the Quik! Forms Library to determine the company, form, page and fields to extract.
Extract Data - The data is then passed through several AI-driven processes to read the data.
1. Every character read is given a confidence score and when that score is too low, the data is then verified, edited or rejected by humans to ensure 99.9% accuracy
Store Data - The extracted data is stored in the Quik! Vault, a hyper-secure storage mechanism.
Notify Customer - A webhook then notifies the customer system that their submitted document has been processed.
Retrieve Data - The customer calls another FormXtract API to download the extracted data
Transform Data - The customer system transforms the data and puts it into the customer’s systems.

Human Validation of Data

Having data validated by people is completely secure and ensures accuracy.

When data cannot be reliably read by machines, even with the help of AI (artificial intelligence), then FormXtract has a human review the data. The data sent to a person is only a small portion of the actual data, 2-4 characters of information. The reason it is split into pieces is to ensure that private identifiable information is not shared with anyone. In addition, the person doing the reviewing has no idea what the context of the data is - they do not know what the data represents or what the value means. The person’s sole job is to look at a value and either:

validate the machine got it right,
fix the value, or
reject the value as indecipherable

The only time data is sent to a human for validation is when the data is hard to read. This typically means the data on the form was hand-written or the document itself may have artifacts on it that obscure the value (e.g. a scan done by a dirty scanner), or many other reasons the data couldn’t be read (e.g. coffee spilled on the document).