There are three main data structure types:
Structured data: fixed-format documents like application forms and questionnaires. The layout often includes graphical elements such as boxes, checkmarks, and separators, but their position is fixed. Here, simple extraction is sufficient.
Semi-structured data: multi-variant documents with flexible layouts. There’s some visual layout such as boxes, but the format is more flexible, with variants of specific layouts. For example, you may have various invoice layouts from different vendors. This data type requires an IDP solution that can quickly learn new formats and field positions.
Unstructured data: documents with plain, natural language text. In this case, there’s little or no visual organization of text, and whole blocks of text must be read and understood before info is extracted. Because this is the most complex data type, it requires segmentation, entity extraction, and large volumes of data samples. Intelligent document solutions thrive in this type of data.