Unstructured data (or unstructured information) refers to information that either does not have a pre-defined data model or is not organized in a pre-defined manner.

Examples of unstructured data:

  • Raw images
  • Text data, without formatting
  • Text data that contains numeric information

Unstructured data is very difficult to analyze with traditional computer programs. And data that can't be searched an analysed is worthless.

It is said that somewhere around 80-90% of all potentially usable business information may originate in unstructured form.

Examples of data structuring:

  • Raw images (that contain image data) -> can be structured in standardized images, that can be identified and classified
  • Raw images (that contain text) -> can be transformed in structured text files
  • Text data (that contain numeric information) -> can be transformed in tables with numerical information

What we can do:

  • Optical Character Recognition (OCR): we will take your images, run them through an OCR software to produce text
  • OCR correction: we will take the primary text files produced by OCR, validate them against the source images and structure the output
  • Data extraction: we can take your text files with unstructured data, and extract the data in a database-ready format
  • Data conversion: we can take your data files, and process them to get the data in a different format

