Lifecycle Discovery

In this digital world, there is still a significant amount of data locked in electronic, and paper-based, documents. This makes data less accessible, and usable, compared to structured data stored in databases or other structured formats that already feed into corporate reporting and data analysis. Extracting and verifying this data can strengthen the overall data landscape that is being used to analyze and make corporate decisions.

DocuNECT’s Discovery module provides a powerful platform for extracting and processing this data locked in documents.

If you think about how we look at documents. On our first glance we try to determine what type of document it is…an invoice, loan appraisal, contract etc. Once we have established this then we have more context in which to look at the data. DocuNECT’s Discovery module works the same way, classifying the document type and then indexing the document to extract the data relevant to that type. The module uses different methods to achieve this including DocuNECT’s Rules Engine and Artificial Intelligence (AI), and where required, a human operator will be prompted by the system to confirm.

What is Artificial Intelligence?

A much-used term in today, but with good reason as this technology provides us significant value in both the classification and data extraction arenas.

In general, the definition is as follows:

Artificial Intelligence (AI) is a multidisciplinary field of technology that focuses on creating systems capable of performing tasks that typically require human intelligence. These tasks encompass a wide range of activities, including reasoning, problem-solving, learning, understanding natural language, perception, and decision-making. AI systems aim to replicate or simulate human-like cognitive functions to varying degrees, with the ultimate goal of achieving human-level or superhuman performance in specific domains.

“We use DocuNECT’s Discovery module to automatically identify different types of loan documents to increase the efficiency of our capture and indexing process. The web-based interface and routing functionality means that we can easily get the right users involved in the review process.”

Loan Servicing Officer – Mortgage Company

The Building Blocks of DocuNECT’s Discovery Engine

The Discovery Engine and AI provide the following capabilities:

  1. Natural Language Processing (NLP). NLP is a branch of AI focused on enabling machines to understand, interpret, and generate human language. DocuNECT can analyze the document or page text to determine sentiment analysis, and text summarization
  2. Image Analysis. Computer vision involves the development of algorithms and systems that enable machines to interpret and understand visual information from the world, such as images and videos. It’s used in facial recognition, object detection, and document imaging
  3. Powerful Business Rules. Creating powerful business rules to help verify the information against logic rules or external systems can greatly improve the data integrity
  4. Machine Learning. DocuNECT is constantly analyzing the historical document data and any interaction to increase the level of automation
  5. Data Chain of Custody. DocuNECT maintain the link between the data element and the exact location in the document so the data source can always be viewed
Discovery AI

What Can the Discovery Engine Do?

The engine can classify and extract data from different document structures:

  1. Structured Documents – A document where the fields are in a fixed location. An example would be a Government form.
  2. Semi-Structured Documents – Documents that share the same, or similar, data but have a different format. An example would be an invoice, or W2.
  3. Unstructured Documents – Documents that do not have a consistent structure. An example would be an email, a letter, or contract.

Whether a TIFF or PDF Image, PDF Document, Text, or Microsoft Office the content can be extracted and analyzed by:

  • Extracting Key-Pair Information
  • Signature Detection (Digital and Wet Signatures)
  • Sentiment Analysis
  • Image Analysis
  • Extract Barcode Data
  • Use Natural Language to Query Content

What If the Engine is Not Sure?

The Discovery engine is designed to automate the classification and extraction but there are situations when the engine is not sure. A low-quality document preventing useful data from being identified, or data that fails a business rule, can prompt the system to route the document to be reviewed by user.

Using Workflow to Review/Approve Data

Use DocuNECT’s workflow engine to route documents to different users to gather more subjective data from Subject Matter Experts (SMEs) or provide data approval before distribution. A use case here would be Accounts Payable. Once the invoice data is extracted, the documents and associated information can be routed to managers for approval before being uploaded to the finance system for payment.

It Learns as It Goes…

The more documents the Discovery engine processes, the more data history it has to refine and compare the results. In addition, any user review is also analyzed to identify patterns that can be used to improve the rules.