Brightleaf Solutions

  • What We Do
    • Automated Contract Abstraction
    • Contract Migration
  • Our Technology
  • Our Services
    • Services
    • FAQ
  • Our Customers
    • Customers
    • Industries
      • Finance
      • Healthcare and Life Sciences
      • Manufacturing
      • Real Estate
      • Technology
      • Telecommunication
      • Pharmaceutical
      • Marketing & Research
      • Railroad
    • Success Stories
      • Telecommunication
      • Railroad
      • Healthcare and Life Sciences
      • Global Performance Management
    • Project Synopsis
  • Our Partners
  • Company
    • Overview
    • Leadership
    • News
    • Awards & Recognitions
    • Resources
    • Careers
    • Blog
  • Contact Us

OUR THOUGHTS

About business, contracts…and life

Legacy Contract Data Extraction: Software, Manual or Both?

October 14, 2021 Posted by Bleaf

Contract data extraction or contract review is a vital step of the contract management process for any business to thrive. In contract management, review means critically analyzing the information present in a contract. The old way of capturing essential data attributes from legal documents has been transformed, thanks to technology. We hear about Artificial Intelligence (AI), we hear about Machine Learning software systems and other new technologies for data extraction from contracts. But, most of these modern-day AI software solutions run on predefined algorithms and patterns, at times failing to identify errors that might have occurred while converting the scanned documents into text-based OCR documents, resulting in incorrect abstraction and migration of data.

What is OCR?

OCR is the acronym for Optical Character Recognition. When you scan in a paper document, the resulting scan is a picture or an image. The computer only sees “dots”, and does not recognize characters – it is not meant as data. The formats that we are used to for pictures and images are. JPG, .TIF, .GIF, BMP. These are digital picture or image formats. If there is a copy of some kind on these digital images, comprising numerous dots then they need to be identified as information. They need to be “recognized” and converted to text. This process is known as Optical Character Recognition or OCR.

Challenges

Poorly scanned documents, hand-written text, and calculations, and date fields within the contracts can make it a little tricker to extract the information from contracts. OCR technologies are commonly used to convert scanned documents into machine-readable formats. But if there is poor quality data, handwritten text, and fuzzy scanning present in the documents, even the best software fails to convert the documents with 100% accuracy, which can lead to error-filled extraction.

Here is an example of a poor OCR converted document for extraction

An example of a poorly scanned and OCRed document

If an extraction engine is configured to extract the above information i.e. confidentiality and non-solicitation agreement/clause, and come across this type of poorly scanned or OCRed document with incorrect spellings and spaces in the middle of the text, the software may fail to identify and extract the appropriate information.

Software is just 1/3rd the solution, & you need a stringent process and a trained team of lawyers to achieve the best quality extraction.

You can’t rely on software only. The software can be used as a first step to extract voluminous data fast, but a manual review is required to ensure 100% accuracy in the extracted data, especially in the cases like above.

Technology itself won’t fix the problem. You need to have the right people who will streamline the process in place and then you bring in technology to fix them

Human review is essential to ensure quality. A team of lawyers should be used to check what the software extracts, fix/fill-in-the-blanks of what the software couldn’t, maybe because of some OCR read errors or handwritten attributes such as signatures, etc.

Artificial Intelligence, Blog, contract abstraction, contract analysis, legal document automation, Meta-Data

Post navigation

  • ← Capitalize the use of a CLM system post-merger or acquisitions
  • Ways to extract information out of contracts →

Categories

  • Artificial Intelligence
  • Automation
  • big data analytics
  • Blog
  • CLM
  • contract abstraction
  • contract analysis
  • contract intelligence
  • contract management system
  • COVID-19
  • Data extraction
  • document automation technology
  • documents
  • fasb
  • Force Majeure clause
  • iasb
  • in-house legal department
  • lease
  • Legacy contract migration
  • Legacy contracts
  • legal document automation
  • legal marketing
  • Meta-Data
  • mobile drafting
  • MSA
  • Obligation Management
  • practice
  • Procurement
  • Robotics
  • Uncategorized

Recent Posts

  • What are legacy contracts and approach to manage
  • Transforming contract management process with NLP
  • CEO Blog 10: Legacy contract migration/data extraction
  • CEO Blog 9: How do I extract contract obligations? And what do I do with the information?
  • CEO Blog 8: What are the best practices for legacy contract migration/data extraction?
  • What We Do
    • Automated Contract Abstraction
    • Contract Migration
  • Our Technology
  • Our Services
    • Services
    • FAQ
  • Our Customers
    • Customers
    • Industries
      • Finance
      • Healthcare and Life Sciences
      • Manufacturing
      • Real Estate
      • Technology
      • Telecommunication
      • Pharmaceutical
      • Marketing & Research
      • Railroad
    • Success Stories
      • Telecommunication
      • Railroad
      • Healthcare and Life Sciences
      • Global Performance Management
    • Project Synopsis
  • Our Partners
  • Company
    • Overview
    • Leadership
    • News
    • Awards & Recognitions
    • Resources
    • Careers
    • Blog
  • Contact Us
  • Overview
  • Leadership
  • News
  • Awards & Recognitions 2
  • Resources
  • Blog
  • Terms
  • Privacy
Copyright © 2014-2025 by Brightleaf Solutions, Inc., the owner of trademarks on brightleaf, and the tri-page leaf logo. Various trademarks held by their respective owners.
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are as essential for the working of basic functionalit...
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
SAVE & ACCEPT