When using a particular data extraction engine — such as Nanonets OCR, Amazon Textract OCR, etc. —, it is important to understand how that OCR engine actually works.
In this article, you will learn how to extract data from different document formats using these tools.
Using an invoice as an example, here we show how to extract data using the Nanonets OCR. Follow these steps:
1. Set Nanonets OCR accordingly.
2. Train the model with at least a sample of 50 invoices so that the Nanonets OCR is able to extract data accurately.
Note: If you have trained the model with invoices of one particular format alone, the Nanonets OCR engine may not be able to extract data from invoices of other formats.
For more information on engines operation, please read the respective article from this list: