Docsumo is an intelligent document processing solution that excels at extracting information and structuring it from unstructured sources. Powered by AI, it is able to turn data from invoices, receipts, payrolls, Acord forms, and various documents into an electronic format for automated processing. If you’re new to Docsumo or if it’s your first time scanning and reading information from physical documents, this handy guide will walk you through the process.
Training your document using Docsumo entails two phases – creating a new document type and using it.
An API can be used to process multiple documents of the same category. For example, driver’s license, PAN Cards, and Aadhar are classified under Indian KYC. Likewise, Insurance Compliance Documents comprise of Acord 130, Acord 25, and other types of Acord forms.
Here is an overview on how to create, train and use Docsumo APIs to read and process your documents.
1. Before you get acquainted with the Docsumo user interface, make sure you sign up for an account on the platform and log in. To access the Docsumo dashboard, visit app.docsumo.com and enter your email and password.
2. The basic workflow is that you use a pre-trained API to scan and read multiple documents. Pre-trained APIs on the platform use machine learning algorithms to intelligently read documents and extract key value pairs. However, you can always train the model for a new document type.
3. Go to Document Types > Create New Document Type. Upload your first new document
4. Set your key identifiers and choose the appropriate data types for each field. This step involves a drag and drop approach where you create bounding boxes to help the model detect different fields on the document. You can refer to our key value pair extraction guide for a complete overview of how key-value pairs work.
5. Select the ‘Add field’ button to add more fields and define the data types for each. To create a section, click on ‘Add Section’.
6. Each section is a different part of your document and each section has its own set of key value pair fields. Similarly, you can extract line items in a different section with ‘Add Table’.
7. Click on Save and Close. Docsumo will display a pop-up asking if you want to apply the changes to all your new documents or to both existing and new documents. Select whichever is appropriate for your needs and finish.
8. At this point, the document is annotated. After that, go back to 'Document Types' and 'Upload' at least 20 document files. Once uploaded, you can 'Review' the uploaded files. All the uploaded files have keys listed and you can find values extracted for some of them. If you don't see the value extracted for a data variable, help the software to capture the value for it.
9. Once all the fields are extracted accurately, 'Approve' it. Repeat the process for few documents. After doing it for a few documents, the software starts to capture data for all the fields accurately. Go ahead and 'Review' and 'Approve' at least 20 documents.
To use your newly trained API for reading and extracting information from documents, simply follow these steps:
1. Log in to the Docsumo dashboard
2. Select APIs & Services
3. Choose the appropriate API model for your document type. If you’re planning to upload Acord Form-25, you need to select the ‘Acord 25’ API. Under Actions, toggle the button and make sure it is set to Enabled.
4. Go to Document Types, find your API module and hit Upload
5. The trained Docsumo will automatically read your document and structure the information. If any errors are found or any details are missing, it will flag them and alert you instantly
6. You can click on Review to do a manual review of your documents after the automated extraction is done.
And that is how you train a new API and teach it how to read new documents. You can repeat these steps to create new document types for different purposes. Docsumo supports bulk uploading documents and has batch processing benefits too. Once you have your pre-trained APIs set up and running, it takes mere minutes to get your documents processed and converted into electronic formats.
The best part about using Docsumo is that you can convert your documents into different file types such as JSON, XML, and CSV. So, what are you waiting for? Improve your document processing times and enjoy 99% accurate automated data extraction using the power of Docsumo APIs.
For more info, sign up for a free demo on our website and try out Docsumo today!
In today’s dynamic business world, filing and archiving official documents in the digital form makes it handy, and works wonders in the future or in unforeseen circumstances.
With an automated data extraction solution, loan documents can automatically be processed end-to-end without any human errors and delays. Automation in loan document processing prevents downtimes, eliminates data redundancy, and allows companies to respond faster to client queries. By combining machine learning with deep learning and OCR, companies can eliminate huge costs, derive actionable insights, and streamline loan processing and approvals through efficient data extraction and analysis.
Mortgage lenders receive multiple identity and income verification documents along with different forms from loan applicants in a variety of formats and styles. Traditional OCR solutions fail to extract data from these semi-structured documents and that’s why more and more lenders are adopting intelligent document processing solutions. IDP solutions not only extract data correctly, they are able to validate extracted data against predefined rules in order to improve accuracy.
Intelligent Document Processing is an automation technology that captures information from a myriad of documents and data sources, extract data, and organizes it for further processing. IDP solutions enable businesses to seamlessly integrate with core processes, eliminate manual labour, address challenges faced in reading different document layouts, and meeting legal & compliance requirements. Accurate data is the foundation of every organization, and IDP assists businesses in dealing with the complexity of processing huge volumes of documents, helping them automate manual data entry processes, and move away from traditional semi-automated OCR workflows.