27 April, 2022 AI for document classification in the insurance industry When process automation meets AI, it not only increases performance and efficiency but also minimizes errors. This is how Iris Global has managed to streamline daily operations and reduce risks in crucial business processes. Document management is one of the key activities in the insurance sector: any procedure involves the exchange of documents, in which the information needed by the company to carry out its procedures is gathered. The automation of processes is already a first step to streamline daily operations. However, Iris Global understood that this was not enough: they needed to go a step further to increase efficiency. Indeed, simplifying manual work would free the teams from tasks that added little value. But how could the RPA (Robot Process Automation) solution they planned to implement in the company be made more capable? The answer lay in Artificial Intelligence and Machine Learning, in which eºmergya is specialized. A smart Robot Process Automation solution A Robot Process Automation solution is not "intelligent", in other words, it can activate procedures automatically, but it does not have the capacity to extract information, interpret it, classify it... For this reason, although it is usually a first step in improving processes, it was not sufficient for the major change that Iris Global was considering. This digital transformation responded to several purposes of the company. On the one hand, to improve customer service and user experience, the teams can attend to the user with more dedication by reducing the time spent on manual work. On the other hand, minimize operational risk to increase efficiency and performance. To address this change, it was proposed to implement an Agile methodology that would allow starting with one or two use cases of great importance for the core business but whose technical complexity was low or medium. This would lay the groundwork for future iterations. Automatic document classification How many documents can an insurance company receive by e-mail in a single day? ID cards, burofaxes, bank account certificates, invoices? An RPA tool can automatically store all these files, but it will not be able to classify them under tags or other criteria. To overcome this obstacle we trained an NLP (Natural Language Processing) AutoML model capable of processing the information contained in a document, its title, the subject of the email that sent it... The model interprets the content and assigns it to one of the 23 established typologies. In addition, thanks to the implementation of Vision API, it supports different file formats: JPEG, JPG, PNG, PDF, GIF. TIFF and WEBP. Thanks to Artificial Intelligence, every time Iris Global receives a document related to the management of claims from any line of business, the Machine Learning model processes the information it contains, "understands” it, and automatically classifies the file. In this way, the management of the procedures that are carried out daily in the company has been speeded up exponentially. Information extraction One of the main disadvantages of traditional document digitization is that the information they contain cannot be processed. Any management software requires data in a key-value format in order to work. However, the files are usually presented with unstructured information and are even stored as images (in JPG, PDF, or similar format). Thus, such digitally archived forms may have evidential value but little use for processing. In contrast, Artificial Intelligence allows us to detect any data included in a file, link them to certain tags and store them in formats that can be processed by any system. Therefore, in addition to the document classification use case, we trained for iris Global an ML model of information extraction that allowed us to identify the various fields that include the wide variety of processed documents, some as specific as the NIS (Claim Number). In total, our model is able to extract information from 7 different document types, from an ID card to a death certificate. Automated classification of 90% of documents The ML models developed have a common architecture and have used solutions offered in the Google Cloud Platform such as Vision API and AutoML NLP. The document classification and the information extraction models have been applied to two essential processes within the company: settlements and document management. So far, success has been achieved in the automated classification of 90% of the documents processed by the model. In total, the implemented solution is expected to classify more than 400,000 documents per year in the settlement process and another 250,000 in document management. As for the other use case, 75% of the documents involved have been correctly extracted.