AI and Analytics: what implications for ECM solutions?

- 24 Nov 2022
The mid-term evolution of ECM solutions is marked by a shift from content management to knowledge management. Artificial Intelligence - already available for limited use on the Hyland Nuxeo and Alfresco platforms - and the extensive analysis of document content on a large scale, should enable strong progress in this "content intelligence".
WHICH CHALLENGES MUST THE ECM SOLUTIONS MEET?
ECM solutions are now faced with the following challenges:
- Change of scale (exponentially growing content volumes)
- Greater variety of content, both structured and unstructured (text, images, videos, audio files, chatbot logs, digital interaction traces, etc.).
- New functional requirements (having self-suggested elements and being able to cross-reference them with other searches, without being compartmentalized in a structured query, giving access to all types of business scenarios, etc.)
To meet these challenges, the objectives set, among others, are as follows:
- Know how to recognize a type of document;
- Analyze images or content of a document to extract the intent or the main subject;
- Automate workflows;
- Extract metadata automatically;
- Extract text from audio or video media to enable qualification and indexing)
These objectives will allow you to take better advantage of the value of your document holdings while automating the processing of content and the management of the document life cycle.
Taking contract management as an example:
Today, the best that ECM solutions can do is classify your content by examining the metadata tags and keywords in the documents.
- An ECM solution allows the user to choose the type of document "contract" and to classify it with other legal contracts according to this metadata.
- AI will take the ECM solution to the next level, not only by classifying the document as a contract (by directly recognizing the type of document), but also by evaluating it to ensure that it is a valid contract a priori and that it contains the clauses deemed necessary for this type of use.
The contribution of AI
Today we find two categories of AI adapting to ECM, one called "generic" and the other "personalized or contextualized" (knowing that these two approaches can be complementary).
- AI generic : It is based on the services of existing suppliers, which are generally the GAFAMs (with Google and Amazon in the lead via their tools: Google Vision, Amazon Rekognition, Amazon Textract, to name just a few). Their advantages are that these tools are already proven in many uses (other than ECM solutions) and that connectors already exist with the main ECM solutions on the market. The disadvantage is that the AI model called is not specialized on the data / business metadata contained in the ECM solution and, moreover, conflicts of protection of sensitive data or respect of privacy (RGPD) can arise when using these services.
- Personalized (or contextualized) AI : It helps to meet the objectives described above, i.e., to better leverage the metadata and characteristics of the corpus of documents being analyzed, but requires a project-specific modeling and training effort. Hyland Nuxeo's solution claims an "information intelligence" capability, by coupling its solution with a contextual AI service, available in the Nuxeo Insight offering.
In conclusion, although the AI domain is quite recent in ECM applications, it already touches many uses (image classification, user experience improvement, deduplication,...). It is likely that these capabilities will expand, especially to help qualify unstructured data or to extend the possibilities of action on documents beyond the uses initially planned (search, creation of new classification plans, metadata enrichment, etc.).