KDM
Knowledge Discovery Machine is our solution to easily convert documents (paper, .pdf, digital text, image, audio, video, …) into actionable classified data, eventually enriched with external content and Alternative Data.

Dedicated to Legal Advisory, Asset Management, Family Offices, Credit & Risk Management
Every organization produces and gets millions of textual documents, reports, contracts, presentations, audio/video materials and other forms of information.
There is also an exponential increase in online availability of information: from news to blogs, forums, review sites and social media, they are all full of textual data. But it’s not realistic to carefully review and categorize this data, just relying on human work.
Employees spend 1.8 hours every day searching and gathering just internal information. On average, that’s 9.3 hours per week! (23% of total weekly workhours). It’s too much!
Source: McKinsey

Source: International Data Corporation
TURN DOCUMENTS (IN ANY FORM) INTO ACTIONABLE DATA
FinScience applies proprietary AI algorithms to summarize content, extract and categorize data from huge volumes of documents.

From paper to digital text analytics
We use NLP (Natural Language Processing) and machine / deep learning technologies powered by ourselves and by PaperLit (tech company specialised in digital transformation of Publishing, part of our same group Datrix).
Content digitalization
- OCR technology to extract text content from pdf scans
- Assignment of a “quality score” to the extraction carried out
- Transformation into Editable digital text format
Key-sentences identification
- From the sample of the loans’ transferability clauses provided by the client, generation of a list of key-sentences of interest in order to train the ML Engine
- Mapping of these sentences with possible output classes (transferable, non-transferable, etc.).
Documents analysis
- NLP analysis of the entire documentation
- Within each case, the “interesting” text parts are identified based on their similarity and/or proximity to the Key-Sentences highlighted. This generates an interest score for future analysis
Documents ranking
- Based on the previous results, each possible output generates a classifier to map the characteristics of the loans and their status according to the clauses
- A final summary document is automatically created with all the results
An innovative methodology
Four main steps




In FinScience we have faced many challenges, such as using unsupervised machine learning to group together similar documents and summarise their content, extraction of the emotions contained in a text through proprietary algorithms based on deep learning, definition of the keywords in a text (the words present in the text are represented as nodes in a network, from which we try to determine which are the most important nodes inside this network, similar to how Google’s famous PageRank works), and using supervised learning to classify a great number of legal documents starting from a group of tags.
To resolve these problems FinScience need to combine knowledge of the given field, for example collaborating with lawyers in the case of legal applications, and technical knowledge of algorithms and programming.
FinScience has experiences in relation to the datalisation of contracts underlying Non-performing Loans (NPL), the buying and selling of real estate, development of alternative investment indicators and quantamental strategies, ESG evaluations of companies – in particular the measurement of the distance between in-house sustainability reports and public sentiment, and the improved reliability of SME default risk estimate models.
Key benefits
Turn documents in valuable data
Find out more about FinScience’s services and solutions most suited to your business.