Machine learning to detect fraud
Author: Tommaso Motta
What is fraud detection
Fraud detection and prevention are sets of activities used to prevent money or property from being obtained by third parties through false pretenses. These are challenging problems because just one fraud can quickly turn into a big money loss without the appropriate tools and systems.
Fraud detection and prevention are applied to many industries both private, like insurance and bank, and public, like national government.
It is important to highlight the differences between fraud prevention and fraud detection, in fact the first one occurs before the fraud attempt and its goal is to reduce the risk of future fraud, while the second one occurs during the fraud attempt and its goal is to ease the fraud.
Types of frauds
Fraud, in legal terms, is intentional deception to secure unfair or unlawful gain, or deprive a victim of legal right. There are lots of different ways to commit a fraud, for example in banking one of the common types of fraud is customer account takeover, where someone illegally gains access to a victim’s bank account. Fraud in insurance can be the embezzlement of insurance premium, called premium diversion fraud, while in government the fraud include billing for unnecessary procedures, overcharging for items that cost much less.
In general, the most common types of fraudulent activities that still occur today are:
- Denial of service: this attack seeks to make a machine or a network resources unavailable to its intended users by temporarily or indefinitely disrupting services of a host connected to the Internet. It is typically accomplished by flooding the targeted machine or resource with superfluous requests in an attempt to overload system and prevent some or all legitimate request from being fulfilled.
- Malware: it is any software intentionally designed to cause damage to a computer, server, client or computer network. There are different kinds of malware, such as worms and Trojan horses. Programs are also considered malware if they secretly act against the interests of the computer user.
- Phishing: it is the fraudulent attempt to obtain sensitive information or data, such as usernames, passwords and credit card details, by disguising oneself as a trustworthily entity in an electronic communication. Typically carried out by email spoofing and instant messaging, phishing often pushes users to enter personal information at a fake website which is completely similar to the legal one.
- Ransomware: it is a type of malware that threatens to publish the victim’s data or perpetually block access to it unless a ransom is paid. This class of malware is a criminal moneymaking scheme that can be installed through deceptive links in an e-mail message, instant message or website. It has the ability to lock a computer screen or encrypt important, predetermined files with a password. The advanced malware uses a crypto viral extortion, which encrypts the victim’s files and demand a ransom payment to decrypt them.
Fraud detection techniques
Due to the dramatic increase of fraud which results in loss of billions of dollars worldwide each year; several modern techniques in detecting fraud are continually evolved and applied to many business fields. Fraud detection involves monitoring the behavior of populations of users in order to estimate, detect, or avoid undesirable behavior.
Fraud detection systems can be separated by the use of statistical data analysis techniques or artificial intelligence.
These types of analysis include the use of:
- Calculating statistical parameters: such as averages, quantiles and performance metrics in order to find some possible fraud comparing the available data with these parameters. For example, the parameters can be the average numbers of calls per month or the average delays in bill payment.
- Regression analysis: it is used to determinate the relationship between two or more interest variables. This method can be used to help analyzing and discover relationship among variables and predict actual results
- Probability distributions and models of different business activities conducted in a normal condition.
- Data matching: it is used to compare two sets of collected data. The process can be performed based on algorithms or programmed loops, where processors perform sequential analyses of each individual piece of a data set, matching it against each individual piece of another data set.
The main AI techniques are:
- Data mining: it is the process used to extract usable data from a larger set of raw data. It involves exploring and analyzing large blocks of information to find out meaningful patterns and trends, including those related to fraud.
- Neural networks: they are computing system that, thanks to their flexibility, can be applied for complex pattern recognition and prediction problems. In fact they can automatically generate classification, clustering, generalization and forecasting.
- Machine learning: it is a mathematical model based on sample date, known as training data, in order to make predictions or decision without being explicitly programmed to do so, hence it can automatically identify characteristic of fraud
- Pattern recognition: it is the automated recognition of regularities in data through the use of computer algorithms. Thanks to these regularities the date can be classified into different categories.
How can machine learning be applied to fraud detection?
Machine learning, as said before, is a subset of AI, but it is different because it is able to analyze a large amount of information and it can learn how to make decision about the data, similar to a way that a human would do. The machine learning is perfectly suited to fraud detection and prevention due to its speed, to its scalability, to its efficiency and to its accuracy.
In this field FinScience is deeply involved and, having developed proprietary machine learning algorithms capable of collecting and managing a huge amount of data, they provide a concrete solution to find anomalies. In fact, in order to manage the previously mentioned algorithms, and thus making the user’s life easier, FinScience has developed a platform that has the function of monitor specific data, spot trends before they are trending, analyze in depth specific topics or phenomena (Knowledge Graph) and finally present the results through an easy to read dashboard to the user.