Retailer’s average shrink rate of 1.44% costs the U.S. retail economy over $48B, as reported by NRF (2017 National Retail Security Survey). The survey also notes that Loss Prevention staffs are not increasing, 55% of respondents expect staff sizes to remain flat, while another 21% anticipate staff cuts. Thus, retailers have less headcount now to tackle this age-old problem, making software aids increasingly necessary.
The traditional software approach to fraud detection in many domains, including retail, uses rule-based filters to flag suspicious activity. There are several challenges to this approach:
Machine learning can help retailers detect fraud by working in concert with the techniques and principles used in detecting credit-card fraud. Taking a page from the detection of credit-card fraud, where rules-based approaches have similar deficiencies, we can apply machine learning to supplement the rules-based approach and conquer these challenges. Machine learning here is not a replacement for the rules-based approaches but rather works in concert with them, and in fact the rules are an integral part of training some of the machine-learning models. It is possible to employ multiple machine learning techniques, both unsupervised and supervised.
A prime candidate for an unsupervised approach is the 1-class Support Vector Machine (SVM)-based anomaly detection provided by the Oracle Advanced Analytics (OAA) engine. In this use of SVM, it becomes a detector of outliers, that is, of unusual behavior. While normally SVM-based learning is supervised, the 1-class implementation in OAA makes it unsupervised. In our particular use of the 1-class SVM here, it may be more accurate to call it a “partially supervised” approach, in that the training data has suspicious activity removed from it beforehand. And how do we remove the suspicious activity? We do not use human intelligence here but simply use the existing rules-based filters, and so the rules are still an integral part of our machine-learning based fraud detection.
This approach remedies some of the deficiencies of a rules-only approach by:
Let's take a closer look. The figure below shows a portion of a report where in addition to the usual columns and a rules-based Risk Type, we have a numerical Risk Score between 0 and 100 generated by the anomaly detector (higher numbers indicate more anomaly). Thus, rules-based classification can be combined with the anomaly score in a single report. Also, the anomaly detector generates scores for accounts that were not identified by the rules as being risky.
Together with 1-class SVM anomaly detection, we also can also employ supervised-learning classification algorithms. Here, we do rely on the user to explicitly flag historical cases that are known to be fraudulent. In addition, the user may also assign a “degree of auspiciousness” to cases, ranging from “definitely fraudulent” for those cases that investigation showed were truly fraud, to “not suspicious” for those cases that the user knows for sure are not fraud. The classification algorithm will learn whatever classification the users have provided, whether it was simply a flag indicating “definitely fraudulent” or a more refined gradation of suspicion.
This case-by-case method allows a per-retailer configuration of how supervised and unsupervised approaches are used. For example, at a retailer which has been adept at finding cases that were “definitely fraud,” we may configure the system to only run the anomaly detection on the cases that are not classified as “definitely fraud” by the supervised-learning algorithms, since the supervised algorithms marking activity as “fraud” is reliable enough in this case that it is unnecessary to also run the anomaly detection on the same activity. We can reserve the anomaly detection for the activity not classified as "definitely fraud". Thus, the anomaly detection for some retailers can be a kind of fall back, and overall the system is more reliable than if we were to rely only on the supervised-learning algorithms.
The figure below illustrates how anomaly detection works. The red stars are the anomalies because they are away from the bulk of the black triangles. Thus the black triangles constitute normal behavior and the red stars are anomalies and could indicate fraudulent activity.
In general, we can configure the weighting between supervised and unsupervised approaches, according to how effective the retailer believes its fraud detection processes are. Investigation of fraud requires human effort, and retailers may differ in the amount of effort they devote to it and how proficient they are in detecting fraud.
Another way in which fraud detection at retailers may be different from detecting credit-card fraud is the need to use different data for each of the broad classes of fraud. At the typical retailer, there are two:
We need to train separate models for account-level fraud and for cashier-level fraud. The cashier data poses an unusual data-science problem, in that it is limited to the number of cashiers at a retailer (whereas the number of customer accounts is unbounded). Thus, to obtain reliable models for the cashier case we greatly increase the number of cashier features that we use in the models, so that the amount of data fed into the model per cashier is increased.
While it is possible to adopt some techniques from credit-card fraud detection, fraud detection for retail has some significantly different requirements, and requires tailoring a machine learning approach specifically for retail. However, the extra development investment improves the usability of the system, its effectiveness at finding fraud, and its per-retailer configurability.
To learn more about how machine learning helps retailers in all areas of the business, check out: