Please use this identifier to cite or link to this item: https://repository.iimb.ac.in/handle/2074/11158
Title: Spotting earnings manipulation: using machine learning for financial fraud detection
Authors: Rahul, Kumar 
Seth, Nandini 
Dinesh Kumar, U 
Keywords: Accrual Manipulation;Bagging;Boosting;Data Analytics;Earnings Manipulation;Ensemble Methods;Gaussian Model;Sampling;Simulation;Supervised Learning;Unsupervised Learning
Issue Date: 2018
Publisher: Springer Verlag
Abstract: Earnings manipulation and accounting fraud leads to reduced firm valuation in the long run and a public distrust in the company and its management. Yet, manipulation of accruals to hide liabilities and inflate earnings has been a long-standing fraudulent conduct amongst many listed firms. As auditing is time consuming and restricted to a sample of entries, fraud is either not detected or detected belatedly. We believe that supervised machine learning models can be used to determine high risk firms early enough for auditing by the regulator. We also discuss the anomaly detection unsupervised learning methodology. Since the proportion of manipulators is much lower than the non-manipulators, the biggest challenge in predicting earnings manipulation is the imbalance in the data leading to biased results for conventional statistical models. In this paper, we build ensemble models to detect accrual manipulation by borrowing theory from the seminal work done by Beneish. We also showcase a novel simulation-based sampling technique to efficiently handle imbalanced dataset and illustrate our results on data from listed Indian firms. We compare existing ensemble models establishing the superiority of fairly simple boosting models whilst commenting on the shortfall of area under ROC curve as a performance metric for imbalanced datasets. The paper makes two major contributions: (i) a functional contribution of suggesting an easily deployable strategy to identify high risk companies; (ii) a methodological contribution of suggesting a simulation-based sampling approach that can be applied in other cases of highly imbalanced data for utilizing the entire dataset in modeling.
URI: https://repository.iimb.ac.in/handle/2074/11158
ISBN: 9783030041908
9783030041915
ISSN: 0302-9743
DOI: 10.1007/978-3-030-04191-5_29
Appears in Collections:2010-2019

Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.