Tax payment default prediction using genetic algorithm-based variable selection

Research output: Contribution to journalArticleScientificpeer-review

28 Citations (Scopus)

Abstract

According to the statistics from the Finnish tax authorities, about 12% of all active firms in Finland had unpaid taxes at the end of year 2015. In monetary terms, this translates to over 3 billion euros in unpaid taxes. This is a highly significant amount as the total amount of taxes collected during 2015 was 49 billion euros. Considering the economic significance of the unpaid taxes, relatively little research has been done on identifying tax defaulting firms. The objective of this study is to develop a genetic algorithm-based decision support tool for predicting tax payment defaults. More closely, a genetic algorithm is used for determining an optimal or near optimal subset of variables for a linear discriminant analysis (LDA) model that classifies the examined firms as either defaulting or non-defaulting. The tool also provides information about the importance of various variables in predicting a tax default. The dataset consists of Finnish limited liability firms that have defaulted on employer contribution taxes or on value added taxes and the total number of available variables is 72. The results show that variables measuring solvency, liquidity and payment period of trade payables are important variables in predicting tax defaults. The best performing model comprises three non-linearly transformed variables and has a predictive accuracy of 73.8%.
Original languageEnglish
Peer-reviewed scientific journalExpert Systems with Applications
Volume88
Issue numberDecember
Pages (from-to)368-375
Number of pages8
ISSN0957-4174
DOIs
Publication statusPublished - 20.07.2017
MoE publication typeA1 Journal article - refereed

Keywords

  • 512 Business and Management
  • Tax default
  • Discriminant analysis
  • Genetic algorithms
  • Variable selection

Fingerprint

Dive into the research topics of 'Tax payment default prediction using genetic algorithm-based variable selection'. Together they form a unique fingerprint.

Cite this