Stacked-SVM: A dynamic SVM framework for telephone fraud identification from imbalanced cdrs - Details

Author：

Chang, Qingqing (Chang, Qingqing.) | Lin, Shaofu (Lin, Shaofu.) | Liu, Xiliang (Liu, Xiliang.)

Indexed by：

Abstract：

Recent　years　witnesses　the　rampancy　of　telephone　fraud　along　with　the　development　of　modern　communication　technology.　The　challenges　from　telephone　fraud　identification　mainly　exist　in　two　aspects:　(1)　the　telephone　fraud　records　are　typical　imbalanced　data　due　to　the　characteristic　of　heterogeneous　spatial-temporal　distribution,　leading　to　bias　towards　predicting　the　majority　class;　(2)　traditional　evaluation　metrics　in　imbalanced　learning　mainly　rely　on　accuracy　or　precision,　neglecting　the　completeness　of　telephone　fraud　identification　in　real-world　implementations.　In　response　to　the　limitations　of　traditional　methods,　we　propose　the　Stacked-SVM　framework　based　on　heterogeneous　ensemble　learning　and　support　vector　machines　(SVMs).　We　first　employ　both　edited　nearest　neighbors　(ENN)　and　adaptive　synthetic　sampling　(ADASYN)　to　alleviate　the　high　dimensional　curse　in　imbalanced　data　resampling;　secondly,　we　propose　the　optimal　linear　combination　strategy　in　the　iteration　of　Stacked-SVM　and　demonstrate　its　validity　with　the　help　of　Kullback-Leibler　divergence.　Finally,　we　construct　the　Stacked-SVM　framework　with　respect　to　the　constraints　of　the　loss　function　in　SVM.　We　further　compare　the　performance　under　different　evaluation　metrics　(i.e.,　accuracy,　precision,　recall,　F1-score,　and　AUC　value)　with　other　four　traditional　telephone　fraud　identification　methods,　namely　Logistic　Regression,　Isolation　Forest,　SVM　with　random　parameter　settings,　and　optimized　SVM.　We　implement　Stacked-SVM　with　a　list　of　experiments　based　on　real　telephone　fraud　data　sets　in　the　form　of　calling　detail　records　(CDRs)　from　a　Chinese　domestic　telecom　operator.　The　experimental　results　show　that　the　proposed　Stacked-SVM　holds　a　93.83%　recall　value　and　an　82.96%　accuracy　in　telephone　fraud　identification,　behaving　more　precise　and　robust　than　other　models.　©　2019　ACM.

Keyword：

Support vector machines Telecommunication industry Iterative methods Logistic regression Crime Telephone sets

Author Community：

[ 1 ] [Chang, Qingqing]Beijing University of Technology, Faculty of Information Technology, Beijing, China
[ 2 ] [Lin, Shaofu]Beijing Institute of Smart City, Beijing University of Technology, Beijing, China
[ 3 ] [Liu, Xiliang]Beijing Institute of Smart City, Beijing University of Technology, Beijing, China

Reprint Author's Address：

[liu, xiliang]beijing institute of smart city, beijing university of technology, beijing, china

Email：

liuxl@bjut.edu.cn

Show more details

Related Keywords：

The Combination Forecasting Model of Telecommunication User Tricking Account Overdraft Limit Based on Logistic Regression and SVM
2019，2019 IEEE International Conference on Artificial Intelligence and Computer Applications, ICAICA 2019
Customer Transaction Fraud Detection Using Xgboost Model
2020，2020 International Conference on Computer Engineering and Application, ICCEA 2020
Classification method based on SVM for human gene sequences
2008，Journal of Beijing University of Technology
The application of autoencoder in classification of the eye movement data
2015，4th International Conference on Information Science and Cloud Computing, ISCC 2015

Source ：

Year： 2019

Page： 112-120

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 3

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

信息学部

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to