• 综合
  • 标题
  • 关键词
  • 摘要
  • 学者
  • 期刊-刊名
  • 期刊-ISSN
  • 会议名称
搜索

作者:

Chen, Junhua (Chen, Junhua.) | Tian, Miao (Tian, Miao.) | Qi, Xingming (Qi, Xingming.) | Wang, Wenxing (Wang, Wenxing.) | Liu, Youjun (Liu, Youjun.) (学者:刘有军)

收录:

EI Scopus SCIE

摘要:

The reconstruction of cross-cut shredded text documents (RCCSTD) is an important problem in forensics and is a real, complex and notable issue for information security and judicial investigations. It can be considered a special kind of greedy square jigsaw puzzle and has attracted the attention of many researchers. Clustering fragments into several rows is a crucial and difficult step in RCCSTD. However, existing approaches achieve low clustering accuracy. This paper therefore proposes a new clustering algorithm based on horizontal projection and a constrained seed K-means algorithm to improve the clustering accuracy. The constrained seed K-means algorithm draws upon expert knowledge and has the following characteristics: 1) the first fragment in each row is easy to distinguish and the unidimensional signals that are extracted from the first fragment can be used as the initial clustering center: 2) two or more prior fragments cannot be clustered together. To improve the splicing accuracy in the rows, a penalty coefficient is added to a traditional cost function. Experiments were carried out on 10 text documents. The accuracy of the clustering algorithm was 99.1% and the overall splicing accuracy was 91.0%, according to our measurements. The algorithm was compared with two other approaches and was found to offer significantly improved performance in terms of clustering accuracy. Our approach obtained the best results of RCCSTD problem based on our experiment results. Moreover, a more complex and real problem - reconstruction of cross-cut shredded dual text documents (RCCSDTD) problem - was tried to solve. The satisfactory results for RCCSDTD problems in some cases were obtained, to authors' best knowledge, our method is the first feasible approach for RCCSDTD problem. On the other hand, the developed system is fundamentally an expert system that is being specifically applied to solve RCCSTD problems. (C) 2019 Elsevier Ltd. All rights reserved.

关键词:

Ant colony algorithm Constrained seed K-means algorithm Horizontal projection Penalty coefficient Reconstruction of cross-cut shredded documents (RCCSTD)

作者机构:

  • [ 1 ] [Chen, Junhua]Beijing Univ Technol, Coll Life Sci & Bioengn, 100 Pingleyuan, Beijing 100124, Peoples R China
  • [ 2 ] [Tian, Miao]Beijing Univ Technol, Coll Life Sci & Bioengn, 100 Pingleyuan, Beijing 100124, Peoples R China
  • [ 3 ] [Qi, Xingming]Beijing Univ Technol, Coll Life Sci & Bioengn, 100 Pingleyuan, Beijing 100124, Peoples R China
  • [ 4 ] [Wang, Wenxing]Beijing Univ Technol, Coll Life Sci & Bioengn, 100 Pingleyuan, Beijing 100124, Peoples R China
  • [ 5 ] [Liu, Youjun]Beijing Univ Technol, Coll Life Sci & Bioengn, 100 Pingleyuan, Beijing 100124, Peoples R China

通讯作者信息:

  • 刘有军

    [Liu, Youjun]Beijing Univ Technol, Coll Life Sci & Bioengn, 100 Pingleyuan, Beijing 100124, Peoples R China

查看成果更多字段

相关关键词:

来源 :

EXPERT SYSTEMS WITH APPLICATIONS

ISSN: 0957-4174

年份: 2019

卷: 127

页码: 35-46

8 . 5 0 0

JCR@2022

ESI学科: ENGINEERING;

ESI高被引阀值:52

JCR分区:1

被引次数:

WoS核心集被引频次: 8

SCOPUS被引频次: 10

ESI高被引论文在榜: 0 展开所有

万方被引频次:

中文被引频次:

近30日浏览量: 1

在线人数/总访问数:1546/2924530
地址:北京工业大学图书馆(北京市朝阳区平乐园100号 邮编:100124) 联系我们:010-67392185
版权所有:北京工业大学图书馆 站点建设与维护:北京爱琴海乐之技术有限公司