• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Li, Dian (Li, Dian.) | Wang, Weidong (Wang, Weidong.) | Zhao, Yang (Zhao, Yang.)

Indexed by:

EI Scopus

Abstract:

As big data and artificial intelligence continue to gain importance, the reliable quality of datasets has be-come a crucial factor in algorithm performance and result reliability. However, many datasets lack standardization and quality control, which can lead to potential issues in software development and data analysis. In this paper, we propose an objective dataset evaluation algorithm that utilizes multiple metrics, including unified naming conventions and document annotations, for statistical analysis and filtering to ensure reliable data. Our approach scores datasets using reasonable criteria, and we use visualization techniques inspired by remote sensing to compare and visualize reliable dataset quality. Our research demonstrates that our evaluation method is feasible and can assist software developers in enhancing the reliable quality and efficiency of software development by improving dataset quality. Our study concentrates on datasets obtained from GitHub using a web crawler, and our approach establishes a standardized technique for evaluating reliable dataset quality. © 2023 IEEE.

Keyword:

Python Quality control Information retrieval systems Java programming language Visualization Remote sensing Software design Web crawler

Author Community:

  • [ 1 ] [Li, Dian]Beijing University of Technology, Faculty of Information Technology, Department of Software Engineering, Beijing, China
  • [ 2 ] [Wang, Weidong]Beijing University of Technology, Faculty of Information Technology, Department of Software Engineering, Beijing, China
  • [ 3 ] [Zhao, Yang]Beijing University of Technology, Faculty of Information Technology, Department of Software Engineering, Beijing, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2023

Page: 263-270

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count: 1

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 2

Affiliated Colleges:

Online/Total:1069/5330059
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.