Indexed by:
Abstract:
As big data and artificial intelligence continue to gain importance, the reliable quality of datasets has be-come a crucial factor in algorithm performance and result reliability. However, many datasets lack standardization and quality control, which can lead to potential issues in software development and data analysis. In this paper, we propose an objective dataset evaluation algorithm that utilizes multiple metrics, including unified naming conventions and document annotations, for statistical analysis and filtering to ensure reliable data. Our approach scores datasets using reasonable criteria, and we use visualization techniques inspired by remote sensing to compare and visualize reliable dataset quality. Our research demonstrates that our evaluation method is feasible and can assist software developers in enhancing the reliable quality and efficiency of software development by improving dataset quality. Our study concentrates on datasets obtained from GitHub using a web crawler, and our approach establishes a standardized technique for evaluating reliable dataset quality. © 2023 IEEE.
Keyword:
Reprint Author's Address:
Email:
Source :
Year: 2023
Page: 263-270
Language: English
Cited Count:
SCOPUS Cited Count: 1
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: