An improved small file processing method for HDFS - Details

Author：

Chen, Jilan (Chen, Jilan.) | Wang, Dan (Wang, Dan.) (Scholars：王丹) | Fu, Lihua (Fu, Lihua.) | Zhao, Wenbing (Zhao, Wenbing.)

Indexed by：

EI Scopus

Abstract：

Hadoop　distributed　file　system　(HDFS)　has　been　widely　used　in　various　clusters　to　build　large　scale　and　high　performance　systems.　However,　it　is　designed　to　mainly　handle　big　size　files,　therefore　the　performance　processing　massive　small　files　is　relatively　low　because　of　huge　numbers　of　small　files　imposing　heavy　burden　on　Namenode　of　HDFS.　Focusing　the　problem　about　HDFS　when　processing　small　files,　an　approach　to　improve　I/O　performance　of　small　files　on　HDFS　is　introduced.　Our　main　idea　is　to　merge　small　files　in　the　same　directory　into　large　one　and　accordingly　build　index　for　each　small　file　to　enhance　storage　efficiency　of　small　files　and　reduce　burden　on　Namenode　caused　by　metadata.　Furthermore,　a　kind　of　Cache　strategy　to　improve　the　reading　efficiency　of　small　files　on　HDFS　is　presented.　Relevant　design,　data　structure　and　implementation　are　described.　The　experimental　results　indicate　that　the　method　proposed　can　improve　the　efficiency　of　processing　massive　small　files　on　HDFS.

Keyword：

Efficiency File organization Digital storage Processing

Author Community：

[ 1 ] [Chen, Jilan]Beijing University of Technology, China
[ 2 ] [Wang, Dan]Beijing University of Technology, China
[ 3 ] [Fu, Lihua]Beijing University of Technology, China
[ 4 ] [Zhao, Wenbing]Beijing University of Technology, China

Reprint Author's Address：

Email：

wangdan@bjut.edu.cn

Show more details

Related Keywords：

Confused-Modulo-Projection-Based Somewhat Homomorphic Encryption - Cryptosystem, Library, and Applications on Secure Smart Cities
2021，IEEE Internet of Things Journal
Efficient public auditing for data migration across cloud systems
2017，International Journal of Wireless and Mobile Computing
MinHash-based fuzzy keyword search of encrypted data across multiple cloud servers
2018，Future Internet
Multi-channel seismic data synchronizing acquisition system based on wireless sensor network
2008，2008 IEEE International Conference on Networking, Sensing and Control, ICNSC

Source ：

International Journal of Digital Content Technology and its Applications

ISSN： 1975-9339

Year： 2012

Issue： 20

Volume： 6

Page： 296-304

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count： 13

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

学院待认领

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to