You have just completed your registration at OpenAire.
Before you can login to the site, you will need to activate your account.
An e-mail will be sent to you with the proper instructions.
Important!
Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version
of the site upon release.
The processing of massive numbers of small files is a challenge in the design of distributed file systems. Currently, the combined-block-storage approach is prevalent. However, the approach employs the traditional file systems such as ExtFS and may cause inefficiency when accessing small files randomly located in the disk. This paper focuses on optimizing the performance of data servers in accessing massive numbers of small files. We present a Flat Lightweight File System (iFlatLFS) to manage small files, which is based on a simple metadata scheme and a flat storage architecture. iFlatLFS is designed to substitute the traditional file system on data servers and can be deployed underneath distributed file systems that store massive numbers of small files. iFlatLFS can greatly simplify the original data access procedure. The new metadata proposed in this paper occupies only a fraction of the metadata size based on traditional file systems. We have implemented iFlatLFS in CentOS 5.5 and integrated it into an open source Distributed File System (DFS), called Taobao FileSystem (TFS), which is developed by a top B2C service provider, Alibaba, in China and is managing over 28.6 billion small photos. We have conducted extensive experiments to verify the performance of iFlatLFS. The results show that when the file size ranges from 1KB to 64KB, iFlatLFS is faster than Ext4 by 48% and 54% on average for random read and write in the DFS environment, respectively. Moreover, after iFlatLFS is integrated into TFS, iFlatLFS-based TFS is faster than the existing Ext4-based TFS by 45% and 49% on average for random read access and hybrid access (the mix of read and write accesses), respectively.
[1] N. Agrawal, W. Bolosky, J. Douceur and J. Lorch. A five-year study of filesystem metadata. In Proceedings of the 5th USENIX Conference on File and Storage Technology (FAST'07). , San Jose, CA, USAFeb. 13-16, 2007.
[2] Dutch T. Meyer, William J. Bolosky. A Study of Practical Deduplication. In Proceedings of the 9th USENIX Conference on File and Storage Technology (FAST'11) , San Jose, CA, USA, Feb. 15-17, 2011
[3] D. Beaver, S. Kumar, H. C. Li, J. Sobel, and P. Vajgel. Finding a Needle in Haystack: Facebook's Photo Storage. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI'10), Vancouver, Canada, Oct. 2010
[4] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google file system. In 19th Symposium on Operating Systems Principles, Lake George, New York, 2003
[5] S. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop Distributed File System. In Proceedings of the Symposium on Mass Storage Systems and Technologies (MSST'10), Nevada, May 3-7, 2010
[6] Liu Jiang Bing Li Meina Song. THE optimization of HDFS based on small files. The 3rd IEEE International Conference on Broadband Network and Multimedia Technology (IC-BNMT), Oct. 2010
[7] Xuhui Liu, Jizhong Han, et al. Implementing WebGIS on Hadoop: A case study of improving small file I/O performance on HDFS. IEEE Cluster'09, doi:10.1109/CLUSTR.2009.5289196. New Orleans LA, Sep. 2009
[8] Mackey, G. Sehrish, S. Jun Wang. Improving metadata management for small files in HDFS. IEEE Cluster'09, doi:10.1109/CLUSTR.2009.5289133. New Orleans LA, Sep. 2009
[10] Avantika Mathur, Mingming Cao, Suparna Bhattacharya, Andreas Dilger, Alex Tomas, Laurent Vivier. The new Ext4 filesystem: current status and future plans. Proceedings of the Linux Symposium (PDF). Ottawa ON, CA: Red Hat Jan. 15, 2008
[11] G. R. Ganger and M. F. Kaashoek. Embedded inodes and explicit grouping: exploiting disk bandwidth for small files. In ATEC '97: Proceedings of the annual conference on USENIX Annual Technical Conference, pages 1-1, Berkeley, CA, USA, 1997.
[12] Borislav Djordjevic, Valentina Timcenko. Ext4 file system performance analysis in linux environment. Proceedings of the 11th WSEAS international conference on Applied informatics and communications. Wisconsin, USA2011
[13] Steve D. Pate UNIX Filesystems: Evolution, Design, and Implementation. Wiley. ISBN 0-471-16483-6. 2003
[14] R. Buyya, M. Pathan and A. Vakali. Content Delivery Networks, ISBN 978-3- 540-77886-8, Springer, Germany, 2008.
[15] Cade Metz, Google File System II: Dawn of the Multiplying Master Nodes, http://www.theregister.co.uk/2009/08/12/google_file_system_part_deux/, Auguest 12, 2009
[16] Hadoop Archive Guide, http://hadoop.apache.org/mapreduce/docs/r0.21.0/hadoop_archives.html, Aug. 17, 2010
[18] A. Sweeney, D. Doucette, W. Hu, C. Anderson, M. Nishimoto, and G. Peck. Scalability in the xfs file system. In ATEC '96: Proceedings of the 1996 annual conference on USENIX Annual Technical Conference, pages 1-1, Berkeley, CA, USA, 1996.
[25] F. Schmuck et al., GPFS: A shared-disk file system for large computing clusters, in Proceedings of the 1st USENIX Conference on File and Storage Technologies, FAST '02, 2002.
[26] I. F. Haddad, PVFS: A parallel virtual file system for linux clusters, Linux J., vol. 2000, Nov. 2000.
[27] Esmet J, Bender M A, Farach-Colton M, et al. The TokuFS streaming file system[C]//Proceedings of the 4th USENIX conference on Hot Topics in Storage and File Systems, HotStorage. 2012, 12: 14-14.
[28] Vrable M, Savage S, Voelker G M. BlueSky: a cloud-backed file system for the enterprise[C]//Proceedings of the 10th USENIX conference on File and Storage Technologies (FAST'12). USENIX Association, Berkeley, CA, USA. 2012: 19- 19.
[29] Hendricks J, Sambasivan R R, Sinnamohideen S, et al. Improving small file performance in object-based storage[R]. CARNEGIE-MELLON UNIV PITTSBURGH PA PARALLEL DATA LABORATORY, 2006.
[30] Zhang B, Zuo Y Y, Zhang Z C. Research and Improvement of the Hot Small File Storage Performance under HDFS[J]. Advanced Materials Research, 2013, 756: 1450-1454.
[31] Zhang Q, Feng D, Wang F. Metadata Performance Optimization in Distributed File System[C]//Computer and Information Science (ICIS), 2012 IEEE/ACIS 11th International Conference on. IEEE, 2012: 476-481.
[33] Wallace G, Douglis F, Qian H, et al. Characteristics of backup workloads in production systems[C]//Proceedings of the Tenth USENIX Conference on File and Storage Technologies (FAST'12). 2012.
[34] Harter T, Dragga C, Vaughn M, et al. A file is not a file: understanding the I/O behavior of Apple desktop applications[J]. ACM Transactions on Computer Systems (TOCS), 2012, 30(3): 10.
[35] Mason C. Journaling with reisersfs[J]. Linux Journal, 2001, 2001(82es): 3.
[36] DeCandia G, Hastorun D, Jampani M, et al. Dynamo: amazon's highly available key-value store[C]//ACM SIGOPS Operating Systems Review. ACM, 2007, 41(6): 205-220.
[38] Chodorow K. MongoDB: the definitive guide[M]. " O'Reilly Media, Inc.", 2013.
[39] George L. HBase: the definitive guide[M]. " O'Reilly Media, Inc.", 2011.
[40] Chang F, Dean J, Ghemawat S, et al. Bigtable: A distributed storage system for structured data[J]. ACM Transactions on Computer Systems (TOCS), 2008, 26(2): 4.
[41] Abu-Libdeh H, Princehouse L, Weatherspoon H. RACS: a case for cloud storage diversity[C]//Proceedings of the 1st ACM symposium on Cloud computing. ACM, 2010: 229-240.
[42] Bessani A, Correia M, Quaresma B, et al. DepSky: dependable and secure storage in a cloud-of-clouds[J]. ACM Transactions on Storage (TOS), 2013, 9(4): 12.
[43] Yang XJ, Liao XK, Lu K et al. The TianHe-1A supercomputer: Its hardware and software[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 26(3): 344-351 May 2011. DOI 10.1007/s11390-011-1137-4
[44] Liao X, Xiao L, Yang C, et al. Milkyway-2 supercomputer: system and application[J]. Frontiers of Computer Science, 2014, 8(3): 345-356.
[45] Pang Z, Xie M, Zhang J, et al. The TH Express high performance interconnect networks[J]. Frontiers of Computer Science, 2014, 8(3): 357-366 Songling Fu received the BS degree in the department of electronic science and technology from Harbin Institute of Technology, Harbin, China, in 2001, and received the MS and PhD degree of computer science and technology from National University of Defense Technology, Changsha, China, in 2003 and 2014, respectively. His research interests include parallel and distributed computing, high-performance computer systems, operating systems, cloud computing.