• English
    • العربية
  • العربية
  • Login
  • QU
  • QU Library
  •  Home
  • Communities & Collections
View Item 
  •   Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Engineering
  • Computer Science & Engineering
  • View Item
  • Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Engineering
  • Computer Science & Engineering
  • View Item
  •      
  •  
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Hadoop Perfect File: A fast and memory-efficient metadata access archive file to face small files problem in HDFS

    Thumbnail
    View/Open
    Publisher version (You have accessOpen AccessIcon)
    Publisher version (Check access options)
    Check access options
    Hadoop Perfect File A fast and memory-efficient metadata access archive file to face small files problem in HDFS.pdf (1.439Mb)
    Date
    2021-10-01
    Author
    Zhai, Yanlong
    Tchaye-Kondi, Jude
    Lin, Kwei Jay
    Zhu, Liehuang
    Tao, Wenjun
    Du, Xiaojiang
    Guizani, Mohsen
    ...show more authors ...show less authors
    Metadata
    Show full item record
    Abstract
    HDFS faces several issues when it comes to handling a large number of small files. These issues are well addressed by archive systems, which combine small files into larger ones. They use index files to hold relevant information for retrieving a small file content from the big archive file. However, existing archive-based solutions require significant overheads when retrieving a file content since additional processing and I/Os are needed to acquire the retrieval information before accessing the actual file content, therefore, deteriorating the access efficiency. This paper presents a new archive file named Hadoop Perfect File (HPF). HPF minimizes access overheads by directly accessing metadata from the part of the index file containing the information. It consequently reduces the additional processing and I/Os needed and improves the access efficiency from archive files. Our index system uses two hash functions. Metadata records are distributed across index files using a dynamic hash function. We further build an order-preserving perfect hash function that memorizes the position of a small file's metadata record within the index file.
    URI
    https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85108089031&origin=inward
    DOI/handle
    http://dx.doi.org/10.1016/j.jpdc.2021.05.011
    http://hdl.handle.net/10576/35578
    Collections
    • Computer Science & Engineering [‎2429‎ items ]

    entitlement


    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Home

    Submit your QU affiliated work

    Browse

    All of Digital Hub
      Communities & Collections Publication Date Author Title Subject Type Language Publisher
    This Collection
      Publication Date Author Title Subject Type Language Publisher

    My Account

    Login

    Statistics

    View Usage Statistics

    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Video