• English
    • العربية
  • العربية
  • Login
  • QU
  • QU Library
  •  Home
  • Communities & Collections
  • Help
    • Item Submission
    • Publisher policies
    • User guides
    • FAQs
  • About QSpace
    • Vision & Mission
View Item 
  •   Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Business and Economics
  • Accounting & Information Systems
  • View Item
  • Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Business and Economics
  • Accounting & Information Systems
  • View Item
  •      
  •  
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Big data velocity management-from stream to warehouse via high performance memory optimized index join

    Thumbnail
    View/Open
    Big_Data_Velocity_ManagementFrom_Stream_to_Warehouse_via_High_Performance_Memory_Optimized_Index_Join.pdf (1.344Mb)
    Date
    2020-10-23
    Author
    Naeem, M. Asif
    Mirza, Farhaan
    Khan, Habib Ullah
    Sundaram, David
    Jamil, Noreen
    Weber, Gerald
    ...show more authors ...show less authors
    Metadata
    Show full item record
    Abstract
    Efficient resource optimization is critical to manage the velocity and volume of real-time streaming data in near-real-time data warehousing and business intelligence. This article presents a memory optimisation algorithm for rapidly joining streaming data with persistent master data in order to reduce data latency. Typically during the transformation phase of ETL (Extraction, Transformation, and Loading) a stream of transactional data needs to be joined with master data stored on disk. To implement this process, a semi-stream join operator is commonly used. Most semi-stream join operators cache frequent parts of the master data to improve their performance, this process requires careful distribution of allocated memory among the components of the join operator. This article presents a cache inequality approach to optimise cache size and memory. To test this approach, we present a novel Memory Optimal Index-based Join (MOIJ) algorithm. MOIJ supports many-to-many types of joins and adapts to dynamic streaming data. We also present a cost model for MOIJ and compare the performance with existing algorithms empirically as well as analytically. We envisage the enhanced ability of processing near-real-time streaming data using minimal memory will reduce latency in processing big data and will contribute to the development of highperformance real-time business intelligence systems.
    URI
    https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85102896549&origin=inward
    DOI/handle
    http://dx.doi.org/10.1109/ACCESS.2020.3033464
    http://hdl.handle.net/10576/37715
    Collections
    • Accounting & Information Systems [‎555‎ items ]

    entitlement


    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Home

    Submit your QU affiliated work

    Browse

    All of Digital Hub
      Communities & Collections Publication Date Author Title Subject Type Language Publisher
    This Collection
      Publication Date Author Title Subject Type Language Publisher

    My Account

    Login

    Statistics

    View Usage Statistics

    About QSpace

    Vision & Mission

    Help

    Item Submission Publisher policiesUser guides FAQs

    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us | Send Feedback
    Contact Us | Send Feedback | QU

     

     

    Video