• English
    • العربية
  • العربية
  • Login
  • QU
  • QU Library
  •  Home
  • Communities & Collections
  • About QSpace
    • Vision & Mission
  • Help
    • Item Submission
    • Publisher policies
    • User guides
      • QSpace Browsing
      • QSpace Searching (Simple & Advanced Search)
      • QSpace Item Submission
      • QSpace Glossary
View Item 
  •   Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Engineering
  • Computer Science & Engineering
  • View Item
  • Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Engineering
  • Computer Science & Engineering
  • View Item
  •      
  •  
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A survey of clustering algorithms for big data: Taxonomy and empirical analysis

    Thumbnail
    View/Open
    A_Survey_of_Clustering_Algorithms_for_Big_Data_Taxonomy_and_Empirical_Analysis.pdf (5.103Mb)
    Date
    2014
    Author
    Fahad, Adil
    Alshatri, Najlaa
    Tari, Zahir
    Alamri, Abdullah
    Khalil, Ibrahim
    Zomaya, Albert Y.
    Foufou, Sebti
    Bouras, Abdelaziz
    ...show more authors ...show less authors
    Metadata
    Show full item record
    Abstract
    Clustering algorithms have emerged as an alternative powerful meta-learning tool to accurately analyze the massive volume of data generated by modern applications. In particular, their main goal is to categorize data into clusters such that objects are grouped in the same cluster when they are similar according to specific metrics. There is a vast body of knowledge in the area of clustering and there has been attempts to analyze and categorize them for a larger number of applications. However, one of the major issues in using clustering algorithms for big data that causes confusion amongst practitioners is the lack of consensus in the definition of their properties as well as a lack of formal categorization. With the intention of alleviating these problems, this paper introduces concepts and algorithms related to clustering, a concise survey of existing (clustering) algorithms as well as providing a comparison, both from a theoretical and an empirical perspective. From a theoretical perspective, we developed a categorizing framework based on the main properties pointed out in previous studies. Empirically, we conducted extensive experiments where we compared the most representative algorithm from each of the categories using a large number of real (big) data sets. The effectiveness of the candidate clustering algorithms is measured through a number of internal and external validity metrics, stability, runtime, and scalability tests. In addition, we highlighted the set of clustering algorithms that are the best performing for big data. 2013 IEEE.
    DOI/handle
    http://dx.doi.org/10.1109/TETC.2014.2330519
    http://hdl.handle.net/10576/41731
    Collections
    • Computer Science & Engineering [‎2485‎ items ]

    entitlement


    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us
    Contact Us | QU

     

     

    Home

    Submit your QU affiliated work

    Browse

    All of Digital Hub
      Communities & Collections Publication Date Author Title Subject Type Language Publisher
    This Collection
      Publication Date Author Title Subject Type Language Publisher

    My Account

    Login

    Statistics

    View Usage Statistics

    About QSpace

    Vision & Mission

    Help

    Item Submission Publisher policies

    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us
    Contact Us | QU

     

     

    Video