• English
    • العربية
  • العربية
  • Login
  • QU
  • QU Library
  •  Home
  • Communities & Collections
  • About QSpace
    • Vision & Mission
  • Help
    • Item Submission
    • Publisher policies
    • User guides
      • QSpace Browsing
      • QSpace Searching (Simple & Advanced Search)
      • QSpace Item Submission
      • QSpace Glossary
View Item 
  •   Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Engineering
  • Computer Science & Engineering
  • View Item
  • Qatar University Digital Hub
  • Qatar University Institutional Repository
  • Academic
  • Faculty Contributions
  • College of Engineering
  • Computer Science & Engineering
  • View Item
  •      
  •  
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    When rank order isn't enough: New statistical-significance-aware correlation measures

    Thumbnail
    Date
    2018
    Author
    Kutlu M.
    Elsayed T.
    Hasanain M.
    Lease M.
    Metadata
    Show full item record
    Abstract
    Because it is expensive to construct test collections for Cranfield-based evaluation of information retrieval systems, a variety of lower-cost methods have been proposed. The reliability of these methods is often validated by measuring rank correlation (e.g., Kendall's t) between known system rankings on the full test collection vs. observed system rankings on the lower-cost one. However, existing rank correlation measures do not consider the statistical significance of score differences between systems in the observed rankings. To address this, we propose two statistical-significance-aware rank correlation measures, one of which is a head-weighted version of the other. We first show empirical differences between our proposed measures and existing ones. We then compare the measures while benchmarking four system evaluation methods: pooling, crowdsourcing, evaluation with incomplete judgments, and automatic system ranking. We show that use of our measures can lead to different experimental conclusions regarding reliability of alternative low-cost evaluation methods.
    DOI/handle
    http://dx.doi.org/10.1145/3269206.3271751
    http://hdl.handle.net/10576/13016
    Collections
    • Computer Science & Engineering [‎2485‎ items ]

    entitlement


    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us
    Contact Us | QU

     

     

    Home

    Submit your QU affiliated work

    Browse

    All of Digital Hub
      Communities & Collections Publication Date Author Title Subject Type Language Publisher
    This Collection
      Publication Date Author Title Subject Type Language Publisher

    My Account

    Login

    Statistics

    View Usage Statistics

    About QSpace

    Vision & Mission

    Help

    Item Submission Publisher policies

    Qatar University Digital Hub is a digital collection operated and maintained by the Qatar University Library and supported by the ITS department

    Contact Us
    Contact Us | QU

     

     

    Video