• Annotator rationales for labeling tasks in crowdsourcing 

      Kutlu, Mucahid; McDonnell, Tyler; Elsayed, Tamer; Lease, Matthew ( Elsevier , 2020 , Article)
      When collecting item ratings from human judges, it can be difficult to measure and enforce data quality due to task subjectivity and lack of transparency into how judges make each rating decision. To address this, we ...
    • ArabicWeb16: A new crawl for today's Arabic Web 

      Suwaileh, Reem; Kutlu, Mucahid; Fathima, Nihal; Elsayed, Tamer; Lease, Matthew ( Association for Computing Machinery, Inc , 2016 , Conference Paper)
      Web crawls provide valuable snapshots of the Web which enable a wide variety of research, be it distributional analysis to characterize Web properties or use of language, content analysis in social science, or Information ...
    • Crowd vs. Expert: What can relevance judgment rationales teach us about assessor disagreement? 

      Kutlu, M.; Kutlu, Mucahid; McDonnell, Tyler; Barkallah, Yassmine; Elsayed, Tamer; ... more authors ( ACM , 2018 , Conference Paper)
      © 2018 ACM. While crowdsourcing offers a low-cost, scalable way to collect relevance judgments, lack of transparency with remote crowd work has limited understanding about the quality of collected judgments. In prior work, ...
    • Efficient Test Collection Construction via Active Learning 

      Rahman, Md Mustafizur; Kutlu, Mucahid; Elsayed, Tamer; Lease, Matthew ( Association for Computing Machinery , 2020 , Conference Paper)
      To create a new IR test collection at low cost, it is valuable to carefully select which documents merit human relevance judgments. Shared task campaigns such as NIST TREC pool document rankings from many participating ...
    • The many benefits of annotator rationales for relevance judgments 

      McDonnell, Tyler; Kutlu, Mucahid; Elsayed, Tamer; Lease, Matthew ( International Joint Conferences on Artificial Intelligence , 2017 , Conference Paper)
      When collecting subjective human ratings of items, it can be difficult to measure and enforce data quality due to task subjectivity and lack of insight into how judges arrive at each rating decision. To address this, we ...
    • Mix and match: Collaborative expert-crowd judging for building test collections accurately and affordably 

      Kutlu, Mucahid; McDonnell, Tyler; Sheshadri, Aashish; Elsayed, Tamer; Lease, Matthew ( CEUR-WS , 2018 , Conference Paper)
      Crowdsourcing offers an affordable and scalable means to collect relevance judgments for information retrieval test collections. However, crowd assessors may showhigher variance in judgment quality than trusted assessors. ...
    • Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to Ensure Quality Relevance Annotations 

      Goyal, Tanya; McDonnell, Tyler; Kutlu, Mucahid; Elsayed, Tamer; Lease, Matthew ( AAAI Press , 2018 , Conference Paper)
      While peer-agreement and gold checks are well-established methods for ensuring quality in crowdsourced data collection, we explore a relatively new direction for quality control: estimating work quality directly from ...