Mix and match: Collaborative expert-crowd judging for building test collections accurately and affordably
Author | Kutlu, Mucahid |
Author | McDonnell, Tyler |
Author | Sheshadri, Aashish |
Author | Elsayed, Tamer |
Author | Lease, Matthew |
Available date | 2024-02-21T08:22:11Z |
Publication Date | 2018-08 |
Publication Name | CEUR Workshop Proceedings |
Citation | Goyal, T., McDonnell, T., Kutlu, M., Elsayed, T., & Lease, M. (2018, June). Your behavior signals your reliability: Modeling crowd behavioral traces to ensure quality relevance annotations. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing (Vol. 6, pp. 41-49). |
ISSN | 1613-0073 |
Abstract | Crowdsourcing offers an affordable and scalable means to collect relevance judgments for information retrieval test collections. However, crowd assessors may showhigher variance in judgment quality than trusted assessors. In this paper, we investigate how to effectively utilize both groups of assessors in partnership. We study how agreement in judging is correlated with three factors: relevance category, document rankings, and topical variance. Based on this, we then propose two collaborative judging methods in which some document-topic pairs are assigned to in-house assessors for relevance judging while the rest are assessed by crowd workers. Results on two TREC collections show encouraging results when we distribute work intelligently between our two groups of assessors. |
Sponsor | This work was made possible by NPRP grant# NPRP 7-1313-1-245 from the Qatar National Research Fund (a member of Qatar Foundation). |
Language | en |
Publisher | CEUR-WS |
Subject | Crowdsourcing Evaluation Information retrieval Relevance |
Type | Conference Paper |
Pagination | 41-49 |
Volume Number | 2167 |
Files in this item
Files | Size | Format | View |
---|---|---|---|
There are no files associated with this item. |
This item appears in the following Collection(s)
-
Computer Science & Engineering [2402 items ]