Rater Equivalence: Evaluating Classifiers in Human Judgment Settings
arXiv:2106.01254v2 Announce Type: replace Abstract: In many decision settings, the definitive ground truth is either non-existent or inaccessible. We introduce a framework for evaluating classifiers based solely on human judgments. In such cases, it is helpful to compare automated classifiers…
