This article discusses the use of digital experiments in social sciences to evaluate the accuracy and bias in medical decision-making and human-computer interactions. The authors recruited a large number of physicians to participate in an experiment measuring diagnostic accuracy with and without AI assistance. The experiment focused on skin diseases and followed methods from algorithmic auditing to identify errors and bias in machine learning algorithms. The results of the experiment can offer insights into the performance of physicians and physician-machine partnerships.
