Two Ethical Dilemmas in Conducting Human Subjects Research on Amazon Mechanical Turk (MTurk)

March 22, 2023 Huichuan Xia

Huichuan Xia
Department of Information Management, Peking University
[email protected]

Amazon Mechanical Turk (MTurk) is perhaps the most popular crowd work platform for academic scholars to collect research data because it enables scholars to reach a large and diverse population of Internet users (the so-called crowd workers) and get their data fast and at a relatively low cost. Since crowd workers are essentially humans, scholars in the U.S. must get approval from an Institutional Review Board (IRB) and abide by the Belmont principles (e.g., respect for persons, beneficence, justice) before they can collect data from crowd workers on MTurk.

However, conducting academic research on MTurk (more broadly, I call it crowd work-based research) is not without dispute. For example, MTurk is designed for businesses and individuals to reduce their costs in outsourcing tasks to experts but not for academic research purposes. So, crowd workers taking tasks on MTurk are different from research volunteers in nature. Second, crowd workers are motivated by money, but the payment standard on MTurk is obscure and commonly very low. So, if scholars pay crowd workers at a higher (or, as many scholars believe, fairer) standard (such as in a federal or state minimum wage standard) than the common practice on MTurk, they may impose some undue influence on crowd workers to select well-paid tasks over the others. Third, and it is the flip side of the second dispute. If scholars pay crowd workers too low, crowd workers may feel exploited by scholars for their labor and time.

—Dehumanization has been an intrinsic problem on MTurk—

Related to the disputes above, I identified another two ethical dilemmas in conducting human subjects research (i.e., research with crowd workers) on MTurk, summarized in Table 1 below.

Table 1. Two dilemmas in crowd work-based research

Dilemmas	D1: Dehumanization vs. Respect for persons	D2: Monetary incentives and reputational risks vs. Beneficence
Descriptions	Crowd work dehumanizes crowd workers as a commodity or data source, but scholars and IRBs expect crowd workers to be autonomous human beings	Monetary incentives and reputational risks are concrete rewards and harm for crowd workers, yet they are not counted as research benefits or risks in the Belmont principles
Consequences	Crowd workers cannot readily switch between voluntary participation in academic tasks and money-driven participation in non-academic tasks	Scholars and IRBs may diverge in how to frame research benefits and demarcate research risks in crowd work-based research
Comparison with traditional surveys	Internet panels or sociological surveys have less undue influence from payment than crowd work-based research	—
Comparison with biomedical research	—	Biomedical research has more concrete research benefits and less reputational harm on subjects than crowd work-based research on crowd workers

The first dilemma is dehumanization vs. respect for persons. Dehumanization has been an intrinsic problem on MTurk since its birth because it treats crowd workers more often as “mechanical” data providers than “individual” human beings. For example, MTurk provides no contract, legal protection, or minimum wage to crowd workers. On the other side, respect for persons is an essential ethical principle for scholars to obey, which means treating research participants as humans with free spirits and independent-thinking minds. So, when conducting research on MTurk, scholars should ensure and anticipate crowd workers to freely and voluntarily choose to participate in their research study. However, as said above, it could become a dilemma because if scholars choose to pay crowd workers at a high/fair standard (e.g., a minimum wage standard that is higher than most task payments on MTurk), such a relatively high payment rate could impact crowd workers’ “voluntary” participation (e.g., participating a research task not because of its meaningfulness or contribution to science but because of its payment rate). If scholars want to avoid such a potential impact on crowd workers’ voluntary participation and pay crowd workers less, they may be regarded by crowd workers as exploitative. Such a dilemma is less obvious in traditional sociological surveys because traditional survey takers are more often authentic “volunteers” not motivated by money and do not need to switch between academic and non-academic tasks.

The second dilemma is monetary incentives and reputational risks vs. beneficence. Besides payment, MTurk has a particular reputational system to deter crowd workers from providing bad-quality data. This reputational system has two parts: (1) how many tasks have been finished by a crowd worker, and (2) how many finished tasks of a crowd worker have been approved by task requesters. To ensure data quality, scholars often prefer to recruit crowd workers who have finished more than 1,000 tasks and with a 95% approval rate. However, if a crowd worker’s submission has been rejected once, it will be permanently marked in their reputational system on MTurk and can significantly impact their approval rate (for scholars to recruit them in the future). Since most crowd workers are motivated by money, and academic research tasks often pay much better than non-academic tasks, many crowd workers are very concerned about being rejected. However, it is also a fact and trend that the overall data quality submitted by crowd workers on MTurk has been worsening. On the other side, traditionally speaking, the ethical principle of beneficence requires scholars not to treat compensation as a benefit to research participants (the same reason to avoid any “undue influence” on voluntary participation) and seldom regards rejecting a research participant’s submission (often due to poor data quality) as a research risk to that research participant. So, here the second dilemma arises: should payment be taken as a research benefit and rejection as a research risk to crowd workers on MTurk? As a comparison, traditional biomedical research has more concrete benefits (e.g., a research subject’s health improvement after participating in a drug trial) and less reputational harm to research participants.

I propose that these two ethical dilemmas in conducting human subjects research on MTurk deserve more investigation and discussion in the future.

Reference

Xia, H. (2023). What scholars and IRBs talk when they talk about the Belmont principles in crowd work-based research. Journal of the Association for Information Science and Technology (JASIST), 74(1), 67– 80. https://doi.org/10.1002/asi.24724

Cite this article in APA as: Xia, H., (2023, March 29). Two ethical dilemmas in conducting human subjects research on Amazon Mechanical Turk (MTurk). Information Matters, Vol. 3, Issue 3. https://informationmatters.org/2023/03/two-ethical-dilemmas-in-conducting-human-subjects-research-on-amazon-mechanical-turk-mturk/

Author

Huichuan Xia

View all posts

Two Ethical Dilemmas in Conducting Human Subjects Research on Amazon Mechanical Turk (MTurk)

Huichuan XiaDepartment of Information Management, Peking University[email protected]

—Dehumanization has been an intrinsic problem on MTurk—

Author

Huichuan Xia
Department of Information Management, Peking University
[email protected]