Pill or bill? Influence of monetary incentives on the perceived riskiness and the ethical approval of clinical trials

Janine Hoffart; Benjamin Scheibehenne

In clinical trials, incentivizing human research subjects with large amounts of money is often considered unethical, as it may coerce people to participate. This argument implies that people perceive rewards (i.e., incentives) independently of risks (i.e., probability of side-effects) or that they assume that larger rewards are associated with lower risks. However, past research on risk perception indicates that people associate higher rewards with higher risks. To test whether people treat incentives in clinical trials as a proxy for risk, we conducted an online experiment (N = 483) in which people estimated the riskiness of hypothetical clinical trials. We manipulated the monetary incentives that participants of the clinical trials were offered. The results show that people expect more side effects if the monetary incentives for participation are higher. Results further show that the majority of participants were more likely to ethically approve a trial if it offered a high monetary incentive. In contrast to existing ethical guidelines these results suggest that paying large rewards may be less problematic because people implicitly associate them with higher risk and because they trade-off risks and financial benefits.

1 Introduction

In medical research, it is an essential ethical requirement for clinical trials to inform participants about the risks and benefits of participation (Hansson, 2006). Typically, the benefits include monetary incentives that reimburse research-participants for their time, effort, and other possible discomforts. Risks include possible side effects and their respective likelihoods. It is an important question in medical research how such compensations shall be structured (Grady, 2005; Wertheimer 2010).

A simple expected utility model of the decision to participate in a clinical trial predicts that people weigh the risks and benefits against each other and participate if the benefits outweigh the risks. From this perspective, monetary incentives should increase with increasing riskiness of trials to compensate for the risk. However, the Council for International Organizations of Medical Sciences (CIOMS, 2016) argues that large incentives serve as undue inducements as they entrap people to participate in studies against their better judgments. Following this argumentation, some ethicists and researchers worry that large incentives coerce people to participate in clinical trials (e.g., Macklin, 1981; McNeill, 1997). In line with this, the CIOMS guidelines state that “the level of compensation should not be related to the level of risk that participants agree to undertake” (CIOMS, 2016; pp. 53–54). The underlying rationale here is to prevent a poorer person from accepting a clinical trial by virtue of its compensation irrespective of the risk. While this perspective is adopted by many institutional review boards (IRBs), it is an ongoing debate in bioethics whether monetary incentives can be coercive and whether financial payments should be regarded as a benefit that can be traded-off against the risk of participation (Baron, 2006; Largent & Lynch, 2017; Largent, Grady, Miller & Wertheimer, 2013; Wertheimer, 2010).

Websites that advertise participation in clinical trials often emphasize financial benefits (Wertheimer, 2010) and research on risk perception shows that people often expect a correlation between risks and benefits (Slovic, 1987). These insights suggest that the CIOMS guidelines, which prohibit a link between risk and compensation, presumably conflict with participants’ intuitions and may even deprive them of a relevant cue for risk assessment. In particular, a first stream of research on risk perception suggests that people often expect a negative correlation between risks and benefits. For instance, participants in an experiment judged safer activities and technologies to be more beneficial than riskier activities and technologies (Alhakami & Slovic, 1994). Likewise, probabilities of attractive outcomes, such as hitting a large lottery jackpot are often overestimated (Irwin, 1953). In the ethical context at hand, the assumption that larger benefits come with lower risks than smaller benefits suggests that high monetary incentives may lower participants’ risk-estimates of clinical trials. This rationale would be in line with the CIOMS reasoning as high incentives would elicit an unwarranted sense of safety.

On the other hand, Edwards (1962) conjectured that “our world is so constructed that the more desirable objects are harder to get” (p. 49). More recently, Pleskac and Hertwig (2014) provided empirical evidence for this claim and found that in many environments higher payoffs occur with lower probabilities than smaller payoffs. Further, when probability information is missing, people infer reward-probabilities from reward-magnitudes and expect larger rewards to be less likely than smaller rewards (Hoffart, Rieskamp & Dutilh, 2018; Pleskac & Hertwig, 2014). This risk-reward heuristic dovetails with the idea that people believe in fair bets and believe that expected values are similar across comparable situations (Osherson, 1995). Applied to clinical trials, the risk-reward heuristic predicts that people expect larger risks when incentives are higher. Stated differently, low financial incentives may also lead participants to infer low risks. In line with this, Cryder, London, Volpp and Loewenstein (2009) reported that participants in a behavioral experiment perceived research trials as riskier when incentives were larger.

Another study on clinical trials however, did not find a relationship between subjective risk judgments and incentives for participation (Bentley & Thacker, 2004). One possible exlanation for these diverging findings can be found in the way that risk was measured in the two studies: Cryder, London, Volpp and Loewenstein (2009) asked participants “How risky do you believe this study would be for participants?” without further specification of what risk relates to. Bentley and Thacker (2004) on the other hand, asked participants to evaluate both the likelihood and the severity of negative events with five distinct items which they pooled in their analysis. These findings suggest that in the context at hand, it is important to differentiate between the severity of an adverse outcome and the probability of its occurrence.

In summary, the “risk-reward” hypothesis predicts that higher monetary compensation in clinical trials will increase the expected probability of side effects while the “desirability” hypothesis predicts a decrease in the expected probability as incentives increase. Finally, the null hypothesis predicts that people perceive risks and rewards as independent.

2 Method

To empirically test the three hypotheses described above and to better understand in how far higher payments are judged as coercive, we conducted an online experiment on Amazon Mechanical Turk. The study design and statistical analysis were preregistered on the open science framework (https://osf.io/b4wtr). Participants of our experiment read a hypothetical advertisement for a clinical trial that aimed to test a new vaccine against Ebola for women, adapted from an experiment by Ambuehl, Niederle and Roth (2015). We manipulated between subjects whether hypothetical participants of the vaccine-trial would be reimbursed with $500 or $10,000.¹ After reading the text and passing an attention check, we asked participants to estimate, first, how many out of 1,000 women participating in the clinical trial would suffer from mild side effects and, second, how many would suffer from very severe side effects. We separately asked for mild and severe side effects as a Trial X may be judged as riskier than Trial Y because: a) overall more side effects occur in Trial X; or b) the absolute number of side effects is similar across trials but relatively more severe (compared to mild) side effects occur in Trial X. The side effect judgments served as our estimate of perceived riskiness.

In addition, participants also stated how ethical they perceived different compensation schemes (i.e., no money, $500, and $10’000) on a scale from 1 (completely unethical) to 7 (completely ethical) and whether they themselves would participate in such a trial for a) a reimbursement of $500 and b) a reimbursement of $10’000 (possible responses: yes, no, don’t know). We asked these questions as we planned to analyze whether people who believe it is more ethical to reimburse participants of medical trials with little money (referred to as “ethicists” by Ambuehl et al., 2015) expect a greater increase of side effects when incentives are larger than people who believe that it is more ethical to reimburse participants of medical trials with much money (referred to as “economists” by Ambuehl et al. 2015). Further, we assessed whether participants would approve the trial if they were part of an ethical committee on a scale form 1 (definitively reject) to 7 (definitively approve). We asked this question to explore which factors (i.e., incentive and/or personal expectations about the likelihood of side effects) influence approval decision.

2.1 Participants

In total, we collected data from 483 participants (223 women, 258 men, and 3 did not respond, collected in two batches). The sample size was pre-registered and determined by the budget we had available for the study. Participants were remunerated with $1 for their participation. As preregistered, we controlled for possible outliers in people’s risk estimates that were measured in an open answer format by excluding 20% of the most extreme data points (i.e.: 10% of the lowest data points and 10% of the highest data points) within both experimental conditions. On average participants of the final sample (N= 371, 169 women, 201 men, and 1 did not respond) were 36 years old (SD = 10.23, range = 18–72).

2.2 Statistical analyses

To analyze whether incentives influence side effect expectations, we estimated a negative binomial linear regression model with the estimated number of side effects as dependent variable and incentive condition as predictor. To analyze approval decisions, we estimated ordinal logistic regressions with incentive condition and side effect expectations (the estimated number of side effects) as predictors. Statistical analyses were conducted in R. We based our inferences about whether a hypothesis was supported by the data by comparing the Bayesfactors (calculated from the models’ BICs, Kass & Raftery, 1995) of a regression model including a predictor of interest and the simpler nested model without this predictor.

3 Results

In summary, results show that participants’ side effect judgments confirmed the prediction of the risk-reward heuristic. In addition, for most participants higher incentives increase approval ratings. In the following, we report both findings in more detail.

3.1 Side effect estimates

Side effect expectations for all side effect (sum of mild and very severe) differed between conditions as predicted by the risk-reward hypothesis: In the high incentive condition participants expected that more women would suffer from side effects (M = 237.26, SD = 200.63) than in the low incentive condition (M = 170.15, SD = 142.91). To statistically test if incentives influence side effect expectations, we estimated a negative binomial linear regression model with the estimated number of side effects as dependent variable and incentive condition as predictor. Comparing this regression with a nested “intercept only” null model that assumes no difference between the conditions yield a Bayes factor (BF) of 65, indicating strong evidence for the risk-reward hypothesis (Jeffreys, 1961). These results replicate previous findings by Cryder, London, Volpp and Loewenstein (2009). When running the analysis on the total (i.e., non-trimmed) data across all 483 participants, the model comparison yields a Bayes factor that is close to one, suggesting no effect of incentives. A closer look at the distribution of the side-effect estimates in the non-trimmed sample reveals that many participants submitted estimates of either 0 or 1000 in the free answer format that we used. We suspect that these are participants who did not take the task seriously and therefore induce noise in the data. Therefore, in the case at hand the trimmed data presumably yields more robust and reliable estimates.

Figure 1 illustrates for both mild and severe side effects participants in the high incentive condition expected that more women will suffer from side effects than in the low incentive condition. The previously described regression model, conducted separately for mild and severe side effect judgments, confirmed these results (BF_mild = 31; BF_severe = 11), providing strong evidence for the risk-reward hypothesis for both dependent variables.

We also preregistered to separately compare judgments for participants who believe that larger incentives are more ethical than lower incentives (i.e., “economists”) and for participants who believe that lower incentives are more ethical than larger incentives (i.e. “ethicists”). There are only 11 participants (3% of the sample) who show a (weakly) monotonic decrease in their approval ratings as monetary incentives increase from $0 to $500 to $10,000. The vast majority (n = 294, 79%) showed a monotonic increase suggesting that larger incentives boost approval ratings. The remaining 18% of participants could not be classified in any of the two categories because they were either indifferent (6%) or gave inconsistent answers (12%).² Exploratory analyses indicate that participants categorized as “ethicists” were slightly older than the rest of the sample (45 years vs. 36 years). However, given the low number of “ethicists” in our sample, we refrained from further group comparisons.

3.2 Do side effect judgments influence ethical approval judgments?

In the high incentive condition, participants were more likely (Mdn = 6, on a scale from 1 to 7) to approve the clinical trial than in the low incentive condition (Mdn = 5). To statistically test this difference we estimated an ordinal logistic regression using the polr() function in R with incentive condition as predictor and approval ratings as dependent variable. In a next step we compared this regression to a reduced “null” model that did not include the predictor but was otherwise similar. The model that included the condition as a predictor predicted the data better than the null model (BF = 6). In addition, higher approval ratings were also associated with lower estimates for the number of side effects (Spearman’s rank correlation = −0.13, p = 0.013). Adding participants’ side effects estimation as a second predictor to the ordinal logistic regression further improved model fit over the model with just one predictor (BF = 7), indicating that approval ratings depend on both monetary incentives and individual expectations of side effects.

4 Discussion

Here, we experimentally tested whether people expect that the magnitudes of monetary rewards for participating in clinical research trials foreshadows the riskiness of the trial. We contrasted three hypotheses: First, the “desirability” hypothesis predicting that monetary incentives decrease side effect estimates. Second, the risk-reward hypothesis predicting a positive correlation between incentives and side effect estimates and third the null hypothesis predicting no link between incentives and side effect estimates. Consistently with the risk-reward hypothesis, participants in our study expected more mild and more very severe side effects when participation in the trial was incentivized with $10,000 instead of $500.

In our sample, a minority of 3% of participants were less likely to approve a trial when the monetary incentives were high. The behavior of this minority dovetails with current ethical guidelines stating that monetary incentives must not be used to compensate for risks (CIOMS, 2016). In contrast to this, for the vast majority of participants in our sample, higher monetary incentives led to increased ethical approval ratings despite the fact that they associated higher payments with higher risks. The data further shows that incentives and personal risk expectations contribute individually to the approval decision. The results support the notion that research participants implicitly trade-off financial benefits and the risk of participation (Wertheimer, 2010).

In a US survey across 1,380 individuals who are part of IRBs or who self-identified as interested in IRBs, the majority of respondents disagreed with the statement that “Researchers should be permitted to consider the offer of money as compensation for risk or as a benefit in risk-benefit assessment” (Largent et al. 2013). This view is in contrast to the behavior of the vast majority of participants in our sample. A possible explanation for this contrast could be due to qualitative differences in the ethical reasoning and intuition of IRB members compared to our sample of Mturk participants (see also Baron, 2006 for a critique of common IRB procedures).

It should be noted though that our results are based on hypothetical choices. Future research should confirm these results in a context where decisions have real consequences. Also, the high incentive of $10.000 was considerably higher than what is commonly payed in clinical trials. While the amounts we chose allowed for a stronger experimental manipulation, the design would miss a possible (albeit unlikely) u-shape or inverted u-shape relationship.

The discussion about how participants in potentially harming trials shall be compensated has mostly centered on the argument that large incentives may be coercive. However, our results show that people also understand monetary incentives as signals for harm potential which in turn should mitigate the monetary allurement. Practitioners and researchers should consider this intuition when explaining potential consequences of participation in clinical trials and when determining payment schemes to patients.

To conclude, common sayings suggest that “you get what you pay for” and that “there is no such thing as a free lunch”. In line with these sayings, it seems that participants in our experiment intuitively expected a catch when offered high incentives. This is in line with previous results in the domain of monetary gambles in natural environments (Pleskac & Hertwig, 2014; Hoffart, Rieskamp & Dutilh, 2018). On a broader level, these results suggest that research on judgment and decision making in an economic context should take people’s prior beliefs and expectations about environmental regularities into account. When ignoring such expectations, researchers may miss important behavioral patterns that stem from subjective beliefs about environmental regularities.

5 References

Alhakami, A. S., & Slovic, P. (1994). A psychological study of the inverse relationship between perceived risk and perceived benefit. Risk analysis, 14, 1085–1096.

Ambuehl, S., Niederle, M., & Roth, E. A. (2015). More money, more problems? Can high pay be coercive and repugnant? American Economic Review: Papers Proceedings, 105, 357–360.

Baron, J. (2006). Against Bioethics. Cambridge, MA: MIT Press.

Bentley, J. P., & Thacker, P. G. (2004). The influence of risk and monetary payment on the research participation decision making process. Journal of medical ethics, 30, 293–298.

Christopoulos, G. I., Liu, X.-X., & Hong, Y.-y. (2017). Toward an Understanding of Dynamic Moral Decision Making: Model-Free and Model-Based Learning. Journal of Business Ethics, 144, 699–715.

Council for International Organizations of Medical Sciences (CIOMS). (2016). International ethical guidelines for health-related research involving humans. Geneva, Switzerland: CIOMS.

Cryder, C. E., London, A. J., Volpp, K. G., & Loewenstein, G. (2009). Informative inducement: Study payment as a signal of risk. Social Science & Medicine, 70, 455–464.

Edwards, W. (1962). Utility, subjective probability, their interaction, and variance preferences. Journal of Conflict Resolution, 62, 42-51.

Grady, C. (2005). Payment of clinical research subjects. Journal of Clinical Investigation, 115, 1681–1687.

Hansson, S. O. (2006). Informed Consent Out of Context. Journal of Business Ethics, 63, 149–154.

Hoffart, J. C., Rieskamp, J., & Dutilh, G. (2019). How environmental regularities affect people’s information search in probability judgments from experience. Journal of Experimental Psychology. Learning, Memory, and Cognition, 45, 19–231.

Irwin, F. W. (1953). Stated expectations as functions of probability and desirability of outcomes. Journal of Personality, 21, 329–335.

Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistic Association, 90, 773–795.

Largent, E., Grady, C., Miller, F. G., & Wertheimer, A. (2013). Misconceptions about coercion and undue influence: reflections on the views of IRB members. Bioethics, 27, 500–507.

Largent, E. A., & Lynch, H. F. (2017). Paying research participants: the outsized influence of “undue influence”. IRB, 39, 1.

Macklin, R. (1981). ’Due’ and ’Undue’ inducements: On paying money to research subjefcts. IRB: Ethics & Human Research, 3, 1–6.

McNeill, P. (1997). Paying people to participate in research: Why not? Bioethics, 11, 390–396.

Osherson, D. N. (1995). Probability judgment. In E. E. Smith, & D. N. Osherson (Eds.), Thinking (pp. 35–75). Cambridge, MA: MIT Press.

Pleskac, T. J., & Hertwig, R. (2014). Ecologically rational choice and the structure of the environment. Journal of Experimental Psychology: General, 143, 2000–2019.

Slovic, P. (1987). Perception of risk. Science, 236, 280–285.

Wertheimer, A. (2010). Rethinking the ethics of clinical research: widening the lens. Oxford University Press.

Appendix: Written instructions in our online experiment

Suppose that you are a member of an ethics committee, and you will have to decide whether or not to approve the following clinical trial. Pay close attention. The following question will be based on this text.

The E.M.C.A. Medical Research Institute has developed a new vaccine to prevent infection with the Ebola Virus. In rats and chimps the vaccine successfully prevents infection with the virus and causes no measurable side effects.

The institute now seeks to enlist 1000 female participants to investigate whether the vaccine causes side effects in women. This is important to know, as it will determine whether the vaccine can be given to female healthcare workers in regions affected by the disease.

Each of the 1000 participants will be injected with the vaccine and then monitored in weakly intervals for two months. The total time required to participate if no side effect occur is about 40 hours. Participants will not be exposed to the virus; the study only tests for side effects of the vaccine. Since no side effects occurred in the animal studies, the Institute’s experts consider it unlikely that they will occur in humans.

However, nobody knows for sure. This is why the experiment needs to be run. In case that unexpected side effects occur, they might range from very mild, such as a day of nausea to very sever, such as persistent migraines. Side effects will be treated free of charge, if treating them is medically possible. An affected woman will not, however, receive treatment for any unrelated medical problems, and she will not receive any other compensation for suffering these side effects. The only compensation to any participant is the money paid to her when she agrees to take part in the study, before she is injected the vaccine. Study participation invitations will be put up in both rich and poor neighborhoods.

The Institute will compensate each woman who participates with $500 (five hundred US Dollar) [$10,000 (ten thousand US Dollar)] for the risk the participants take, and the total of 40 hours required to participate in the study.

University of Geneva, Faculty of Economics and Management (GSEM), 40 bd du Pont D’Arve, CH-1211 Geneva, Switzerland. E-mail: benjamin.scheibehenne@unige.ch.

See the Appendix for the instruction text that participants saw on the computer screen.

In our original pre-registration, we proposed to define these groups by comparing approval ratings only for $500 and $10.000. As this approach does not take the $0 condition into account we decided to deviate from our original plan. Under the pre-registered rule, 40 participants were classified as “ethicists”.