The Lie Deflator – The effect of polygraph test feedback on subsequent (dis)honesty

Despite its controversial status, the lie detection test is still a popular organizational instrument for credibility assessment. Due to its popularity, we examined the effect of the lie-detection test feedback on subsequent moral behavior. In three studies, participants could cheat to increase their monetary payoff in two consecutive phases. Between these two phases the participants underwent a mock polygraph test and were randomly given Deception Indicated (DI) or No Deception Indicated (NDI) assigned feedback. Then, participants engaged in the second phase of the task and their level of dishonesty was measured. Study 1 showed that both NDI and DI feedback (but not the control) reduced cheating behavior on the subsequent task. However, Study 2 showed that the mere presence of the lie-detection test (without feedback) did not produce the same effect. When the role of the lie detector as a moral reminder was cancelled out in Study 3, feedback had no effect on the magnitude of cheating behavior. However, cheaters who were given NDI feedback exhibited a lower level of physiological arousal than cheaters who were given DI feedback. These results suggest that lie detection tests can be used to promote honesty in the field, and that, while feedback type does not affect the magnitude of cheating, NDI may allow people to feel better about cheating.

1 Introduction

The controversy as to the validity of lie detector tests is very much alive among practitioners and researchers (e.g., Palmatier & Rovner, 2015). Most of the criticism has been directed against the Control Question Test (CQT) (e.g., Fiedler, Schmid & Stahl, 2002; Iacono & Ben-Shakhar, 2019; Verschuere et al., 2011; but see Ginton, 2017 for a favorable evaluation). While most consider the test to be scientifically flawed (e.g., Furedy, 1993; Gallai, 1999; Iacono & Lykken, 1997; Meijer & Verschuere, 2010), others consider it to be a useful indicator to detect involvement in a specific incident (Gibson, 2001), and experienced practitioners are indeed capable of accurately identifying cheating (O’Sullivan, Frank, Hurley & Tiwana, 2009). The importance of this debate notwithstanding, the current work was not designed to examine the scientific validity of the polygraph test. Rather, due to its popularity on a worldwide scale (Raskin, Honts & Kircher, 2014), we examined the effects of the outcome of polygraph tests (i.e., the examiner’s feedback) on moral judgments and the behavior of the examinees themselves.

More specifically, the main goal of the current work is to determine how feedback on the results of a polygraph test affects people’s moral behavior on a subsequent task. One possible way in which the polygraph test feedback could have an effect on people’s moral behavior is through moral reminders (Mazar, Amir & Ariely, 2008). Research shows that, although people want to maintain a positive moral self, they also want to benefit from unethical behavior. This conflict leads to a psychological tension dubbed ethical dissonance, which, if strong enough, inhibits dishonest behavior (Barkan, Ayal & Ariely, 2015). The rationale for moral reminders (Ayal, Gino, Barkan & Ariely, 2015; Mazar et al., 2008) is that when reminded of moral standards, people may be less likely to violate ethical rules. Moral reminders work since they increase the ethical dissonance associated with cheating behavior. In addition, according to the Bounded Ethicality approach, since people are constrained by cognitive limitations, even well-intentioned individuals can make questionable ethical choices (Bazerman & Tenbrunsel, 2012). Thus, moral reminders can highlight the moral code and draw people’s attention to their own transgressions by making what is morally wrong and morally right more salient.

In one key demonstration of moral reminders, Mazar et al. (2008) gave participants the opportunity to cheat after recalling either the 10 Commandments or 10 books they had recently read. The results showed that participants presented with a moral reminder (e.g., recalling the 10 Commandments) cheated less (less overstating of the number of math problems solved correctly) relative to the control condition (e.g., recalling 10 books). Mazar et al. suggested that moral reminders increase participants’ awareness of moral standards, which facilitates honest behavior. More recently, moral reminders have been found to decrease dishonesty among students in a Finnish business school (Grym & Lilijander, 2016) and among Amazon Mechanical Turk workers (Hwang, 2015). In addition, it has been argued that even the display of ethical cues in people’s physical surroundings can act as moral reminders and reduce the level of dishonesty (Ayal et al., 2015; Shu, Mazar, Gino, Ariely & Bazerman, 2012). In this sense, polygraph test results can also be viewed as a moral reminder, since this device is designed to objectively identify dishonest individuals. After a polygraph test, people should be more aware of their moral standards, and cheat less regardless of whether they get positive (No Deception Indicated, henceforth NDI) or negative (Deception Indicated, henceforth DI) feedback. This led to the following hypothesis:

Hypothesis 1:

Despite the support for the effectiveness of moral reminders (e.g., Grym & Lilijander, 2016; Mazar et al., 2008; Pruckner & Sausgruber, 2013; Welsh & Ordóñez, 2014), recent studies have failed to replicate the facilitating effect of moral reminders on honest behavior. For example, in a registered replication report describing the aggregated results of 25 replications of Mazar et al.’s (2008) original experiment, there was no reduction in dishonest behavior after participants were asked to recall the 10 Commandments (Verschuere et al., 2018). Similarly, Schild, Heck, Ścigała and Zettler (in press) examined the effectiveness of the REVISE (REminding, VIsibility, and SElf-engagement) framework (Ayal et al., 2015) in reducing dishonest behavior. While the authors found that both visibility and self-engagement reduced dishonesty, no effect was found for ethical priming (as a form of moral reminder).

To account for this lack of observed effects in the replications, Amir, Mazar and Ariely (2018) introduce six factors that might limit the effectiveness of moral reminders. For example, given the growth of research in the field of behavioral ethics, and the increased popularity of books and online material including TED talks on the topic targeted to the general public, direct examinations of moral reminders may be transparent to participants who are aware to the fact that their behavior is being monitored. In line with this suggestion, Ayal et al. (2015) argued that to design effective moral reminders, one must maintain saliency and avoid adaptation; reminders should be changed and re-actualize periodically. Moreover, Amir et al. (2018) called for an inquiry into moral reminders on a more conceptual level rather than in terms of the efficacy of specific manipulations (i.e., recalling the 10 Commandments). Thus, another goal of the current manuscript was to test for moral reminders on a conceptual level with a completely different manipulation.

Specifically, in three studies, we tested the prediction of the moral reminder approach about the effect of lie-detector feedback on subsequent (dis)honest behavior. The first study investigated people’s honesty before and after undergoing a polygraph test and receiving DI or NDI feedback. The second study compared dishonesty before and after a polygraph test without feedback, to explore whether the test itself serves as a moral reminder and affects participants’ dishonest behavior. Finally, the third study incorporated an actual physiological measure to better understand how polygraph feedback affects the galvanic skin response (GSR) associated with dishonest behavior (Hochman, Glöckner, Fiedler & Ayal, 2016; Wang, Spezio & Camerer, 2010). The findings should shed light on the mechanisms underlying moral behavior after a polygraph test feedback. Although there are numerous works on polygraph tests, to the best of our knowledge this is the first attempt to systematically examine the effects of polygraph feedback on subsequent dishonest behavior.

2 Study 1

Study 1 examined how DI and NDI feedback on a polygraph test after an initial cheating task affected participants’ moral behavior in a subsequent task administered immediately after receiving feedback. This tested whether polygraph feedback could serve as a moral reminder that curbs dishonesty. Importantly, in this study we used false rather than actual feedback. Participants were randomly allocated to either the DI or the NDI feedback condition regardless of their cheating behavior in the first phase. This deception was crucial to prevent a selection bias in which participants who received DI feedback were those who were mainly dishonest and participants who received NDI feedback were mainly honest. Such a random feedback mechanism enabled us to directly examine the effect of the polygraph feedback without individual differences in moral dispositions which might serve as confounds.

2.1 Method

2.1.1 Participants

Ninety-nine students from an Israeli University (58 males, 41 females) volunteered to participate in the study. We recruited as many participants as possible during the course of a full academic semester. Thus, sample size was not determined a-priori. The mean age was 24.81 years (SD=3.109). Participants received up to 40 ILS (approx. $11.00) for their participation, but actual payment was contingent upon their selections during the task.

2.1.2 Design and procedure

Participants engaged in the flexible dot task (Hochman et al., 2016) in two stages in a 3-between-subjects polygraph feedback design. The flexible dot task is a computerized perceptual task in which participants are presented with a square divided down the middle by a vertical line into two parts. On each trial, 25 non-overlapping red dots appear in different arrangements within the square. Some dots appear on the right-hand side of the square and some on the left side. The position of the dots on each side is mirrored on the middle line, with the exception that there are more dots (1, 3, or 5) on one side. The dots only appear for 2 seconds and disappear. Once they have disappeared, the participants are required to indicate which side of the square (right or left) contained more dots. Participants are fully informed about the procedure. Although participants are clearly instructed to indicate which side has more dots, their payment depends on their selections. Specifically, participants were paid 5 cents for indicating there were more dots on the right side of the square, and 0.5 cents for indicating there were more dots on the left. Thus, on certain trials the correct response and the incentivized response were in conflict. This payment scheme is similar to previous uses of the dots task to measure levels of dishonesty (e.g., Ayal & Gino, 2011; Gino, Norton, & Ariely, 2010; Mazar & Zhong, 2009), as it allows participants to maximize their earnings by violating the clear instructions to select the side that has more dots (for detailed review on this task and its analysis see Hochman et al., 2016).

The flexible dots task was composed of 10 practice trials and was then conducted in two phases (each consisting of 100 trials). After completing the first phase, participants were taken to an adjacent room where they were randomly assigned to one of three conditions: NDI Polygraph feedback, DI Polygraph feedback, and Control. In both the NDI and DI Polygraph feedback conditions, participants were asked about their dishonesty during the dots task while we monitored their physiological arousal. This was done in a mock situation simulating a semi-automated version of a classic lie-detection examination, a common instrument used to test people’s integrity in the US, Europe, Israel, and South Africa (Honts & Reavy, 2015; Raskin et al., 2014; see further elaboration in Study 3). All participants were notified prior to the test that if the polygraph determined they had told the truth, they would be paid 5 ILS (approx. $1.50) in addition to the amount they accumulated in the other parts of the study. In fact, out of fairness, at the end of the entire study all the participants were given this extra sum. Feedback (NDI or DI) was provided at random, and was not based on whether the participants actually cheated or not. Participants (in both studies) were debriefed about this deception and its purpose immediately upon completion of the study. Participants were asked to deny cheating on the test and were told that the polygraph could determine whether they had lied or not. As each participant might mistakenly identify more dots on the right-hand side in some of the trials, they were told that we were looking for those who purposely and repeatedly chose to cheat to increase their personal gain.

In the NDI polygraph feedback condition, the examiner informed the participants that they did not cheat on the dots task. In the DI polygraph feedback condition, the examiner informed the participants they had cheated on the dots task. Thus, while the feedback was random, in each feedback condition we had participants whose feedback matched their actual cheating behavior. Finally, in the Control condition, participants completed a mathematical filler task, and received brief feedback on their performance (i.e., accuracy).

After completing the polygraph test/filler task, participants returned to the first room, where they were asked to complete the second phase of the dots task, which was identical to the first. This served to assess how the polygraph feedback affected their cheating behavior immediately after the test.

2.2 Results and discussion

To examine cheating levels, we focused on the errors participants made in the dots task. Specifically, we classified errors into two types: detrimental errors; i.e., trials in which participants chose “left” when there were more dots on the right-hand side of the square, and beneficial errors; i.e., trials in which participants chose “right” when there were more dots on the left side of the square. The cheating level was calculated as the difference between the percentage of beneficial and detrimental errors, which captured the extent to which participants deliberately violated the instructions to increase personal gain (see Hochman et al., 2016). The two percentages used to calculate this difference are available in the data (in Excel format, with graphs) for all three studies.

A 2 X 3 repeated-measure ANOVA using phase (first & second) as a within-subject factor and condition (NDI, DI & Control) as a between-subject factor was conducted to predict cheating level. This analysis revealed a significant main effect for phase (F(1, 96) = 16.916, p < 0.0001, Partial η² = 0.150), but not for condition (F(1,96) = 0.976, p = 0.381, Partial η² = 0.02).There was also a significant condition by phase interaction (F(2, 96) = 9.478, p < 0.0001, Partial η² = 0.165).

Planned contrasts were conducted to probe this significant interaction. These analyses revealed that cheating level was virtually the same for all three conditions on the first phase of the dots task (F(2,96) = 0.229, p = 0.80). Specifically, the mean difference between beneficial and detrimental errors was 24.21 (SD = 22.45) in the Control condition, 22.74 (SD = 18.39) in the DI condition, and 26.25 (SD=22.51) in the NDI condition. However, a different pattern emerged for the second phase of the dots task. In the Control condition there was a slight increase in cheating level, with a mean of 26.30 (SD = 27.74), which did not reach significance (t(32) = −0.849, p = 0.402). By contrast, participants exhibited a significant decrease in cheating behavior in both the DI (t(33) =5.154, p < 0.0001) and the NDI conditions (t(31) =4.719, p < 0.0001) (Mean = 13.35, SD= 15.22 and Mean = 19.63, SD = 25.60 respectively). Although the decrease in the NDI feedback condition was smaller than in the DI feedback condition, the difference between the decrease in the two conditions was not significant (F(1,64)=1.414, p = 0.239, Partial η² =0.022). The results of this analysis are summarized in Figure 1.

In addition, we conducted a two-way ANOVA using cheating in the first phase (as a continuous variable) and condition to assess cheating level in the second phase. This analysis was done to examine if cheating in the first phase (which might represent a tendency to cheat) effected the cheating level in the second phase. The analysis revealed a significant main effect for condition (F(2, 23) = 3.645, p < 0.05) and cheating level in the first phase (F(43,22.2) = 11.020, p < 0.0001). Thus, it seems that the initial cheating level, and not just condition, had an effect on cheating level in the second phase. Importantly, a significant condition X cheating in the first phase interaction was found (F(22,31) = 2.597, p < 0.01), suggesting that the reduction in cheating level after the feedback moderated the effect of the initial cheating level.

Thus, the pattern of results supports Hypothesis 1, in that cheating behavior decreased under both feedback conditions. Presumably, the feedback (whether positive or negative) served as an external cue drawing the participants’ attention to internal honesty and thus decreasing dishonest behavior (e.g., Mazar et al., 2008). However, one interpretation of the results is that the polygraph test itself served as a moral reminder, and not the feedback. To test for this possibility, in Study 2 we used a polygraph test without feedback.

3 Study 2

Study 2 was designed to determine whether the mere presence of a lie detector test could serve as a moral reminder by itself, even without feedback. To do so, we measured cheating behavior before and after a polygraph test without feedback, and compared it to cheating behavior before and after recalling the Ten Commandments. Although recalling the Ten Commandments did not reduce cheating behavior in a recent replication project (Verschuere et al., 2018), we expected this type of moral reminder to be effective among Israeli participants (Amir et al., 2018).

3.1 Method

3.1.1 Participants

One hundred and fifteen students from an Israeli University (54 males, 61 females) volunteered to participate in the study. We recruited as many participants as possible during the course of a full academic semester. Thus, sample size was not determined a-priori. The mean age was 24.59 years (SD=5.296). Participants received up to 40 ILS ($ 11.00) for their participation, but actual payment was contingent upon their selections during the task.

3.1.2 Design and procedure

Participants engaged in the same dots task as used in Study 1 in two stages in a 2-between-subjects design. After completing the first phase, participants were taken to an adjacent room where they were randomly assigned to one of two conditions: the Ten Commandments reminder or the Lie detection test without feedback. In the Ten Commandments condition, participants were asked to recall the Ten Commandments and write them down on a piece of paper. They were allotted 2 minutes for that purpose. In the Lie detector test without feedback condition, participants were asked about their dishonesty during the dots task. As in Study 1, we monitored their physiological arousal in a mock situation simulating a semi-automated version of a classic realistic lie-detection test. Importantly, no feedback was given after the test. Next, participants in both conditions completed the second phase of the dots task and were debriefed.

3.2 Results and discussion

A 2 X 2 repeated-measure ANOVA using phase (before and after) as a within-subject factor and condition (Lie detector test without feedback vs. Ten Commandments) as a between-subject factor was conducted to predict cheating level. This analysis revealed no significant main effect for phase (F(1,109)=0.230, p = 0.633, Partial η²=0.002), or for condition (F(1,109)=0.361, p = 0.549, Partial η²=0.003). However, there was a marginally significant condition by phase interaction (F(1,109)=3.801, p = 0.054, Partial η²=0.034).

Planned contrasts were conducted to further explore the interaction effect. In the Ten Commandments condition, there was a significant reduction in cheating after recalling the Commandments (from M=37.38, SD=38.35 to M=32.69, SD=27.56; t(54)=2.446, p=0.018, d=0.140). By contrast, there was no significant difference in cheating level before (M=36.76, SD=30.67) and after (M=39.60, SD=31.00) the lie detector test without feedback (t(55)= −0.852, p = 0.398, d=0.092). The results of this analysis are summarized in Figure 2.

This pattern of results suggests that similar to the original results reported in Mazar et al. (2008), reciting the 10 Commandments reduced cheating behavior. Most importantly, however, the test in itself did not serve as a moral reminder. The mere existence of a lie detection test with no credible feedback given to the examinees had no effect on subsequent cheating behavior. Study 3 examined the possible effects of polygraph test feedback as a justification as well as its effects on the physiological responses (dissonance) associated with it.

4 Study 3

The previous studies suggested that the polygraph test in itself does not serve as a moral reminder, but may do so when associated with feedback. Thus, polygraph test feedback may reduce subsequent dishonest behavior by increasing awareness of moral standards. Nevertheless, previous research has suggested that people use justifications to defuse ethical dissonance (Hochman et al., 2016), so that they can profit from dishonest behavior while preserving a positive moral self (Barkan et al., 2015). To the extent that people act dishonestly to the point they can justify their behavior and maintain a positive self-image (Ayal & Gino, 2011; Shalvi, Gino, Barkan & Ayal, 2015), polygraph test feedback may also serve as a justification. However, this could happen if moral behavior and feedback will not align. Consider, for example, a person who cheated extensively but received NDI feedback on a polygraph test. This person has “objective” confirmation of honesty and thus might continue to cheat to increase personal gain while maintaining a positive self-image. By contrast, a DI feedback delivered to someone who cheated might hinder subsequent dishonesty either due to the fact that clear feedback from an authority makes it harder to interpret an immoral act as being within the bounds of morality, or simply a deterrence effect. Thus, we hypothesized that:

Hypothesis 2:

Increasing the alignment between behavior and feedback should reduce dishonest behavior.

Since a possible justification effect of the polygraph test could be masked by the stronger moral reminder effect, Study 3 examined the effect of the polygraph feedback while we offset its role as a moral reminder. To do so, in the first phase of the study participants were asked to provide the names of their parents. Participants were told in advance that they could choose whether to provide actual names or fictitious names, and that in a subsequent phase they would undergo a polygraph test in which they would be tested whether they provided true or fictitious names. To make sure the initial phase (before the actual polygraph test) offset the role of the polygraph as a moral reminder, we explicitly informed participants in advance of all stages of the experiment that honesty was important (Pruckner & Sausgruber, 2013), but that they could deliberately choose to lie for a chance to win a monetary prize. Since we directly emphasized morality in the first stage, it seemed reasonable to assume that the polygraph test, which followed immediately afterwards, would not serve as an additional reminder.

In addition, in Study 3 we also used the polygraph test to examine physiological arousal in response to cheating behavior. The polygraph test uses the GSR to estimate the likelihood that an individual is dishonest. As a person becomes more or less stressed, skin conductance increases or decreases proportionally (Andreassi, 2000). GSR levels are considered the most indicative measure of deception (e.g., Barland & Raskin, 1975; Krapohl, 2013). As shown in previous research, physiological arousal can serve as an index of ethical dissonance in that justifications reduce ethical dissonance and its associated arousal (Hochman et al., 2016). Thus, if polygraph test feedback serves as a justification for dishonesty, it can be hypothesized that:

Hypothesis 3.

Participants who cheat should exhibit lower levels of physiological arousal after receiving NDI feedback than after receiving DI feedback.

4.1 Method

4.1.1 Participants

Sixty-three students from an Israeli University (40 males, 23 females) volunteered to participate in the study. We recruited as many participants as possible during the course of a full academic semester. Thus, sample size was not determined a-priori. The mean age was 25.46 years (SD=5.5). Participants received up to 30 ILS (approx. $7.50) for their participation, but actual payment was contingent upon their selections on the task.

4.1.2 Design and procedure

First, participants were asked to fill in the choice declaration task stating the first names of their parents. Participants were told that honesty was important but that the design allowed them to earn a bonus of 5 ILS ($1.50) if they cheated without being detected by the polygraph. Thus, participants could choose to be honest and provide their parents’ real names, or lie and provide fictitious names to try to increase their monetary payoff. As in Study 1, after completing the first phase, participants were taken to an adjacent room where they were randomly assigned to one of two between-subjects conditions: NDI Polygraph feedback and DI Polygraph feedback. In both conditions, participants were tested about their dishonesty during the names task while we monitored their physiological arousal. Participants were asked to deny providing fictitious names and were informed they could get a 5 ILS ($1.50) bonus if they could trick the examiner (that is, lie about their parents’ names and get NDI feedback). As in Study 1, however, feedback was provided at random regardless of the participants’ choice.

After the first polygraph test, participants engaged in 100 trials of the dots task (as in the second phase of Studies 1 and 2). Upon completion, the participants underwent a second polygraph test in which they were asked about their performance on the dots task. Here again, we monitored their physiological arousal. This yielded two physiological findings from two lie detector tests.

Psychophysiological measures

Participants’ GSR were recorded using the Limestone Technologies© system (Data Pac_USB Ltd.) connected to a laptop computer. The stimulus questions on the test and GSR signals were recorded using Cogito software (Ltd.) by SDS © (Suspect Detection Systems). Two 24k gold-plated electrodes were attached to the palmar surface of the participants’ index and ring fingertips of their right hand. The data were sampled at 60 Hz with 16 bits per sample. The GSR data were down-sampled to 30 Hz after smoothing by 3 sample kernels. The data were then separated using wavelets to analyze the peaks of tension and temporal responses by an in-house MATLAB script.

The GSR measure was the baseline-to-peak amplitude difference in the 0.5-s to 8-s time window from the response provided by the participant. Measures included arousal in response to 6 changing probable lie control (PLC) questions and 3 changing irrelevant questions. Arousal in response to the control questions was compared to arousal in response to 2 relevant test questions (see Appendix for all questions). This question format is consistent with the most common polygraph test known as the Control Question Test (CQT; see Honts & Reavy, 2015; Raskin, 1986; Vrij & Fisher, 2016). Despite the controversy concerning CQT (e.g., Iacono & Ben-Shakhar,, 2019), we used this protocol due to its popularity, as well as laboratory and field research in forensic settings that supports the rationale and the validity of the PLC version of the CQT (see Honts 2004; Raskin & Honts 2002). Given the constraints of the laboratory setting and the use of multiple tests, we implemented a semi-automated version of the standard CQT test. The procedure included a pre-test phase that included a brief explanation of the upcoming process followed by a short discussion of all the test questions asked by the interviewer (research assistants). Participants gave their informed consent, conforming as closely to standard CQT procedures. Participants were then seated in front of the computer and were connected to the machine. The exam itself was fully computerized. Research has shown that automation of polygraph tests increases lie detection accuracy (Honts & Amato, 2007; Kircher & Raskin, 1988) and is viewed more positively by both examiners and examinees (Novoa, Malagon & Kraphol, 2017). Furthermore, this short-automated polygraph test has been used successfully worldwide in 15 different countries as well as international terminals (Suspect Detection Systems. Ltd).

Questions were presented to participants on the computer screen and auditorily through a headphone set placed on their ears. Answers were recorded orally via a microphone attached to the headset. All participants answered all the questions in a changing order in 3 consecutive rounds and their average GSR in response to the relevant questions relative to their responses to the control questions were calculated to obtain the final index on a scale we dubbed the Sympathetic Arousal Index. GSR or EDA (electrodermal activity) is the most frequently used measure in the field of physiological lie detection (Vrij, 2000) and is considered to be the most sensitive and critical parameter out of the three channels used in polygraph examination (Kircher & Raskin, 2002). The Sympathetic Arousal Index ranges from −1.2 to 1.8 and rises as GSR levels increase in response to the relevant questions (relative to the control questions). This continuous index allows for a precise analysis of physiological arousal levels on the relevant questions.

4.2 Results and discussion

4.2.1 Behavioral results

An examination of the proportion of cheaters on the name task revealed that 55.6% chose to cheat and 44.6% chose honesty. To test the justification hypothesis, we calculated the mean cheating levels separately for participants who chose to be honest and for participants who chose to be dishonest on the names task as a function of the feedback they received on the polygraph test. This analysis is presented in Figure 3. The analysis revealed that for participants who were honest on the names task, the mean cheating level was 24.0 (SD=30.27) among those who received NDI feedback, and 31.0 (SD=29.34) among those who received DI feedback. For participants who were dishonest on the names task, the mean cheating level was 54.06 (SD=36.81) among those who received NDI feedback, and 57.35 (SD=41.95) among those who received DI feedback. A 2 X 2 ANOVA using initial choice (honesty or dishonesty) and feedback type (NDI or DI) as independent measures and cheating behavior as the dependent measure revealed a significant main effect for choice (F(1,59)=9.823, p=0.003, Partial η 2=0.143), but not for feedback type (F(1,59)=0.327, p=0.569, Partial η²=0.006). The interaction between the two factors was not significant (F(1,59)=0.042, p=0.838, Partial η²=0.001).

This pattern of results runs counter to Hypothesis 2 since even when the polygraph test did not serve as a moral reminder, the alignment between behavior on the names task and the feedback obtained from the test did not affect subsequent cheating behavior. Moreover, although caution should be exercised when comparing results from different studies, the higher level of cheating across all participants relative to the second phase of Study 1 (18.42; SD=23.06 vs. 43.14; SD=37.47, t(128) = −4.561, p<0.0001, Cohen’s d=0.795) suggests that as we expected, the polygraph test did not serve as a moral reminder in Study 3.

Importantly, another interpretation for the difference in cheating magnitude between Study 1 and 3 may stem from the unique characteristics of Study 3. More specifically, in the names task, participants were encouraged by an authority figure to trick the lie-detection test and as a result the examiner. Thus, it could be argued that they might have felt that they were expected to cheat, or that cheating the examiner is an acceptable norm that allows individuals to increase personal profit. Nevertheless, this interpretation also suggests that people should feel that it is acceptable to cheat more after NDI feedback from the examiner (compared to DI), but this was not supported by our results.

The significant difference between participants who chose to be dishonest on the names task and those who chose to be honest is worth noting since the former exhibited a much higher mean cheating level on the dots task (55.66; SD=38.84 vs. 27.5; SD=29.46). This finding supports the convergent validity of our measures and specifically our claim that the difference between beneficial and detrimental errors on the dots task represents cheating behavior, since people who chose to be dishonest in the names task were also more likely to make more errors that were beneficial to them.

4.2.2 Physiological results

The behavioral results in Study 3 suggested that the feedback type did not affect subsequent cheating behavior. To test whether feedback type had any effect on the psychological tension the participants experienced when facing a moral conflict (Barkan et al., 2015), we examined their physiological arousal during the polygraph test. Since our aim was to examine the relationship between physiological arousal and actual cheating behavior, we compared the arousal of participants who cheated to a large extent on the dots task (based on the median split which was 25.00), to arousal in participants who cheated to a small extent. This comparison was done separately for participants who received DI and NDI feedback on the first polygraph test. To do so, a sympathetic arousal index was calculated for each participant on each polygraph task. Then, to isolate the participants’ arousal in response to their behavior on the dots task, the final sympathetic arousal index was calculated as the difference between the index on the dots task and the index on the names task.

This analysis revealed that for participants who cheated to a large extent on the dots task, the mean sympathetic arousal index of those who received DI feedback on the names task was 0.15 (SD=0.93), and −0.49 (SD=1.03) for those who received NDI feedback. Since a higher sympathetic arousal index represents a higher likelihood that the participant was dishonest, this result suggests that participants who cheated to a large extent were less likely to be detected as dishonest after receiving NDI feedback on the polygraph test. A similar pattern was observed for participants who cheated to a low extent. The mean sympathetic arousal index for low cheaters who received DI feedback was −0.01 (SD=0.58) and −0.18 (SD=0.77) for participants who received NDI feedback. This analysis appears in Figure 4.

A 2 X 2 ANOVA using group (high vs. low cheaters) and feedback type (NDI or DI) as independent measures and the sympathetic arousal index as the dependent measure revealed an almost significant main effect for feedback type (F(1,59)=3.604, p=0.06, Partial η²=0.058), but not for group (F(1,59)=0.106, p=0.746, Partial η²=0.002). In addition, no interaction was found between the two factors (F(1,59)=1.269, p=0.264, Partial η²=0.021). This pattern of results suggests that NDI feedback on the polygraph test can decrease the psychological tension associated with cheating behavior. Research has indicated that justifications can decrease ethical dissonance (Hochman et al., 2016). Thus, even if the feedback itself did not directly increase cheating behavior, it allowed the participants to feel better about their cheating.

5 General Discussion

Polygraph devices are popularly depicted as scientific lie detectors (Gibson, 2001). In the current study, we examined whether the perceived role of the polygraph (Synnott, Dietzel & Loannou, 2015) had an effect on behavior. Specifically, we explored whether polygraph test feedback could serve as a signal orienting subsequent moral behavior. Based on the idea of moral reminders (Mazar et al., 2008), we tested the hypothesis that feedback or the examination itself, regardless of the feedback type, could serve as a moral reminder that promotes honesty.

In three studies, we found empirical support for the moral reminder hypothesis as well as its boundary conditions. In Study 1, we showed that polygraph test feedback (compared to the control) significantly reduced dishonesty on a subsequent task, regardless of whether participants received NDI or DI feedback. Study 2 suggests that feedback might be essential, and that a test without feedback might not serve as a moral reminder. Finally, in Study 3, we showed that the feedback on the polygraph test had no significant effect on the magnitude of cheating behavior when its role as a moral reminder was offset by an earlier manipulation. Previous research indicates that actively interacting with ethical cues in the surroundings (Ayal et al., 2015; Gino, Moore & Bazerman, 2010; Mazar et al., 2008; Shu et al., 2012) makes morality more salient and reduces dishonesty. Moral reminders have been shown to affect dishonest behavior even when people are simply made aware of the importance of moral standards (e.g., that honesty is important, Pruckner & Sausgruber, 2013), without any actual interaction with the surroundings. In line with these findings, we show that even a subtle and indirect intervention that emphasizes moral standards can have an influence on participants’ ethical behavior. This may suggest that effective moral reminders may work best when they come as a surprise and should be presented just before a temptation to cheat becomes available. Thus, in repeated trial situations, where multiple opportunities to cheat exist, the use of moral reminders may require reactivation every now and then to maintain the saliency of ethical standards and avoid habituation (Ayal et al., 2015).

One possible alternative explanation for our results is the bogus pipeline (BPL) phenomenon (Jones & Sigall, 1971), in which individuals are made to believe that their responses to questions will be independently verified by an infallible lie detector, but in fact, no lie detector is used. As a result of fear of getting caught, people react by telling the truth more (Roese & Jamieson, 1993). While the current results cannot completely rule out the BPL explanation, the fact that in Study 2 the polygraph test had no effect suggests that the feedback on the test itself was the main factor influencing subsequent cheating level. It is reasonable to assume that the polygraph test is not the only honesty verification method whose feedback may serve as a moral reminder. However, since polygraph tests are generally considered to be scientific lie detectors (Gibson, 2001), they have a real-life potential to decrease dishonesty, regardless of feedback type.

We also tested whether the polygraph test feedback served as a justification to cheat. The idea of self-signaling (Bem, 1967) suggests that past behavior may act as a signal that guides future behavior (Lee, Hochman, Prince & Ariely, 2016), and can affect internal and external moral standards (Gino et al., 2010). Thus, feedback from a professional examiner based on a presumably objective measure of morality might reinforce and validate previous behavior and influence the way people perceive their own morality. However, we did not find evidence that this was the case.

There are several possible reasons why the polygraph test feedback did not serve as a justification for the magnitude of subsequent moral behavior. The participants may have perceived the polygraph test and its outcome as unreliable and an invalid indicator of moral behavior and standards. In line with this possibility, research has shown that polygraph tests in mock juror studies have virtually no effect on the final verdict (Myers & Arbuthnot, 1997; Spanos, Myers, Dubreuil & Pawlak, 1992). However, the polygraph test tends to be perceived an objective method to detect dishonesty (Gibson, 2001). Although the type of feedback had no effect on the magnitude of dishonest behavior, NDI feedback alone reduced the physiological arousal associated with dishonesty. Ethical dissonance (Ayal & Gino, 2011) suggests that acting dishonestly comes at the psychological cost of increased tension and discomfort (Barkan et al., 2015). As people become less aware of their own dishonesty, this tension lessens, thus enabling them to continue their wrongdoings (Hochman et al., 2016). Thus, while feedback itself might not serve as a justification to increase cheating behavior, it may serve to justify the existing level of cheating behavior. This interpretation points to an interesting possible distinction between different types of justifications. Whereas some justifications can promote cheating behavior (Shalvi et al., 2015), some justifications may be more internal, and serve to reduce the guilt associated with actual dishonesty. Further research is required to examine this possibility.

One limitation of the current research is the lack of a second control condition in Study 2, in which participants engage in the dots task without a moral reminder. This is especially true given the debate about the effectiveness of moral reminders in reducing dishonest behavior (Amir et al., 2018; Schild et al., in press; Verschuere et al., 2018). While the results of the control condition in Study 1 suggest that no reminder lead to a similar effect of polygraph test without feedback, this should be further explored in future research. Another limitation is that in Study 3 we employed a names task, in which participants could lie about the fact that they lied. In other words, given that participants received a bonus for reporting they had lied and did not get caught, they could have used their parents’ real names but report that they lied. However, the categorization of participants into honest and dishonest was based on their own admission and not on the polygraph test, and thus this limitation cannot be construed as having reduced the validity of the test. Moreover, the fact that those who reported lying on the names task also cheated more on the subsequent dots task and those who were honest in the names task cheated less in the dots task suggests that at least most of the participants who decided to cheat on the preliminary task were accurately considered as dishonest and vice versa.

Another limitation is that some choices (not all) that we count as cheating could in fact result from rational response bias. If participants have little or no idea which side of the square had more dots, they must guess. In such cases, the rational choice would be to guess the side with the higher payoff. Thus, such responses should boost their cheating score.

Finally, a more general limitation of these studies is that the polygraph test feedback was random, and not contingent on actual cheating behavior. This random manipulation was designed to avoid selection bias and allow for a sufficient (and equal) number of participants in each feedback condition. However, this could have affected the way participants perceived the feedback. Nevertheless, despite this randomization, the feedback was aligned with almost half of the participants’ behavior, and their results were virtually the same. Future research should examine whether similar results would be obtained for genuine rather than random polygraph feedback. In addition, in the current studies we examined the effect of the polygraph test feedback on immediate subsequent behavior. Further research is needed to determine whether the role of the polygraph test as a moral reminder has a long-lasting effect on people’s moral behavior, as well as on how their ethicality is perceived by others.

6 References

Amir, O., Mazar, N., & Ariely, D. (2018). Replicating the effect of the accessibility of moral standards on dishonesty: Authors’ response to the replication attempt. Advances in Methods and Practices in Psychological Science, 1(3), 318–320.

Andreassi, J. L. (2000). Psychophysiology: Human behavior & physiological response (4th ed.). New Jersey: Lawrence Erlbaum Associate

Ayal, S., & Gino, F. (2011). Honest rationales for dishonest behavior. In M. Mikulincer & P. R. Shaver (Eds.), The Social Psychology of Morality: Exploring the Causes of Good and Evil. Washington, DC: American Psychological Association.

Ayal, S., Gino, F., Barkan, R., & Ariely, D. (2015). Three principles to REVISE people’s unethical behavior. Perspectives on Psychological Science, 10(6), 738–741.

Barkan, R., Ayal, S., & Ariely, D. (2015). Ethical dissonance, justifications, and moral behavior. Current Opinion in Psychology, 6, 157–161.

Barland, G. H., & Raskin, D. C. (1975). An evaluation of field techniques in detection of deception. Psychophysiology, 12, 321–330.

Bazerman M.H. & Tenbrunsel, A. E. (2012). Blind spots: Why we fail to do what’s right and what to do about it. Princeton: Princeton University Press.

Bem, D. J. (1967). Self-perception: An alternative interpretation of cognitive dissonance phenomena. Psychological Review, 47(3), 183–200.

Fiedler, K., Schmid, J., & Stahl, T. (2002). What is the current truth about polygraph lie detection? Basic and Applied Social Psychology, 24, 313–324.

Furedy,J.J. (1993). The “control” question “test” (CQT) polygrapher’s dilemma: Logico-ethical considerations for psychophysiological practitioners and re- searchers. International Journal of Psychophysiology, 15, 263–267.

Gallai, D. (1999). Polygraph evidence in federal courts: Should it be admissible? American Criminal Law Review, 36, 87–116.

Gibson, M. (2001). The truth machine: Polygraphs, popular culture and the confessing body. Social Semiotics, 11(1), 61–73.

Gino, F., Ayal, S., & Ariely, D. (2009). Contagion and differentiation in unethical behavior: The effect of one bad apple on the barrel. Psychological science, 20(3), 393–398.

Gino, F., Moore, D. A., & Bazerman, M. H. (2010). See no evil: When we overlook other people’s unethical behavior. In R. M. Kramer, A. E. Tenbrunsel, & M. H. Bazerman (Eds.), Organization and management series. Social decision making: Social dilemmas, social values, and ethical judgments (pp. 241–263). New York, NY, US: Routledge/Taylor & Francis Group.

Gino, F., Norton, M. I., & Ariely, D. (2010). The counterfeit self. The deceptive costs of faking it. Psychological Science, 21(5), 712–720.

Ginton, A. (2017). Examining different types of comparison questions in a field study of CQT polygraph technique: Theoretical and practical implications. Journal of Investigative Psychology and Offender Profiling, 14(3), 281–293.

Grym, J., & Liljander, V., (2016). To cheat or not to cheat? The effect of a moral reminder on cheating. Nordic Journal of Business, 65, 18–37.

Hochman, G., Glöckner, A., Fiedler, S., & Ayal, S. (2016). “I can see it in your eyes”: Biased processing and increased arousal in dishonest responses. Journal of Behavioral Decision Making, 29(2–3), 322–335.

Honts, C. R. (2004). The psychophysiological detection of deception. In P. Granhag & L. Strömwall (Eds.), Detection of deception in forensic contexts. London: Cambridge.

Honts, C., & Amato, S. (2007). Automation of a screening polygraph test increases accuracy. Psychology, Crime, & Law, 13, 187–199.

Honts, C. R., & Reavy, R. (2015). The comparison question polygraph test: A contrast of methods and scoring. Physiology and Behavior, 143, 15–26.

Hwang, H. (2015). Moral reminder as a way to improve worker performance on amazon mechanical turk. Proceedings of the Third AAAI Conference on Human Computation and Crowdsourcing. San Diego, CA., 12–13.

Iacono, W. G., & Ben-Shakhar, G. (2019). Current status of forensic lie detection with the comparison question technique: An update of the 2003 National Academy of Sciences report on polygraph testing. Law and Human Behavior, 43(1), 86–98.

Iacono, W. G., & Lykken, D. T. (1997). The validity of the lie detector: Two surveys of scientific opinion. Journal of Applied Psychology, 82, 426 – 433.

Jones, E. E., & Sigall, H. (1971). The bogus pipeline: A new paradigm for measuring affect and attitude. Psychological Bulletin, 76, 349–364.

Kircher, J. C, & Raskin, D. C. (1988). Human versus computerized evaluation of polygraph data in a laboratory setting. Journal of Applied Psychology, 73, 291–302.

Kircher, J. C., & Raskin, D. C. (2002). Computer methods for the psychophysiological detection of deception. In M. Kleiner, (Ed.), Handbook of polygraph testing (pp. 287–326). San Diego, CA:Academic Press.

Krapohl, D. J. (2013). Polygraph principles: A literature review. Polygraph, 42(1), 35–60.

Lee, C.-Y., Hochman, G., Prince, S. E., & Ariely, D. (2016). Past actions as self-signals: How acting in a self-interested way influences environmental decision making. PLOS ONE, 11(7).

Myers, B., & Arbuthnot, J. (1997). Polygraph testimony and juror judgments: A comparison of the guilty knowledge test and the control question test. Journal of Applied Social Psychology, 27, 1421–1437.

Mazar, N., Amir, O., & Ariely, D. (2008). The dishonesty of honest people: A theory of self-concept maintenance. Journal of Marketing Research, 45(6), 633–644.

Mazar, N., & Zhong, C. (2009). Do green products make us better people? Psychological Science, 21(4), 494–498.

Meijer, E. H., & Verschuere, B. (2010). The polygraph and the detection of deception. Journal of Forensic Psychology Practice, 10, 325–338.

Novoa, M., Malagon, F., & Krapohl, D. (2017). Attitudes of polygraph examiners and examinees. Polygraph, 46(2), 172–186.

O’Sullivan, M., Frank, M. G., Hurley, C. M., & Tiwana, J. (2009). Police lie detection accuracy: the effect of lie scenario. Law and Human Behavior, 33, 530–538.

Palmatier, J. J., & Rovner, L. (2015). Credibility assessment: Preliminary Process Theory, the polygraph process, and construct validity. International Journal of Psychophysiology, 95(1), 3–13.

Pruckner, G., & Sausgruber, R. (2013). Honesty on the streets: A field study on newspaper purchasing. Journal of the European Economic Association, 11, 661–679.

Raskin, D. C. (1986). The polygraph in 1986: Scientific, professional and legal issues surrounding application and acceptance of polygraph evidence. Utah Law Review, 29, 29–74.

Raskin, D. C., & Honts, C. R. (2002). The comparison question test. In M. Kleiner (Ed.), Handbook of polygraph testing (pp. 1–49). London: Academic Press.

Raskin, D. C., Honts, C. R., & Kircher, J. C. (2014). Credibility assessment: scientific research and applications. Amsterdam: Academic Press.

Roese, N. J., & Jamieson, D. W. (1993). Twenty years of bogus pipeline research: A critical review and meta-analysis. Psychological Bulletin, 114, 363–375.

Schild, C., Heck, D. W., Ścigała, K. A., & Zettler, I. (in press). Revisiting REVISE: (Re)Testing unique and combined efects of REminding, VIsibility, and SElf-engagement manipulations on cheating behavior. Journal of Economic Psychology.

Shalvi, S., Gino, F., Barkan, R., & Ayal, S. (2015). Self-serving justifications: Doing wrong and feeling moral. Current Directions in Psychological Science, 24(2), 125–130.

Shu, L. L., Mazar, N., Gino, F., Ariely, D., & Bazerman, M. H. (2012). Signing at the beginning makes ethics salient and decreases dishonest self-reports in comparison to signing at the end. Proceedings of the National Academy of Sciences of the United States of America, 109(38), 15197–200.

Spanos, N. P., Myers, B., Dubreuil, S. C., & Pawlak, A. E. (1992). The effects of polygraph evidence and eyewitness testimony on the beliefs and decisions of mock jurors. Imagination, Cognition, & Personality, 12, 103–113.

Synnott, J., Dietzel, D., & Loannou, M. (2015). A review of the polygraph: history, methodology and current status. Crime Psychology Review, 1, 59–83.

Verschuere, B., Ben-Shakhar, G., & Meijer, E. (Eds.). (2011). Memory detection: Theory and application of the Concealed Information Test. Cambridge, UK: Cambridge University Press.

Verschuere, B., Meijer, E. H., Jim, A., Hoogesteyn, K., Orthey, R., McCarthy, R. J., ... Barbosa, F. (2018). Registered replication report on Mazar, Amir, and Ariely (2008). Advances in Methods and Practices in Psychological Science, 1, 299–317.

Vrij, A. (2000). Detecting lies and deceit: The psychology of lying and the implications for professional practice. Chichester, UK: Wiley.

Vrij, A., & Fisher, R. P. (2016). Which lie detection tools are ready for use in the criminal justice system? Journal of Applied Research in Memory and Cognition, 5, 302–307.

Wang, J. T-Y., Spezio, M., & Camerer, C. F. (2010). Pinocchio’s pupil: Using eyetracking and pupil dilation to understand truth telling and deception in sender-receiver games. American Economic Review, 100(3), 984–1007

Welsh, D. T., Ordóñez, L. D. (2014). Conscience without cognition: The effects of subconscious priming on ethical behavior. Academy of Management Journal, 57, 723–742.

7 Appendix

2a - In the previous task, did you select the side that would give higher payoffs regardless of whether it was the side with more dots? (Study 1 & 2)

2b - In the previous task, did you provide fictitious names for your parents? (Study 2)

Corresponding author. School of Psychology, Interdisciplinary Center (IDC) Herzliya, Israel. Email: ghochman@idc.ac.il.