President Joe Biden recently found himself embroiled in debate with rape researchers. As part of a push for funding to clear DNA backlogs, Biden stated that “the average rapist rapes about 6 times.” On the other side, Prof Mary Koss and others argued this is an overestimate, that the average rapist commits only 2 offenses.
On the one hand, this all seems rather pedantic and overlooks the damage done to victims. However, on the other hand, and without losing sight of how victims have been affected, we need to get to the bottom of it: Are we are talking mostly about people who commit one-offs, or about a felonious few who are responsible for many offenses? Public debates about rape and how to protect the public hinge on the facts of the matter.
So, 6 versus 2: Who’s right?
We can readily recount news stories on either side. For example, many concluded that the Stanford rape case is an example of an alleged rape that was the product of a huge misunderstanding, and a promising young man’s life has been needlessly ruined.
However, we also remember the prolific rapist in Manchester who drugged and raped dozens of men , with some investigators opining that he had more than 300 victims.
In this article, I won’t dwell on case studies, but instead focus on statistical matters, and analyse many cases.
I conclude from my analysis, first, that neither Biden nor the rape researchers are correct.
Second, the felonious few commit a terrifying number of offenses. In order to catch them, we must ensure that victims are interviewed properly by criminal investigators.
This also means that rape investigations start with the premise that the perpetrator was likely a repeat offender.
Third, victim accounts must be kept on file, even when charges are not sought, so that serial sexual offenses can be linked, perpetrators apprehended, and many more future offenses prevented.
In his quote above, it is likely that Biden was drawing on research conducted by Lisak and Miller in 2002. In the study, a total of 1,882 men on a university campus completed an anonymous questionnaire, in private, ostensibly about “childhood experiences and adult functioning”. Among the questions were items on sexual offending that were phrased in behaviourally explicit language, without using terms that require the respondent to identify as an offender (e.g., "rape," "assault," "abuse," or "battery" ), a methodology pioneered by Koss and Oros (1982). To test reliability, a subset of participants completed the survey and were interviewed; their interview responses were largely consistent with questionnaire responses (Kappa = .75).
120 of the men (6.4% of the sample) admitted engaging in behaviours consistent with rape and attempted rape. These men were responsible for total of 483 acts combined across them.
Below is figure from the paper that plots the number of people who committed single and multiple rapes.
The first thing to note about the distribution above is its unusual shape. Most things researchers measure are normally distributed. That is, how often people engage in a given behaviour is often distributed like a symmetrical bell-shaped curve that looks like this:
One key thing about normally distributed variables is that extreme scores (both small and large) are relatively improbable. But not everything is normally distributed, like personal wealth, or sexual offending. The vast majority of people are not wealthy and do not sexually offend; but, there are a few extreme exceptions (like Epstein and Weinstein).
Note that the above distribution of data on sexual offending reported in Lisak and Miller is aberrant. It looks more like what’s called a “Pareto” or a “gamma” distribution. These types of distributions are not a nice normal bell shape, but rather skewed. Computing an average (i.e., the arithmetic mean) to characterise what most people do is not appropriate unless the data are normally distributed. In a Pareto distribution the average, which is about 6 offenses in Lisak and Miller, is skewed by extreme observations.
Let’s look at why an average of about 6 offenses does not characterise the men in the sample. First, we can tell by looking at Figure 1 above that there are 34 offenders who each raped 2 times, and together as a group they committed a total of 64 rapes. However, the remaining 42 repeat offenders committed a total of 375 acts.
I’ll repeat that – 42 repeat offenders combined self-reported committing total of 375 acts of rape/attempted rape. That's astonishing. Among these offenders, 11 reported committing between 9 and 50 offenses.
And here’s the real kicker: the logic of the Pareto distribution is that most research studies are unlikely to sample by chance the most prolific offenders in society at large. They would have to have a very large sample size to capture extreme exceptions.
Lisak and Miller’s research was roundly criticised in a recent Washington Post article. In it, several researchers were interviewed, and they commented negatively on the old age of the Lisak and Miller study, and problems with the fact that a convenience sample from a university campus was used.
The researchers offered averages from their own survey research, which turned out to be an 2 rapes per rapist. Another study reported in the article looked at the total number of DNA matches in a DNA database and found that the repeat offenders (defined as those who had two or more matching samples), raped 2 victims.
Yet, as discussed above, an average does not characterise what offenders do, because offending is not normally distributed.
But wait, you might say, the DNA research is convincing. Is there something wrong with Lisak and Miller’s data?
No, there isn’t anything wrong with it. Here’s a bit of fascinating research from RAND on repeat offending across many types of crimes. (Warning, if you read the report you might go blind trying to follow the algebra.) Their data clearly show that the Pareto principle in action once again:
Many other studies find this too. See here for a review in relation to sexual offending.
A general explanation for this pattern is straightforward. Most people don’t commit crime. They are deterred for whatever reason. Some try it once but are apprehended or simply stop, while others “age out” of offending. Amongst those who do commit crime, most commit a single offence. Still, there a few who are extremely good at evading detection and are prolific in their offending.
This pattern means that we can reconcile the findings from the DNA database study with the survey data. A Pareto or gamma distribution for the number of victims per rapist implies both that:
A) most men commit no rapes, and amongst repeat rapists the most common number of victims is 2; and
B) a disproportionate number of rapes are committed by a small number of prolific offenders.
Thus, we would expect to observe in the DNA database research that the most common number of offenses committed by a repeat rapist is the minimum number of repeat offenses required to count as a repeat rapist. This is the tall bar on the left-most columns for repeat offenders in both of the two crime plots above.
But, this does not mean that most rapes are due to people on their first or second offence. Rather, it means that a disproportionate number of rapes are due to just a few perpetrators. Those in the rightmost bars.
This pattern is the same as observed in book sales, where a few best-sellers account for almost all sales, while most books are only bought by the author and their mum. The same pattern is observed in stock market moves, where the movements on a few extreme days are much more important than thousands of average days.
But wait, there’s more
I am a big fan of using multiple methods to triangulate an answer. There are drawbacks to questionnaires after all such as participants underreporting because they fear that their responses will not be anonymous, or perhaps they overreport because they do not take the task seriously.
The above studies with rapists did not sample convicted offenders specifically, so we can’t verify that any of the offenses occurred. Also, all of the studies drew on samples gathered on university campuses. Perhaps these offenders were at the start of their offending careers. Maybe the data underestimate offending because the most prolific undetected offenders never tell anyone about their offenses, hence why they haven’t been caught. So, all things considered, it is important to look elsewhere too for clues.
In one such study by Ahlmeyer, McKee and English (2000), convicted sex offenders were questioned while they were hooked up to a polygraph. Inmates reported on average attacking 184 victims and committing 528 offenses. Parolees reported on average 7 victims and 23 offenses. The averages again are biased, and don’t do a good job of characterising the men. The range in their data is 2,593 victims and 6,094 offences. This suggests that the most prolific offender in their data was responsible for over 2,500 victims and 6,000 offences.
Another piece of research by Abel and colleagues (1987) interviewed convicted sex offenders and guaranteed them immunity from further prosecutions. It found repeat offenders admitted committing 7 offenses; but again, this is an average that is misleading.
As we know, police often only have the victim’s account to work from in investigating the offense. What victims remember, and especially, how victims are interviewed, is crucial in determining the quality of the account.
A high-quality account can lead the police to forensic evidence and bystanders that can corroborate the victim’s version of events, as well as link serial offenses, where one perpetrator is responsible for many rapes.
It’s hard to argue that victims are falsely remembering or falsely reporting rape when people who do not know each other all provide information that points to the same suspect.
It is clear from the analysis above that investigators must begin with the premise that it is likely that the perpetrator is a repeat offender. The police must also maintain a database of crime reports and compare new victim accounts to the records they have on hand to link crimes and catch prolific offenders.
Appendix – More research in case you are interested
As students of criminology would tell us, theory and research on repeat offending has been going on for decades.
The Philadelphia Birth Cohort Study was one of the first to investigate the life course of offending (Wolfgang et al. 1972). It concluded 18% of offenders are responsible for more than half of all criminal offences. Chronic offenders such as this have been variously referred to as the “miscreant many”, the “power few”, or the “felonious few” (Sherman et al. 2016).
Another famous study found that repeat offenders tend to be more versatile than those who commit one-off offenses (Farrington, 1978). This means that they commit many different types of offenses rather than specialising in just one type of offense. What is more, this research also found that perpetrators of violence commit more criminal acts than non-violent offenders.
More recent research has focused on the notion of “crime harm”, or how severe an offense is. As an example, rape and other crimes against the person are more severe than drug or criminal damage offenses.
This research finds that crime harm is highly concentrated: 80% of crime harm is linked to just 7% of all detected offenders. Further, those with more varied offence histories cause the most harm.