Human Sexuality is a complicated, multi-faceted part of our identities. However, we tend to collapse this along relatively few axes (for example, Straight to Gay, or Vanilla to Kinky), obscuring much of the complexity of what people are interested in. As part of my longstanding interest in this topic, I recently came across kinklists on Reddit, which are structured data meant to convey what, and to what intensity, people are interested in certain sexual acts of kinks. A blank kinklist is shown below (Figure 1.1).
Figure 1.1: A blank kinklist, showing the categories available for rating
In these charts, users code each interest in a 5-point scale from lowest to highest: ?No?, ?Maybe?, ?Okay?, ?Like? and ?Favorite?, with an option to not enter anything. An example is found below
figure 1.2: A completed kinklist for the author.
This data, if aggregated on a large enough scale, might be a good opportunity to explore what people are interested in, and what they are not.
We can scrape all of these that we find on Reddit using the Reddit API and download the associated image files (usually hosted on imgur) from the following subreddits, where they are commonly found (dppprofiles, dirtypenpals, exxxchange).
Once we have these images we can use image processing software (ImageJ) along with some custom scripts to automatically extract the colors for each item and map them to a five point scale, from 1 (no) to 5 (favorite), with omitted values being ignored for the purposes of analysis. There are 202 separate categories in the standard kinklist (although longer and shorter variations exist, they were excluded from this analysis), leading to a vector of length 202 which defines the interests of an individual user. From the title of the post which contains the kinklist, we can usually (but not always) find the user?s self-reported age and gender (see the note below on data privacy).
As of June 29th, 2019, I have scraped 2464 profiles. What follows, arranged loosely into sections, is my attempt to make sense of this data and to see what (if anything) interesting can be learned from it.
Before we begin, I would like to stipulate three brief disclaimers. First, I am not a statistician, and this is not a rigorous statistical analysis. There will almost certainly be errors and omissions. Second, this sample is likely to be very unrepresentative of the broader population, and so should not be overly generalized. Third, we should remember here that these data reflect what people fantasize about, not what they may want in reality. In addition, we cannot have any confidence that the people filling these out are who they say they are, and are actually interested in what they say they are interested in. For example, some people identifying as female here may actually be male, and vice-versa. People are likely to mis-report their age. These will introduce a large amount of bias an uncertainty in the data. The data presented here should be considered ?for fun?. At the bottom of part 3 is a brief note about how to interpret the graphs contained in this series.
This series is divided into seven parts (so far, with more coming):
Part One: Introduction (this)
Part Two: Population Description and General Trends
Part Three: Physical Attributes and Multi-Partner Sex
Part Four: Oral and Anal Sex
Part Five: Clothing, Toys and Manual Manipulation
Part Six: BDSM
Part Seven: Pain
Part Eight: Miscellaneous Fetishes
Glossay of Terms
A note on data privacy
These data are publicly available, with links scraped from Reddit and the corresponding data downloaded automatically from Imgur. The only identifying information is the Reddit username, which is itself pseudo-anonymous. Despite this, I have not retained any links between the Reddit username and the data. All extracted vectors are assigned a random number in place of username, with only the gender and the age (if available) being retained. Usernames by themselves are kept in a list (without any associated kinklists or data) for the sole purpose of ensuring that the kinklist for a single user isn?t downloaded multiple times during the scraping process. Kinklist image files are deleted after the values are extracted, and there is no way for me to find an individual user?s kink information from the data that I retain. Because turnabout is fair play, I have used my own kinklist above for example purposes, and so I am the only person for whom kinklist data can be matched to in this project.