The OKCupid Experiment: When Data Met Desire
The Story: How a Mathematician Hacked Love
In 2014, a lonely mathematician named Chris McKinlay sat at his computer and realized that the online dating platform OKCupid was nothing more than a massive, unstructured dataset.
Every user profile was a vector of choices, thousands of multiple-choice answers, beliefs, preferences, and quirks; freely available behind a few layers of HTML.
So, like any good data scientist facing uncertainty, McKinlay didnât rely on luck.
He built a web scraper.
The Scraping
Night after night, his program quietly collected answers from thousands of womenâs profiles, focusing on those whose photos or interests fit his general type.
Each profile became a row in his growing dataset; each question an attribute; each answer a signal.
By the end, he had a dataset richer than most market research firms.
The Clustering
Then came the analysis.
He used unsupervised learning, clustering the women based on how similarly they answered OKCupidâs questions.
The result? Distinct behavioral âtypesâ of women: clusters defined not by superficial traits, but by underlying values and preferences.
He didnât stop there.
The Optimization
For each cluster he found appealing, McKinlay created a customized profile of himself, carefully optimized to resonate with that type.
Each version of his profile was a targeted experiment, designed to score high on OKCupidâs internal compatibility algorithm, effectively reverse-engineering attraction.
When women from a given cluster browsed the corresponding version of his profile,
they saw a man who seemed astonishingly compatible, because, statistically, he was.
The Outcome
The experiment worked.
Over several weeks, McKinlay went on dozens of dates, each one a validation of his modelâs predictive power.
And then, as poetic justice would have it, one of those data-driven matches became his wife.
It was, in every sense, a love story optimized through algorithms.
What He Actually Did (The Technical Lens)
| Phase | Method | Data Science Concept |
|---|---|---|
| Data collection | Web scraping of public OKCupid profiles | Data acquisition / ETL |
| Feature extraction | Transforming questions and answers into numerical vectors | Feature engineering |
| Pattern discovery | Clustering profiles into behavioral archetypes | Unsupervised learning |
| Profile optimization | Crafting messages per cluster for max compatibility | Personalization / optimization |
| Field testing | Measuring response rate per cluster | A/B testing / reinforcement |
McKinlay didnât hack a system. He modeled it, and then used that model to influence human behavior in the real world.
The Deeper Significance
This story captures the fusion of mathematics and human emotion, where algorithms donât just predict, they persuade.
- OKCupidâs compatibility algorithm was built to help users find matches.
- McKinlayâs meta-algorithm optimized himself to the algorithm.
- The result was an arms race of inference: human intuition versus machine pattern recognition.
Itâs a striking example of how data science can amplify human intent, for better or worse.
The Ethical Undercurrent
- Consent: None of the users whose data he scraped agreed to be analyzed in this way.
- Manipulation: By tailoring profiles per cluster, he blurred the line between authenticity and strategy.
- Reflection: Yet, was he lying, or simply presenting different facets of himself to different audiences?
This moral ambiguity mirrors the modern internet:
every algorithmic system invites people to game it; whether for attention, profit, or love.
âWhenever a system quantifies human behavior, someone will optimize to exploit it.â
Lessons for Data Scientists
| Principle | What McKinlay Taught Us |
|---|---|
| Data â Truth | Context defines meaning. A dating answer is not an absolute fact, itâs a performative signal. |
| Optimization changes the system | Once you optimize for compatibility, the nature of âcompatibilityâ shifts. |
| Every dataset hides ethics | Data scraped without consent can still yield powerful, but morally ambiguous insights. |
| Feedback loops are everywhere | His âmodelâ influenced behavior, which in turn validated his model, classic reinforcement. |
Broader Connections
- In Machine Learning: This is human-in-the-loop optimization, where feedback reinforces selection.
- In Marketing: Similar clustering and segmentation models guide personalized advertising.
- In AI Ethics: It foreshadows todayâs debates on algorithmic manipulation and consent.
- In Psychology: It proves how identity itself can be parameterized and tuned for resonance.
Reflection Prompts
- Would you call McKinlayâs method clever experimentation or digital manipulation?
- Where is the line between self-optimization and deception?
- How many modern systems: recommendation engines, political campaigns, influencer algorithms now do the same at scale?
- If you had McKinlayâs tools, would you have done it?