The OKCupid Experiment: When Data Met Desire

How Chris McKinlay used Web Scraping and got Lucky.

Tue, Nov 25th
ethicsdatasciencestrategyperspectivesystemthinkingwebscrapingpatternrecognition
Created: 2025-12-15Updated: 2025-12-15

The OKCupid Experiment: When Data Met Desire

The Story: How a Mathematician Hacked Love

In 2014, a lonely mathematician named Chris McKinlay sat at his computer and realized that the online dating platform OKCupid was nothing more than a massive, unstructured dataset.

Every user profile was a vector of choices, thousands of multiple-choice answers, beliefs, preferences, and quirks; freely available behind a few layers of HTML.
So, like any good data scientist facing uncertainty, McKinlay didn’t rely on luck.
He built a web scraper.

The Scraping

Night after night, his program quietly collected answers from thousands of women’s profiles, focusing on those whose photos or interests fit his general type.
Each profile became a row in his growing dataset; each question an attribute; each answer a signal.

By the end, he had a dataset richer than most market research firms.

The Clustering

Then came the analysis.
He used unsupervised learning, clustering the women based on how similarly they answered OKCupid’s questions.

The result? Distinct behavioral “types” of women: clusters defined not by superficial traits, but by underlying values and preferences.

He didn’t stop there.

The Optimization

For each cluster he found appealing, McKinlay created a customized profile of himself, carefully optimized to resonate with that type.
Each version of his profile was a targeted experiment, designed to score high on OKCupid’s internal compatibility algorithm, effectively reverse-engineering attraction.

When women from a given cluster browsed the corresponding version of his profile,
they saw a man who seemed astonishingly compatible, because, statistically, he was.

The Outcome

The experiment worked.
Over several weeks, McKinlay went on dozens of dates, each one a validation of his model’s predictive power.
And then, as poetic justice would have it, one of those data-driven matches became his wife.

It was, in every sense, a love story optimized through algorithms.


What He Actually Did (The Technical Lens)

PhaseMethodData Science Concept
Data collectionWeb scraping of public OKCupid profilesData acquisition / ETL
Feature extractionTransforming questions and answers into numerical vectorsFeature engineering
Pattern discoveryClustering profiles into behavioral archetypesUnsupervised learning
Profile optimizationCrafting messages per cluster for max compatibilityPersonalization / optimization
Field testingMeasuring response rate per clusterA/B testing / reinforcement

McKinlay didn’t hack a system. He modeled it, and then used that model to influence human behavior in the real world.


The Deeper Significance

This story captures the fusion of mathematics and human emotion, where algorithms don’t just predict, they persuade.

  • OKCupid’s compatibility algorithm was built to help users find matches.
  • McKinlay’s meta-algorithm optimized himself to the algorithm.
  • The result was an arms race of inference: human intuition versus machine pattern recognition.

It’s a striking example of how data science can amplify human intent, for better or worse.


The Ethical Undercurrent

  1. Consent: None of the users whose data he scraped agreed to be analyzed in this way.
  2. Manipulation: By tailoring profiles per cluster, he blurred the line between authenticity and strategy.
  3. Reflection: Yet, was he lying, or simply presenting different facets of himself to different audiences?

This moral ambiguity mirrors the modern internet:
every algorithmic system invites people to game it; whether for attention, profit, or love.

“Whenever a system quantifies human behavior, someone will optimize to exploit it.”


Lessons for Data Scientists

PrincipleWhat McKinlay Taught Us
Data ≠ TruthContext defines meaning. A dating answer is not an absolute fact, it’s a performative signal.
Optimization changes the systemOnce you optimize for compatibility, the nature of “compatibility” shifts.
Every dataset hides ethicsData scraped without consent can still yield powerful, but morally ambiguous insights.
Feedback loops are everywhereHis “model” influenced behavior, which in turn validated his model, classic reinforcement.

Broader Connections

  • In Machine Learning: This is human-in-the-loop optimization, where feedback reinforces selection.
  • In Marketing: Similar clustering and segmentation models guide personalized advertising.
  • In AI Ethics: It foreshadows today’s debates on algorithmic manipulation and consent.
  • In Psychology: It proves how identity itself can be parameterized and tuned for resonance.

Reflection Prompts

  • Would you call McKinlay’s method clever experimentation or digital manipulation?
  • Where is the line between self-optimization and deception?
  • How many modern systems: recommendation engines, political campaigns, influencer algorithms now do the same at scale?
  • If you had McKinlay’s tools, would you have done it?