.com Review
-----------
Q&A with Christian Rudder, cofounder of OkCupid and author of
Dataclysm
-----------------------------------------------------------------
Christian Rudder ( http://www..com/Christian-Rudder/e/B00LNLNS68 ) As more of our social interaction happens on social media, how
much can researchers learn about us from our online interactions?
Well, they can only learn what we tell them, but in the age of
Facebook and Google, that’s become pretty much everything. To the
extent that friendship, anger, sex, love, and whatever else
happen online, we can investigate them.
Your search history tells us what kind of jokes you like. Your
Facebook network reveals not just your friendships, but in some
cases the state of your marriage. Your preferences on OkCupid
tell us what you find sexy, and your reaction to the strangers
the site offers up tells us how you judge people. The articles
you “like” tell us not just about your politics, but even predict
your intelligence.
You fold in data points like these for millions and millions of
people, and you start to get a whole new picture of humankind.
In Dataclysm you’re taking this flood of information and putting
it to an entirely new use: understanding human nature. So what
have you found?
I tried really hard to avoid the numerical dog and pony show.
There are of course lots of interesting one-off factoids, but I
mostly found what I (and probably you) have always known: that
people are gentle, mean, stupid, lusty, lonely, kind, foolish,
shrewd, shallow, and endlessly complex. Dataclysm’s central idea
isn’t necessarily what we can see using big data; it’s the fact
of the vision itself. That we can get real data on even the most
private moments in people’s lives is an astounding thing. It’s
like the second advent of reality television, but this time
without the television part. Just the reality.
Are you worried about any of this?
I have mixed feelings about the implications. I myself almost
never tweet, post, or share anything about my personal life. At
the same time, I’ve just spent three years writing about how
interesting all this data is, and I cofounded OkCupid. My hope is
that this ambivalence makes me a trustworthy guide through the
thicket of technology and data. I admire the knowledge that
social data can bring us; I also fear the consequences.
You have a lot to say about race in the book, and you use data
to shed light on the many ways it affects the way we interact
with one another. What surprised you about your research in this
area? Did you find anything unsurprising?
The data on race was surprising only in its stubborn
predictability—for all the glitzy technology, the results
could’ve been from the 1950s. I grew up in Little Rock and
graduated from Central High, the first school in the South to be
integrated: Eisenhower, the National Guard, mobs of white people
screaming at nine black children, that’s Central. The school
embraces its history and is now over half black. I’m no brave
crusader, but race (and racism) were part of my education. So
when, in researching the book, I unpacked three separate
databases and found that in every one white people gave black
people short-shrift, I wasn’t shocked, you know? Asians and
Latinos apply the same penalty to African Americans that white
folks do, which says something about how even (relatively) recent
additions to the “American experience” have acquired its biases.
What makes this moment in time—and this set of data—different
from the massive data surveys of the past, such as Pew, Gallup,
or the Kinsey Institute?
The data in my book is almost all passively observed—there’s no
questionnaire, no contrived experiment to simulate “real life.”
This data is real life. Online you have friends, lovers, enemies,
and intense moments of truth without a thought for who’s
watching, because ostensibly no one is—except of course the
computers it all. This is how digital data circumvents
that old research obstacle: people’s inability to be honest when
the truth makes them look bad. Digital data’s ability to get at
the private mind like this is unprecedented and very powerful.
Read more ( javascript:void(0) )
Review
------
An NPR Best Book of 2014
A Globe & Mail Best Book of 2014
A Brain Pickings Best Science Book of 2014
A Bloomberg Best Book of 2014
One of Hudson Booksellers' 5 Best Business Books of 2014
Goodreads Semifinalist for Best Nonfiction Book of the Year
Finalist for the Los Angeles Times Book Prize
"Most data-hyping books are vapor and slogans. This one has the
real stuff: actual data and actual analysis taking place on the
page. That’s something to be praised, loudly and at length.
Praiseworthy, too, is Rudder’s writing, which is consistently
zingy and mercifully free of Silicon Valley business gabble."
—Jordan Ellenberg, Washington Post
"As a researcher, Mr. Rudder clearly possesses the statistical
acumen to answer the questions he has posed so well. As a writer,
he keeps the book moving while fully exploring each topic,
revealing his graphs and charts with both explanatory and
narrative skill. Though he forgoes statistical particulars like
p-values and confidence intervals, he gives an approachable,
persuasive account of his data sources and results. He offers
explanations of what the data can and cannot tell us, why it is
sufficient or insufficient to answer some question we may have
and, if the latter is the case, what sufficient data would look
like. He shows you, in short, how to think about data."
—Wall Street Journal
"Rudder is the co-founder of the dating site OKCupid and the data
scientist behind its now-legendary trend analyses, but he is also
— as it becomes immediately clear from his elegant writing and
wildly cross-disciplinary references — a lover of literature,
philosophy, anthropology, and all the other humanities that make
us human and that, importantly in this case, enhance and ennoble
the hard data with dimensional in into the richness of the
human experience...an extraordinarily unusual and dimensional
lens on what Carl Sagan memorably called ‘the aggregate of our
joy and suffering.’"
—Maria Popova, Brain Pickings
"Fascinating, funny, and occasionally howl-inducing...[Rudder] is
a quant with soul, and we’re lucky to have him."
—Elle
"There's another side of Big Data you haven't seen—not the one
that promised to use our digital world to our advantage to
optimize, monetize, or systematize every last part our lives.
It's the big data that rears its ugly head and tells us what
we don't want to know. And that, as Christian Rudder demonstrates
in his new book, Dataclysm, is perhaps an equally worthwhile
pursuit. Before we heighten the human experience, we should
understand it first."
—TIME
"At a time when consumers are increasingly wary of online
tracking, Rudder makes a powerful argument in Dataclysm that the
ability to tell so much about us from the trails we leave is as
potentially useful as it is pernicious, and as educational as it
may be unsettling. By explaining some of the ins he has
gleaned from OkCupid and other social networks, he demystifies
data-mining and sheds light on what, for better or for worse, it
is now capable of."
—Financial Times
"Dataclysm is a well-written and funny look at what the numbers
reveal about human behavior in the age of social media. It’s both
profound and a bit disturbing, because, sad to say, we’re
generally not the kind of people we like to think — or say — we
are."
—Salon
"For all its data and its seemingly dating-specific
focus, Dataclysm tells the story set forth by the book's
subtitle, in an entertaining and accessible way. Informative,
eye-opening, and (p) fun to read. Even if you’re not a
giant stat head."
—Grantland
"[Rudder] doesn’t wring or clap his hands over the big-data
phenomenon (see N.S.A., Google ads, that sneaky Fitbit) so much
as plunge them into big data and attempt to pull strange
creatures from the murky depths."
—The New Yorker
"A hopeful and exciting journey into the heart of data
collection...[Rudder's] book delivers both insider access and a
savvy critique of the very machinery he is employed by. Since
he's been in the data mines and has risen above them, Rudder
becomes a singular and trustworthy guide.
—The Globe and Mail
"Compulsively readable — including for those with no particular
affinity for numbers in and of themselves — and surprisingly
personal. Starting with aggregates, Rudder posits, we can zoom in
on the details of how we live, love, fight, work, play, and age;
from numbers, we can derive narrative. There are few characters
in the book, and few anecdotes — but the human story resounds
throughout."
—Refinery29
"Rudder’s lively, clear prose…makes heady concepts understandable
and transforms the book’s many charts into revealing
truths…Rudder teaches us a bit about how wonderfully peculiar
humans are, and how we go about hiding it."
—Flavorwire
"Dataclysm is all about what we can learn about human minds and
hearts by analyzing the massive ongoing experiment that is the
internet."
—Forbes
"The book reads as if it's written (well) by a curious child
whose parents beg him or her to stop asking "what-if" questions.
Rudder examines the data of the website he helped create with
unwavering curiosity. Every turn presents new questions to be
answered, and he happily heads down the rabbit hole to resolve
them."
—U.S. News
"A wonderful march through infographics created using data
derived from the web…a fun, visual book—and a necessary one at
that."
—The Independent (UK), 2014's Best Books on the Internet and
Technology
"This is the best book that I've read on data in years, perhaps
ever. If you want to understand how data is affecting the present
and what it portends for the future, buy it now."
—Huffington Post
"Rudder draws from big data sets – Google searches, Twitter
updates, illicitly obtained Facebook data passed shiftily between
researchers like bags of weed – to draw out subtle patterns in
politics, sexuality, identity and behaviour that are only
revealed with distance and aggregation…Dataclysm will entertain
those who want to know how machines see us. It also serves as a
call to action, showing us how server farms running everything
from home shopping to homeland security turn us into easily
digested data products. Rudder's message is clear: in this
particular sausage factory, we are the pigs.”
—New Scientist
"Dataclysm offers both the satisfaction of confirming stereotypes
and the fun of defying them…Such candor is disarming, as is Mr.
Rudder’s puckish sense of humor."
–Pittsburgh Post-Gazette
"Studying human behavior is a little like exploring a jungle:
it's messy, hard, and easy to lose your way. But Christian Rudder
is a consummate guide, revealing essential truths about who we
are. Big Data has never been so fun."
—Dan Ariely, author of Predictably Irrational
"Dataclysm is a book full of juicy secrets—secrets about who we
love, what we crave, why we like, and how we change each other’s
minds and lives, often without even knowing it. Christian Rudder
makes this mathematical narrative of our culture fun to read and
even more fun to discuss: You will find yourself sharing these
intriguing data-driven revelations with everyone you know."
—Jane McGonigal, author of Reality Is Broken
"In the first few pages of Dataclysm, Christian Rudder uses
massive as of actual behavioral data to prove what I always
believed in my heart: Belle and Sebastian is the whitest band
ever. It only gets better from there."
—Aziz Ansari
"It’s unheard of for a book about Big Data to read like a guilty
pleasure, but Dataclysm does. It’s a fascinating, almost
voyeuristic look at who we really are and what we really want."
—Steven Strogatz, Schurman Professor of Applied Mathematics,
Cornell University, author of The Joy of x
"Smart, revealing, and sometimes sobering, Dataclysm affirms what
we probably suspected in our darker moments: When it comes to
romance, what we say we want isn't what will actually make us
happy. Christian Rudder has tapped the tremendous wealth of data
that the Internet offers to tease out thoughts on topics like
beauty and race that most of us wouldn’t cop to publicly. It's a
riveting read, and Rudder is an affable and humane guide."
—Adelle Waldman, author of The Love Affairs of Nathaniel P.
"Christian Rudder has written a funny and profound book about
important issues. Race, love, sex—you name it. Are we the sum of
the data we produce? Read this book immediately and see if you
can answer the question."
—Errol Morris
"Big Data can be like a 3D movie without 3D glasses—you know
there's a lot going on but you're mainly just disoriented. We
should feel fortunate to have an interpreter as skilled (and
funny) as Christian Rudder. Dataclysm is filled with ins
that boil down Big Data into byte-sized revelations."
—Michael Norton, Harvard Business School, coauthor of Happy Money
"With a zest for both the profound and the wacky, Rudder
demonstrates how the information we provide individually tells a
vast deal about who we are collectively. A visually engaging read
and a fascinating topic make this a great choice not just for
followers of Nate Silver and fans of infographics, but for just
about anyone who, by participating in online activity, has
contributed to the data set."
—Library Journal
"Demographers, entrepreneurs, students of history and sociology,
and ordinary citizens alike will find plenty of provocations and,
yes, much data in Rudder's well-argued, revealing pages."
—Kirkus Reviews
Read more ( javascript:void(0) )
About the Author
----------------
Christian Rudder is a co-founder and former
president of the dating site OkCupid, where he authored the
popular OkTrends blog. He graduated from Harvard in 1998 with a
degree in math and later served as creative director for
SparkNotes. He has appeared on Dateline NBC and NPR's "All Things
Considered" and his work has been written about in the New York
Times and the New Yorker, among other places. He lives in
Brooklyn with his wife and daughter.
Read more ( javascript:void(0) )
Excerpt. © Reprinted by permission. All rights reserved.
--------------------------------------------------------
1.
Wooderson’s Law
Up where the world is steep, like in the Andes, people use
funicular railroads to get where they need to go—a pair of cable
cars connected by a pulley far up the hill. The weight of the one
car going down pulls the other up; the two vessels travel in
counterbalance. I’ve learned that that’s what being a parent is
like. If the years bring me low, they raise my daughter, and,
please, so be it. I surrender gladly to the passage, of course,
especially as each new moment gone by is another I’ve lived with
her, but that doesn’t mean I don’t miss the days when my hair was
actually all brown and my skin free of weird spots. My girl is
two and I can tell you that nothing makes the arc of time more
clear than the creases in the back of your hand as it teaches
plump little fingers to count: one, two, tee.
But some guy having a baby and getting s is not news. You
can start with whatever the Oil of Olay marketing department is
running up the pole this week—as I’m writing it’s the idea of
“color correcting” your face with a creamy beige paste that is
either mud from the foothills of Alsace or the very essence of
bullshit—and work your way back to myths of Hera’s jealous rage.
People have been obsessed with getting older, and with getting
uglier because of it, for as long as there’ve been people and
obsession and ugliness. “Death and taxes” are our two eternals,
right? And depending on the next government shutdown, the latter
is looking less and less reliable. So there you go.
When I was a teenager—and it shocks me to realize I was closer
then to my daughter’s age than to my current thirty-eight—I was
really into punk rock, especially pop-punk. The bands were
basically snottier and less proficient versions of Green Day.
When I go back and listen to them now, the whole phenomenon seems
supernatural to me: grown men brought together in trios and
quartets by some unseen force to whine about girlfriends and what
other people are eating. But at the time I thought these bands
were the shit. And because they were too cool to have s, I
had to settle for arranging their album covers and flyers on my
bedroom wall. My parents have long since moved—twice, in fact.
I’m pretty sure my old bedroom is now someone else’s attic, and I
have no idea where any of the paraphernalia I collected is. Or
really what most of it even looked like. I can just remember it
and smile, and wince.
Today an eighteen-year-old tacks a picture on his wall, and
that wall will never come down. Not only will his
thirty-eight-year-old self be able to go back, pick through
the detritus, and ask, “What was I thinking?,” so can the rest of
us, and so can researchers. Moreover, they can do it for all
people, not just one guy. And, more still, they can connect that
eighteenth year to what came before and what’s still to come,
because the wall, covered in totems, follows him from that
bedroom in his parents’ house to his dorm room to his first
apartment to his girlfriend’s place to his honeymoon, and, yes,
to his daughter’s nursery. Where he will proceed to paper it over
in a billion updates of her eating mush.
A new parent is perhaps most sensitive to the milestones of
getting older. It’s almost all you talk about with other people,
and you get actual metrics at the doctor’s every few months. But
the milestones keep coming long after babycenter .com and the
pediatrician quit with the reminders. It’s just that we stop
keeping track. Computers, however, have nothing better to do;
keeping track is their only job. They don’t lose the scrapbook,
or travel, or get drunk, or grow senile, or even blink. They just
sit there and remember. The myriad phases of our lives, once gone
but to memory and the occasional shoebox, are becoming permanent,
and as daunting as that may be to everyone with a drunk selfie on
Instagram, the rtunity for understanding, if handled
carefully, is self-evident.
What I’ve just described, the wall and the long accumulation of a
life, is what sociologists call longitudinal data—data from
following the same people, over time—and I was speculating about
the research of the future. We don’t have these capabilities
quite yet because the Internet, as a pervasive human record, is
still too young. As hard as it is to believe, even Facebook,
touchstone and warhorse that it is, has only been big for about
six years. It’s not even in middle school! Information this deep
is still something we’re building toward, literally, one day at a
time. In ten or twenty years, we’ll be able to answer questions
like . . . well, for one, how much does it mess up a person to
have every moment of her life, since infancy, posted for everyone
else to see? But we’ll also know so much more about how friends
grow apart or how new ideas percolate through the mainstream. I
can see the long-term potential in the rows and columns of my
databases, and we can all see it in, for example, the promise of
Facebook’s Timeline: for the passage of time, data creates a new
kind of fullness, if not exactly a new science.
Even now, in certain situations, we can find an excellent proxy,
a sort of flash-forward to the possibilities. We can take groups
of people at different points in their lives, compare them, and
get a rough draft of life’s arc. This approach won’t work with
music tastes, for example, because music itself also evolves
through time, so the analysis has no control. But there are fixed
universals that can support it, and, in the data I have, the
nexus of beauty, sex, and age is one of them. Here the
possibility already exists to mark milestones, as well as lay
bare vanities and vulnerabilities that were perhaps till now just
shades of truth. So doing, we will approach a topic that has
consumed authors, painters, philosophers, and poets since those
vocations existed, perhaps with less art (though there is an art
to it), but with a new and glinting precision. As usual, the good
stuff lies in the distance between thought and action, and I’ll
show you how we find it.
I’ll start with the opinions of women—all the trends below are
true across my sexual data sets, but for specificity’s sake, I’ll
use numbers from OkCupid. This table lists, for a woman, the age
of men she finds most attractive. If I’ve arranged it unusually,
you’ll see in a second why.
Reading from the top, we see that twenty- and
twenty-one-year-old women prefer twenty-three-year-old
guys; twenty-two-year-old women like men who are twenty-four,
and so on down through the years to women at fifty, who we see
rate forty-six-year-olds the highest. This isn’t survey data,
this is data built from tens of millions of preferences expressed
in the act of finding a date, and even from just following along
the first few entries, the gist of the table is clear: a woman
wants a guy to be roughly as old as she is. Pick an age in black
under forty, and the number in red is always very close. The
broad trend comes through better when I let lateral space reflect
the progression of the values in red:
That dotted diagonal is the “age parity” line, where the male and
female years would be equal. It’s not a canonical math thing,
just something I overlaid as a guide for your eye. Often there is
an intrinsic geometry to a situation—it was the first science
for a reason—and we’ll take advantage wherever possible. This
particular line brings out two transitions, which coincide with
big birthdays. The first pivot point is at thirty, where the
trend of the red numbers—the ages of the men—crosses below the
line, never to cross back. That’s the data’s way of saying that
until thirty, a woman prefers slightly older guys; afterward, she
likes them slightly younger. Then at forty, the progression
breaks free of the diagonal, going practically straight down for
nine years. That is to say, a woman’s tastes appear to hit a
wall. Or a man’s looks fall off a cliff, however you want to
think about it. If we want to pick the point where a man’s sexual
appeal has reached its limit, it’s there: forty.
The two perspectives (of the woman doing the rating and of the
man being rated) are two halves of a whole. As a woman gets
older, her standards evolve, and from the man’s side, the rough
1:1 movement of the red numbers versus the black implies that as
he matures, the expectations of his female peers mature as
well—practically year-for-year. He gets older, and their
viewpoint accommodates him. The s, the nose hair, the
renewed commitment to cargo shorts—these are all somehow
satisfactory, or at least offset by other virtues. Compare this
to the free fall of scores going the other way, from men to
women.
This graph—and it’s practically not even a graph, just a table
with a couple columns—makes a statement as stark as its own
negative space. A woman’s at her best when she’s in her very
early twenties. Period. And really my plot doesn’t show that
strongly enough. The four highest-rated female ages are twenty,
twenty-one, twenty-two, and twenty-three for every group of
guys but one. You can see the general pattern below, where I’ve
overlaid shading for the top two quartiles (that is, top half) of
ratings. I’ve also added some female ages as numbers in black on
the bottom horizontal to help you navigate:
Again, the geometry speaks: the male pattern runs much deeper
than just a preference for twenty-year-olds. And after he hits
thirty, the latter half of our age range (that is, women over
thirty-five) might as well not exist. Younger is better, and
youngest is best of all, and if “over the hill” means the
beginning of a person’s decline, a straight woman is over the
hill as soon as she’s old enough to drink.
Of course, another way to put this focus on youth is that males’
expectations never grow up. A fifty-year-old man’s idea of
what’s hot is roughly the same as a college kid’s, at least with
age as the variable under consideration—if anything, men in
their twenties are more willing to date older women. That pocket
of middling ratings in the upper right of the plot, that’s your
“cougar” bait, basically. Hikers just out enjoying a nice day,
then bam.
In a mathematical sense, a man’s age and his sexual s are
independent variables: the former changes while the latter never
does. I call this Wooderson’s law, in honor of its most famous
proponent, Matthew McConaughey’s character from Dazed and
Confused.
Unlike Wooderson himself, what men cl they want is quite
different from the private voting data we’ve just seen. The
ratings above were submitted without any specific prompt beyond
“Judge this person.” But when you ask men outright to select the
ages of women they’re looking for, you get much different
results. The gray space below is what men tell us they want when
asked:
Since I don’t think that anyone is intentionally misleading us
when they give OkCupid their preferences—there’s little
incentive to do that, since all you get then is a site that gives
you what you know you don’t want—I see this as a statement of
what men imagine they’re supposed to desire, versus what they
actually do. The gap between the two ideas just grows over the
years, although the tension seems to resolve in a kind of
pathetic compromise when it’s time to stop voting and act, as
you’ll see.
The next plot (the final one of this type we’ll look at)
identifies the age with the greatest density of contact attempts.
These most-messaged ages are described by the darkest gray
squares drifting along the left-hand edge of the larger swath.
Those three dark verticals in the graph’s lower half show the
jumps in a man’s self-concept as he approaches middle age. You
can almost see the gears turning. At forty-four, he’s
comfortable approaching a woman as young as thirty-five. Then,
one year later . . . he thinks better of it. While a nine-year
age difference is fine, ten years is apparently too much.
It’s this kind of calculated no-man’s-land—the balance between
what you want, what you say, and what you do—that real romance
has to occupy: no matter how people might vote in private or what
they prefer in the abstract, there aren’t many fifty-year-old
men successfully pursuing twenty-year-old women. For one thing,
social conventions work against it. For another, dating requires
reciprocity. What one person wants is only half of the equation.
Read more ( javascript:void(0) )