"Everybody Lies" by Seth Stephens-Davidowitz
Above: "Everybody Lies - Big Data, New Data and What the Internet Can Tell Us About Who We Really Are." Seth Stephens-Dvidowitz - 284 Pages.
Oh. The title, "Everybody Lies," derives from the fact that what you click on on the internet is not necessarily what you tell people outwardly.
I completed reading this book today.
This book was tossed on the table at La Societe Deux Magots, Wasatch Bagel by Kilamanjaro a month or two ago. I'm about the fourth ROMEO of the group to read it. I'll toss it back on the table the next time I'm at an LSDM colloquy.
The author is a Harvard grad, former Google engineer, and now New York Times tech writer.
Social science as a proper scientific discipline was once suspect. Freud, for example, was nothing but a guy with theories, most of which were unsubstantiated. Now... no longer. With the advent of the internet and eight trillion gigabytes of data, asking the right questions, making the right searches, can make available hard data about ourselves and the world. What for Freud was hypothesis, today can be certitude. Social science, now, is big data science.
The author is a baseball fan. Making the right google and Facebook inquiries he was able to determine that the average age where an American male decides what major league baseball team he will follow for the rest of his life, is eight (females, its twenty-two).
I thought back to the experience I had in Queens in 1951 where as a seven year old my Dad took me to a Bayside Queens sporting goods store to buy me a baseball cap. The store clerk showed me my choices: Dodgers, Giants or Yankees. Going on nothing more than uniformed impulse, I chose a Giants cap. The brash, New York store clerk said, "Whaaaa? Giants? You don't want a Giants cap... you want a Yankees cap!" And so, I took his advice... bought the Yankees cap and have been, more or less, a Yankees fan ever since.
My son, Rudy, Jr., and Mwah (sic) would catch six or eight Yankees home games, in the cheap seats, a year while we lived in the New York area.
If you can find out the average age of adopting a team on the internet. What else can you find. Think of the marketing and advertising possibilities alone.
In the book's section on "zooming in," we learn that cheating on taxes correlates to the number of tax professionals living in a given area. We learn that the number of famous people per location, as determined by having a Wikepedia entry, correlates to being raised in a big city or a college town. A Wikepedia search shows that a disproportionate number of high achievers have a foreign born parent. Some of these conclusions seem intuitive, but with big data searches we no longer need to rely on intuition... we can know for sure.
And then there's A/B testing. Correlation is not causality. But, if you want to test for causality, use big data A/B testing. Schools in one state of India were having terrible luck with teacher absenteeism. Teachers were absent forty percent of the time. Would higher pay change the results? A control group of teachers was paid an extra $2.00 a day. Absenteeism in the control group went down to near zero. Results in the performance of the students increased significantly. So now, in this example, we know that teacher absenteeism correlates to low pay and if teacher pay is increased (causation) we can improve student performance.
If Google wants to know how to get more people to click on ads on their sites, they may try two shades of blue in ads - one shade for Group A, another for Group B. Google can then compare click rates. Surprisingly, big differences in click response can be engineered by just changing colors or the size and shape of the click boxes.
Facebook now runs a thousand A/B tests per day, which means that a small number of engineers at Facebook start more randomized, controlled experiments in a given day than the entire pharmaceutical industry starts in a year.
Imagine, thousands of companies and entities A/B testing daily to get you to click more!
With conclusions ranging from strange-but-true to thought-provoking to disturbing, Stephens-Davidowitz explores the power of internet search and its deeper potential—revealing biases within us, information we can use, good or bad, to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. To a degree not appreciated by most, all of us our touched by big data everyday, and its influence is multiplying. "Everybody Lies" challenges us to think differently about how we see it and the world.
If you liked the Freakanomics books in the '90's, you'll enjoy this book. Its full of stranger than fiction insights all derived from data mining. Its also scary... seeing up close the tools used by big data companies that are used to manage and manipulate our thinking.
Oh. The title, "Everybody Lies," derives from the fact that what you click on on the internet is not necessarily what you tell people outwardly. That's it: digital truth serum!