Book cover of Everybody Lies by Seth Stephens-Davidowitz

Everybody Lies

by Seth Stephens-Davidowitz

11 min readRating:3.9 (40,961 ratings)
Genres
Buy full book on Amazon

In his thought-provoking book "Everybody Lies," Seth Stephens-Davidowitz takes readers on a fascinating journey into the world of big data and its profound implications for understanding human behavior. As the title suggests, the central premise of the book is that people often lie or misrepresent themselves in surveys and social situations, but their true thoughts, desires, and behaviors are revealed through their online searches and digital footprints.

Introduction

We live in an age of unprecedented data collection. Every click, search, and online interaction leaves a digital trace, creating vast pools of information about human behavior. Stephens-Davidowitz argues that this "big data" offers a more accurate and unfiltered view of human nature than traditional research methods like surveys or interviews. By analyzing these massive datasets, we can uncover surprising truths about ourselves and society that were previously hidden or misunderstood.

The Power of Big Data

A New Kind of Data Science

The author begins by explaining that data science is more intuitive than many people think. He uses the example of his grandmother, who at 88 years old, had developed her own "data-driven" theories about relationships based on years of observation. While her intuitions weren't always accurate, this anecdote illustrates how we all naturally seek patterns and make predictions based on the information available to us.

However, Stephens-Davidowitz emphasizes that true data science goes beyond intuition. It requires rigorous analysis of large datasets to confirm or refute our initial hypotheses. This is where big data shines, as it allows us to test our assumptions against a much larger and more diverse pool of information than personal experience alone.

Google: A Window into Human Nature

One of the most powerful sources of big data explored in the book is Google searches. The author argues that Google has revolutionized our ability to understand human behavior because:

  1. It provides a constant stream of new information
  2. People are more likely to be honest in their searches than in surveys or face-to-face interactions
  3. The sheer volume of data allows for detailed analysis of specific subgroups and regions

Stephens-Davidowitz gives several examples of how Google search data has been used to gain insights into various aspects of human behavior, from tracking the spread of influenza to uncovering hidden sexual preferences.

The Four Key Advantages of Big Data

Throughout the book, the author highlights four main reasons why big data is so powerful:

  1. It offers entirely novel information that was previously unavailable
  2. It doesn't lie, as people are more honest in their online behavior than in surveys
  3. It allows for detailed analysis of small subsets within larger populations
  4. It makes A/B testing easier and more cost-effective

Let's explore each of these advantages in more detail.

Novel Information

Big data provides access to information that was previously difficult or impossible to obtain. For example, before the advent of search engines and social media, researchers had to rely on surveys, interviews, and official reports to gather information about public health, economic trends, or social attitudes. Now, we can analyze real-time data from millions of online interactions to gain insights into these areas.

The author cites the work of Google engineer Jeremy Ginsberg, who demonstrated that flu-related Google searches could be used to track the spread of influenza across geographical areas and over time. This approach provided more timely and granular information than traditional methods of disease surveillance.

Honesty in Data

One of the most compelling arguments in "Everybody Lies" is that big data reveals truths that people are unwilling or unable to admit in surveys or face-to-face interactions. Stephens-Davidowitz explains the concept of "social desirability bias," which leads people to give answers that make them look better or more socially acceptable.

He provides numerous examples of how online search data contradicts survey results:

  • In a survey of University of Maryland graduates, only 2% admitted to having a GPA lower than 2.5, while official records showed the actual number was 11%.
  • Analysis of pornography searches reveals sexual interests and fantasies that people are unlikely to disclose in surveys or to their partners.
  • Google searches related to racist attitudes are more prevalent in areas where survey data suggests lower levels of racial prejudice.

By examining what people search for when they think no one is watching, we can gain a more accurate picture of human desires, fears, and beliefs.

Zooming In on Small Subsets

The vast amount of data available allows researchers to analyze specific subgroups and geographical areas with a level of detail that was previously impossible. Stephens-Davidowitz highlights the work of Harvard professor Raj Chetty, who used tax records to investigate economic mobility in the United States.

Chetty's research revealed that while the overall chances of a poor American becoming rich were lower than in countries like Denmark and Canada, there were significant variations between different cities and regions within the United States. This level of granularity allows policymakers and researchers to identify areas where the "American Dream" is still alive and to study the factors that contribute to economic mobility.

Easier A/B Testing

The fourth advantage of big data is its ability to facilitate large-scale randomized controlled experiments, also known as A/B tests. These tests are crucial for establishing causal relationships rather than mere correlations.

Stephens-Davidowitz explains how big data makes A/B testing easier and more cost-effective:

  • Online platforms can quickly test different versions of websites, advertisements, or product features with large numbers of users.
  • Results can be analyzed in real-time, allowing for rapid iteration and improvement.
  • The scale of these experiments provides more statistically significant results than traditional small-scale studies.

The author cites Barack Obama's 2008 presidential campaign as an example of successful A/B testing using big data. The campaign team tested various combinations of pictures and text on their website to determine which layout was most effective at encouraging sign-ups and donations.

Limitations of Big Data

While Stephens-Davidowitz is generally enthusiastic about the potential of big data, he also acknowledges its limitations:

Too Many Variables

When dealing with datasets that have a large number of variables, it can be challenging to extract reliable insights. The author uses the example of behavioral geneticist Robert Plomin's attempt to find a gene linked to intelligence. Initially, Plomin thought he had discovered a correlation between the IGF2r gene and high IQ, but subsequent studies failed to replicate this finding.

This example illustrates the risk of finding spurious correlations in large datasets with many variables. As the number of potential relationships increases, so does the likelihood of identifying patterns that occur by chance rather than reflecting genuine causal connections.

Lack of Qualitative Insights

Big data excels at measuring quantifiable aspects of human behavior, but it often fails to capture the nuances of human experience. The author points out that while companies like Facebook can easily track clicks, likes, and other measurable interactions, these metrics don't necessarily reflect users' emotional experiences or satisfaction with the platform.

To address this limitation, many companies supplement their big data analysis with traditional qualitative research methods, such as surveys, focus groups, and interviews. They also employ psychologists and sociologists to help interpret the data and provide context for the patterns they observe.

Ethical Considerations

As the book progresses, Stephens-Davidowitz delves into the ethical implications of big data, particularly when it comes to government use of this information.

Privacy Concerns

The author raises important questions about the balance between utilizing big data for public good and protecting individual privacy. He discusses the hypothetical scenario of using search data to identify individuals at risk of suicide, highlighting the potential benefits and drawbacks of such an approach.

While it might seem beneficial to alert authorities when someone searches for suicide-related terms, Stephens-Davidowitz argues that this would be impractical and potentially invasive. He notes that there are millions of suicide-related searches each month, but far fewer actual suicides, meaning that most interventions would be unnecessary and could infringe on people's privacy.

Appropriate Use of Data

The author suggests that a more ethical and effective approach would be to use big data at a regional or population level rather than targeting individuals. For example, state or local authorities could use aggregated search data to identify areas with high rates of suicide-related searches and implement targeted prevention programs in those regions.

This approach allows for the benefits of big data analysis while maintaining individual privacy and avoiding the potential for government overreach.

Implications for Society

Throughout "Everybody Lies," Stephens-Davidowitz explores how big data is reshaping our understanding of various aspects of society, including:

Politics and Public Opinion

The author discusses how analysis of search data can reveal political attitudes and biases that people may not express openly. For example, he found that racist search terms were more common in areas that voted against Barack Obama in the 2008 and 2012 elections, suggesting that racial attitudes played a larger role in voting behavior than many people admitted.

Economics and Social Mobility

The book delves into how big data analysis can shed light on economic trends and opportunities. The author's discussion of Raj Chetty's work on economic mobility highlights how data-driven research can inform policy decisions and help identify areas where interventions might be most effective.

Health and Well-being

Stephens-Davidowitz explores how big data can be used to track and predict health trends, from the spread of infectious diseases to mental health issues. He also discusses the potential for using search data to identify early warning signs of various health conditions.

Sexuality and Relationships

The book contains numerous examples of how big data reveals hidden aspects of human sexuality and relationship dynamics. From uncovering unusual sexual interests to challenging assumptions about gender roles, the author demonstrates how online behavior provides a more honest picture of human desires than traditional research methods.

The Future of Big Data

In the final sections of the book, Stephens-Davidowitz speculates on the future implications of big data for society:

Personalization and Prediction

As data collection and analysis techniques become more sophisticated, we may see increasingly personalized products, services, and experiences tailored to individual preferences and behaviors.

Scientific Discovery

Big data has the potential to accelerate scientific research across various fields, from medicine to social sciences, by allowing researchers to identify patterns and test hypotheses more quickly and efficiently.

Ethical Challenges

As big data becomes more integrated into our lives, we will need to grapple with complex ethical questions about privacy, consent, and the appropriate use of personal information.

Education and Skills

The growing importance of big data may lead to changes in education and job markets, with an increased emphasis on data literacy and analytical skills.

Conclusion

"Everybody Lies" presents a compelling case for the transformative power of big data in understanding human behavior and society. Seth Stephens-Davidowitz argues that by analyzing the vast amounts of information generated through our online activities, we can uncover truths about ourselves that were previously hidden or misunderstood.

The book's key messages include:

  1. People often lie or misrepresent themselves in surveys and social situations, but their true thoughts and behaviors are revealed through their online searches and digital footprints.

  2. Big data offers several advantages over traditional research methods, including access to novel information, honesty in data collection, the ability to analyze small subsets of populations, and easier A/B testing.

  3. While big data is a powerful tool, it has limitations, particularly when dealing with datasets containing too many variables or when trying to capture qualitative aspects of human experience.

  4. The use of big data raises important ethical considerations, especially regarding privacy and the appropriate use of personal information by governments and corporations.

  5. Big data analysis has the potential to reshape our understanding of various aspects of society, from politics and economics to health and relationships.

As we move further into the age of big data, Stephens-Davidowitz encourages readers to think critically about the information they encounter and to consider the potential benefits and risks of this new approach to understanding human behavior. He also challenges us to be more honest with ourselves and others about our true thoughts and desires, recognizing that the aggregate data often reveals a more complex and nuanced picture of humanity than we might admit to in public.

Ultimately, "Everybody Lies" serves as both a fascinating exploration of human nature and a call to embrace the potential of big data while remaining mindful of its limitations and ethical implications. As we continue to generate vast amounts of data through our daily interactions with technology, the insights gleaned from this information have the power to transform our understanding of ourselves and the world around us.

Books like Everybody Lies