Book cover of The Bestseller Code by Jodie Archer

The Bestseller Code

by Jodie Archer

12 min readRating: 3.8 (1,212 ratings)
Genres
Buy full book on Amazon

Introduction

Have you ever wondered what makes a book become a bestseller? Is it just luck, or is there a secret formula behind literary success? In "The Bestseller Code," authors Jodie Archer and Matthew L. Jockers set out to answer these questions using an innovative approach: data science and machine learning.

This fascinating book takes readers on a journey through the world of publishing, exploring the patterns and characteristics that define bestselling novels. By analyzing thousands of books using a sophisticated algorithm, the authors uncover surprising insights about what makes a book resonate with readers and climb to the top of the charts.

Whether you're an aspiring author, a publishing industry professional, or simply a book lover curious about the mechanics of literary success, "The Bestseller Code" offers a fresh perspective on the art and science of creating popular fiction.

The Challenge of Predicting Bestsellers

For as long as bestseller lists have existed, publishers and critics alike have struggled to predict which books will capture the public's imagination. The first bestseller list appeared in 1891 in The Bookman, a London literary magazine. Even then, it was clear that popularity didn't always align with critical acclaim.

This disconnect between commercial success and literary quality persists to this day. Critics often scratch their heads over the runaway success of books like "Fifty Shades of Grey" or "The Da Vinci Code," which they consider poorly written. Yet these books continue to fly off the shelves, defying conventional wisdom about what makes a "good" book.

The sheer volume of books published each year makes predicting bestsellers even more challenging. With around 50,000 fiction titles released annually in the United States alone (not counting e-books), the odds of any single book making it to the New York Times bestseller list are incredibly slim. Only about 200 novels achieve this feat each year – less than half of one percent of all published books.

Given these daunting statistics, publishers have long relied on gut instinct and past performance to guide their decisions about which books to acquire and promote. But what if there was a more scientific way to identify potential bestsellers?

The Birth of the Bestseller-ometer

Enter the "bestseller-ometer," an algorithm developed by the authors over five years of intensive research. This groundbreaking tool analyzes the components of bestselling novels and identifies patterns that contribute to their success.

The results are impressive: the algorithm can predict with 80-90% accuracy which books will make it to the New York Times bestseller list. It does this without considering an author's name, reputation, or previous sales history – focusing solely on the content of the book itself.

For example, the algorithm gave Dan Brown's "Inferno" a 95.7% chance of becoming a bestseller and Michael Connelly's "The Lincoln Lawyer" a 99.2% chance. Both books indeed reached the number one spot on the bestseller list.

While not perfect (it gave Kathryn Stockett's "The Help" only a 50% chance of success), the algorithm represents a significant leap forward in understanding what makes a book commercially successful.

This technology could be a game-changer for the publishing industry, which has long struggled to identify the next big thing. Consider the case of J.K. Rowling's Harry Potter series – the first book was rejected by 12 different publishers before finding a home. Had those publishers had access to the bestseller-ometer, they might have recognized its 95% chance of success and avoided missing out on one of the biggest literary phenomena of our time.

The Importance of Topics in Bestsellers

One of the key factors the algorithm considers is a book's topic – not to be confused with its genre. While bookstores organize titles into broad categories like science fiction, mystery, or young adult, it's the specific topics within these genres that often determine a book's success.

Two topics, in particular, stand out as hugely popular across multiple genres: love and crime. The presence of these themes, in varying degrees, is far more important to a book's commercial success than its genre classification.

Take Jodi Picoult's "House Rules," for example. The algorithm breaks down its topics as follows:

  • Kids (23%)
  • Crime (10%)
  • Legal settings (7%)
  • Domestic situations (6%)
  • Close relationships (2%)

While the dominant topic is "kids," the combined presence of crime and relationships significantly contributed to the algorithm's prediction of the book's bestseller status.

The algorithm determines these topics through a process called topic modeling, which examines every word in context. This allows it to differentiate between different uses of the same word – for instance, "body" in a sexual context versus a criminal one.

Interestingly, crime emerges as the most successful topic identified by the algorithm. The more crime-related nouns a book contains, the greater its chances of success.

Contrary to popular belief, sex doesn't sell nearly as well when it comes to novels. Among bestsellers analyzed, the topic of sex appears only 0.0009% of the time.

The Power of Emotion in Storytelling

While readers might not consciously consider the quality of prose when choosing a book, what they do seek is an emotional journey. This explains why books like "Fifty Shades of Grey" can achieve massive success despite poor critical reviews – they deliver the emotional rollercoaster readers crave.

The algorithm takes into account the emotional arcs of a story, recognizing that a reader's emotions often mirror those of the main characters. It can chart these emotional beats on a graph, showing the ups and downs throughout the narrative. Books with more dramatic emotional fluctuations tend to have a higher chance of success.

"Fifty Shades of Grey" is a prime example of this phenomenon. Despite its controversial content and mixed critical reception, the algorithm gave it a 90% chance of becoming a bestseller. This is because the book's main topic isn't sex, but rather an intimate human relationship with little conflict – a theme that resonates strongly with readers.

The emotional chart for "Fifty Shades of Grey" resembles the rhythmic beat of techno music, with frequent ups and downs. Interestingly, the only other novel to achieve this same pattern was Dan Brown's "The Da Vinci Code," another massive bestseller.

The Simplicity of Bestselling Prose

When it comes to writing style, the algorithm reveals a surprising truth: bestselling authors tend to avoid fancy phrases and opt for simple, straightforward prose.

This doesn't mean that bestselling authors lack a distinctive voice. In fact, many have such a strong style that their words act as a linguistic fingerprint, recognizable by computer analysis. This was demonstrated when J.K. Rowling was correctly identified as the author of "The Cuckoo's Calling," published under the pseudonym Robert Galbraith.

However, the style that leads to bestseller status isn't about clever wordplay or innovative phrasing. Instead, it's characterized by common, even boring, sentence structures. The algorithm measures style by examining various factors such as syntax, sentence length, and the use of common words like "a," "the," and "of."

Some interesting trends emerge from this analysis:

  • Bestsellers use the word "do" twice as often as non-bestsellers
  • They use the word "very" half as often
  • Bestselling books tend to have short, clean sentences with fewer adjectives and adverbs

While this approach might not result in the most literary prose, it creates a smooth reading experience that appeals to a broad audience. The key to bestseller success, it seems, is keeping things simple and accessible.

The Gender Factor in Writing Style

An intriguing discovery made by the algorithm is that female authors, or authors the algorithm perceives as female, tend to score higher when it comes to style. This advantage persists even when other factors like plot and theme are isolated.

When looking at debut books, nine out of the ten novels deemed most likely to succeed were written by women. However, it's worth noting that the algorithm's ability to predict an author's gender isn't perfect – it's accurate only 71% of the time.

This led to some interesting observations:

  • The algorithm was 99% sure that James Patterson's romance novel "Suzanne's Diary for Nicholas" was written by a woman
  • It also thought Patterson's thriller "Four Blind Mice" was the work of a female author
  • Conversely, Toni Morrison's work was mistaken for that of a male writer due to its more sophisticated literary style

Upon closer examination, it became clear that what the algorithm recognizes is a mix of both cultural and gender signals. Male authors tend to use a more sophisticated literary style, while female writers often employ a blunter, simpler style that aligns with bestseller characteristics.

This stylistic advantage for female writers may be partly due to their backgrounds. Many top female authors, like Terry McMillan, have experience in journalism, which teaches writers to use clear, concise language – a style that resonates well with bestseller readers.

In James Patterson's case, his background in advertising likely contributed to his broadly appealing style, much as journalistic experience has helped many women authors.

The Impact of Titles and Character References

The title of a book can play a crucial role in its success, particularly when it references a strongly written character. You might have noticed a trend in recent bestsellers with titles like "Gone Girl," "The Girl on the Train," and "The Girl with the Dragon Tattoo." While the word "girl" seems to be a common factor, it's actually the reference to the main character that makes these titles effective.

About one-fifth of all bestselling titles refer to the book's main character in some way. However, the approach has evolved over time:

  • In the 19th century, it was common to name books directly after characters (e.g., "Madame Bovary," "Oliver Twist")
  • Modern bestsellers tend to describe the character in a few simple words
  • Using "the" instead of "a" in the title creates a stronger impact (e.g., "The Client" sounds more powerful than "A Client")

The algorithm can distinguish between stereotypical and intriguing character descriptions in titles. This is why "The Girl with the Dragon Tattoo" was recognized as a potential bestseller, while the less descriptive "A Girl to Come Home To" didn't fare as well.

Characters are a primary reason why people read books, so it's crucial to make them compelling and strong. The algorithm measures character strength by analyzing the frequency of character-related words and the verbs and pronouns associated with them.

Interestingly, "need" is the most popular verb in bestsellers. When a character needs something, it drives the plot forward and creates engagement. For example, Gillian Flynn's "Gone Girl" contains 163 sentences using the word "need," signaling to the algorithm that the book has a strongly defined character and a propulsive story.

The Potential of the Bestseller-ometer

The bestseller-ometer has potential applications beyond just predicting commercial success. It could be used to recommend books to readers based on objective data rather than subjective opinions.

Imagine trying to convince your book club to read a crime novel you love, but they're skeptical about its literary merit. With the algorithm's analysis, you could present graphs and data comparing the book to thousands of others, providing a more compelling argument for your choice.

For first-time authors, the algorithm could be an invaluable tool. It's often a better predictor of success than critics, as demonstrated by Dave Eggers's "The Circle." Despite mixed critical reception, the novel scored a perfect 100% chance of success according to the algorithm, based on its appealing title, central topics, strong first sentence, and well-developed main character. True to the prediction, it appeared on many bestseller lists in 2013.

The algorithm could also serve as a teaching tool for aspiring writers. Finding one's voice and developing a strong writing style can take years, but the bestseller-ometer could guide new authors in the right direction, helping them through the revision process and increasing their chances of success.

Final Thoughts

"The Bestseller Code" offers a fascinating glimpse into the science behind literary success. By analyzing patterns across thousands of bestselling novels, Archer and Jockers have uncovered valuable insights about what makes a book resonate with readers on a massive scale.

Key takeaways from their research include:

  1. Topics matter more than genres, with crime and love being particularly powerful themes
  2. Emotional rollercoasters keep readers engaged
  3. Simple, straightforward prose is more effective than flowery language
  4. Strong, well-defined characters are crucial
  5. Titles that reference intriguing characters can boost a book's appeal

While the bestseller-ometer isn't a guarantee of success, it provides a data-driven approach to understanding what makes a book commercially viable. This information can be invaluable for authors, publishers, and anyone interested in the mechanics of storytelling.

However, it's important to remember that the algorithm focuses solely on commercial success, not literary merit or artistic value. Many great works of literature might not score highly on the bestseller-ometer, but that doesn't diminish their worth or impact.

Ultimately, "The Bestseller Code" reminds us that while writing is an art, publishing is a business. By bridging the gap between creativity and data analysis, this book offers a unique perspective on the evolving landscape of literature in the digital age.

For readers, the insights provided by Archer and Jockers can enhance our understanding of why certain books capture our imagination and become cultural phenomena. For writers, it offers a roadmap to crafting stories that have the potential to reach a wide audience.

As technology continues to shape the way we create and consume literature, tools like the bestseller-ometer may become increasingly common. While some may worry that this could lead to formulaic writing, it's more likely to serve as a guide rather than a rigid template. After all, the most successful authors are those who can surprise and delight readers while still tapping into the universal elements that make stories resonate.

In the end, "The Bestseller Code" doesn't demystify the magic of great storytelling – it simply helps us understand the science behind it. Whether you're a casual reader, an aspiring author, or a publishing industry professional, this book offers valuable insights into the complex alchemy of creating a bestseller.

Books like The Bestseller Code