Can a computer algorithm predict the success of a book?
I know – the thought of some brainless computer ever having the insight or creative ability to impact the writing of a novel is scornful, but this week - three computer scientists at New York’s Stony Brook University set out to show that maybe technology could in fact help us on our endless road towards literary success.
In this paper, published by the Association of Computational Linguistics – Ashok, Feng and Choi present their findings after running a sophisticated computer algorithm to analyse over 44,000 books.
The 44,000 books used were taken from Project Gutenberg, and included multiple different genres of books from science-fiction to poetry. The algorithm analysed the first 1000 sentences of each of the books and compared the various stylistic elements they contained to how successful the book was. How ‘successful’ a book was in this case, was measured by the number of downloads it had achieved, and the level of critical acclaim it had received.
Ultimately, the scientists found a correlation between stylistic elements used and how successful the book was. Here are the general correlations discovered by the study:
- Successful books used more conjunctions (“and”, “but”) to join sentences and prepositions (“above”, “beneath”) than less successful books.
- Successful books had a noticeably higher percentage of nouns and adjectives to describe situations than less successful books, which relied on verbs and adverbs to do the same job.
- Successful books more frequently used advanced verbs (“recognised”,”remembered”) to describe character thought-processes, rather than simple verbs such as “took” and “wanted”.
Overall, it was found that more successful books contained characteristics that are closer to journalistic writing. Journalists use nouns, pronouns and prepositions more often, because these parts of speech tend to be more informative. It’s worth noting that many writers are journalists before they become novelists, think: Charles Dickens and Ernest Hemingway...
Could it be that your novel is more likely to be successful if you write like a journalist? That’s certainly what this study implies...
It remains to be seen whether or not this approach is applicable to ‘real-world’ fiction writing, but in another study, a computer scientist at Bar-Ilan University in Israel, created an algorithm that can correctly predict the gender of an author over 80% of the time. Clearly there are patterns in literature that can be identified by computers. Perhaps we should give these findings a second thought next time we sit down to edit our manuscripts?
V. Ashok, S. Feng & Y. Choi, 2013, 'Success with Style: Using Writing Style to Predict the Success of Novels' in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1753–1764, accessed 12.01.14, here.