Aaron Stanton is a book browser – he loves to hang out in libraries and bookstores.
At least since 2003, while still a student at the University of Idaho, he has been dreaming of a way to improve the book-browsing experience with the help of technology.
Over the past eight years, he organized a “book genome project,” dedicated to “breaking books down into their constituent elements on a large scale,” which in turn led him to form a company, BookLamp, which is debuting today.
The product classifies books by three primary components – Story (which includes the setting), Characters, and Style. It also evaluates the content in books according to several core metrics – Pacing, Density (complexity), Action, Description, and Dialogue.
In all, there are 132 “Story DNA elements” built into the initial product, says the company’s Director of Research, Stanford English professor Dr. Matthew Jockers, as he explains how it works:
“It’s a useful tool for a reader to find books that have stylistic elements you like. The 132 story elements are the ones that based on testing people will understand. But we have some 550 elements that we’ve researched, many of which aren’t immediately interpretable to a human being.
“It’s analogous to the human genome,” Jockers continues. There are genes that we know what they do, such as determining the color of your eyes, but there are others, including recessive genes, that we do not understand what they do. Computers can help us improve out understanding of those other elements.”
Thanks to partnerships with publishers, BookLamp has analyzed some 20,000 books to date, digging into titles such as The Da Vinci Code, in what Jockers calls a “macroanalysis that tries to determine what are the elements that make it successful.
“BookLamp can find a similar book, say on the backlist, that you’ve never heard of, with a similar profile. If you read it and like it just as much as The Da Vinci Code, that raises the question of why that other book wasn’t as successful. Which may indicate issues of how the books were marketed differently and so on.”
In these and other ways, what BookLamp appears to offer is a discovery engine for books on the mid or backlist, out along the "Long Tail" of content.
“Humans are really good at detecting the Characters and Stories they like,” says Stanton. “Computers can measure language makeup and can do so on a scene-by-scene basis.
"For example, the book Jurassic Park, with its 231 individual scenes, starts out with a relatively balance in Density (complexity) and Pacing until about a quarter of the way in, when suddenly the Density falls by 25 percent while there is a dramatic upswing in Pacing."
This is close to the scene where the dinosaurs escape their cages, and the story remains extremely fast-paced until the very end, when these two story elements come back into balance once again.
“From the publisher’s perspective, taking books apart and knowing that it sells with these data scores is invaluable,” says Stanton. “We are seeking big publishers to try our system out. And we are seeking feedback from readers – if publishers find out that people find this valuable, we’ll be able to build a much better version over time.”
With so many variables in the content analysis engine, and with a relatively small library of titles to date, the recommendations based on any one title may be somewhat sparse at first. “We need our audience to be somewhat forgiving,” says Stanton, “but it will get better the more people use it.”
Although the system was built originally for fiction, Stanton says, “it holds true for non-fiction as well.”
Stanton likens BookLamp to a virtual bookshelf. “I don’t think it will be valuable in increasing the speed at which a reader discovers new books. Our goal instead is to offer you a bookshelf where the percentage of books you find there you actually enjoy reading is higher.”
The comparison with Pandora that BookLamp uses to describe itself goes only so far. Pandora uses a panel of about 20 music experts to evaluate which of some 480 musical elements (things like key, rhythm, tempo, instruments used, etc.) are essential to a particular song, and then recommends other songs to you on that basis and other preferences or tastes your choices reveal over time.
At least at launch, BookLamp does not customize its recommendations to users, although the team envisions being built to do that in the future.
“You will be able to dial the story elements, tweak the settings, so it will be like browsing with a completely different toolset,” explains Stanton. “Sort of like a KAYAK for books.”
“As the corpus grows larger,” adds Jockers, “the system will encompass more themes and the algorithms will get better.”
At present, the distributed BookLamp team of nine (six FTE) is headquartered in Boise, has angel funding, and a revenue stream based on deals with publishers. It’s free to users who agree to join the project, which for now resides at BookLamp.org. "We also own BookLamp.com," says Stanton, "and we'll move there when we grow up."