Using Amazon for Research

by F.

Amazon has some really nice features for analyzing books. Yes, they could be better, and in time—given enough user feedback, perhaps—I imagine they will be. But still, at the moment I don’t see any cheaper alternative. For any book that has been “Search Inside!”-ed, you can use three pretty useful tools:

1. Concordance.

What the hell is a concordance? It’s a list of words in a book, in alphabetical order, often with a number after each word. The numbers denotes the number of times the word shows up in the book. Amazon goes one better and offers a “cloud,” which is a nice visualization tool currently popular with blogs,, and other places. Here’s an example. Why you should care: Looking at the concordance gives you a good idea of the key concepts in the book—what are the concepts that show up a lot? But be careful: the concordance on Amazon (as is usual, I think) provides words not phrases. So there could be a key phrase such as “weasel sodomite” and it could be used throughout the book, but it would not be in the concordance. However, it might be in the…

2. Statistically Improbably Phrases.

Or SIPs for short. SIPs are—wait for it—phrases that are statistically improbable. But relative to what? Well, relative to the set of words in all “Search Inside!”-ed books. Why you should care: This gives you an idea of what is fresh in this book relative to other “Search Inside!”-ed books. But this isn’t the whole story, because you can also see a book’s…

3. Capitalized Phrases.

Or CAPs. CAPs are—wait for it—yeah, you get the idea. Why you should care: Since in English we capitalize proper names and trademarks, “Search Inside!” for instance, this can give you an idea of the “characters” in a book. This would be even more useful for German, it seems to me, since German orthographic convention requires capitalization of nouns, and seeing the main nouns in a book would nice.

These three tools, in conjunction with the usual reading hacks (such as looking at the index for frequently used words, reading the table of contents, looking at the summaries of each chapter, looking at the front and back of each chapter, and reading each topic sentence), provide a nice way to extract what is really important about books in the first place: their conceptual content.