The Logic of Verbal Communication

by F.

I don’t pretend to have gotten my mind around all the goodies in this paper. But fortunately some of its contents have been predigested by Plus magazine:

Various mathematical techniques have been used to analyse texts in the past, but recently a team of physicists and language researchers, led by Elisha Moses of the Weizmann Institute in Israel and Jean-Pierre Eckmann of the University of Geneva, have refined an existing model in a way that seems to do the trick rather well.

To prove the point they applied their model to twelve of the most acclaimed texts of Western culture, including War and Peace by Tolstoi, Shakespeare’s Hamlet, Don Quixote by Cervantes and Kafka’s The Metamorphosis, as well as the scientific and philosophical works Relativity: The scpecial and the general theory by Einstein, The Critique of Pure Reason by Kant and Plato’s The Republic.

Their technique does seem to be able to spot the ideas that lie at the heart of each book, and can trace the way they interact and develop. It even allows them to identify linguistic structures that seem to be essential in helping our minds and memories reconstruct the author’s multi-dimensional world of ideas from the one-dimensional string of words on the page.

Their model is worth describing not only because it is useful, but also because it is amazingly elegant. It turns each book into a space with many dimensions in which each direction corresponds to a central idea. The narrative meanders through this conceptual forest and the path it traces out serves to understand the structure of the text.

So, what are the takeaways? One might be that it is better to keep paragraph length to around 200 words. The researchers

defined “windows of attention” of around 200 words (about a paragraph) and within these windows, they identified pairs of words that frequently occurred near each other (after eliminating “meaningless” words such as pronouns). From the resulting word lists and the frequencies with which the single words appeared in the text, the scientists’ mathematical analysis was used to construct a sort of network of “concept vectors” – linked words that convey the principal ideas of the text.

This is according to I actually find 200 words pretty long. The first quote in this post is around 230 words, and I split it into 4 bite-sized pieces for ease of reading on a computer screen. Of course, if you bite-size things too much, the text actually gets less clear.

In any event, if you’re not convinced that “simple and direct” is a good way to write, note that if the reader doesn’t get what you’re saying, they think you are the dummy, not them:

In a series of five experiments, [Oppenheimer] found people tended to rate the intelligence of authors who wrote essays in simpler language, using an easy to read font, as higher than those who authored more complex works.

“It’s important to point out that this research is not about problems with using long words, but about using long words needlessly,” said Oppenheimer. “Anything that makes a text hard to read and understand, such as unnecessarily long words or complicated fonts, will lower readers’ evaluations of the text and its author.

Mmm. Hulk understand writer. Writer smart. Mmm.

More on this research here. And, finally, from the realm of art (in this case film), a quote from Steven Soderbergh in an interview with The Believer that should be taped to the mental refrigerator:

THE BELIEVER: What is the hardest thing about filmmaking?

STEVEN SODERBERGH: I will say, and coming from someone who’s made some of the movies and TV I’ve made, it may seem disingenuous—but the hardest thing in the world is to be good and clear when creating anything. It’s the hardest thing in the world. It’s really easy to be obscure and elliptical and so fucking hard to be good and clear. It breaks people. Because you don’t often get encouragement to do that, to be good and clear.