Do Readability Scores work?

Flesch-Kincaid Readability ScoresSome seventy years after its invention the Flesch readability test is used by marketers, policy writers and bloggers; but do Readability Scores work?

Origins of Readability Tests

In the 1940’s, a consultant with the Associated Press, Rudolph Flesch, devised a method for improving the readability of newspapers. The resulting Flesch Reading Ease (1948) is a readability test to determine the level of education needed to easily read a text. Still in use today, most of the Search Engine Optimisation industry, blogging platforms and some low-end publishers appear completely enslaved by it.

By analysing a text and assigning a score from 1 and 100, the Flesch Reading Ease is supposed to indicate how easily the average adult will understand it. Using a complex mathematical formula, the ‘Goldilocks Zone’ of readability is a score between 70 to 80, equivalent to the US eighth grade in school.

Unfortunately, Flesch Reading Ease results are not immediately meaningful. The US Navy tried to revise it in the 1970’s to make their technical manuals more readable(!), producing the Flesch-Kincaid Grade formula. This proposes an ‘ideal’ readability score of 8, or eighth grade (again) level education for the average sailor – or member of the public.

Fundamentals of Readability Tests

Both formulae are based on two fundamentals:

  • Sentence length, which is the average number of words in a sentence in a text
  • Word length, which is the average number of syllables in a word in a text

The weightings and rules for tenses and sentence construction differ between the two formulae. The simplistic logic is that short sentences containing short words are easier to read and understand. This is not, in itself, wrong.

In the Flesch-Kincaid Reading Ease, the higher the reading score, the easier a text is to read. For example, a score of 60 to 70 puts it around eighth and ninth grades, understandable by 13 to 15-year-olds. Unfortunately this still requires a conversion table of sorts to give the scores some context.

The majority of readability scores developed alongside these two use a scale where a lower score is easier.

There are now hundreds of readability formulas built into text analysis tools, with some tailored to specific industries, government, education or subject areas: Gunning Fog, Coleman Liau, SMOG, FORCAST, ARI, Lix, Rix, Raygor, CEFOR, IELTS and LENSEAR, naming just a few. Flesch-Kincaid remains in common use for general purpose analysis.

Claims in favour of Readability Tests

The use of readability scores can make certain text much more readable; product terms and conditions, technical manuals, classroom textbooks, presentations. However some of the other ‘recommended’ uses include website copy, advertising copy, and editing novels.

The Search Optimisation industry loves Readability scoring as it makes their job so much easier. The search engine algorithms also embed Readability scores to maximise the breadth of audience they can promote to. There are well-founded studies that prove a better readability score decreases bounce rates, increases site time, encourages content sharing and ultimately, increases sales and/or advertising revenue.

Readability scores can prompt writers to look again at the state of their text and perform valuable re-writes – if it is the right kind of text, in an appropriate field of interest.

The Case for the Prosecution

The danger in relying on Readability scores is in ‘writing down’ to the lowest common denominator of the ‘average’ reader. The more general-purpose the tool, the more mechanistic the analysis, and the more likely the recommendation to dumb-down any text put through it. Wave goodbye to style, creativity, rhetoric or expansive description including metaphor and simile.

Seriously, who puts their novel through Flesch-Kincaid? Dan Brown, apparently.

A famous example is Charles Dicken’s A Tale of Two Cities. Apparently Dickens, “needs improvement on readability,” and his score of 50.4 in the Flesch test translates to, “fairly difficult to read.” As well as ‘too many words’, more than half of his sentences are in the passive voice. Flesch-Kincaid tools promote active voice right after short sentences and short words.

Let’s not go anywhere near Ulysses or The True History of the Kelly Gang.

Some educational establishments compile curriculum lists of books for school year groups. Is this valuable or is it a short-cut to unambitious, lazy teaching, along an easy path of least resistance? Is this just over-privileged, intellectual whining? Do Readability Scores work?

This post scores 45.8 in the test, which is considered difficult to read. The prosecution rests.