Talk:Coleman–Liau index
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
Formula
[edit]During the history of this Wikipedia page, the Coleman-Liau formula has appeared in several slightly different forms:
- (5.89 * characters/word) - (0.3 * sentences)/(100 * words) -15.8
- (5.89 * characters/word) - (3 * sentences)/(1000 * words) -15.8
- (5.89 * characters/word) − (0.3 * sentences (per 100 words)) − 15.8
It also appears on many other websites, usually in a similar form. The first two versions are completely equivalent, given that a fraction is unchanged in value when the numerator and denominator are both multiplied by 10. The third is ambiguous. A normal interpretation would be to replace the word 'per' with a division sign '/', and put a multiplication sign between '100' and 'words', getting a sub-formula 'sentences / (100 * words)' — which matches the first two formulas. On the other hand, if you read it aloud as 'sentences per hundred words', it doesn't mean the same thing at all. If you take a writing sample of 100 words containing seven sentences, the first interpretation works out to 7/(100*100), or 0.0007. The second interpretation works out to 7: that is, there are seven sentences per hundred words of the writing sample.
Another reason to reject the first two versions of the formula is that the second term is always negligible compared to the first. The number of characters per word has to be at least 1, even if the sample is all 'a' and 'I'. The number of sentences per word can't be more than one, even if the sample is. all. one. word. sentences. Then the first term of the formula, 5.89 * characters/word, must be at least 5.89, and the second term, (0.3 * sentences)/(100 * words) = (0.3/100) * (sentences/words), can't be any greater than 0.003; which means that the second term is 0.051% of the first term or less, and is lost in the three significant digits.
If you really want to compute the number of sentences per hundred words in a writing sample, the correct formula is 100 * sentences / words. For example, if there are 18 sentences in a 200-word composition, the number of sentences per hundred words is 100 * 18 / 200 = 9. I believe the formula should be changed to:
- (5.89 * characters/words) − (30 * sentences/words) − 15.8
I'll change it in a few days unless someone has a reference to prove otherwise. (Casual websites don't count, because they may only have reproduced the error from Wikipedia.) Gwil 19:13, 4 May 2006 (UTC)
- I corrected the formula. The coefficients of 5.89 and 0.3 were not accurate; it looks like they were the result of intermediate calculations based on Coleman and Liau 1975 (which doesn't actually list the exact formula used in Wikipedia) using too few significant digits. I also added an example. N.g.davies (talk) 16:48, 8 September 2010 (UTC)