Wednesday, December 4, 2013

Clarifying the Role of Text Readability Scores in Benchmark and Formative Assessment


I want to take this opportunity to address some questions that have come up regarding readability metrics related to the texts appearing in Galileo K-12 Online assessments. One of the interesting aspects of readability metrics is the attempt to quantify, mathematically, the literary and informational expression found in text. These measures provide a handy guideline for determining if the structural complexity of a text is relevant to typical readers at a grade level. However, there are some limitations to the numerical categorization of text, which will be covered in this blog post.

First, it’s important to address the purpose of testing students using text-based items.

The ATI Assessment and Instructional Design team’s goal in developing the texts and items is to provide a basis for clear measurement of students’ abilities to comprehend and analyze text in a way that informs educators on each individual student’s progress toward standards mastery. To make sure that all students being assessed are challenged, the most effective strategy is to include texts and items of variable difficulty. This means that some items are achievable by all students, some by most students, and some only by students who are proficient in the expectations required to complete those tasks. Note that Common Core State Standards information recommends a broad text readability range in measuring performance on Common Core State Standards-aligned assessments.

All texts used in Galileo Online since 2002 have been analyzed using the Flesch-Kincaid Readability measure. 

The Flesch-Kincaid formula:
A Flesch-Kincaid score can be calculated manually or by utilizing the Microsoft Word Spelling and Grammar tool with the Readability option enabled to get the score. As can be seen in the formula, the number of syllables/words and words/sentences in an analyzed text are the key determiners of the grade level rating of a text.

Below is an example of the paragraph above, analyzed using Flesch-Kincaid, and then revised using an understanding of the nature of the formula.

A Flesch-Kincaid score can be calculated manually or by utilizing the Microsoft Word Spelling and Grammar tool with the Readability option enabled to get the score. As can be seen in the formula, the number of syllables/words and words/sentences in an analyzed text are the key determiners of the grade level rating of a text. Flesch-Kincaid Grade Level as written: 14.3 (post high school)
A Flesch-Kincaid score can be calculated manually. It can also be found utilizing the Microsoft Word Spelling and Grammar tool. This is done with the Readability option enabled to get the score. The formula calculates the number of syllables/words and words/sentences in an analyzed text. These are the key determiners of the grade level rating of a text. Flesch-Kincaid Grade Level with sentences simplified: 8.1 (middle school)
A Flesch-Kincaid score can be found by hand. It can also be found using the MS Word Spelling and Grammar tool. This is done with the review option set to get the score. F.K. uses the number of syllables/words and words/sentences in a text. These are the key parts of the grade level rating of a text. Flesch-Kincaid Grade Level with sentences simplified and polysyllabic words reduced: 4.0 (elementary school)

As can be seen from the three examples above, the readability number, while useful, is simply an analysis of the structure of the passage, not the appropriateness of content. Edits made the text simpler for less-accomplished readers because they are more familiar with simpler sentences and words, but did not simplify the topic. Sometimes compound sentences and polysyllabic words are unavoidable, even at earlier grades. The presence of these complex sentences and more difficult words will affect the readability statistics of the text, but may or may not affect the students’ ability to comprehend the text. For example, longer words that the students are familiar with will not make the text more difficult to read even though the readability formula will interpret them that way. Awareness of these limitations makes this quantitative measure simply a guideline to avoid overly complex sentence structures and longer, unfamiliar words.

One question that is often asked is the meaning of the decimal points. A text with a score of 4.4 is not intended to be aligned to the fourth month of fourth grade, but simply to be  structurally more complex than a text with a score of 4.2.

In addition to Flesch-Kincaid readability measures, Galileo K-12 Online incorporates The Lexile® Framework for Reading developed by MetaMetrics® into our benchmark texts. The model used in Lexile measures has some similarity to Flesch-Kincaid. More information on the formula used to compute Lexile measures can be found at the MetaMetrics Web site.  While some grade-level equivalent information is identified, MetaMetrics has made the following statement on grade equivalence:


We all recognize that the quantitative structural scores of text are just one aspect of readability. The other aspect is the qualitative element of the topic and ideas presented therein. ATI’s content specialists and each district’s teachers choose topics and themes that are meaningful to the students. We encourage all of our district partners to engage in the assessment review process, which allows the replacement of items and texts to better serve the needs of each district’s students. ATI develops items aligned to texts to serve a broad range of student abilities, and is glad to work with our partners to make sure that the texts presented in the final version of custom assessments have the support of the educators who will use them to inform instruction.

Further questions on readability and text appropriateness? Please comment below.

No comments: