Monday, October 31, 2011

Are you Searching for A Better Way to Raise Student Achievement?

As you look for a technology solution for educational management, we encourage you to experience Galileo K-12 Online. Galileo K-12 Online from Assessment Technology Incorporated is a fully integrated, research-based instructional improvement system, providing next generation comprehensive assessment and instructional tools. State standards are built in and ready for use, as are Common Core State Standards. Galileo K-12 Online management tools assist educators in establishing instructional goals reflecting the district’s curriculum, assessing goal attainment, forecasting standards mastery on statewide tests, and using assessment information to guide classroom instruction, enrichment, and reteaching interventions. ATI’s patented technology is uniquely qualified to address a district’s goals when implementing standards-based strategies to raise student achievement.

With ATI’s cutting-edge technology and flexible innovations, users can:
• Administer a full range of assessments using ATI’s next generation comprehensive assessment system, including benchmark, formative, screening and placement tests, plus interim and final course examinations, pretests and posttests, early literacy benchmarks, computerized adaptive tests, and instruments documenting instructional effectiveness.
• Import, schedule, deliver, administer (online or offline), automatically score and report on assessments created outside of Galileo K-12 Online using ASK Technology.
• Increase measurement precision using Computerized Adaptive Testing (CAT), providing high levels of efficiency in several types of assessment situations.
• Use Item Response Theory generated assessment information to guide classroom instruction, enrichment, and reteaching interventions.
• Evaluate instruction based on reliable and valid assessment information including continuously updated forecasts of student achievement on statewide tests.
• Access the Dashboard, designed to provide immediately available, actionable information relevant in the teacher’s role of enhancing student achievement.

Experience Galileo during an online overview and see how it is the better way to raise student achievement. You can visit the Assessment Technology Incorporated website (, participate in an online overview by registering either through the website or by calling 1.877.442.5453 to speak with a Field Services Coordinator, or visit us at the
• Arizona Educational Research Organization (AERO) 24th Annual Conference November 3 at the University of Phoenix, Southern Arizona Campus, Tucson, Arizona.
• Arizona Charter School Association (ACSA) 16th Annual Conference November 10 and 11 at the Westin La Paloma Resort, Tucson, Arizona.
• Illinois Association of School Boards (IASB), Illinois Association of School Administrators (IASA), and Illinois Association of School Business Officials (IASBO) Joint Annual Conference November 18 through 20 at the Hyatt Regency Chicago, Chicago, Illinois.

Thursday, October 27, 2011

English Language Arts Test Design

English tests, by their nature, require the students to do a lot of reading. When designing these tests, how much reading is reasonable? How can we best assess students’ reading comprehension abilities without creating tests that are too long and have too many texts?

The answer comes at the very beginning of the process, in assessment planning. District pacing guides are meant to ensure that all classes are being instructed and students are making progress toward mastering essential standards. In designing a benchmark assessment, districts often use their pacing guides to plan assessments, but those pacing guides, while useful in tracking progress toward state standards mastery, rarely reflect the full scope of instruction that is occurring in the classroom during a benchmark period.

Do English teachers have their students read an entire story only to teach the idea of a main character? Of course not. They teach about plot, the author’s use of language, the context and setting of the story, how it relates to the author’s life experiences, and all of the other elements of literature that compose a novel, short story, poem, or dramatic work. However, pacing guides may only emphasize one of these standards in a particular assessment period.

The pacing guide approach to assessment design, while it has the benefit of matching the district’s plans for instructing and assessing standards, has a tendency to narrow the focus of instruction so much that the assessment requires a large number of texts to measure very specific aspects of a text, and leaves students’ holistic comprehension of a text unmeasured. When the purpose of assessment is to measure student progress, this seems like an opportunity missed.

Measurement reliability is best served by having long tests, the longer the test the greater the reliability. For our purposes and the realities of class time available for assessment, we recommend 35-50 items per test.

When a pacing guide emphasizes a few core standards, it helps to clarify expectations for everyone, but when a test measures only a narrow range of standards, many more questions are required per standard. When these standards are spread across multiple genres, or focused on comparing or synthesizing texts, we start to see unintended and undesirable characteristics, namely the inclusion of too many texts on an assessment.

For example, to address reliability with a 35-50 item test, a pacing guide of five learning standards would require seven to ten items per standard. That doesn’t seem like many items to fully measure a standard, until we consider a concept like main character. How many “main” characters will a short story have? Standards like this often require a new text for each question. Eight or ten texts seem like an awful lot to read to assess whether a student understands the concept of a main character.

How about comparing and contrasting two texts? When the student is asked to compare and contrast across genres, or two different authors’ explorations of a similar theme, it’s an opportunity to see students demonstrate analytical skills and synthesis of information, higher-order thinking skills we want them to develop through reading. One or two questions that compare two texts, and up to 5 or 10 more questions that require analysis of each text in depth can better measure these higher-order or holistic skills than forcing students to read 10 different texts to answer 5 questions focused only on comparison.

Some pacing guides strive for balance, incorporating elements of fiction and nonfiction in each benchmark period. This is beneficial for students and teachers, allowing them to explore different forms of reading and writing, often in relation to each other. Complications in assessment occur when too few standards or standards without any overlap of genres are implemented on the same assessment. Measuring a single standard five to eight times on the morals of folklore and mythology with the rest of the test addressing nonfiction standards will result in a number of folklore texts with very few items per text and no possibility of overlap with the nonfiction standards.

So how do we address these concerns? Assessment Technology Incorporated has worked with a number of districts to develop a text packaging system that allows districts to still emphasize the essential standards that they want reflected in each benchmark, but to reduce the number of texts that appear on the tests. This is accomplished by including other learning standards that the teachers are instructing that may not appear on that benchmark period’s pacing guide but are an important component in measuring students’ overall reading comprehension. The package also balances the number of items per standard based on the occurrence of the skill in everyday reading. For example, the main character standard would get one item, compare and contrast maybe a couple, and elements of literature might get three or four, a reasonable distribution of the types of information students would see in a normal short work of fiction.

Another approach is to focus on one or two specific genres in an assessment rather than trying to address poetry, short story, persuasive text, informational text, and dramatic works in one assessment.
We can, by testing by genre or limiting repetition of POs that involve compare and contrast, cross-genre, cross-cultural, and single-instance items, reduce the number of texts.

The table below shows a test created without the text packaging approach for 2010-11, and the same test adjusted by the methods outlined above. Note that with text packaging the number of words that students had to read was cut nearly in half while the number of items on the two tests remained nearly the same. There are many benefits to fully utilizing texts as we are doing with the text-packaging approach: fewer texts for the students to read on the assessment, a more thorough demonstration of understanding of the text, less repetitive questioning, and by using fewer texts per test there are more texts available to choose from in later assessments.

*GO is a graphic organizer. It does not have a word count.
**Note this is the total number of items on the test, not the sum of the column as some items have more than one text attached.

If you are interested in exploring the text packaging options available in Galileo K-12 Online, please contact your Field Service coordinator or Karyn White in Educational Management Services to learn more.

Monday, October 17, 2011

Question and Answer

Having had conversations with many clients and prospective clients throughout the years, questions regarding the implementation of assessment and intervention can be specific to a school or district. However, some questions are similar from one district to another. Here I address some of the questions that come up regularly.

What are the benefits of online vs. offline assessment?
While there are pros and cons to both methods of administration, there are strong reasons to lean towards online administration. Online testing saves on the cost of paper (environmentally friendly) and gives immediate access to test results. These days, students are more technologically savvy and are comfortable with online testing. I’ve observed groups of very young students navigating through online testing with confidence and ease. Some could make a valid argument that they want the testing to mimic the statewide assessment (e.g., AIMS, MCAS, CSAP, CST). Our own research seems to show that whether the students take the test online or offline, it does not seem to affect our ability to predict how the students will do on the statewide assessment.

On the other hand, offline assessments can be administered to large groups of students at the same time and reading texts/items are presented in a traditional format. Access to computers may be a limiting factor that would lead to a need for offline testing. English language learners or other specific groups of students may benefit from working from the test booklet. Offline testing involves the extra steps of printing test booklets and bubble sheets and of scanning answer sheets once test administration is completed. The plain-paper scanning available for the past several years is an improvement in scanning technology which makes the scanning task much quicker.

What grade levels should be included in our district’s assessment planning?
Recent policies have led to emphasis in testing for grades three through 10. However, in every state, teachers in all grade levels are responsible to assess towards the state standards. Teachers and students at all grade levels can benefit from a comprehensive assessment system which is aligned to state standards, provides information about mastery of standards to inform a variety of decision-making questions (e.g., questions related to instruction/intervention, screening/placement, growth) and, with regard to instruction/intervention, recommends specific actions to improve student performance.

What subject areas should we be testing?
Math, reading, science, and writing frequently included in assessment plans as they encompass the core subject areas and most statewide assessment cover these areas. However, teachers in all subject areas should be encouraged to incorporate a comprehensive assessment system into their approach to instruction.

Should we build District Curriculum Aligned Assessments (DCAA) or use the Comprehensive Benchmark Assessment Series (CBAS)?
The DCAA is the optimal choice for districts that have common pacing guides (or curriculum maps) which are incorporated across the district. The DCAA are customized assessments intended to be aligned to instruction. These tests measure student accomplishment and pinpoint areas for which reteaching could be of most value.

The CBAS is designed as a comprehensive assessment to give multiple snapshots throughout the year of progress toward standards mastery. These are built by ATI using the blueprints from the statewide assessments.

How long does it take to test a student?
This depends on many factors including the length of the test, the objectives being assessed, and the number and length of the reading texts. A 40-45 item assessment will likely take a typical class period to administer. Some students will take less time and some will take more. In general, Galileo assessments are not designed to be timed. The goal is to determine what the student knows. Although it’s ultimately a district decision, enough time should be allowed for students to complete all testing.

What are the best reports for teachers and administrators?
ATI has synthesized some of the most frequently used reports into a Dashboard where teachers and administrators can easily access actionable information. For example, the Dashboard contains one-click access to the:
1. Test Monitoring reports which are in a graphical format and indicate how individual students (or the class) performed on specific assessment standards;
2. Detailed Analysis Report which links test items to state standards as well as reporting on response patterns for specific items; and
3. Intervention Alert Report which helps the teacher focus interventions by state standards and to place students into intervention groups.

I hope you found this Q&A helpful. If you have additional questions, please contact ATI at 877.442.5453 or at

-Baron Reyna, Field Services Coordinator

Monday, October 10, 2011

Using the Intervention Alert Report

The Intervention Alert Report lists all of the learning standards on a given assessment and displays the percentage of students who have demonstrated mastery of the learning standards. The learning standards listed that do not have 75 percent of students mastering them, will be highlighted in red. This allows the users to easily identify the standards on which interventions should focus. This is an actionable report that allows the user to schedule Assignments and Quizzes, or drill-down through the data to view individual Student Results.

The Intervention Alert Report provides a rich source of data for guiding evaluation. Each Intervention Alert Report provides information about what is happening in regards to students’ learning. The information on the report provides detailed data about students’ strengths as well as areas where additional intervention or planning will be beneficial. The Intervention Alert provides educators with an efficient means of tracking student mastery of different standards at many different levels of aggregation. The report can be run at the District-, School-, or Class-level. Educators may use the information to evaluate mastery of standards and to make plans accordingly.

1. Review your data… What does your data tell you about your classes and the students?
Which standards did students learn?
Which standards require additional focus to promote mastery?
What were the expectations for student learning?
Did the learning you expected occur?

2. Ask questions of your data… How can the data be used to provide learning opportunities for students that reflect class-level goals?
What mastery levels do most of your students fall into? For which standards?
Who are the students in each grouping? What Intervention Groups need to be
What variables that you know of could be impacting student achievement?

3. Use your data to begin the goal setting, planning and intervention process.
What expectations do you have for learning in the months ahead? For example,
what type of movement do you hope to see from one mastery level to the next?
Which standards will you target?
What plans will be put in place to achieve these goals?
What can school personnel do to help you support individual students and their learning?

Monday, October 3, 2011

Measuring Student Growth with ATI’s Instructional Effectiveness Pretests and Posttests

A primary function of ATI’s Instructional Effectiveness (IE) assessments is to evaluate the amount of growth that students demonstrate between the pretest and the posttest. The IE posttests are comprised entirely of current grade-level content while the IE pretests contain both prior grade-level and current grade-level content. Many Galileo users may wonder how the assessments can measure growth, when part of the pretest was aligned to the prior grade-level.

The IE pretests and posttests can be used to measure growth because ATI uses state-of-the-art Item Response Theory (IRT) techniques to put the student development level (DL) scores for both the pretest and the posttest on a common scale so that the scores will be directly comparable. ATI DL scores are scale scores. The calculation of scale scores takes into account the relative difficulty of the items on the assessment. For example, imagine two students take two different tests. Both get 75% correct. If one test was much easier than the other assessment, then when the students’ ability levels (DL scores) are estimated, the one who got 75% correct on the easy test should have a lower ability estimate than the student who got 75% correct on the difficult assessment. Even though both have the same raw score, the student who took the easy test will have a lower DL score than the student who took the difficult test.

When DL scores are calculated for the pretests, the prior grade-level items are treated as if they were easy items. So even though the students should do very well on those items, their DL scores will be adjusted to a slightly lower level because the IRT analysis will see that the items were very easy. The DL scores will be on a scale that is directly comparable to the posttest, and there will be room for growth in DL scores between the pretest and the posttest. In other words, if a student gets exactly the same raw score (percent correct) on the pretest and the posttest, his or her DL score on the posttest will be higher than that on the pretest because the items on the posttest will be seen as being more difficult than the items on the pretest.

There are a number of ways to learn first-hand about the benefits both of Galileo K-12 Online and of measuring student growth with instructional effectiveness pretests and posttests. You can visit the Assessment Technology Incorporated website (, participate in an online overview by registering either through the website or by calling 1.877.442.5453 to speak with a Field Services Coordinator, or visit us at:

- Arizona School Administrators, Inc. (ASA) Fall Superintendency/Higher Education Conference, October 23 through 25 at the Prescott Resort and Conference Center, Prescott, Arizona
- Massachusetts Computer Using Educators (MassCUE) and the Massachusetts Association of School Superintendents (M.A.S.S.) Annual Technology Conference October 26 and 27 at the Gillette Stadium, Foxborough, Massachusetts
- Arizona Charter School Association (ACSA) 16th Annual Conference November 10 and 11 at the Westin La Paloma Resort, Tucson, Arizona
- Illinois Association of School Boards (IASB), Illinois Association of School Administrators (IASA), and Illinois Association of School Business Officials (IASBO) Joint Annual Conference November 18 through 20 at the Hyatt Regency Chicago, Chicago, Illinois