M E M O R A N D U M
February 8, 2005
To: Dr. David Bell, Head
Department of Curriculum and Instruction
From: Sid Womack, Tim Carter, Sammie Stephenson, Connie Zimmer
(Master of Instructional Improvement Advisory Committee)
Subject: Reliability of MII measures
The MII Advisory Committee met today to examine the reliability of rubrics and other measures of candidate achievement used for assessing those students' work. Courses in the Master of Instructional Improvement program where students are asked to showcase their work are, particularly, ELED6803, ELED6823, and EDFD6993. With regards to the measurements utilized for those candidate products, we found:
Rubrics for the four instructional styles as used in ELED6803:
1. That the rubric used for assessing student work for expository lessons in ELED 6803 was reliable at 0.71 split-half (uncorrected, with some missing data; actual r would be still higher) with the data from 18 graduates, significant beyond the .01 level (critical r=0.590).
2. That the rubric used for assessing student work for demonstration lessons in the same course, in which students are expected to showcase their capabilities in teaching students with exceptionalities, was reliable at 0.718 split-half (uncorrected, with some missing data; actual r would be still higher) with the data from 18 graduates, significant beyond the .01 level (critical r=0.590).
3. That the rubric used for assessing student work for discovery lessons in the same course, in which students are expected to showcase their capabilities in teaching students with exceptionalities, was reliable at 0.735 split-half (uncorrected, with some missing data; actual r would be still higher) with the data from 18 graduates, significant beyond the .01 level (critical r=0.590).
4. That the rubric used for assessing student work for individualized lessons in the same course, in which students are expected to showcase their capabilities in teaching students with exceptionalities, was reliable at 0.863 split-half (uncorrected, with some missing data; actual r would be still higher) with the data from 18 graduates, significant beyond the .01 level (critical r=0.590).
Validity notes for the above measures: Rubrics were developed with cooperation from students during the first times that the ELED6803 classes were taught in 1998. They have been subject to criticism or change during the duration of the program and the course. On the Flander's Analysis that has been used in some sessions of the course, and inferred from teaching/learning units composed in the others, the evaluator is a Flander's trained observer with a Scott reliability of over 0.85 (Texas A&M University). Rubrics have been carefully screened to assess expository teaching, demonstration teaching, discovery learning, or individualized instruction, and have been matched to Local Diversity and Pathwise Standards.
Some experimentation has been done in various sections of ELED6803. In some semesters, students have been directed to teach four demonstration lessons (expository, demonstration, discovery, and individualized) with adaptations made for learners with mental retardation. In other semesters, they have been asked to write detailed curriculum units, and measurements for the four instructional styles have been inferred from the reading of those units. As the findings below will indicate, more reliable measurements have resulted from in-class assessments of MII students' demonstration lessons than from their units.
ELED6823 IEP rubric: In ELED6823 (Learning Disabilities), eight graduates thus far have completed the course and have done the "Sammy" exercise in which they were to take data from an anonymous psychological report and write appropriate goals, objectives, learning activities, adaptations, and/or accommodations, based upon "Sammy's" data. These tests are always done in class. The reliability thus far has been 1.0. While this is encouraging, we are as cautious with this finding as we would be with a low r, considering that the number of data are small.
In the early 2000s, other courses were sometimes substituted for this one, therefore sometimes data were not available. In the past two years, there has been less substitution and MII students have been more likely to take it, therefore it is becoming more likely that we will have data about their ELED6823 IEP exercises.
Validity: ELED6823 students are asked to translate psychological and achievement data into a working remediation plan, much as they would in an IEP session with other professionals and parents present.
ELED6803 unit rubric: There were only six units available for data analysis. The uncorrected split-half reliability for those units was 0.31, due mostly to the low numbers and small amount of arithmetic variance generated between odd and even components of the rubric. If reliabilities continue in this same vein, our recommendation will be to discontinue having students write the full unit in ELED6803 and instead teach the demonstration lessons. Not only are there differences in the reliabilities, but the data from the four teaching rubrics seem much richer and more extensive than the eight data generated from the evaluation of the units.
EDFD6993 Action Research Rubric: The split-half reliability of MII graduates' action research data was 0.826 but with data from only six students. This correlation is significant at the .05 level, referencing a critical r of 0. 811. We are recommending continuation of the use of the rubric for this and other graduate programs that have action research projects until more data can be collected and further determinations made.
Validity of the
Action Research Rubric: Items in the
action research rubric are referenced to Pathwise, ISTE, and Local Diversity
Standards which were developed in 2000 by the